Journal

RAG vs fine-tuning: how we pick for client builds

A decision tree we use when a product needs LLMs: when retrieval wins, when weights need to change, and what it costs in time and money.

Most products don’t need a custom model on day one. They need correct answers from their data with an honest latency and cost envelope.

When RAG is enough

RAG keeps the model generic; the “memory” lives in a store you control. For many B2B tools, that’s the right default.

When fine-tuning (or domain adaptation) enters the picture

How we actually decide

We prototype with RAG + strong eval questions from the client, then measure. If the failure mode is “doesn’t know our edge cases,” we improve data and retrieval. If the failure mode is “can’t behave like this every time,” we talk fine-tuning or small specialist models.

No dogma - just evidence from your traffic and your risk tolerance.