Note on Building Search-Based RAG Using Claude, Datasette and Val.Town via Simon Willison
RAG is often implemented using vector search against embeddings, but there’s an alternative approach where you turn the user’s question into some full-text search queries, run those against a traditional search engine, then feed the results back into an LLM and ask it to use them to answer the question.
This considerably easier to reason about than RAG (retrieval-augmented generation) using vector search based on embeddings, and can provide high quality results with a relatively simple implementation.
It’s often much easier to bake FTS (full-text search) on to an existing site than build a pipeline to embedding search.