RAG is often implemented using [vector search against embedd...

RAG is often implemented using vector search against embeddings, but there’s an alternative approach where you turn the user’s question into some full-text search queries, run those against a traditional search engine, then feed the results back into an LLM and ask it to use them to answer the question.

This considerably easier to reason about than RAG (retrieval-augmented generation) using vector search based on embeddings, and can provide high quality results with a relatively simple implementation.

It’s often much easier to bake FTS (full-text search) on to an existing site than build a pipeline to embedding search.

Comments
www.joshbeckman.org/notes/741185037