|
|
|
|
|
by sgk284
171 days ago
|
|
Reranking is definitely the way to go. We personally found common reranker models to be a little too opaque (can't explain to the user why this result was picked) and not quite steerable enough, so we just use another LLM for reranking. We open-sourced our impl just this week: https://github.com/with-logic/intent We use Groq with gpt-oss-20b, which gives great results and only adds ~250ms to the processing pipeline. If you use mini / flash models from OpenAI / Gemini, expect it to be 2.5s-3s of overhead. |
|