|
|
|
|
|
by jimmypk
60 days ago
|
|
The BM25-first routing bet is interesting. You mention 85% recall@20 on 500 artifacts, but the heuristic classifier routing "short lookups to BM25 and narrative queries to cited-answer" raises a practical question: what does the classifier key on to decide a query is narrative vs short? Token count? Syntactic structure? The reason I ask is that in agent-generated queries, the boundary is often blurry - an agent doing a dependency lookup might issue a surprisingly long, well-formed sentence. If the classifier routes those to the more expensive cited-answer loop it could negate the latency advantage of BM25 being first. |
|