|
|
|
|
|
by BoorishBears
1022 days ago
|
|
I guess this really boils down to your usecase: if you can have a result for your user with fully predictable latency (my biggest beef with non-Azure OpenAI), no additional round trip, and increased configurability, does MTEB performance move the needle? Considering the LLM is still doing the final pass, and the latency from the LLM is based on output length, I find the UX to be significantly improved just doing reranking in-process. I think there's been a bit of whiplash, where people went from gatekeeping "hard ML", to "I can shove this all at a REST API", but there's a golden path laying in between for use-cases where UX matters. I even fall back to old school NLP (like ML-less, glorified wordlist POS taggers) for LLM tasks and end up with significantly improved performance for almost 0 additional effort |
|