Hacker News new | ask | show | jobs
by bardiapour 775 days ago
i think not, better results >>> better latency + cost
1 comments

Maybe a combined approach beats either? Let some non-LLM reranker quickly spit out two results, and fill in the rest with the LLM.