|
|
|
|
|
by bustadjustme
377 days ago
|
|
Sorry if I missed it, but how is a single token output from an LLM comparable to a search result from an engine? The author here compares 1k tokens (as an estimate for an average LLM single query response) to 1k web search queries. How is this not a factor of 1000 error? > To compare a midrange pair on quality, the Bing Search vs. a Gemini 2.5 Flash comparison shows the LLM being 1/25th the price. That is, 40x the price _per query_ on average (which is the unit of user interaction). LLMs with web-search will only multiply this value, as several queries are made behind the scenes for each user-query. EDIT: thanks, zahlman, he does quote LLM prices in 1M tokens, or 1k user-queries, so the above concern is mistaken! |
|
The author compares 1k uses of the LLM - resulting in an estimated 1M output tokens, and the prices are quoted per 1M tokens - to 1k uses of the search engine (the prices for which are directly quoted per 1k uses).