Y
Hacker News
new
|
ask
|
show
|
jobs
by
wolttam
23 days ago
It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.
1 comments
himata4113
23 days ago
gemini models solve a problem in 80% less tokens so that's something to think about.
link
johaugum
23 days ago
Source?
link
himata4113
23 days ago
https://help.kagi.com/kagi/ai/llm-benchmark.html
link