Y
Hacker News
new
|
ask
|
show
|
jobs
by
ludwik
316 days ago
There is "performance" as in "speed and cost" and performance as in "the model returning quality responses, without getting lost in the weeds". Caching only helps with the former.
2 comments
otabdeveloper4
316 days ago
If the context window is small enough then only the tail of the prompt matters anyways.
link
HardCodedBias
316 days ago
"the model returning quality responses, without getting lost in the weeds"
I should edit, but that would be disingenuous. This is exactly what I meant.
thank you!
link