Y
Hacker News
new
|
ask
|
show
|
jobs
by
esperent
111 days ago
Is it because of caching? If the context changes arbitrarily every turn then you would have to throw away the cache.
1 comments
FuckButtons
110 days ago
So use a block based cache and tune the block size to maximize the hit rate? This isn’t rocket science.
link
wonnage
110 days ago
This seems misguided, you have to cache a prefix due to attention.
link