Hacker News new | ask | show | jobs
by skeptrune 630 days ago
You could also just save the first outputted atomic chunk and store it then re-use it each time yourself. Easier and more consistent.
2 comments

I don't understand how that helps here. They're not regenerating each chunk every time, this is about caching the state after running a large doc through a model. You can only do this kind of thing if you have access to the model itself, or it's provided by the API you use.
To be fair, that only works if you keep chunk windows static.