Hacker News new | ask | show | jobs
by 9rx 260 days ago
It has become fast enough that another call isn't going to overwhelm your pipeline. If you needed this kind of functionality for performance computing perhaps it wouldn't be feasible, but it is being used to feed back into an LLM. The user will never notice.