Hacker News new | ask | show | jobs
by onedognight 817 days ago
No, you can cache some of the work you did when processing the previous tokens. This is one of the key optimization ideas designed into the architecture.