Y
Hacker News
new
|
ask
|
show
|
jobs
by
sadhorse
815 days ago
Does every token requires a full model computation?
1 comments
onedognight
815 days ago
No, you can cache some of the work you did when processing the previous tokens. This is one of the key optimization ideas designed into the architecture.
link