| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sadhorse 862 days ago
	Does every token requires a full model computation?

1 comments

onedognight 862 days ago

No, you can cache some of the work you did when processing the previous tokens. This is one of the key optimization ideas designed into the architecture.

link