| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blobbers 34 days ago
	Isn't the thinking part the part that burns the tokens? You're just outputting tokens.

1 comments

Totally depends, but I think this is mostly just an illustration of overall speed, regardless of the content.

Okay, but I think the realistic thing is * burns 18000 tokens thinking of the solution * outputs 1000 tokens of code

So you can easily follow the 1000 tokens of code, and the 18000 tokens of thinking is you sitting around waiting for your GPU to process the LLM.