| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwup238 507 days ago
	> max_tokens：The maximum length of the final response after the CoT output is completed, defaulting to 4K, with a maximum of 8K. Note that the CoT output can reach up to 32K tokens, and the parameter to control the CoT length (reasoning_effort) will be available soon. [1] [1] https://api-docs.deepseek.com/guides/reasoning_model

1 comments

gliptic 507 days ago

So yes, it's a limitation of their own API at the moment, not a model limitation.

link

throwup238 507 days ago

I’m using it through Kagi which doesn’t use Deepseek’s official API [1]. That limitation from the docs seems to be everywhere.

In practice I don’t think anyone can economically host the whole model plus the kv cache for the entire context size of 128k (and I’m skeptical of Deepseek’s claims now anyway).

Edit: a Kagi team member just said on Discord that they’ll be increasing max tokens next release

[1] https://help.kagi.com/kagi/ai/llms-privacy.html

link