| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aprentic 118 days ago
	Do you know if that's true of non-English models? As I said elsewhere, Deepseek injects Chinese characters into responses. Anecdotally, that seems to happen when the context gets longer. That suggests that they're primarily trained in Chinese and I would expect them to use fewer tokens for Chinese than English.