Hacker News new | ask | show | jobs
by ForceBru 3 days ago
Wait, this isn't real, is it? Is there actually an intermediate model that translates DeepSeek's thinking from its "alien language" into human languages? That's not actually the case, right?

I thought "thinking" is literally the model generating additional text in a human language that shows its "thought process". It's added to the model's context, which helps it reason better because it now has this self-generated context.

The "their own language" idea seems to come from some recent science fiction where LLMs develop their alien language and take over the world by 2037 or something.

5 comments

Yeah, it's actually the case. Researchers have shown that the models response doesn't always follow from the reasoning. Whether you consider that an internal language or not really depends on what you're speculating the neural network is doing. I think there was an Antropic paper on it.
You're right, it's just additional text that allows it to do thinking / reasoning-like behavior. The big proprietary models hide the real output from the user and instead provide a friendly abridged version, but that's just to protect their secret sauce from distillation.
The parent is off, you’re right. They may reason in any language, typically whatever the user’s language is, and you’ll see the reasoning directly with an open model like Deepseek.

Research only showed that thinking might be disconnected from the final output but in my experience they are very strongly correlated in recent models

> Research only showed that thinking might be disconnected from the final output

It is trivial to regularly spot obvious contradictions and inconsistencies if you read carefully. For example I've encountered traces that amounted to "I can deduce X, therefore Y, so that means Z" but then the model turns around and outputs "the answer is W because X". It's even been demonstrated that having the model output placeholder tokens or other gibberish instead of "thoughts" still improves performance. However the thinking traces can still be useful to the end user regardless.

I see those too and I think of it as the "thinking" in action. If you could replace their actual thinking trace with gibberish and get improved performance that scaled with the amount of gibberish you injected, that's what we'd do. But instead, we see that the quality of of the model's output scales with the amount of 'thinking' tokens they generate before responding.

It has been my experience that yes, models make contradictions throughout their thinking process, but the conclusions they arrive at during/near the end of thinking more often than not align with the final output.

I may have misremembered but I thought I had read somewhere that recent models by OpenAI and Anthropic tend to produce reasoning that is not always understandable for humans. But you're right that it's not the case for Deepseek so maybe I'm hallucinating ;)

Or maybe it was an article or a tweet about researchers trying really hard to steer the model to think in English otherwise interpretability / safety becomes a lot harder?

Current models simply generate additional text that gets added to the context for the trace. However iterative models that "think" by repeatedly looping through several layers instead of outputting text have recently been demonstrated.