Magistral Small seems wayyy too heavy-handed with its RL to me:
\boxed{Hey! How can I help you today?}
They clearly rewarded the \boxed{...} formatting during their RL training, since it makes it easier to naively extract answers to math problems and thus verify them. But Magistral uses it for pretty much everything, even when it's inappropriate (in my own testing as well).
It also forgets to <think> unless you use their special system prompt reminding it to.
Honestly a little disappointing. It obviously benchmarks well, but it seems a little overcooked on non-benchmark usage.
"Thinking" is a term of art referring to the hidden/internal output of "reasoning" models where they output "chain of thought" before giving an answer[1]. This technique and name stem from the early observation that LLMs do better when explicitly told to "think step by step"[2]. Hope that helps clarify things for you for future constructive discussion.
The point that was trying to be made, which I agree with, is that anthropomorphizing a statistical model isn’t actually helpful. It only serves to confuse laypersons into assuming these models are capable of a lot more than they really are.
That’s perfect if you’re a salesperson trying to dump your bad AI startup onto the public with an IPO, but unhelpful for pretty much any other reason, especially true understanding of what’s going on.
If that was their point, it would have been more constructive to actually make it.
To your point, it's only anthropomorphization if you make the anthrocentric assumption that "thinking" refers to something that only humans can do.[1]
And I don't think it confuses laypeople, when literally telling it to "think" achieves the very similar results as in humans - it produces output that someone provided it out-of-context would easily identify as "thinking out loud", and improves the accuracy of results like how... thinking does.
The best mental model of RLHF'd LLMs that I've seen is that they are statistical models "simulating"[1] how a human-like character would respond to a given natural-language input. To calculate the statistically "most likely" answer that an intelligent creature would give to a non-trivial question, with any sort of accuracy, you need emergent effects which look an awful like like a (low fidelity) simulation of intelligence. This includes simulating "thought". (And the distinction between "simulating thinking" and "thinking" is a distinction without a difference given enough accuracy)
I'm curious as to what "capabilities" you think the layperson is misled about, because if anything they tend to exceed layperson understanding IME. And I'm curious what mental model you have of LLMs that provides more "true understanding" of how a statistical model can generate answers that appear nowhere in its training.
[1] It also begs the question of whether there exists a clear and narrow definition of what "thinking" is that everyone can agree on. I suspect if you ask five philosophers you'll get six different answers, as the saying goes.
> It also begs the question of whether there exists a clear and narrow definition of what "thinking" is that everyone can agree on. I suspect if you ask five philosophers you'll get six different answers, as the saying goes.
And yet we added a hand wavy 7th to humanize a peice of technology.
I know this is the terminology, but I'd argue that the activations are the actual thinking. It's probably too late to change that, but I wish people would refer to thinking as the work Anthropic and Deepmind are doing with their mech interp
It's a misleading "term of art" which is more accurately described as a "term of marketing". Reasoning is precisely what LLMs don't do and it's precisely why they are unsuited to many tasks they are peddled for.
How are you defining "reasoning" such that you are confident that LLMs are definitely not doing it? What evidence do you have to that effect? (And are you certain that none of your reasoning applies to humans as well?)
These kind of comments are the equivalent of going to dog owners' forums, analyzing word choices in every post and warning the dog owners about the dangers of anthropomorphizing their pets, an effort as accurate as it is boorish and ineffectual.
Because my understanding is that how "thinking" works is actually still a total mystery. How is it we no for certain that the basis for the analog electric-potential-based computing done by neurons is not based on statistical prediction?
Do we have actual evidence of that, or are you just doing "statistical token prediction" yourself?