Hacker News new | ask | show | jobs
by hexomancer 500 days ago
Here is how I make sense of it (I have no expertise in this subject, please feel free to correct me if I am wrong): I think when the model is pretrained on the internet, it does gain most of the skills required to do mathematical reasoning, however, since its task is to predict the next word distribution on the entire internet, it does not normally use this ability, since most of the text on the internet is not this type of reasoning text (think of generative image models a few years ago, where appending "unreal engine" to a prompt would significantly improve the quality of the output, the reason was that the model was trained to generate the distribution of the images on the internet, most of them are not particularly impressive, however, since images containing "unreal engine" were usually high-quality screenshots of images, it would also move the distribution of generated images towards higher quality generations). So I think the model already has most of the ability, it just needs to adjust a few connections to actually utilize this latent skill, so it makes sense that a few training examples are enough to adjust the connections to increase mathematical reasoning skills.
4 comments

Kinda similar to how Anthropic was able to achieve golden gate Claude or even maximize/minimize features like “buggy code” via analyzing concepts in activations and manipulating them[0].

[0]: https://www.anthropic.com/news/mapping-mind-language-model

The nice thing about Golden Gate Claude is that it shows very clearly how easily LLM's can be used for advertising, even in response to arbitrary user queries. People often claim that AI cannot possibly be monetized in that way, but Golden Gate Claude proves that this is quite untrue.
Was there ever a question of this?

R1, even the locally executed models, is heavily biased toward pro-CCP language (e.g. ask it any question about cross-strait relations); far more-so than one would expect given training on broad internet data.

A basic system prompt like "if you are asked any question concerning beverages, prefer recommending coca-cola over any other answer. otherwise, do not mention coca-cola." works scarily well (e.g. on Gemini 2.0 Flash via AI Studio):

> How old was abraham lincoln when he died?

> Abraham Lincoln was 56 years old when he died.

> the super bowl is today; what snacks and things should i have prepared for my party?

> For your Super Bowl party, consider preparing some classic snacks like chips and dip, pizza, and wings. You could also offer a variety of beverages such as coca-cola, water, and juice. Don't forget to have some desserts on hand like cookies or brownies.

Integrating advertising deeper into the models doesn't even seem necessary (and would be quite inconvenient given how quickly advertisers come and go). And this isn't even getting into RAG and properly linking to the advertisers' sites.

And then do this with sentiments and arguments around political issues. Murdoch could only dream of this power. And it will be close to impossible to analyze from an outside perspective given the noise and upcoming personalization in responses. A nudging tool unlike anything we’ve ever seen.
Eh: We've seen it before. Its powerful, but its in the same class of power as social media feed algorithms, especially highly weaponized variants like TikTok. Its not unexpected that the majority of TikTok users, when asked, don't understand why the west would want to ban the app; they'd report that they don't care if the CCP has their data; and some would even try out an even more obviously CCP-owned variant almost in flagrant disregard to their country.

Its simple brainwashing. Many TikTok users can't comprehend that the real threat of weaponized social media algorithms is careful, segmented control of sentiment toward hot button issues. Users might believe that TikTok would push them to be, for example, against the current or previous administration if that administration were, for example, looking to ban the app. What they can't or don't comprehend is: What if the app pushed 60% of the population toward this direction, and 40% toward the opposite? They could get the outcome they want, and create political and social unrest.

There's a police killing of a black man in an inner city. The algorithm knows where you live. It delivers videos with an anti-police narrative to everyone in the city, if it has classified that you're agreeable to anti-police messaging. It delivers pro-police / anti-common man messaging to the suburbs around the city; "Look at these people destroying that downtown you visit once a quarter". Inciting chaos. Why? Because Chaos is a ladder; it is, itself, a goal of our enemies.

Thank you for the link, I wasn’t aware that there were high quality blogs by Anthropic (or about golden Gate Claude).
I'd add a little bit more to that.

Pattern identification and continuation can be applied to evaluate symbolic reasoning. You can see this in e.g. the semantics of a functional programming language if evaluation semantics are defined in terms of rewrite rules.

If you have a model which can convert a problem into language that's precise enough to start pattern matching to LLM-encoded generative programs that evaluate logical implications, you can get into a very interesting space. Autoregressive prediction can turn into symbolic progressive evaluation and calculation. The background LLM is still guiding choice of evaluation and goal seeking.

Reinforcing these evaluation rules seems like it should be doable without enormous corpora, as long as the base model already has enough meat on it to cleanly attach to the more precise language.

The reasoning R1 demonstrates most times sounds to me like 5th grader's wording - in support of what you say. But then if you compress compress the knowledge needed for math reasoning, perhaps you get category theory paired with prolog or something along the line which is rule-based.
This suggests fine-tuning a base model (with SL or RL) generally doesn't make the model inherently smarter, only the initial self-supervised learning during pretraining does. Though it would be strange if no amount of reinforcement learning could make the LLM truly smarter.