Hacker News new | ask | show | jobs
by turnsout 504 days ago
As someone who follows AI basically minute-to-minute, I'm a little confused about why everyone is freaking out so much about DeepSeek & R1. Normal (non-tech) people are asking me about it today. The stock market is freaking out.

Why is this news—which is mostly technical and incremental—causing such panic?

2 comments

They seem to have matched OpenAI model capability on a fraction of the resources. R1 is roughly as good as o1 and can be run locally. There are some interesting contributions in the paper too. So I see it as two parts, (1) the money, lots invested, wondering if they will ever make it back now (2) China showing impressive results while being handicapped by the West

https://arxiv.org/abs/2501.12948

Keep an eye on the effort to reproduce here: https://github.com/huggingface/open-r1

We will see if the (over?) reaction matches reality in time. Media sure loves to whipsaw us all around

That's all cool, but this has been the cat and mouse game with "open" models for years now, right?
Open models have only really become useful in the last year or so and still required significant investment to build. They have still lagged behind closed models in capability. though still not matching the best models, this shows that the gap has closed significantly and the cost to produce should be much lower.

Another thing to consider is society & the market is not currently rationale, lots of wild swings, over reactions, messaging & beliefs at the extremes

MIT license
Some people complain when the training code is not also open sourced, hence the scare quotes.
The big news with DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners.

They successfully distilled the reasoning capabilities from larger models into much smaller ones. e.g. Their 14B model outperforms other 32B models.