| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by turnsout 504 days ago
	As someone who follows AI basically minute-to-minute, I'm a little confused about why everyone is freaking out so much about DeepSeek & R1. Normal (non-tech) people are asking me about it today. The stock market is freaking out. Why is this news—which is mostly technical and incremental—causing such panic?

2 comments

verdverm 504 days ago

They seem to have matched OpenAI model capability on a fraction of the resources. R1 is roughly as good as o1 and can be run locally. There are some interesting contributions in the paper too. So I see it as two parts, (1) the money, lots invested, wondering if they will ever make it back now (2) China showing impressive results while being handicapped by the West

https://arxiv.org/abs/2501.12948

Keep an eye on the effort to reproduce here: https://github.com/huggingface/open-r1

We will see if the (over?) reaction matches reality in time. Media sure loves to whipsaw us all around

link

turnsout 504 days ago

That's all cool, but this has been the cat and mouse game with "open" models for years now, right?

link

verdverm 504 days ago

Open models have only really become useful in the last year or so and still required significant investment to build. They have still lagged behind closed models in capability. though still not matching the best models, this shows that the gap has closed significantly and the cost to produce should be much lower.

Another thing to consider is society & the market is not currently rationale, lots of wild swings, over reactions, messaging & beliefs at the extremes

link

stogot 504 days ago

MIT license

link

turnsout 503 days ago

Some people complain when the training code is not also open sourced, hence the scare quotes.

link

Jimmc414 504 days ago

The big news with DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners.

They successfully distilled the reasoning capabilities from larger models into much smaller ones. e.g. Their 14B model outperforms other 32B models.

link