| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Art9681 319 days ago
	None of the major players have ever been quiet. DeepSeek enjoyed about a week or two's worth of press before its spotlight was stolent from the next great model. It never held the top spot, ever, mind you. So I don't understand why you think major players had to say anything about it, when the model was neither first, second or third in real world capability, and why they would have to say anything about it when DeepSeek service processes maybe an 1/8 of what OpenAI, Google or Claude in any given span of time. I applaud their open efforts. But being "altruistic" and being best are two different things.

4 comments

sothatsit 319 days ago

DeepSeek's contributions to training efficiency improvements were as, if not more, important than the models themselves. A lot of the worry people had about DeepSeek was related to people questioning the moat of the big AI players, since DeepSeek was able to train a competitive model with so much less compute.

Their innovations in training efficiency were almost guaranteed to have been heavily considered by the big AI labs. For example, Dario Amodei talks about the efficiency improvements being the real important contribution of DeepSeek V3 here: https://www.darioamodei.com/post/on-deepseek-and-export-cont...

> DeepSeek's team did this via some genuine and impressive innovations, mostly focused on engineering efficiency. There were particularly innovative improvements in the management of an aspect called the "Key-Value cache", and in enabling a method called "mixture of experts" to be pushed further than it had before.

link

laughingcurve 319 days ago

Almost all of High Flyers achievements have more to do with scaling the process but when scaling is all you need, it’s darn effective

link

laughingcurve 319 days ago

It crashed the market because retail investors and perhaps non-retail as well had a great deal in overconfidence with the ability of the USA to maintain a lead thanks to the chip gap. High Flyer's innovations allowed them to scale and show that is not the case. This major event then likely spurred on many others. It was a mini 'sputnik moment'

link

laughingcurve 319 days ago

Genuinely many times it seems most people need to find reasons to assume the best about DeepSeek and China in order to confirm their prior bias that “America bad” and “Capital is evil”. The reality is grey and fuzzy, with neither side landing on truth yet

link

cma 319 days ago

How would people use deepseek to think "Capital is evil?" It was from a private hedge fund named "High Flyer," not a state university project or something.

link

laughingcurve 319 days ago

Yes, exactly. How the heck? It makes no sense to me either, but you can certainly find plenty of laymen/not-in-the-know folks making those kinds of comments, often in non-technical spaces. Often the worst parts of the internet where discourse is non-existent. Human psychology allows for us to hold many contradictory positions all at once. Ideologies are the lens through which we view the world and it distorts our perception.

link

cma 318 days ago

Usually what I see is not that, but that Deepseek stole from American capital by training on the O1 release to acheive chain of thought, but there is a contradiction because o1 at the time didn't show its real chain of thought to train on.

link

benreesman 319 days ago

MLA is just one example of a best-in-class technique from Hangzhou that's seen wide adoption in US prestige labs.

And the saltiness of US labs about DeepSeek is well-known. "O3, explain model distillation like I'm five."

No Sam, explain intellectual property rights to the judge in the NYT test case asshole.

link

laughingcurve 319 days ago

… wait did you just seriously tell SamA that he’s an asshole because of copyright issues… while praising Chinese labs who couldn’t give a rat fuck and won’t follow the same laws? Or pay creators? Physician, heal thyself

link

benreesman 319 days ago

Sam's an asshole for a lot of reasons, a ridiculous commons grab of intellectual property draped in threadbare rhetoric about human welfare (get those developing nation eyeballs SCANNED people!) being just one of them.

Watching the Chinese labs kick the shit out of better funded US enclaves of TESCREAL psychopathy in the public fucking domain is gravy.

I don't care that their internal calculus or that of the PRC is to Cloud Strife Limit Break a bunch of "shareholder value" in the form of a bloated NVIDIA cap feeding frenzy by bloated "public benefit corporations" with a bunch of creepy ties to Thiel et al: they're publishing papers, code and weights. So they're hoovering up of the commons has something of value going back into the commons.

So yeah, fuck Sam and its going to be fun watching OpenAI and Anthropic pivot ever more towards trying to outlaw competition than they already have. Amodei already sounds like Donald Rumsfeld on Taiwan hawkishness, this is not the positioning of someone who loves their product roadmap.

It turns out that a zillion ScaleAI and SurgeAI turks don't have economics any better than paying NVIDIA to run 85% net earnings for CapEx that's obsolete by the time its racked and powered.

link

laughingcurve 319 days ago

... You did not speak to the key point at all and went on some massive rambling incoherent political commentary. I feel this comment is unworthy of the thread.

Native Sparse Attention matters. Your commentary is beneath this paper.

link