Hacker News new | ask | show | jobs
by cratermoon 456 days ago
Deepseek showed us that the easiest way to catch up with OpenAI is.. <checks notes> hiring programmers who know what they are doing.
2 comments

Right, because Google, Apple, Facebook, Amazon, and X all don't have anyone who knows what they are doing - and also none of them had the bright idea to hire anyone who knows what they are doing either...

I'm going to go out on a limb and say it's not that reductive.

I think he missed adding: and let them do their job and be smart managing the resources by themselves.

Pichai & co put a borderline-communist quota system and it became a popularity contest to get a tiny slice extra of compute. And then OpenAI ate their lunch (Google tech).

Do you have a source for this “quota system”
Are you stalking me, doyouhaveasourcebro?

Previously https://news.ycombinator.com/item?id=40273440

Self-quoting here:

That's in contrast to what OpenAI's David Luan "Why Google couldn’t make GPT-3" (https://www.latent.space/p/adept):

  And it turned out the whole time that they just couldn't get critical mass.
  So during my year where I led the Google LM effort and I was one of the
  brain leads, you know, it became really clear why. At the time, there was a
  thing called the Brain Credit Marketplace. Everyone's assigned a credit. So
  if you have a credit, you get to buy end chips according to supply and
  demand. So if you want to go do a giant job, you had to convince like 19 or
  20 of your colleagues not to do work. And if that's how it works, it's
  really hard to get that bottom up critical mass to go scale these things.
  And the team at Google were fighting valiantly, but we were able to beat
  them simply because we took big swings and we focused.”
I still think that Deepseek was mostly dramatically overblown. It was 6-9 months behind the performance of the best frontier models at a dramatically lower cost.....but dramatically lower costs 6-9 months behind is basically what has been happening for a couple of years.

I'm mostly convinced that the only reason that it blew up was that it was the first Chinese model that was even in the same ballpark as the American frontier models, which drove a lot of reporting, which caused a lot of normies that hadn't tried any AI model since CHatGPT very first blew up to try it and they were (understandably) blown away by the progress relative to what they remembered from 2 years previously.

The timing seems to indicate that the mainstream press publicity over DeepSeek R1 was due to an NVidia short recommendation that had just gone viral, not about the disclosed cost (which related to V3, not R1).

However, there also seems to have been some genuine panic at OpenAI, maybe elsewhere too, over DeepSeek R1 since not only did they come close to matching the performance of o1, but they also described exactly how they did it (apparently very similar to what OpenAI had done, judging from the reaction), and therefore killed any competitive lead that OpenAI - who had been working on it for a long time - may have thought they had.