Hacker News new | ask | show | jobs
by codingwagie 482 days ago
I think the difference between deepseek and OpenAI/Anthropic is one of the difference between practitioners and academics. Ofcourse there is world class talent at OpenAI. But there are also alot of "I went to Harvard and want to work in AI", and those types of people just simply dont have the technical exposure to even think of building something like this.
5 comments

I would say most if not every large company in China has their own AI infra stack, partially because tech talent is relatively more abundant and partially some of the tech leads have been exposed to western tech via open source and work experience so they have a good success rate (which makes it a more common practice). Anecdotally, specifically Google, FB ex-employees from oversea offices, MSFT and Intel ex-employees from their China offices could be the key elements for this trend in the past two decades (Google left China around 2010).

The infra work is usually technically tedious so I think it may become some lost art in the west just like those manufacturing jobs.

As opposed to the US, where every large company has its own AI infra stack, often extending down to the silicon and up to large open source projects?

What's going on here, why are people forgetting what's around them? Does familiarity breed contempt? Are attention spans so shot that failure to participate in this week's news cycle is enough for "out of sight, out of mind"? Or is HN full of Chinese bots now?

That was to answer the previous question. Also the point is why Chinese companies can produce infra work in a cheap and fast way. With regard to US companies, I don’t see that is possible with MSFT, AMZN, AAPL, and likely GOOG as well. (Don’t get me wrong, they all have solid infra, probably except Apple)
I think it would be a bit irrational to claim that so broadly. I've met some incredibly talented people in academia and I've also met people that made me question how they even pass.

My hypothesis is that there is not such a big difference at all. All three of the companies you mentioned are world class competitors in this. DeepSeek were the last to have a "hit" but that isn't an indication that they'll be the next of the three (or other yet unknown entities) to have the next hit. We try to predict what happens next now but perhaps we should rather focus on who or what we want to succeed. For me it's quite clear: it should be open source or I'm long term not that interested.

Someone should write a blog post about the prestige/effectiveness negative feedback loop. This is also the Achilles heel of top tier SV VCs including YC.
The problem isn’t the prestige it’s that prestigious institutions in America don’t produce high-quality talent. They’re instead mostly corrupt credentialing mills for the rich and well-connected. From what I understand, DeepSeek also only hires from the best universities in China, but “best” actually means something relative to how difficult entrance to those organizations is to achieve and their coursework.
I read this too but there was no source on this. The founder Liang Wenfeng himself comes from Zhejiang university. Its admissions rate is 20%, which is much higher than traditional US "elite" schools. Wenfeng has said this about hiring though:

"If you are pursuing short-term goals, it is right to find people with ready experience. But if you look at the long-term, experience is not that important. Basic skills, creativity, and passion are much more important.”

The Chinese college application works way differently from American ones. The admission rate is meaningless. Zhejinag University is state assigned 985 university (there are in total 9 of them). Believe me any students in elite high schools in China will be very happy if they can be accepted into Zhejiang University. Most of them unforunately don't have the score to even think of applying. It technically is not applying. Students take the once a year exam, if they don't score higher than top 500 in their province, don't even think about trying to apply Zhejing Univ.
Can you expand on this?
Makes me wonder where is the best place to learn how to put together and operate something like this then? Certainly there should be resources out there somewhere to teach yourself?
Weren't the flash attention authors not just from academia but in academia at the time?