Hacker News new | ask | show | jobs
by benrutter 469 days ago
I'm not massively knowledgable about the ins and outs of DeepSeek, but I think I'm in the right place to ask. My understanding is DeepSeek:

- Created comparable LLM performance for a fraction of the cost of OpenAI using more off-the-shelf hardware.

- Seem to be open sourcing lots of distributed stuff.

My question is, are those two things related? Did distributed computing allow the AI model somehow? If so how? Or is it not that simple?

1 comments

These type of models need to be trained across thousands of GPUs, which requires distributed engineering on a much higher level than "normal" distributed systems.

This is true for DeepSeek as well as for others. There are a few companies giving insights or open-sourcing their approaches, such as Databricks/Mosaic and, well, DeepSeek. The latter also did some particularly clever stuff, but if you look into details so did Mosaic.

OpenAI and Anthropic likely have distributed tools of even larger sophistication. They are just not open source.

Thanks, that's a really great/helpful explanation!