|
|
|
|
|
by ElectricalUnion
384 days ago
|
|
> Nobody can use ChatGPT offline or retrain it, but DeepSeek is fully opensource. Well, you likely can't train DeepSeek yourself either. You most likely: * you philosophically don't have all the training data to train it yourself (so the claim it's opensource or open-whatever are dubious in the first place); or * you don't have the compute to "press the train button" and getting the weights back before the sun expires. While considered ridiculously ground-breaking cheap, those costs were still estimated to be around 6 million USD (DeepSeek claimed the model training took 2,788 thousand H800 GPU hours, which, at a cost of $2/GPU hour, comes out to a "mere $5.576 million"). I remember that when it was released, the mere thought that "people" cound "train AI cheaply with only 6 million USD" made one of the worst drops in the Nvidia valuation. |
|
Because the FineWeb Dataset is already super good. You can train 7B or 32B Param models at home
The >600B Param model isn't really using all the data effectively, but with a MacStudio Farm you can also train that one at home (if you have enough money to buy at least 100).
Here's the easy way: https://github.com/FareedKhan-dev/train-deepseek-r1
More details: https://www.bentoml.com/blog/the-complete-guide-to-deepseek-...
Here's how DeepSeek-R1-Zero was built, basically from 0 to Hero, including weights the FULL Training Data and everything you need to get it running locally or on servers.https://medium.com/@GenerationAI/how-deepseek-r1-zero-was-re...
For $30 USD you can also train a small DeepSeek at home!
https://github.com/Jiayi-Pan/TinyZero
https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero (the model)