|
|
|
|
|
by epoch_100
500 days ago
|
|
> DeepSeek does not "do for $6M what cost US AI companies billions". I can only speak for Anthropic, but Claude 3.5 Sonnet is a mid-sized model that cost a few $10M's to train (I won't give an exact number). Also, 3.5 Sonnet was not trained in any way that involved a larger or more expensive model (contrary to some rumors). Wow! |
|
There are popular infographics floating around today which explain to the layman that deepseek invented MoE.