Hacker News new | ask | show | jobs
by paulddraper 291 days ago
Reductive.

Doesn’t explain Deepseek.

1 comments

Deepseek story was way overblown. Read the gpt-oss paper, the actual training run is not the only expense. You have multiple experimental training runs as well as failed training runs. + they were behind SOTA even then