Y
Hacker News
new
|
ask
|
show
|
jobs
by
paulddraper
291 days ago
Reductive.
Doesn’t explain Deepseek.
1 comments
FergusArgyll
291 days ago
Deepseek story was way overblown. Read the gpt-oss paper, the actual training run is not the only expense. You have multiple experimental training runs as well as failed training runs. + they were behind SOTA even then
link