Hacker News new | ask | show | jobs
by nl 2 days ago
> Anthropic confirmed the training run was under 10^26 flops. You can't train 10T to chincilla and stay under 10^26.

I don't think Anthropic have said anything of the sort.

Microsoft published it as 6.1*10^27 FLOPs[1]

Elon has claimed the are also training a 10T model because "Some catching up to do"[2]

[1] https://x.com/scaling01/status/2061897540161728791

[2] https://x.com/elonmusk/status/2041754402239975479

1 comments

I must have confused mythos with opus 4.7. One of their recent model cards confirmed that training flops was under the EO reporting requirement of 10^26 flops.