| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by StrLght 636 days ago

Which ones are you referring to?

Just to make it clear, I see only 1 breakthrough [0]. Everything that happened afterwards is just application of this breakthrough with different training sets / to different domains / etc.

[0]: https://en.wikipedia.org/wiki/Attention_Is_All_You_Need

1 comments

cubefox 635 days ago

Autoregressive language models, the discovery of the Chinchilla scaling law, MoEs, supervised fine-tuning, RLHF, whatever was used to create OpenAI o1, diffusion models, AlphaGo, AlphaFold, AlphaGeometry, AlphaProof.

link