Hacker News new | ask | show | jobs
by mysterypie 1141 days ago
I keep hearing that there have been significant breakthroughs in the areas of AI / ML / LLM / Transformer Models between 2012 and present. Can summarize what the breakthroughs were, who was principally responsible, and which papers specifically?

This timeline has something like 60 papers and many papers have 8-30 authors. Are the breakthroughs spread out like that? Or are there one or two super important works? Sort of like Einstein's "On the Electrodynamics of Moving Bodies"?

3 comments

The most notable paper is probably the one that defined the concept of transformers, Attention Is All You Need: https://arxiv.org/abs/1706.03762
It's been very incremental and spread out.

Another commenter pointed to "Attention is all you Need" as particularly breakthrough. Even that paper just merged at the time current trends in attention, normalization, seq2seq and doesn't yet start to have all the interesting empirical results were later found using that architecture to scale up other problems.

The papers with huge numbers of authors tend to be empirical results with a large element of systems/engineering, much large scale LLM work is like this.

The graph helps clarify that. Look at the ones that are heavily-connected ancestors.