Hacker News new | ask | show | jobs
by danielmarkbruce 32 days ago
You are confused by what the L and L mean in LLM, or which data set she created, or both, or in general.
1 comments

Or it is you who are confused. And I want to remind you that you can't retcon historical word use.
Fei Fei was annotating images... the second L in LLM is for "language". The first language models named LLM at the time were trained on language data, with an objective function of predicting the next token. It had nothing to do with the imagenet data. Imagenet data was used in... vision models.

The attention is all you need paper didn't ever use the term LLM or large language model because the phrase didn't exist in industry.

Why comment on a field you know nothing about?