Hacker News new | ask | show | jobs
by jpdus 914 days ago
Hey, imho best overall technical intro to LLMs (I guess that´s your main interest as you mentioned qlora + llama) is by Simon Willis [1]. Additionally or if you prefer videos, the recent 1h "busy persons intro" by Andrei Karpathy is great + dense as well [2].

[1] https://simonwillison.net/2023/Aug/3/weird-world-of-llms/ [2] https://youtu.be/zjkBMFhNj_g?si=M6pRX66NrRyPM8x-

EDIT: Maybe I misunderstood as you asked about papers, not general intros. I don´t think that reading papers is the best way to "catch up" as the pace is rapid and knowledge very decentralized. I can confirm what Andrej recently wrote on X [3]:

"Unknown to many people, a growing amount of alpha is now outside of Arxiv, sources include but are not limited to:

- https://github.com/trending

- HN

- that niche Discord server

- anime profile picture anons on X

- reddit"

[3] https://twitter.com/karpathy/status/1733968385472704548

2 comments

This, but I'd replace Reddit with 4chan. There is a lot more information on how to build, finetune and run models there, compared to Reddit.
Is he referencing a particular “niche discord server?”