Hacker News new | ask | show | jobs
by drdg 934 days ago
Very cool. The explanations of what each part is doing is really insightful. And I especially like how the scale jumps when you move from e.g. Nano all the way to GPT-3 ....