Hacker News new | ask | show | jobs
by 363849473754 746 days ago
You might have covered this topic before, but I'm curious about the main performance differences between nanoGPT and llm.c. I'm planning to take your "Zero to Hero" course, and I'd like to know how capable the nanoGPT chatbot you'll build is. Is its quality comparable to GPT-2 when used as a chatbot?
1 comments

Zero To Hero doesn't make it all the way to a chatbot, it stops at pretraining, and even that at a fairly small scale or character-level transformer on TinyShakespeare. I think it's a good conceptual intro but you don't get too too far as a competent chatbot. I think I should be able to improve on this soon.
Thanks! So, you are considering expanding the Zero to Hero series to include building a basic GPT-2 toy chatbot? I believe you mentioned in one of the early lectures that you planned to include building a toy version of Dalle. Do you still have plans for that as well?
Please do! It's a fantastic series!