Hacker News new | ask | show | jobs
by mips_avatar 51 days ago
You can fully train a 1.6b model on a single 3090. That’s a reasonably big model.
1 comments

you can train it, but not fully
I trained karpathys d28 1.6b nanochat on a 3090. Took an extremely long time but I did it.