Hacker News new | ask | show | jobs
by make3 1976 days ago
also, he just be talking about training a much smaller model than the 1.5B one, because that would take years maybe otherwise