Hacker News new | ask | show | jobs
by lsb 304 days ago
This is evocative of “cramming”, a paper from a few years ago, where the author tried to find the best model they could train for a day on a modern laptop: https://arxiv.org/abs/2212.14034