Hacker News new | ask | show | jobs
by maciejgryka 683 days ago
Yes, compute is absolutely the limiting factor today. Not only because the space of hyperparameters is huge and having more compute would make it easier/possible to explore. But also, weirdly, because inference becomes increasingly important for training, which means even more compute! A lot of work these days goes into getting better data and it turns out that using an existing large model to create/curate data for you works really well.