Hacker News new | ask | show | jobs
by ReadEvalPost 1079 days ago
That advice makes sense if we're talking about 800B+ parameter models that require a gigantic investment of capital and time. For models that fit on a consumer GPU you're leaving chips on the table to not take advantage of training / fine-tuning. It's just too easy and powerful not to.