Hacker News new | ask | show | jobs
by gwern 1441 days ago
#1, and run at a loss. You have no idea if anyone even wants your startup idea or if you have any product-market fit, worrying about training your own model is putting the cart before the horse. If no one wants your product even ignoring the cost of ML, then you don't need to worry about any of that!

If you do potentially have a viable idea and it turns out there is >0 people out there willing to pay for it, then you can experiment with cheaper models and optimizations as necessary. ('People want our product we sell at a loss so much we may go bankrupt' is a good problem to have.) Also, consider that as time passes, your problem may be solved for you: lots of people moan and whine about the OA API and wrung their hands about how no one would ever be able to afford to train their own GPT-3 - but here we are, just over 2 years later, and you have a wealth of alternatives in API or FLOSS model, like Jurassic or GPT-J/Neo-20b or YALM or OPT or BLOOM or... Even if none of those work for you or can be finetuned or something, it is also now easier than ever to train your own: countless bugs have been worked out, better training recipes documented, newer better GPUs come out (A100s are no longer rare, and H100s are coming soon), and older GPUs themselves are enjoying a pricing correction.

1 comments

This was really insightful, thanks for taking your time out to respond.