Hacker News new | ask | show | jobs
by dylanbfox 1718 days ago
Author here. Thanks for your comments!

In general - this is expensive stuff. Training big, accurate models just requires a lot of compute, and there is a "barrier to entry" wrt costs, even if you're able to get those costs down. I think it's similar to startups not really being able to get into the aerospace industry unless they raise lots of funding (ie, Boom Supersonic).

Practically speaking though, for startups without funding, or access to cloud credits, my advice would be to just train the best model you can, with the compute resources you have available. Try to close your first customer with an "MVP" model. Even if your model is not good enough for most customers - you can close one, get some incremental revenue, and keep iterating.

When we first started (2017), I trained models that were ~1/10 the size of our current models on a few K80s in AWS. These models were much worse compared to our models today, but they helped us make incremental progress to get to where we are now.