Hacker News new | ask | show | jobs
by shraremywin2 2639 days ago
I like the idea but data collection seems like a big chunk of the costs. And if you going to go that far why not create an api to host it and charge for usage and pay back initial "investors".
1 comments

API to host the data? Data storage isn't a huge cost really (GPT-2 was trained on only 40 gigs of data which they got from just crawling reddit). It's the compute to train the model that's the costly part.