Hacker News new | ask | show | jobs
by syntaxing 1017 days ago
This sounds like a really fun project, running small models would change a lot of industries like games in their example. But how do people afford these projects?! If I am doing my numbers right, it'll cost them 50K to train this model for 3T tokens.
2 comments

That's less than a month's income for a few people on here. I recall a comment from an engineer at Nvidia a year or two ago saying $700k/year was about much they were paid, in response to someone else not believing those levels.

Get together 5 people in that position and it's less than a week's income for the group. That sounds doable as a hobby for those lucky people.

More realistically, it's within range for a grant, or use of someone else's hardware if they aren't using it, as the sibling comment from wongarsu said.

Also cloud vendors sometimes give out large batches of credits to startups and such as marketing incentive to get future customers.

$38k, based on the "90 days using 16 A100-40G" and lambdalabs prices.

That's a lot for a hobby, but small enough that it might be running on a university machine (the TinyLlama devs provide a way to cite them and all seem to work or study at Singapore University of Technology) or could be sponsored (no indication of that now, but "people made an awesome model in our cloud" is good advertisement). Government grants or grants in general also aren't out of the question, especially for a topic with this much hype.