|
|
|
|
|
by factoidforrest
1180 days ago
|
|
I'd like to know that as well. I went pretty far into looking at alternatives for the post. Best thing you can really do is try them yourself with a tool like this https://github.com/oobabooga/text-generation-webui Benchmarks are one thing but I suspect if any were truly on par we would know about it. Obviously it's partly compute cost, but I also suspect there's a lot of R&D that would need to be redone in the open. How many tricks does openai's pretraining have that aren't found in some paper somewhere? |
|
Training big models takes a lot of random reads/writes and those tend to be pretty latency sensitive. There _may_ be a way to train this BitTorrent style with donated compute, but it's hard to say how many orders of magnitude slower that would be. (Do you need 2x more compute to do it distributes? 10? 100x?)
It is an interesting question to be able to explore this space more!