|
|
|
|
|
by kelnos
7 days ago
|
|
Is the issue that training with less compute takes more time? Or is it just not possible? I think a collective using distributed training could tolerate the idea that it takes 10x as long as Anthropic to train a model, or whatever. |
|