|
|
|
|
|
by alchemist1e9
1164 days ago
|
|
The underlying motivation to my thoughts and comments is investigating if a decentralized but periodically coordinated algorithm for training LLMs exists. We have millions of GPUs distributed across the world which if they could somehow be put to work on training without extreme requirements on data transfer between them could enable training of large LLMs in an open source way even if that training is technically energy suboptimal. |
|