|
|
|
|
|
by moyix
1976 days ago
|
|
I agree. I've been training on 2x3090s connected via NVLink and they're really fast for training language models. I am actually tempted to try and replicate the OP's GPT2 replication using Huggingface, DeepSpeed, and OpenWebText, but the GPUs are occupied right now training a GPT2-774M C language model... |
|