Y
Hacker News
new
|
ask
|
show
|
jobs
by
bjt12345
633 days ago
> [1] The training code for AMD-135M is based on TinyLlama, utilizing multi-node distributed training with PyTorch FSDP.
I thought PyTorch didn't work well with AMD architecture, and read of many people using JAX instead?