Hacker News new | ask | show | jobs
by p1esk 2450 days ago
What was the bug?
1 comments

Late reply, but it was a bug using XLA GPUs to add concurrency to the training process. Maybe someone figured it out or fixed it, but I've moved on already.
You shouldn’t need XLA for multi GPU training. Have you tried training without it?