Hacker News new | ask | show | jobs
by rtkaratekid 2442 days ago
Late reply, but it was a bug using XLA GPUs to add concurrency to the training process. Maybe someone figured it out or fixed it, but I've moved on already.
1 comments

You shouldn’t need XLA for multi GPU training. Have you tried training without it?