Hacker News new | ask | show | jobs
by WithinReason 483 days ago
Why is this even necessary? Can you just shard your training set to the training nodes ahead of time instead?
1 comments

No, besides accessing training data, there is also logging and checkpointing... When you run k8s over it, and there are multiple training jobs... isolated local storage is a nightmare...