Hacker News new | ask | show | jobs
by dmpetrov 600 days ago
Right, DVC caches data for consistency and reproducibility.

If caching is not needed and streaming required, we've created a sister tool DataChain. It's even supports WebDataset and can stream from tar archives and filter images by metadata.

WebDataset example: https://github.com/iterative/datachain/blob/main/examples/mu...

1 comments

Thank you! Thats news to me. I will absolutely give it a try