Hacker News new | ask | show | jobs
by kvnhn 1409 days ago
I very much agree with you about DVC's feature creep. The other issue I have with it is speed. DVC has left me scratching my head at its sluggishness many times. Because of these factors, I've been working on an alternative that focuses on simplicity and speed[0]. My tool is often five to ten times faster than DVC[1]. I'd love to hear what you think.

[0]: https://github.com/kevin-hanselman/dud

[1]: https://kevin-hanselman.github.io/dud/benchmarks/

1 comments

Thanks! I really like your clear explanation of how Dud differs from DVC (and I prefer your version in all cases).

Would it be possible for Dud to push/pull from a DVC remote and use the DVC shared cache? That would be really useful so I (iconoclastic free software user) could use Dud on my machine/acocunt, but still share data and artifacts with other people (who don't give a shit what tool they use) using DVC on their machines/accounts.

Also: Does Dud support reflinks at all? Or does it only support symlinks?

Unfortunately, there's a few things that currently hinder compatibility with DVC caches. First, Dud uses the Blake3 checksum algorithm, and DVC uses md5. This means the content-addressed caches will have completely different file names. Second, directories are committed to DVC differently than they are in Dud. For directories, not only will the committed file names not match (due to point 1), but the contents will not match either. Both of these things could be addressed, but it would take a lot of effort and would likely cost Dud in terms of its two main goals, speed and simplicity. I'm not opposed to this if we can make it work, though.

Dud currently does not support reflinks, but I think adding reflink support would be fairly straight-forward. Just curious: What filesystem and OS are you using for reflinks?

I'd be happy to chat more about this. Feel free to open GitHub issues for these items. I welcome contributions as well. ;)