| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by DominoDataLab 4366 days ago

siganakis's explanation is correct, with a minor caveat [0]. We do some clever things (diffing, compressing), but if large amounts of data are always changing, the network transfer will take time. We have folks comfortably using Domino with about 50GB of data in their project directories. And it's only the amount of data in the current revision -- not cumulative across all revisions -- that matters. Anyway, if you have a use case with more than ~50GB, let us know, we're eager to engineer a more advanced solution -- just haven't had the need yet ;)

If your code pulls data from a database (e.g., kelv's test cases), one option is to save the DB snapshot out to a file when the code runs. After we finish executing your code, we snapshot all the new/changed files in your working directory (we call those the "results" of the run). Using this approach, you'd have a record of the DB snapshot for each run of your tests. But, again... if your snapshot is more than 10s of GB, the network transfer time could get annoying.

[0] We treat the working directory as a git repo, but since git breaks down with large files, we only use it to track your directory structure; we store actual file contents elsewhere.