Hacker News new | ask | show | jobs
by nightfly 610 days ago
> MySQL and Oracle store a compact delta between the new and current versions (think of it like a git diff).

Doesn't git famously _not_ store diffs and instead follows the same storage pattern postgres uses here and stores the full new and old objects?

5 comments

TBF, the quoted section doesn't say that git stores diffs (or anything about git storage), it just says that what MySQL and Oracle stores is similar to a git diff.
It's a little too easy to misinterpret if you're skimming and still have memories of working with SVN, mercurial, perforce, and probably others (I've intentionally repressed everything about tfvc).
It’s not clear why they state “git diff” specifically. It’s simply a diff (git or otherwise).
That is correct. Each version of a file is a separate blob. There is some compression done by packing to make cloning faster, but the raw for git works with is these blobs.
git's model is a good example of layered architecture. Most of the code works in terms of whole blobs. The blob storage system, as an implementation detail, stores some blobs with diffs. The use of diffs doesn't leak into the rest of the system. Good separation of concerns
1. The comparison was to MySQL and Oracle storage using git diff format as an analogy, not git storage.

2. git storage does compress, and the compression is "diff-based" of sorts, but it is not based on commit history as one might naively expect.

Others have mentioned that it said “git diffs”. However git does use deltas in pack files as a low level optimization, similar to the MySQL comparison. You don’t get back diffs from a SQL query either.
Git diffs are generated on the fly, but diffs are still diffs.