Hacker News new | ask | show | jobs
by mtdewcmu 4571 days ago
I noticed that there is no delta compression until objects get incorporated into a pack.

>Think of each branch as a pointer. Then realize that you can make that pointer point anywhere on the DAG, even to parts of the DAG that have no connection to each other. The `reflog` is a (local, non-comprehensive) history of where that pointer has pointed.

I got that branches were pointers. Now that I'm aware that the DAG is fully represented inside objects, I can see that what's inside logs/ is actually just logs. Each log corresponds to a subgraph of the full DAG. Getting history from a log would be more efficient than from the objects themselves, because to get it from objects, you'd have to dereference a lot of object references.

>I'm not sure what branches living under .git/refs has to do with excessive hierarchies/trees. There are enough things stored in the .git directory, that if you mashed them all together it wouldn't make any sense.

Having to descend through layers of subdirectories makes things harder. I'd reduce the depth of the directory tree to the absolute minimum. It's hard to tell if this is the minimum without knowing exactly what all the implementation constraints might have been.

I can see that the real meat of this system is the object store. It's useful to know about `git cat-file` for inspecting it.

1 comments

> Each log corresponds to a subgraph of the full DAG

I don't have the time to keep up this conversation, but this assertion is wrong. It is not a subgraph. It is a history of the values that the pointer was pointing to (e.g. "Pointer <branch_name> changed from pointing to value AAA to value BBB due to action XXX"). That is basically what all of those entries are. 'AAA' and 'BBB' maybe be in completely unconnected sections of the DAG.

If you create a new repository and add a couple of commits, then yes the reflog files will look like a history, but only because the branch pointer has traversed the DAG from start to end with no deviations.

For example you can have a DAG like this:

   A - B - C - D - E

   X - Y - Z
If you change the branch pointer to move from B to Z, this is not a subgraph. Well, I guess technically you could call it sgraph of the history of the branch pointer, but it in no way corresponds to the DAG other than that all of the pointer values exist within the DAG. For example the following operations:

  git clone
  git reset --hard Z
  git reset --hard X
Would create a graph like this (assuming that master pointed to E when you cloned):

  E - Z - X
Notice that this really don't correspond to the DAG other than the fact that those objects exist in the DAG.

Note:

- All of this information is only contained within the .git/logs files. None of it is stored in the objects themselves.