| HN Mirror

$ git clone blah DAG: A - B - C - D - E \ Z - X - Y Branches: master => E topic/new-feature => Y reflog: master E - clone from blah topic/new-feature Y - clone from blah

$ git reset master C DAG: A - B - C - D - E \ Z - X - Y Branches: master => C topic/new-feature => Y reflog: master E - clone from blah C - reset to C topic/new-feature Y - clone from blah

I see. I started reading the internals chapter at[1]. This free book seems better than the O'Reilly book, which I bought.

So the DAG is actually stored inside objects. The contents of the objects directory could be described by a relational schema, and I think that would make it easier for a lot of people to understand (myself included):

  Blob
  - sha1hash (primary key)
  - contents (blob)

  Tree
  - sha1hash (primary key)

  TreeEntry
  - treeid (foreign key into Tree)
  - mode (mode of blob/subtree)
  - type ("blob" or "tree")
  - objectid (foreign key into Tree or Blob)
  - name

  Commit
  - sha1hash (primary key)
  - tree (foreign key into Tree)
  - parent (foreign key into Commit)
  - author
  - committer
  - comment

The tree entries are actually denormalized and stored as a list inside the tree. You could represent this more accurately with XML. But who likes XML?

[1] http://git-scm.com/book