| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bananas 4529 days ago

I think this title is wrong.

Firstly some clarification - this appears to just be about the persistence format for his dive log. It was XML, now it's git based with plain text.

As someone who had to manage a system which worked with plain text files structured in a filesystem for a number of years in the 1990s, this is done to death already.

You now end up with the following problems: locking, synchronising filesystem state with the program, inode usage, file handles to manage galore and concurrency. All sorts.

Basically this is a "look I've discovered maildir and stuffed it in a git repo".

Not saying there is a better solution but this isn't a magic bullet. It's just a different set of pain.

4 comments

e12e 4529 days ago

> You now end up with the following problems: locking, synchronising filesystem state with the program, inode usage, file handles to manage galore and concurrency. All sorts.

Which is why he's reusing git for resolving those pain points? Well presumably all except "synchronizing filesystem state with the program" -- where he's gone from using some kind of xml parser to marshal xml to objects/structs in ram to using a (simple(r)?) text parser to do the same.

I'm guessing he just writes/reads a full (part) of a log (a branch of the full tree, or whatever is used in the program. Maybe a list anchored at a date?) -- and lets git sort the history/backup thing.

So, yes, it's a different format, but I think the argument you're making is off -- seeing as he already has git for that? It's more like combining Maildir (or mboxes, only commited when valid) and git.

link

xsace 4529 days ago

Maybe you want to wait till he release something. Cause you know, if he took months to get the big picture in mind, I doubt you grasp what he envision just by reading his comment.

link

bananas 4529 days ago

If it's not that, I'll eat my hat, and my pyjamas.

There's not much more to infer from the comment.

Unless he's invented a new ASN.1 encoding which plugs into libgit or something or a new text serialisation format (both unlikely).

link

bsder 4529 days ago

Yes, because his design of git was so well-formed.

Git is so well-designed that expert users manage to trash their repositories and propagate the damage.

Maybe that's not a problem of libgit. But tools are both the infrastructure and the UI.

link

taeric 4529 days ago

Not sure what you are referring to. What are some common ways "expert users" manage to "trash their repositories?"

link

bananas 4529 days ago

5 minutes with me and git rebase usually do the job :)

link

bsder 4529 days ago

Let's start here: http://randyfay.com/content/avoiding-git-disasters-gory-stor...

So, the solution to the fact that the merging UI is a pile of garbage is HAVE A SINGLE PERSON ALWAYS DO THE MERGE. Excuse me? The whole point of a distributed revision control system is so I don't have to have a single choke point. That's the definition of distributed.

Then there was the KDE disaster: http://jefferai.org/2013/03/29/distillation/

Yeah, the root fault wasn't Git. However, at no point did Git flag that something was going horribly wrong as the repository got corrupted and deleted. Other distributed SCM systems I have used tend to squawk very loudly if something comes off disk wrong.

Maybe the underlying git data structures are fine, but, man, the UI is a pile of crap.

And, I won't even get into rebase, because that seems to be a religious argument.

link

smharris65 4529 days ago

The issues in the randyfay.com post are due to a misunderstanding when using git as a "centralized" repo like SVN. Git, by design, does not enforce a central repo even if you designate one logically. These issues can be completely avoided if you merge the right way:

http://tech.novapost.fr/merging-the-right-way-en.html

link

pjc50 4529 days ago

Well, that confirms that the "obvious" workflow of "git pull" is dangerous. At least it explains all the spurious merges. Why on earth did it ship with this broken design? Why doesn't git pull do the right thing by default?

link

taeric 4529 days ago

I'm not sure I follow. The advice for the "single person always do the merge" is essentially make sure the people doing the merges are experts. These mistakes do not seem like the kind of thing I have heard of experts doing.

Seriously, you can not call yourself a git expert, if you think rebase is a difficult thing to explain.

Might you sometimes make mistakes? Sure. I hardly see this as a systemic thing, though.

The mirror shenanigans I agree suck. Not sure what the real takeaway is there, other than don't rely on mirror as a good form of backup.

link

jamesgeck0 4529 days ago

So, the solution to the fact that the merging UI is a pile of garbage is HAVE A SINGLE PERSON ALWAYS DO THE MERGE. Excuse me?

That isn't what the post advocates. He says that having a single person approve the pull request is a good idea, but approving the pull isn't the same thing as manually doing a merge. Projects I've worked on required that the submitter merge master into their branch before their PR would be accepted.

link

crucialfelix 4529 days ago

I took this to mean that what he is replacing is a single XML file whose content is a tree of element nodes. Every time you have to make a change to that file (changing, removing or adding children nodes within the file) you would have to store a new copy of the file. The most efficient you can get is to store just the text diffs using git or something.

But what he replaces it with is a git object store. Each xml-node becomes a git object. They each point to a parent (just as git commits point to a parent commit).

Now writing to this datastore means adding a new node to the git object database and changing the parent references.

Where git stores commits that are related sequentially in time, this stores nodes in a tree relationship that IS the document.

If he's not talking about this then I'd like to officially take credit for my weird idea right now.

link

dangoor 4529 days ago

The impression I got was that he was going to store his data in a git object database and that the files would be virtual in there. It would be like the .git directory without the working files on disk. It's all just conjecture until his code his out.

Regardless, I would think that some applications are simple enough (store few enough separate objects in the file system) that the issues you cite are not likely to cause a problem.

link

mixedbit 4529 days ago

What you describe is quite similar to how gollum wiki uses git for storage: https://github.com/gollum/gollum

link