Hacker News new | ask | show | jobs
by yid 3941 days ago
> It requires zero maintenance and runs on your own infrastructure.

I don't understand how these two claims are not contradictory.

> All conflict resolution happens locally in each peer using a deterministic algorithm.

How exactly does this happen? Via CRDTs? Consensus? How do you handle conflicts?

I'm very skeptical about this sort of largely content-free copy fronting what should be a pretty complicated distributed system.

2 comments

Good point, haven't heard that one before. Let me clarify. GUN requires no additional maintenance because it gets embedded into your app, compared to most databases which require you to run (and maintain) a database server.

If you do not want to even run your own app servers, I'm more then happy to let you use my free GUN backend so that way you do not even have to worry about that.

Consensus? Absolutely not. I definitely need to add more documentation and resources on this - in fact I'm doing a talk on how my conflict resolution algorithm works out in Berlin in a few weeks. Check the conference out at: https://2015.distributed-matters.org/ber/ . After that I should have much better materials on this available for people.

Briefly and naively, the way it works is a Vector Clock + Timestamp combo. Timestamps have bad exploits, and Vector Clocks don't work well in ephemeral environments. If you combine the two together they compensate for the vulnerabilities of the other. Every peer acts as a state machine operating inside of an open-closed boundary function (nothing fancy here, this is the same stuff you learned in middle/highschool). This does NOT give you Global Consistency, as different peers might have slightly different boundaries because of clock drift (you can use a separate service to minimize this), but they will all become Eventually Consistent.

This helpful? Any other questions?

> If you combine the two together they compensate for the vulnerabilities of the other.

Do you have a reference for this? I don't mean to keep sounding like a skeptic, but these are pretty big claims. I'm also genuinely curious about this claim and would really like to look into it further.

No worries, getting database stuff correct is important and you have a right to know how things work before building anything on top of it. You don't want to build a house on a cracked foundation, I'm the same way and I'm trying my best to engineer these things properly.

Here is a link explaining the algorithm https://github.com/amark/gun/issues/87#issuecomment-13636276... pretty briefly, it is also in a terrible location that nobody would know to look for (I need to move it out into the wiki or something).

And to answer your question directly, here is how they compensate each other:

1. Timestamps' vulnerability is to accidental or malicious clock drift. I can change my machine's local clock to be 2 years in the future. If you use timestamps to decide who "wins" in a conflict, my edits will win for the next 2 years. That sucks and is evil.

2. Vector clocks were invented to get around some of these problems. You increment a vector on every local change such that it is higher than the highest known vector. So Alice updates a value to "Hello World" at state 1, then to "Hello Mars" at state 2, if she then receives an update from Bob of "Hello Jupiter" at state 5, Alice then has to jump all the way up to and past Bob to change the value - say "Hello Pluto" at state 6. This gets around the timestamp vulnerability, because even if Bob were to say the update is at state 999998 all Alice has to do is increment it again to 999999, she doesn't have to wait 2 years or corrupt her clock.

3. Vector clocks' vulnerability is that since the clock is relative to the machine, if the machine reboots it loses its clock. Or even if the machine persists it, it has to play "catch up" when it comes back online - but while it is coming back online nothing stops two machines from accidentally incrementing to the same conflicting clock. At this point you are skrewed, unless you implement some other deterministic resolution - and plenty do exist. But the point of this is that vector clocks were designed for fairly permanent machines, but we now live in a emphemeral world where we might spin up a hundred servers to handle some load and then shut them down. If you do this, you lose the machine's vectors.

4. But timestamps don't have this problem, machines spinning up and down usually do some sort of NTP that gives them a rough estimate of time - drift aside, they don't need to remember anything. So when you combine these two together you get a vector timestamp relative to other vector timestamps. Aka every update includes its local timestamp (which might have drift) but the receiving computer calculates a vector relative to its own local timestamp (which might also have drift). If the sending peer is being malicious, the computed vector will be large which then receiving peer can use to mitigate the timestamp exploit. Equally as much, you can have any number of ephemeral peers coming and going through the network without fear of conflicts occurring or losing vectors.

Does that make sense? I'll be presenting on these subjects at the conference, which Kyle Kingsbury is doing the keynote. So I'll have an opportunity to talk to him and I'm hoping he'll also review the algorithm (the actual algorithm is in the code which you should look at, and I linked to a brief explanation of it at the top of this post) and help me set up Jepsen tests.

Overall, I need a LOT more documentation on this and I'm also wanting to formally verify it with TLA or Coq. We'll also be building a battle testing suite to hammer gun on in real deployed environments to see where problems lie. So please, give it a shot and slam me with any questions or problems you encounter. Have you seen this demo? https://medium.com/@marknadal/gun-0-2-0-pre-release-auto-rec...

Cheers!

Yes, I am skeptical about automatic conflict resolution too. But the project is interesting and uses many right concepts.