Hi Peter, sounds pretty awesome! Quick question though - on your front page you say that CockroachDB does SQL - but if it can't do a join, then how can you say it uses SQL? Or is distributed SQL a different thing entirely? It does sounds like a very limited SQL subset though... I'm sure I must be missing something as I'm not familiar with your product.
Also, what levels of isolation do you actually offer? Serialized snapshot isolation appears to be MVCC, but I see you also have just snapshot isolation - what is the difference?
Edit: oh brother, the proof you link to, I just realized I bought that book some time ago and never got around to reading it... Transactional Information Systems by Weikum & Vossen, right? time for me to hit the books I guess. Still trying to get my head around that second graph you have drawn, can't work out how you have gotten it :(
SQL is not a single language but rather a family of languages. Each RDBMS that advertises support for SQL ends up implementing a different flavor of SQL, most of the times they are not even compatible :)
For now CockroachDB's supports a subset of the SQL implemented by other databases, with some extensions of its own. This subset will grow over time.
The two levels of isolation offered are snapshot (SI) and serializable (SSI). Snapshot means that concurrent transactions are atomic with regards to each other and "see" the same initial state from the DB. Serializable adds into this that they can't introduce write skews, ie if there are concurrent transactions they will "see" each others effects in some order.
Hi Chris, we describe CockroachDB as a SQL database because that is what we're aspiring to. The missing functionality (i.e. joins) is on our near-term roadmap.
Joins over a distributed database aren't easy. Love to see what you have going! The main issue I see with distributed joins are that they need to be done in a single transaction - if even one table gets an insert, delete or update then it invalidated the join. But this distributed serialised snapshot isolation, that sounds like it might be the best way around it.
You may find that you will have to choose at best between two of these three: joins, speed and convenience. In other words, you won't be able to join quickly in a convenient way for the users, of if you want the user to join like he can in SQL database, it will be much slower.
Also, what levels of isolation do you actually offer? Serialized snapshot isolation appears to be MVCC, but I see you also have just snapshot isolation - what is the difference?
Edit: oh brother, the proof you link to, I just realized I bought that book some time ago and never got around to reading it... Transactional Information Systems by Weikum & Vossen, right? time for me to hit the books I guess. Still trying to get my head around that second graph you have drawn, can't work out how you have gotten it :(