Hacker News new | ask | show | jobs
by brianwawok 3234 days ago
P.S. Google Spanner and FaunaDB both shard. They can call it something else. But unless every node has all data on it, it is sharded.
1 comments

It is true that Spanner and FaunaDB partition a cluster's dataset across multiple nodes but it's handled transparently by the database. Whenever I've heard the term "sharding" it's usually in reference to the application-level sharding described in the article.

Partitioning the dataset isn't really novel these days (Cassandra, Riak, Mongo et al do the same of course), but what is a significant difference is that both Spanner and FaunaDB implement ACID transactions distributed across partitions. It no longer matters for application correctness what partition key you choose if you can involve any arbitrary set of records in an single transaction.

(Ozgun from Citus Data)

> Whenever I've heard the term "sharding" it's usually in reference to the application-level sharding described in the article.

I wanted to drop a quick clarification note here. In the article, I used the term "sharding" to refer to both application and database level sharding.

For anyone that's looking at sharding as an option for scaling, we're always happy to chat and help point you in the right direction. My email's ozgun @ citusdata.com

If you're looking databases that come with built-in sharding, I'd definitely check out Citus (then again, I'm biased): https://www.citusdata.com/

Abstractions are leaky. As soon as you run into a query that runs fast when the data is on one shard and slow when its not, you now need to know about sharding.