Hacker News new | ask | show | jobs
by moshmosh 1856 days ago
IME the answer to "how did they make [hard to scale thing] easily scalable?" is usually that they introduce limitations in how you can use [hard to scale thing] so you can't use it in ways that are hard to scale, then automate scaling it in well-known ways for use cases that are so-limited. Vitesse's site mentions that it relies on horizontal sharding, so right off the bat, my guess is that you can't use it in ways that are sharding-unfriendly, or if you can then you'll be met with restrictions on much of the "magic" of it if you do.

Rarely is it the case that someone's actually discovered e.g. novel math or something to make the hard part easier. Better tools (to do well-understood things more easily for this use case) and restrictions (so you don't use it in ways the tools can't handle) are the usual way.

1 comments

The "ease" we used to refer to in Vitess primarily relates to its interaction with the application side, where it basically presents itself as "one big MySQL datastore". It uses a standard MySQL connector, and in general, once you have the infrastructure up and running with a compatible schema design, there's not too much to worry about from a coding standpoint. Sharding happens transparently to the application code, which generally translates to fewer code changes required.

Admittedly, that view left out the considerable challenge of actually deploying and running the infrastructure, designing and optimizing that schema, along with all the joys of managing large cluster environments.

That's what PlanetScale, the product, aims to solve. Dealing with clustering infrastructure IS a hurdle for most teams to overcome, and though Vitess' feature set and compatibility has expanded greatly to accommodate some of the most demanding use cases on the web today, a lot of its functionality can still be out of reach for a developer just trying to merge some code and a schema change.

Abstracting as much of that complexity away from the end user is the goal, as well as making their lives easier with a ton of the functionality we've always wanted to see built with Vitess. I can confirm that that is not an "easy" job on our end. :)

> once you have the infrastructure up and running with a compatible schema design, there's not too much to worry about from a coding standpoint

Isn’t that the same for a normal sharded MySQL database though?

Depends. Have any examples of "normal" sharded MySQL databases?

EDIT: To clarify, sharding is not a standard feature included in Community Edition MySQL. Over the years, there have been various Oracle-initiated attempts at providing it as an enterprise scaling strategy through MySQL (NDB) Cluster, MySQL Fabric, etc., but these have either ended up having limited applicability outside very specific use cases and are not widely in use.

Most large MySQL users (e.g. Facebook or YouTube) ended up rolling their own frameworks, like Vitess, which has since been open sourced and adapted to more diverse environments. Until that became more accessible, though, the rest of the world mostly made do with wobbly multi-master setups relying on circular replication, behind some kind of proxy, or had to implement the sharding logic itself into their application code.

Damn, I never knew this. I think the first time I ever needed sharding it was just available in whatever version of MySQL we were using at the time (through some plugin, presumably). I never needed it again and ever since just assumed it was the default.

Thanks for the correction!