Hacker News new | ask | show | jobs
by chad_walters 3848 days ago
MySQL may work well for this small data set (200GB). Start working with 10s of TBs of data and you will start to understand why NoSQL stores were built.
4 comments

My thought exactly. This is just a scale that can be solved either way; when you really can't fit your data on even a handful of machines with acceptable performance, then Cassandra can start to shine.
People are running petabyte-sized (1000TB) databases on SQL. One example is from Nasdaq, https://customers.microsoft.com/Pages/CustomerStory.aspx?rec...

Meanwhile, NoSQL does not mean "10s TBs of data" automatically. Check this slideshow explaining challenges of MongoDB (poster NoSQL database) "scaling to 100GB and beyond"

http://www.slideshare.net/mongodb/partner-webinar-the-scalin...

Did you look at vitess [0] ? It handles sharding/replication of MySQL up to PBs of storage and 10s of thousands of connections. Also, it implements caching at the proxy level so you don't need to use memcached. If multiple requests for the same resource are sent to a vttablet (shard proxy) at the same time, only one is forwarded to the database and all of them receive the same result.

[0] http://youtu.be/midJ6b1LkA0

The question then is what is NoSQL. Does mongoDB a NoSQL engine? can you scale it to 10s of TBs? Can you really scale out MongoDB? I can ask the same for Redis and a whole line of other NoSQL engines that are not really scale out solutions.

(yee, you can shard both Mongo and Redis, as well as MySQL and get to 10s of TBs).