Hacker News new | ask | show | jobs
by socratic 5331 days ago
This seems like an odd analysis if you mean that MongoDB is hitting the trough of disillusionment. MongoDB, Cassandra, HBase, and Redis all came out at roughly the same time (2008, 2009) according to Wikipedia and their project pages. Is there a reason they would be on totally different hype cycles?

As far as I can tell, no one has hated on Redis or HBase (except for the brief period when antirez tried to add VM to Redis) because they both (a) work and (b) solve real use cases. Has there been any suggestion that Redis or HBase lose data?

However, maybe you are right in a more general sense. Do you think that the idea of NOSQL itself is reaching the trough of disillusionment? Are we seeing a shake out of which of these data stores are actually designed by people who know what they are doing, both in (database) theory and in (systems coding) practice?

1 comments

Oh I do mean that -- I don't think their time in existence effects the rate at which you move along the hype cycle, I think popularity and deployment does.

I would say Mongo is the most popular NoSQL data store at the moment; whether it is mindshare or deployments and that is what caused the move along the cycle so much faster.

I don't mean to detract from any of the other NoSQL projects; they don't have the marketing or manpower budget that 10gen has so I wouldn't expect them to be at the same place in the cycle. MongoDB came on the scene with the only NoSQL solution that promised SQL-esque queries, insane magnitudes jump in performance AND a big commercial company behind to. To anyone trying to understand "NoSQL", it was the clearest and safest place to look.

Since then we've seen the cracks in that original argument (fast and unstable means terror in production), and 10gen has changed focus as needed and addressed those. During that time it wasn't just coding like a lot of these other projects, they were putting on conference after conference, garnering mindshare and getting developers on board.

An open source Apache project just won't move along a hype path as quickly as a force like that.

(I am making no statement towards quality, performance or worthiness... just positions on the hype-cycle).

  > Do you think that the idea of NOSQL itself is reaching 
  > the trough of disillusionment? Are we seeing a shake out 
  > of which of these data stores are actually designed by 
  > people who know what they are doing, both in (database) 
  > theory and in (systems coding) practice?
I couldn't have phrased it better; yes I think this is exactly what is happening.

The early days it was so exciting to see different ways to store/retrieve data. We had been with SQL for decade(s) and it was very exciting to see something new/fresh and fast popup.

Then everyone started storing data every which way they could think of.

Then a few of us starting solving problems with those new ideas... so far so good.

Then some of those projects and new projects built on those new techniques blew up in popularity, and suddenly the "real world" came knocking and we started to actually test the metal of these things in production... with disk failures, network failures, power failures and administration failures.

Like shaking out a rug, the weakest approaches got shaken out and the strongest teams/products weathered the storm to grow stronger and more stable.

2011 was the year NoSQL "Grew up", I imagine 2012 and 2013 will be the year that NoSQL comes all the way out of the dissolution curve completely and, in a metaphysical sense, "goes into production".

I mean that in the most hand-wavy way, not literally... literally LOTS of people have it in production.

I mean it in the sense that you stop seeing articles like these that sparked all the Mongo hype recently or articles about horrible shortcomings or failures about XYZ datastore.

Early on the teams making the NoSQL solutions AND the users didn't really understand where this boat was going or how the puzzle pieces fit together... they just kept working and refining.

This entire year we've seen more and more specialization in the NoSQL community:

  - Antirez gave up on data-larger-than-ram approaches and 
    wants to focus Redis on what it is amazing at: being 
    fast, in memory.
  - CouchDB, building on its uniquely awesome m-m 
    replication, moves into the mobile space with data sync 
    solutions that are awesome.
  - MongoDB keeps replacing MySQL in production at many 
    large-scale startups in the valley; showing more and 
    more the exact migration path to take.
  - Cassandra becomes markedly easier to use with CQL and 
    combined with its CouchDB-esque replication behavior, 
    suddenly makes all sorts of sense in densely populated 
    deployments.
Back in 2010 I couldn't have told you which NoSQL solution was best for which job... closing in on the end of 2011 it is glaringly obvious to me when you would use Redis and when you would use CouchDB (for example).

This seems silly in hindsight, but I don't think we or the teams really honestly knew where this trip was taking the technology a year or more ago.

2012 will be a year of polish, stability and deployments.

2013 will be production deployments and replacing MySQL in more and more places.

2015, it all starts all over again as SSD-optimized data structures and data stores revamp our understanding of databases :) -- I am half-kidding.

That's my 2 cents anyway.

Cassandra has CouchDB-esque replication?
In the most general sense (master-master) yes, but in a more detailed sense... not really.

Cassandra and Riak have a similar replication model -- the are deployed into a "ring" and the data in the ring distributed across some (or all) of the nodes depending on your ReplicationFactor (how many nodes to copy each piece of data to).

If you query for a piece of data that a node doesn't have, it hashes the query and routes you to the node that does have it.

CouchDB is a bit different, in that by default it treats every node as a master and replicates it in its entirety to any other nodes registered as a replication target.

You can shard with something like BigCouch, but that is 3rd party.

This is different than Mongo which is master-slave-slave-* or Redis which I believe is master-slave as well (I never got a clear answer on how "slave" nodes in Redis resolve or push changes back upstream to the master).