Hacker News new | ask | show | jobs
by pablohoffman 4782 days ago
We went with HBase. Cassandra would have been suitable too, but we already use Hadoop for data processing so it was a natural choice within the infrastructure ecosystem. We will write a followup about that.
1 comments

Clouderan here! Glad to hear you guys went with HBase, I'm looking forward to your follow up post. Will you detail your key design / architectural setup?

Did you guys roll your own HBase environment or did you go with the CDH? If you're using the CDH version and have any questions, feel free to shoot an email to cdh-user.

We are using CDH4.2 and have had a very positive experience so far.

Cloudera has in fact been an inspiration for us to follow, you guys have really struck the right balance between open source and commercial support. We follow the same philosophy with Scrapy (an open source web crawling framework), as you do with Hadoop and its ecosystem.

That's really awesome to hear, thanks for your kind words. I'm looking forward to the follow up blog, depending on your key design you may be able to take advantage of Impala for ad-hoc queries using SQL.