Hacker News new | ask | show | jobs
by threeseed 4107 days ago
Everyone always considers Apache as being synonymous with old school Java technologies. But they have a set of newer big data tools that are integral components at almost every major corporation or serious startup these days:

Accumulo, Ambari, Avro, Cassandra, CouchDB, Falcon, Flume, Hadoop, HBase, Hive, Kafka, Knox, Oozie, Phoenix, Pig, Samza, Spark, Sqoop, Storm, Tez, Zookeeper.

I can imagine many database companies in particular Oracle and Teradata wishing Apache wasn't as fantastically competent as they are.

4 comments

But to the OP's question, many, if not the majority, of those technologies are written in Java. Definitely the majority are on the JVM, since a few of the non-Java ones are Scala and Clojure.

I think the simple answer is that while lots of people do not love programming in Java, it can be attractive for projects that want relatively good performance without trying to implement a system in C++. There are also a lot of tools and companies that have investments in deploying JVM applications.

This is pretty much exactly what I expected to see, once I figured out the potential for most of the hardest portability concerns to be isolated into the JVM. Building the future has always been about stable platforms.
Add Lucene (and by its extension, Elastic Search) to that list. In retrospect, I don't know how we did things before some of these projects came out.
Don't forget about Solr!
Avro is awesome, and I miss working with it.

At my last company, our main method of IPC was sending Avro messages over AMQP (using Apache Qpid). It was the best IPC I've worked with.