|
|
|
|
|
by jdwittenauer
3365 days ago
|
|
I think the term "Hadoop" is becoming almost meaningless. It seems to now be more of a pointer referencing a basket of distributed processing technologies that run on YARN/HDFS. Agree completely with having multiple technologies to solve every problem, that's one of the most confusing parts to learn. My own perspective is that there are lots of businesses that haven't yet needed the capabilities provided by a platform like Hadoop, but they likely will in the future. So the market may be saturated based on current needs but that market will continue to expand. Whether it's Hadoop (YARN/HDFS/etc.) that wins that market share or some other stack like Spark/Mesos remains to be seen. |
|
You reference the MapR distribution for their training material, and its interesting that their version of HDFS is a reimplentation in C++ (MapR-FS). Its part of the reason I settled on MapR to use tools like Apache Drill, because the filesystem becomes usable to non-Hadoop tools via NFS (i.e. Awk).
Given a shift in some categories away from map-reduce to other approaches, could Hadoop eventually just become a collection of distributed filesystems and job schedulers?