Enterprise Hadoop: The Ecosystem

Just finish the training “Designing and Building Big Data Applications” from Cloudera. In the last two years, the hadoop world becomes more mature. What surprises me is so many open source projects there, which also confuses me. If you plan to apply hadoop what’s the best technology stack should take?

Here is what Hartonwork’s view. It is a super fan of Apache projects, but I didn’t see a clean enterprise solution. The business model is more like a Debain Linux.

Screen Shot 2014-07-11 at 8.28.27 PM

 

How about Cloudera? It has CDH, especially the data hub edition. It is running a similar business model like Redhat. It utilizes Apache projects but add lots of enhancement. See Ref[5] for what’s in the latest CDH 5.0.3.

The hadoop world is still growing lighting fast and let’s see what will happen in next two years.

Pay attention to Spark

Reference:

  1. http://cloudera.com/content/cloudera/en/training/courses/big-data-applications-training.html
  2. http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-enterprise.html
  3. http://www.cloudera.com/content/cloudera/en/products-and-services/product-comparison.html
  4. http://hortonworks.com/hadoop/
  5. http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH-Version-and-Packaging-Information/cdhvd_cdh_package_tarball.html