Enterprises are increasingly looking at taking advantage of the open source Hadoop framework to store and process big data projects.
With several distributions to chose from, including ones from Hortonworks Inc., MapR Technologies, EMC and Cloudera, organizations have a number of opportunities. Recently Dan Woods interviewed Cloudera co-founder CTO Amr Awadallah about its products, what it will do with the recent cash it raised — which includes US$740 million from Intel — and Awadallah’s view of competitors.
Along the way he gives an interesting sight into what a developer who bases products on an open source platform has to deal with.
Cloudera makes a mix of open source -based solutions — like CDH, which includes Hadoop plus a user interface, security and integration with enterprise hardware and software — and proprietary solutions like Cloudera Manager for cluster management and Navigator for data management.
The free Cloudera Express comes with Manager, while the paid Cloudera Enterprise includes Manager and Navigator.
“In the first two years of Cloudera’s history, actually, we were 100 per cent open source, literally,” Awadallah says. “There was nothing in our platform that was proprietary. And right away, we would see some of the largest consulting organizations out there, the SIs, professional services organizations, saying, ‘Hey, we can do everything that Cloudera can do for you. All the software they develop goes into open source. It’s free. We can grab it and we will do it for a fraction of the price that Cloudera is quoting you.’
“Not only that, we saw some of the biggest data vendors in the space, the biggest IT vendors in the space—without naming names as well—saying, “Hey, we’re going to do Hadoop too. We’re going to have our own Hadoop distribution, we’re going to take everything Cloudera is doing, and, you, our customer, you’re already paying us a contract of US$10 million dollars per year. We are going to support this Hadoop distribution for you for free. You don’t have to pay us anything.’ And we just can’t compete with that.”
“So that’s why we keep a proprietary component. It’s not about locking the customer in, it’s about locking competitors out.”
However, he stressed, Cloudera’s platform remains open source, so the data isn’t in a proprietary format. He has a view other choice words for one of the biggest names in open source computing, Red Hat.
It’s an interesting perspective and one that IT managers should think about when they’re looking at open source solutions.