IT professionals are scrambling to get trained and certified in what’s expected to be the hottest new high-tech skill for 2012: Hadoop.
Apache Hadoop is open source data management software for analyzing large volumes of structured and unstructured data in a distributed manner. It is used by such popular Web sites as Yahoo, Facebook, LinkedIn and eBay.
Hadoop is gaining popularity as U.S. corporations in all vertical industries — financial services, utilities, media, retail, energy and pharmaceuticals — embrace the concept of “big data,” which refers to the analysis of huge volumes of real-time data to identify trends and increase profitability.
BACKGROUND: Hadoop wins over enterprises, spawns talent crunch
Several training courses and certification programs are available for IT professionals interested in Hadoop development and administration. Big-picture courses about Hadoop and the broader area of “big data” are also available for IT executives such as CIOs and CTOs.
Companies offering Hadoop training include Cloudera, Hortonworks, IBM, MapR and Informatica.
Cloudera has been offering Hadoop training for three years as a complement to its enterprise-class Hadoop software. The Palo Alto, Calif.-based company offers six Hadoop-related courses aimed at four different audiences: programmers, data base analysts, system administrators and IT managers.
“Companies are struggling to hire Hadoop talent,” says Sarah Sproehnle, director of educational services at Cloudera. “We’ve been doing Hadoop for awhile. We’ve put all of the best practices into our training so we can get people really quickly up to speed.”
Cloudera’s Hadoop courses are four days long, feature instructor-led training and lab work, and cost around $2,000. Cloudera provides a certification to IT professionals who complete the course.
“Our certification is quite popular,” Sproehnle says. “Some of the industries that are adopting Hadoop … want assurances that the people who are going to manage their petabytes of data are reliable. But if they’ve taken our courses and passed our certification, that’s proof. We’ve seen people bragging about our certifications in online posts, and we’ve seen job postings looking for our certifications.”
Demand for Cloudera’s Hadoop classes has risen dramatically in the last few months, with the number of people being trained in the first quarter of 2012 expected to be four times greater than during the first quarter of 2011.
In fact, Cloudera’s training courses were sold out at the third annual Hadoop World Conference in New York City last November. Nearly 2,000 people attended the conference.
“Over 10,000 people have come through our Hadoop curriculum in three years. But by the end of 2012, we will train another 10,000 people,” Sproehnle says. “People are coming out of the woodwork to take our courses.”
Charles Zedlewski, vice president of product at Cloudera, adds that the type of IT professionals taking Cloudera’s Hadoop courses has changed.
“In 2011, we saw our audience flip. We went from 70% of our customers being Web companies and 30% being enterprise to 75% being traditional enterprise and 25% being Web companies,” he explains. “Hadoop has become a very big thing for industry.”
Meanwhile, Hortonworks launched its Hadoop training program called Hortonworks University in February. The Sunnyvale, Calif., Hadoop software vendor has provided custom training to a few hundred developers over the last six months. Now Hortonworks University will offer standard courses and certifications to IT professionals.
Hortonworks has two Hadoop certifications: One is for developers and the other is for administrators. The developer course is four days and costs around $2,500, while the administrator course is two days and costs $1,400.
“We’re seeing the demand for our training increase exponentially,” says Bob Mahan, senior director of worldwide field services at Hortonworks. “Enterprises are starting to collect data in a much more granular level — customer information and deal information — and they’re scrambling to figure out how to crunch that data cost-effectively.”
Training companies say that in order to succeed at Hadoop developer training, an IT professional needs experience with Java. For administrator-oriented courses, they need experience in Linux or Unix administration. Database courses require some experience with SQL, but there are no prerequisites for general courses on big data.
The bottom line for Hadoop-trained IT professionals is the chance to pursue new, higher-paying jobs.
“Within three to five years, half of the world’s data will be processed on Hadoop,” Mahan predicts. “It’s something we haven’t seen since the early SQL days or the early Java days. There are going to be demand for thousands and thousands of individuals who are trained in Hadoop.”
While Cloudera and Hortonworks focus on IT professionals with their Hadoop courses, IBM is targeting co-eds.
IBM has a new initiative called Big Data University aimed at training undergraduate and graduate students in the area of big data and exposing them to Hadoop. Launched last October, Big Data University has already attracted more than 14,000 students to register for its online courses.
IBM offers six online courses related to Hadoop, including Hadoop Fundamentals I and II. IBM said around 1% of its registered Big Data University students have completed enough courses in the last five months to qualify for a certification.
Meanwhile, Massachusetts Institute of Technology is requiring students in its introductory computer science course to write a program using Hadoop’s MapReduce feature, while the University of California at Berkeley has a whole course dedicated to data science that uses Hadoop.
“Hadoop sounds a little bit like Java back in 1995,” Zedlewski says.