Site icon IT World Canada

Making Hadoop faster with GPU

By enabling the distributed processing of large data sets across clusters of servers, Hadoop has transformed the way organizations manipulate big data.

The open source framework is capable of rapidly scaling from a single server to thousands of machines and processing information really fast. However, is it possible to make Hadoop faster?

Researchers hypothesize that by offloading calculations from a central processing unit (CPU) to a graphics processing unit (GPU) designed for complex 3D and mathematical tasks, it could be possible to bump up Hadoop’s performance. This is because GPUs can perform calculations 50 to 100 times faster than their CPU counterparts.

RELATED CONTENT

Big Data will mean big year for Hadoop
3 awesome data analytics success
4 ways to tweak business analytics tools
Enterprise software myths dispelled

Using GPU to speed up Hadoop is not a new concept. There have been projects in the past which combined the Hadoop of MapReduce approach with a GPU. For instance, the Mars MapReduce-GPU project managed a 1.5 to 16 times increase in performance in analyzing Web data and processing Web documents.

An article written by the research and development team of big data specialists Altoros Systems Inc., indicates that there is a demand for accelerating parallel computing systems with GPUs . The article also illustrates how organizations can try it in a large scale.

Hardware vendors such as Cray have released machines equipped with GPUs and configured with Hadoop, according to the researchers.

In some scenarios, a GPU can accelerate computations by nearly five to 25 times per node. Some developers claim that if clusters consisting of several nodes are deployed, performance boost can reach 50x to 200x.

To find out more about the tools available to create a Hadoop + GPU system and the issues you will face, click here.

 

 

Exit mobile version