Conference summary offers big data leads

The open source Hadoop framework for storing and processing large datasets is attracting an increasing number of organizations who want to at least try it because their information stores are rapidly increasing.

After a recent big data conference Ben Lorica, chief data scientist at O’Reilly Media pulled together a detailed summary of things he heard which deserve to be followed up.

For example, many companies are struggling with how to process and mine near real time data streams. An Intel official talked about how the chipmaker uses the in-memory cluster computing research from the University of California at Berkeley called Spark and Shark.

Related Articles

Making Hadoop faster with GPU

The open-source answer to big data

SQL-on-Hadoop solutions are now of interest with the release of Cloudera’s Impala query engine that runs on top of Hadoop, and Hadapt’s data-driven schema. Both were discussed at the conference.

And there was also a session about a corner of data science that I haven’t heard of called adversarial analytics — think of behavioral models that try to detect cyber intrusions and black hat hackers that try to evade them.

Sometimes you don’t have to go to a conference to pick up nuggets that are worth pursing on your own time. That makes this column worth scanning.

Read the full column here

POPULAR CATEGORIES

Content Types

ALL CATEGORIES

BEST OF THE WEB

Conference summary offers big data leads

Would you recommend this article?

Share

Featured Download

ITW in your inbox

More Best of The Web

The IBM-Twitter tie-up: Will it produce results?

Words of wisdom from big data conference

SAP lowers hardware requirements for non-production use of HANA

HP gets closer to Hortonworks

Popular Stories This Week

ITWC Network

Follow Us