In developing an effective big data analytics projects, decision makers need to use a blend of three major techniques to achieve their goal: The ad-hoc approach; batch analytics and real-time analytics.
Very often, executives and managers resort to focus on only one of these techniques defending on which one best fits their use case, according to Jim Kaskade, CEO of big data analytics firm Infochimps.
For example: Batch processing, for big chunks of data; real-time analytics, for streaming data; and ad-hoc for cases that are in between.
This is a common missteps that results in the failure of a big data project, Kaskade said. He said it illustrates an initial focus on technology without the benefit of a clear understanding of the project’s realities such as: project goals; projected time-to-business value; nature of the available data; and scope and budget.
Once a big data project is taken in this light of these realities, he said, it becomes clear that the best path to take is a “multi-faceted approach.”
Kaskade cites as an example, Infochimp’s work with the consultancy arm of Canadian media company Postmedia Network.
The firm wanted to bolster its media analysis offerings adding services such a historical data analysis as well as new information from real-time social media.
Postmedia needed to produce metrics around advertiser activity over a historic period as well as what was trending in the last second, Kaskade said. This kind of project, he said, needed a big data platform that can combine two approaches within the company’s application.
Soon it became clear to the media company that a single data scientist “experimenting with several big data instruments” would not be able to help them achieve their goal in time, he said.
Infochimp helped Postmedia develop a cloud-based system that could integrate different technologies such as batch, ad-hoc and real-time analytics. Using a blend of this techniques, Postmedia was able to provide its customers in the advertising industry media analysis across traditional and new media channels like Facebook, Twitter and blog sites in real-time.
For example, using batch analysis Postmedia’s analysis of historical data also provides context of how real-time behaviour of customers compare to past behaviours. This, Kaskade said, allows Postmedia to predict future behaviours.
He said being able to use different techniques in conjunction with each other helps Postmedia answer their customers’ queries such as:
- How does real-time chatter affect my brand?
- What is the impact of media coverage of my brand?
- Did my brand campaign perform well?
- What risks or opportunities did this campaign present?