The open source Hadoop framework is ideal for distributed processing of large data sets across large numbers of severs.
What it isn’t good at is speed.
Actian Corp., which makes a number of specialized data management systems including the SQL-based Vectorwise analytic database has been watching its customers try to bridge the gap with Hadoop by building their own connections.
The latest version of Vectorwise will save them a lot of effort: Version 3.0 comes with advanced Hadoop integration, allowing customers to do fast queries of unstructured data at what Actian says is a relatively modest price.
Hadoop has lots going for it, says Fred Gallagher, general manager for Vectorwise (pictured). Hadoop’s HDFS file system provides almost unlimited storage, and Hadoop itself is good a parallel processing. But, he added, it’s cumbersome to do ad hoc queries or to drill-down data discovery because it’s a batch processor.
“So by integrating the large dataset capabilities of Hadoop with Vectorwise, people can get that responsiveness they’d like.”
Other changes include a more efficient storage engine, support for more data types and analytical SQL functions and enhanced DDL (data description language) features.
Gallagher says that with the Hadoop Connector, Vectorwise on a Dell server with 12 cores can outperform a half-rack of data appliances on 90 per cent of queries at a cost of under $100,000 (including server).
“We’re able to move terabytes in an hour on a modest set of servers.”
Customers who use Vectorwise and Hadoop include a number of social media companies who have large amounts of subscriber data to process, he said. One has a Hadoop custer with over 250 TB of data and needs to analyze 20 TB of data at a time. Another stores Web logs and brings 100 billion records into Vectorwise for processing.
Vectorwise runs on Windows Server and Linux. Pricing varies starting at around US$60,000.
Vectorwise is sold direct to large accounts that have their own database and business intelligence applications; otherwise it is sold by solution providers who sell it with these applications.
One Canadian partner is Korem Inc. of Quebec City, which sells geospatial mapping solutions, some of which leverage Google Earth or postal codes
Gallagher said Actian is looking for more solution providers here with expertise in big data and data warehousing. Partners will get training and discounts on products sold.