Microsoft adds Hadoop support to SQL Server, data warehouse

Microsoft is responding to the “Big Data” movement by adding support for the open source Hadoop framework for large-scale data processing to its SQL Server database and Parallel Data Warehouse platform.

The connectors will be available in CTP (community technology preview) form soon, according to a post this week on the official SQL Server Team blog.

Big Data refers to the ever-growing volumes of data being generated by enterprises, particularly from sensors and Web traffic.

“Our customers have been asking us to help store, manage, and analyze both structured and unstructured data — in particular, data stored in Hadoop environments,” Microsoft said in the blog post.

With the new connectors, customers will be able to interchange data between Hadoop environments, SQL Server and Parallel Data Warehouse, Microsoft said.

Hadoop, which is hosted at the Apache Software Foundation, was formed by Yahoo and is based partly on the MapReduce programming model developed by Google. An increasingly large commercial ecosystem has emerged around Hadoop, with companies such as Cloudera offering services and specialized distributions of the framework.

Microsoft’s move makes sense, given that its data warehousing vendors such as EMC Greenplum and Teradata have already embraced Hadoop, said Forrester Research analyst James Kobielus.

More and more enterprises are running Hadoop clusters and they want to be able to send data from those systems downstream to their data warehouse systems, he added.

But no one vendor can claim to have a fully built-out Hadoop offering, which would include distributions, connectors to Hadoop-related projects such as the Cassandra data store, modeling tools and other components, he said.

There is “no doubt” that like other vendors, Microsoft has serious plans for Hadoop, but it hasn’t made a long-term road map public, Kobielus added.

Microsoft is not embracing Hadoop at the expense of homegrown efforts, having recently released a MapReduce-based programming model, Project Daytona, for use on its Azure cloud platform.

Also this week, Microsoft announced that it has released a second Appliance Update for Parallel Data Warehouse. These updates combine new features for both hardware and software components.

The release includes new connectors for third-party BI (business intelligence) and data-integration tools from SAP, Informatica and Microstrategy.

In addition, a version of the PDW based on Dell hardware is now available, Microsoft said. Pricing starts at less than US$12,000 per terabyte.

Would you recommend this article?

Share

Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.


Jim Love, Chief Content Officer, IT World Canada

Featured Download

Featured Articles

Cybersecurity in 2024: Priorities and challenges for Canadian organizations 

By Derek Manky As predictions for 2024 point to the continued expansion...

Survey shows generative AI is a top priority for Canadian corporate leaders.

Leaders are devoting significant budget to generative AI for 2024 Canadian corporate...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now