Site icon IT World Canada

HP clandestinely develops data warehouse

COMMENT ON THIS ARTICLE

LONDON – HP is stealthily developing a data warehousing appliance that will scale out to hundreds and processors and hundreds of terabytes of data. HP has been shipping this to trial customers since October, 2006, and is expected to formally announce it later this year. The company said that it had speeded up BI operations thirteen-fold for a trial customer.

It is based on 64-bit commodity Intel hardware and Tandem NonStop fault-tolerant micro-kernel operating software. The NonStop server consists of a virtualized, single-image, cluster. Hardware

This is a massively parallel system. Neoview, according to HP’s Owner’s Manual comes in five versions, which are nodes of either 16, 32, 64, 128 or 256 Itanium processors. These are dual-core Itanium 2 CPUs (Montecitos) and operate in parallel. Each has 4GB of RAM and can use 10,000rpm Fibre Channel 146GB disk drives. The base servers are HP’s Integrity server n2620 boxes, (see HP’s Neoview platform field support course outline). An n2620 can have up to two Itanium 2 processors, 32GB of RAM and 3 disk bays holding 36, 73, 146 or 300GB drives. Neoview uses pared down n2620s.

A Neoview cabinet is a 42U rack unit. A 16-node Neoview needs 2 cabinets and a 32 node one three. The 256 node system needs 23 cabinets. A pair of HP ProCurve 2848 switches tie together both the internal and external Neoview LAN networks. There are a total of seven or eight Gigabit Ethernet links from the customer-provided network to the Neoview platform.

The storage system is an HP StorageWorks product such as an XP12000 disk array. Software

The Tandem NonStop operating system micro-kernel and database is used. A database loader extracts data from real-time line-of-business databases, transforms it and loads it into Neoview storage. Data analysis tools are provided by partners such as Cognos and SAS which treat Neoview as a platform.

Users who input SQL queries against a loaded database need not consider Neoview’s parallelism. An Optimizer function ensures that its execution uses as many parallel resources as possible. Each disk drive has its own disk process and all disk drives are intelligently scanned in parallel for a single query with a scan in its execution plan. A database table can be scanned in hundreds of parallel streams.

The developing BI Market

Existing data warehouse suppliers like IBM and Teradata have specialized and proprietary HW and SW architecture to cope with mixed business intelligence (BI) workloads. Newer suppliers like Netezza use an appliance approach with commodity HW. These are characterized as good and fast for single stream BI work, such as analyzing calls in a phone company customer database. HP’s Neoview aims to combine the speed, simplicity and lower prices of BI appliances with the ability to run multi-stream BI work as well.

Customers would then not be locked into proprietary HW. Sun is trying to do the same thing with its X4500 hybrid servers and a partnership with BI software supplier Greenplum.

Neoview has a far higher ratio of processors to data than the X4500, resulting in almost real-time execution of reports and analytic queries, according to HP. This is because of its clustered parallel architecture. Sun’s X4500 has two Opteron CPUs and it cannot be clustered — although that is expected in the future.

HP is using Neoview internally and is consolidating 750 separate data warehouses into just one company-wide data warehouse. CEO Mark Hurd once ran Teradata. HP’s CIO, Randy Mott, used to be CIO at Wal-Mart which has a huge Teradata installation.

There is no price for Neoview but HP people are talking of Teradata-class functionality at significantly lower prices. However a lot of consultancy will be involved in Neoview implementations and its price will reflect this.

QuickLink 075096

COMMENT ON THIS ARTICLE

Exit mobile version