Site icon IT World Canada

Operate a data warehouse or avoid it?

Shutterstock

Is your data warehouse causing heart failure in the executive suite due to development cost overruns and surprisingly high operating costs?

Advances in business intelligence (BI) software packages enable many organizations to achieve their BI goals by accessing their operational datastores directly. This approach can avoid building and operating a data warehouse.

Even though data warehouses have their place, they should be avoided or at least deferred because they are expensive to build and operate. Before you embark on a data warehouse project, confirm that your situation meets most of the criteria below. Only then are the development risks and operating expenses of a data warehouse justified.

Access operational datastores

Exporting data from your operational datastores and transforming that data for import to a data warehouse adds elapsed time to your BI project, costs development dollars and generates ongoing operational work. Never copy unless you absolutely must.

Purists will point to the many benefits of a data warehouse and the fact that operational datastores exist to support many, small business transactions and not large BI queries. While those observations are accurate, today’s BI software can, in many cases, provide a useful illusion of a data warehouse without its cost and complexity.

Copy data from your operational datastores to a data warehouse only if one or more of the following conditions are true in your organization:

  1. The BI workload is interfering with the superior online performance expected by end-users with online access to the operational datastores.
  2. The operational datastores do not retain the historical data required by your BI application.
  3. Outages of the operational datastores are preventing the BI workload from being completed on time.
  4. Access to the operational datastores must be restricted because of confidentiality and privacy issues.

Enhance operational datastores

Before building a data warehouse, add these features to your existing operational datastores to support your BI applications:

  1. New tables to house calculated values.
  2. New indices to improve performance.
  3. New cross-reference tables to create the impression of integration among disparate datastores.
  4. New views to simplify the development of your BI queries.
  5. New reports that highlight data issues that are undermining integration.

Build a data warehouse only when these features are no longer capable of meeting the demands of your BI applications.

Increasing number of data sources

The smaller the BI application is, the easier it is to avoid a data warehouse. The increasing size of a BI application is measured by higher:

  1. Number of data sources accessed.
  2. Total data volume housed across all the data sources.
  3. Number of data centers that house the data sources.

These numbers will grow as the BI application grows in complexity and importance. Eventually, a data warehouse will become unavoidable to ensure smooth access to all the data.

Complexity of data sources

Simpler data sources make it easier to avoid a data warehouse. The increasing complexity of data sources is measured by an increasing:

  1. Number of tables.
  2. Number of columns per table.
  3. Number of foreign keys per table.
  4. Number of indices per table.
  5. Complexity of aggregating data routines.
  6. Complexity of calculating values for calculated data columns.
  7. Number of concatenated or multi-column keys, especially if the individual columns are also foreign keys.

These numbers will grow as the BI application grows in complexity and importance. Eventually, a data warehouse will become unavoidable to manage the complexity.

Usage of BI application

Most BI applications start quite modestly with usage by a single work group or department. However, success will bring more interest that manifests itself by a growing:

  1. Number of end-users.
  2. Number and complexity of queries and reports run.
  3. Number of requests for high availability.

Eventually, this usage growth will require a data warehouse.

Complexity of queries and reports

Most BI applications start with a modest number of queries and reports. However, success will bring more requests that manifest themselves by a growing:

  1. Library of available queries and reports.
  2. Variety of ad hoc query tools in use.
  3. Set of requirements for distribution of reports.

Eventually, accommodating this growth will require a data warehouse.

Data warehouse will grow

Once you commit to a data warehouse, recognize that the data warehouse will inevitably grow over time. Therefore, your:

  1. Development costs will continue and not stop at the end of the initial project to build the data warehouse.
  2. Operating costs for staff and software licenses will continue and will likely grow.
  3. Maintenance cost for staff will continue and will likely grow.

Until you can justify these expenses, typically because of the emergence of an exciting benefit case, it’s best to defer starting a data warehouse.

How would you push back on the pressure to build a data warehouse when you think it’s premature? How would you make the case to build a data warehouse? Let us know in the comments below.

Exit mobile version