Site icon IT World Canada

A roadmap for improving reporting and data analytics

graphical illustration of big data

All custom software and software packages come with reporting functionality. That functionality is almost always sufficient to successfully launch the application. Soon after launch the world changes because successful applications grow in terms of:

  1. Number of concurrent, active end-users.
  2. Total number of end-users.
  3. Average length of end-user sessions.
  4. Volume of data.
  5. Resource consumption per active end-user.
  6. Number of integrations with other applications.
  7. Increasing expectations for functionality.

As this growth continues, the previously sufficient reporting functionality becomes inadequate and the chorus of end-user complaints becomes louder. Here’s a roadmap that CIOs can apply to respond to mounting frustrations. The roadmap will keep end-users and management happy as the application continues to grow in usage and value to the organization. The steps in the roadmap are listed in increasing order of cost, technical complexity, and business value.

Enhance reporting functionality

As end-user familiarity with the application and hopefully their sophistication grows, their frustration with the limitations of the available reports will increase. In addition to developing new reports, the existing reporting functionality can be enhanced with features such as:

  1. More sophisticated data selection options.
  2. More sort options.
  3. More output options beyond print. Frequent options include PDF and Excel. Sometimes shape files for mapping applications are also a useful option.

While this roadmap step often satisfies most end-users, it causes a proliferation of Excel workbooks that are not managed for consistency and lead to multiple versions of the truth.

Upgrade computing resources

Initially the online and the reporting functionality of the application both access the same datastore running on a single computing resource. As usage of the application grows, performance will degrade. This problem can be addressed by:

  1. Adding more computing resources.
  2. Shifting production of larger reports to overnight.

This roadmap step ensures excellent online response times. However, this action:

 

Add query functionality

As end-user sophistication continues to grow, the stream of enhancement requests for new reporting functionality can’t be fulfilled at a reasonable level of IT staff. At this point, the demand for better information access can be addressed by adding ad-hoc query functionality that includes features for:

  1. Selecting database columns to be included in the report or output dataset.
  2. Introducing standard formulas for calculated columns that do not persist in the database.
  3. Specifying data, or row, selections.
  4. Specifying sort columns.
  5. Specifying joins across multiple tables and databases.

This roadmap step greatly increases the sophistication of reporting and querying functionality available to all end-users. It takes the pressure off IT resources to produce more and more reports. However, this action:

  1. Requires end-users to become more knowledgeable in the details of the application’s underlying data structure and in the operation of the ad-hoc query facility.
  2. Requires the implementation of views that simplify the development of end-user queries by hiding some of the complexities of the underlying data model.
  3. Increases resource consumption and therefore operating costs.

Separate online from reporting

As the consumption of computing resources for reporting and querying grows, the ability to keep upgrading a single computing resource becomes increasingly difficult even in a cloud environment where adding resources is easy. Now it’s time to duplicate the online datastore onto a separate computing resource and point the report and query load to the new database instance.
Typically, the report and query instance is refreshed nightly. In most organizations, basing reports and queries on the data available at the end of the previous business day is sufficient.
Your DBMS supports replication and synchronization to refresh your report and query instance. You will likely start with replication because it’s fast and easy to develop and manage. As requirements become more complex, you may move to synchronization.
This roadmap step delivers:

  1. Excellent performance for all end-users.
  2. The ability to manage the two database instances independently.
  3. The ability to add high availability features to the online environment if necessary.

However, this action comes at a cost for more:

  1. Computing resources.
  2. IT staff time for software maintenance and operation.

Add data analytics functionality

As management asks for more variance analysis and trend forecasting, end-users will push Excel charts beyond its capability and request more sophisticated data analytics functionality. Business intelligence or data analytics can be addressed by adding functionality that includes features such as:

  1. Interactive chart development.
  2. Dynamic data transformations.
  3. Real-time data aggregation.

This roadmap step greatly increases the sophistication of querying and charting functionality available to all end-users. However, this action:

  1. Requires end-users to become more knowledgeable in the details of the application’s underlying data structure and in the operation of the data analytics software.
  2. Requires the implementation of templates that simplify the development of data analytics charts by hiding some of the complexities of the underlying data model and database relationships.
  3. Increases resource consumption and therefore operating costs.

Implement in-memory processing

As the consumption of computing resources for reports and queries continues to grow, the online query performance and elapsed time for report production will inevitably suffer.
The next roadmap step is to upgrade the memory of your computing environment significantly to implement in-memory processing by the DBMS. As we all know, the elapsed time to complete a data fetch in memory is a tiny fraction of the time required to complete the same data fetch from disk. This action will:

  1. Improve performance at a significant cost for additional compute resources.
  2. Not require software changes or end-users having to change how they work.
  3. Hide the processing inefficiencies associated with a report and query load accessing a datastore designed for transaction processing.

However, this action increases resource consumption and therefore operating costs.

Revise the report and query environment

As the consumption of computing resources for reporting and querying continues to grow, the inefficiencies associated with a large report and query load accessing a datastore designed for transaction processing causes costs to skyrocket and performance to plummet.
Now it’s time to consider changes to the data model for the datastore of the report and query environment. This roadmap step considers improvements including:

  1. Denormalizing the data model.
  2. Adding more complex views.
  3. Persisting common calculated values.
  4. Adding indices to improve performance.
  5. Implementing a data warehouse with its star schema.

These actions will:

  1. Improve ability to immediately respond to unplanned queries.
  2. Simplify adding new datastores to the report and query environment.

These actions are listed last in this article because they incur significant cost for:

  1. IT software development resources.
  2. Software licenses for ETL software.
  3. IT operation resources.
  4. IT staff time for software maintenance and operation.

What strategies would you recommend that will improve support of reporting, querying, and charting for your applications? Let us know in the comments below.

Exit mobile version