Drilling down to the root of an application performance problem used to be a headache for Montreal-based IT service provider CGI Group Inc.
The firm’s Toronto business unit, which serves some of the larger telecommunications companies in Canada, had noticed the Web-based cell phone activation application it was maintaining for one of its clients was experiencing some major performance problems. In the contract with this customer, CGI had committed to a 10-minute transaction completion time, but the transaction was actually taking 18 minutes and developers couldn’t figure out why, said Dalim Khandaker, manager, enterprise application performance and tuning at CGI in Toronto.
“To try to identify how much of that was really (the Web-based cell phone activation application) was a real challenge,” Khandaker said, partially because that application was interfacing with other software. Using Mercury Interactive Corp.’s Loadrunner didn’t help much either — the load performance and stress tool, which displays information on CPU usage, memory and the top 10 transactions, works well in a pre-production environment, but “you can’t really drill down and do a root-cause analysis” when several applications are interfacing with one another, he said.
According to David O’Leary, director of the Centre for Testing and Quality at CGI in Toronto, CGI’s problem is a common one for developers. “You may have systems that are so highly interconnected and you have so many kinds of diverse platforms, that when something fails, it’s difficult to trace…and it takes a long time to recover.”
Performance tests were also problematic in that CGI had to use a scaled-down version of the production environment during tests. “Production is a very big environment and is expensive to set up,” so companies will often run tests in the scaled-down environment to see if they can fix problems before they go into production, said Khandaker. But with the smaller version it is often difficult to get an accurate picture of how the production environment is going to look and behave, he said.
In 2003 some of CGI’s consultants investigated New York-based Identify Software Ltd.’s AppSight application problem resolution system for the Windows platform and started using it with some customer care applications. According to Khandaker, it made sense to tap into the capabilities of AppSight’s J2EE version to resolve the difficulties his team was experiencing.
Lori Wizdo, Identify’s vice-president of marketing, said AppSight acts much like a flight data recorder or black box on an airplane; the software “monitors and records everything that happens when the software is running,” including user actions, system events, performance metrics, configuration data and code execution flow. The logs can be replayed and analyzed to pinpoint the root cause of application problems, she said.
Wizdo said using this type of tool removes a major burden from developers’ shoulders. She pointed to a recent study by Stamford, Conn. analyst house Gartner Inc. that found 40 per cent of an app dev team’s activities are associated with the support of applications after they have gone into production. Many developers still use a manual approach to problem resolution, although it is a core process in the development cycle, she said.
In a survey Identify conducted among its own customers last August, the vendor found that identifying the root cause consumes 80 per cent of the time it takes to resolve an application problem. “It’s finding that tiny little needle in the haystack — once it’s identified, the problem is easy to resolve,” Wizdo said.
CGI’s O’Leary said it wasn’t difficult to make a business case for AppSight.
“Asking ‘What’s the ROI on a tool like this?’ is like asking ‘What’s the ROI on a $20 fire extinguisher in the kitchen?’” he said.
Quick Link: 057807