In order to successfully manage a service-oriented architecture, IT shops need to stop playing the blame game when a problem arises, according to an IT manager at Louisville, Ky.-based Humana Inc.
The problem with most SOA environments — especially those that are mixed with legacy system environments — is the lack of insight IT administrators have when trying to manage their composite applications, said Craig Whitaker, technology manager of technical services at the health care insurance firm.
“The real change with SOA is that you don’t have the control you used to have,” he told conference attendees at last week’s IBM Impact 2009 in Las Vegas. Each application doesn’t performance one function, but rather multiple functions with dozens of other interconnected applications, Whitaker said.
When faced with a crisis in the early days, he added, his IT shop often resorted to “problem-solving by committee.”
At Humana, it has been common to have 40 or 50 support people on troubleshooting conference call when faced with a technical issue, Whitaker said, with vendors and senior vice-presidents sometimes joining the call.
“Everybody has a stake in solving the problem and every tier support group has its own tools,” he said.
“Often times, somebody on the call will mention that ‘six months ago we had this problem and it was DB2’ and that gives everybody something they can start blaming,” Whitaker added. “If you’re a (database administrator), you might know that it’s not DB2, but because everybody thinks it is, you spent all your time trying to defend your part of the network.”
For Whitaker, instead of pointing fingers and being reactive, IT departments must be predictive and proactive.
Humana has recently partnered with Melville, N.Y.-based Nastel Technologies Inc. in order to gather operational, transactional and business metrics using one tool that monitors the overall health of the IT environment.
Nastel’s AutoPilot M6 suite, which includes transaction, business activity, application performance and middleware management capabilities, aims to give IT administrators a way to find and fix problems quickly without disrupting services.
“Applications are less reliable now because they rely on connectivity,” Richard Nikula, worldwide director of technology services at Nastel, said. “If everything’s on one server, life is simple. You keep the server up and you keep the app up.”
The landscape has now changed, as applications are siloed and more reliance is put on third-party tools, he added.
AutoPilot M6 works with a wide variety of technologies such as IBM WebSphere MQ, Oracle Database, Java 2 Platform Enterprise Edition, JBoss and SQL Server. It can monitor and capture metrics and performance data all across your IT environment, Nikula said.
Whitaker added that the need for transparent data monitoring across all areas of the IT department led them to explore business management tools for the company’s SOA environment.
“For example, you can pull things up in Tivoli and get views of other tools that report into the software, but you can’t see what each app is doing in a synchronized time fashion,” he said. “That’s why transparency is important and stops you from labeling something a compute problem, when it’s clearly being caused somewhere else.”
The ability to get end-to-end real-time visibility in the performance of their Web-based systems, including the ability to replay time-synced historical metrics for forensic study is also crucial to preventing IT infighting, Whitaker added.