While enterprises are grappling with big data, one company is looking to leverage data sets with an eye on improving IT operations and providing alerts with more context.
Rocana’s latest Ops 1.3 for managing IT operations now has an advanced analytics module to enhance its existing anomaly detection capabilities, said company CEO and co-founder Omer Trajman. The new module, dubbed Weighted Analytic Risk Notifications (W.A.R.N.), is a second order analytic that looks at the history of anomalies by object — such as service, device, host, or location — and then computes a score that helps IT teams further understand the prevalence and severity of anomalies.
By applying historical context in creating a risk score, Rocana gives IT administrators the ability to identify which entities in their IT environment are having the most unusual behavior, and drill down to investigate or take further action as needed.
Trajman acknowledges that the IT operations analytics space is crowded. “That’s part of the problem.” He said Rocana, which was partly founded by Cloudera employees, is looking at going beyond just providing a single dashboard of alerts to providing more context, which is especially more valuable as enterprise IT has become more complex thanks to cloud applications, IaaS, PaaS and SDN.
When working for Cloudera, Trajman and others saw that the challenges affecting IT teams deploying big data apps were applicable to other apps, such as OpenStack and Docker. “Operations is used to monitoring discrete technology stacks,” he said, such as a storage system. And they aren’t looking for new tools, but want a new approach. They are also looking to proactively avoid problems, not just fix them quickly when they occur.
IT monitoring tools allow staff to set up alerts based on hundreds of thousands of metrics, but that leads to an overwhelming amount of alerts without context as to their severity, said Trajman. Rocana Ops aims to pull together all of those metrics from all of those hotspots, and then stack them and rank them by severity within the context of the overall infrastructure. “Metrics in and of themselves are not particularly useful.”
Rocana Ops uses an adaptive anomaly detection feature that continually and autonomously fine-tunes itself based on historical data, constantly adjusting thresholds for anomalies based on changes in the real world. For example, the software can extract latency data from log files, convert that data into a metric, and then perform anomaly detection on website latency. This allows IT to get to the root of the cause, said Trajman.
By leveraging as much data in the infrastructure, he said, Rocana Ops is able to create context automatically. If a host is acting funny, for example, it’s clear how relative it is to everything else.
The software is also all open format, added Trajman, rather than proprietary like most monitoring tools. “Having an underlying open system is key when you have volume of data and diversity of data.” Rocana also takes a different approach to licensing, he said, which based on the number users getting value from it, rather than each data source that is connected.
“We don’t believe you should pay a tax on your system,” he says.