Real-time analysis: A key to heading off operational failures
Machine logs often go unappreciated when an application is not down, or if IT planners can't correlate loss of revenue with network or server slowdowns. But this complacency comes at a risk. Many issues are easily solvable with the insight offered by machine data. More importantly, failures or breaches that are imminent or already silently “underway” can be nipped in the bud.
A log management system allows retailers to:
- Respond instantly to an incipient failure or breach. Precious seconds are at stake when the application is at risk of a crash.
- Set up proactive monitors that alert IT planners to impending issues, whether in hardware or software
In either case, retailers need the instant responsiveness made possible by machine logs. And that's where real-time responsiveness is critical. There's no time to wait for a batch job to export data or time to process the data. That's why tools such as Hadoop or Cassandra, although powerful, are not ideal: they're not "real time." Even a loss of 30 minutes due to a downed application could cost an online retailer tens of thousands of dollars in lost sales, as well as hard-to-quantify losses due to churn. According to Aberdeen Group the average cost of downtime increased 65 percent from June 2010 to February 2012, to approximately $165,000/hour.
Proactive monitoring provides critical Alerts
Proactive monitoring is the corollary to real-time responsiveness. In short, it means one thing: forewarned is forearmed. And this forewarning comes by way of alerts that can highlight impending issues in the application, network, storage, or supporting IT infrastructure.
Take a hypothetical, but all too realistic, example. Let's say that your website went down two months ago, and troubleshooting after the fact revealed excessive loading of the network. Essentially, the network could not carry the traffic flowing between application nodes, and the site went down.
The solution: set up a real-time monitor or dashboard in your log management system to monitor network utilization and send an alert when utilization reaches a specified level. The alert highlights the impending problem in sufficient time to address it – allowing IT, for example, to rebalance loads or, spin up a larger number of servers in a separate physical location or public cloud to offload existing network infrastructure.
This scenario once again highlights the importance of real-time responsiveness of a log management system. In short, an impending problem can be prevented only because a real-time alert allowed IT to take action in advance.