Wipro Ltd. –
Historically, Chief Security Officers and security managers have had a difficult time justifying security investments or identifying a specific return on investment. Business sponsors only become truly interested in security-related issues after something has gone wrong. When system breaches, malware attacks, system downtime, or-something that's been in the news a lot lately-customer identity theft occur, you can bet that everyone wants to know what money was spent, by whom, on which security practices, processes and tools. On the other hand, investing wisely in security means nobody notices, since systems do not go down, no customer data is stolen or lost, and business continues as usual.
Of course, there are situations where diligence and prudent security investments get noticed. Let's call them the Acute and Chronic scenarios. Wipro's Product Strategy & Architecture (PSA) Practice recently conducted a study on the patch management practices and associated costs of 90 large enterprises in North America and Western Europe. The study found that organizations that proactively address Chronic challenges on a systematic, ongoing basis can significantly reduce the impact and cost of Acute events. The entire text of the study can be found here.
Acute scenarios arise when specific, disruptive incidents occur. These are rare and often unique events. For example, when very public, automated malware attacks such as Nimda, RedAlert, SQL Slammer, and Blaster (to name a few) happen, some organizations are incapacitated. Other firms escape unaffected, and most organizations experience something in between. In all cases, within IT organizations, Acute events result in flurries of frantic activity, followed by what are often long, arduous clean up and restoration tasks, and eventually post mortems. The balance sheet for these situations is hard to work out, but informally the costs break down as follows:
- How much did it cost to effectively respond?
- How much did it cost to restore systems to their prior state?
- How much did it cost the business?
The Cost of Response
Assessing the immediate IT cost of effective incident response is fairly straightforward. Counting the labor costs incurred is a matter of counting the number of people-hours spent on the job, from at least two different perspectives: the dedicated, specialized IT people whose job it is to respond, and the spur-of-the-moment, informal help that other IT (systems admin, developers, help desk, etc.) and non-IT (business users) people provide.
The Cost of Restoration
Another measurement is the cost to IT of restoring affected systems to their prior functional state. Figuring out how much time it took IT people to fix (clear malware and restore data and applications) and reboot systems, and how many systems they had to attend to can be done in post mortem meetings. Of course, this does not take into account lost user data or systems which must be replaced as a result of difficulties encountered in the restoration process.
The Cost to the Business
This is the most hotly contested set of costs associated with security. Trying to answer the question "what is the cost of an hour of downtime?" is akin to asking "how long is a piece of string?" The truth is that it all depends on the specifics: What time of day? What kind of system? Who is affected? How severe was the incident? Does this interrupt service to the customer? What we have seen at Wipro is that many companies have key performance indicators attached to the particulars of system downtime in their business. The best ones do not get caught up trying to compare themselves with other businesses, but instead focus on measuring themselves relative to their own business goals, and the resources allocated to avoiding downtime in the first place.
By contrast, Chronic scenarios arise regularly. They are not rare, or unique, making them easily measured, and investments in improvement just as easy to justify. The prime example of a Chronic scenario in enterprise security is the process of Security Patch Management. Though each patching situation is a response to a different set of threats or vulnerabilities, it is possible to break the entire process down into similar units of measurement. Wipro has measured how organizations conduct patching events from two related perspectives. One approach measures the elapsed time that it takes an organization to successfully complete what we call a "patching event". The second approach measures the cost of patching events in terms of IT effort.
Patching events occur when organizations deploy updates to executables, data, or system configurations that reduce or eliminate known vulnerabilities or bugs. The activities that make up a patching event are shown in Figure X.
Each firm has different practices for deciding when to initiate patching events, but they typically have the following characteristics:
- Patches are generally applied to systems after a period of testing, often requiring a restart of systems or services.
- Typically, more than one software patch is applied, and more than one vulnerability is closed during a single patching event.
- Successful completion of a patching event occurs when an organization deploys the patches to a predetermined percentage of systems.
Using patching events as the base unit of measurement allows firms to put boundaries around the patching process in a way that is not possible when looking at single patches or vulnerabilities. As one CSO puts it: "We all know how long it took us to complete last week's patching event. Which specific patches were included? I'd have to look that up."
Effort represents the total number of hours IT professionals work when they respond to a security threat and successfully complete an effective patching event. The amount of effort required to complete a patching event is a vital component of estimating annual IT operations costs. A common way to view the patching event effort is per system.
Wipro's research shows that the average per system effort to complete a patching event for a Microsoft Windows PC is just less than 20 minutes. This doesn't seem like much, until you think about this time projected over thousands of systems.
The elapsed time to complete a patching event is also important, since it provides insight into the overall time that an organization is at risk of exploit. Wipro calls this "deployment days of risk" since firms that take longer to complete patching events leave their systems at risk for a longer period of time. Firms can dramatically reduce this risk, but only once they have begun to measure it consistently.
Measuring the Measurable
Firms that can get a grip on key measures of chronic scenarios such as patching event costs and elapsed time can use those measures to implement continuous improvement programs throughout their firm, correlating those measures with investments in security tools and new processes. These programs can in turn drive down response times and costs for chronic as well as acute security scenarios, creating a virtuous cycle of investment and return. Wipro's research shows that, everyone wins by implementing these proactive patch management principles and practices. The CSO can cost justify the allocation of resources to security efforts, and the enterprise can lower overall risk.
Theo Forbath is the Chief Strategist & Practice Leader and Patrick Kalaher is a Senior Technical Architect consultant with Wipro's PSA unit. They can be reached at email@example.com and firstname.lastname@example.org.