May 01, 2001, 8:56 AM — An SLA (service-level agreement) is traditionally a contract between an organization and an external service provider, such as an ISP or ASP (application service provider), that mandates specific performance levels. But the usefulness of an SLA is not limited to outside services; SLAs can be used internally to define requirements for everything from help desk services to network performance and availability, application performance and availability, and internal processes.
Internal SLAs between IT and other departments provide numerous benefits to the entire organization. Managing expectations, boosting productivity, and increasing employee morale are all direct advantages. SLAs also provide indirect benefits. They can help the IT group prioritize work, and as an incentive to provide good service, they lead to better overall system performance. They can also help foster good relations between IT and other departments.
Creating an internal SLA is a simple five-step process. The first step is to set up meetings between IT and department managers and define the requirements and expectations of each party. For example, the IT department may want two weeks to process a new user request, whereas the managers who make these requests would love to have a one-day turnaround. Discussions may determine that one day is not realistic for the IT department and two weeks is not satisfactory for the department managers. In this scenario, a one-week response time may be acceptable to both parties.
A response time, or any other service measurement covered by the SLA, must be agreed on by all the parties involved, and the specific requirements and expectations should be documented. Without a clear, detailed record of what everyone expects, the SLA will not provide a means of managing expectations and identifying responsibilities.
Performance metrics
The second step is to identify the metrics and define the baseline requirements that will measure the effectiveness of the response time, performance, and availability covered by the SLA. For any service, the metric used to measure it should be one of the key, quantifiable indicators of service quality. The metric should also be realistic.
For example, if you want to measure response time, avoid insisting that all requests must be met with a response within 1 second. That's unrealistically strict. Instead, it would be better to state that 95 percent of requests must have no more than a 1-second response time and 5 percent may have a response time of between 2 seconds and 5 seconds.
The key to this step is finding quantifiable factors that are easily measured and analyzed. This can be very difficult, especially when dealing with network performance. You may not have any control over many variables and environmental factors that affect performance, availability, and ultimately the success of your SLA.
For example, an accident such as a severed Internet backbone, or simply Internet congestion, could affect network response time. The internal IT group has no control over such events, so these variables should be addressed in the SLA. If your method of measuring the service fails to take these factors into account, you may have difficulty enforcing the SLA.
Sticks and carrots
Once the requirements and metrics are defined, you need to come up with a system of rewards and penalties for compliance and noncompliance. An unenforceable SLA serves little purpose. It is well and good to say that all requests should have a 1-second response time, but if the group responsible for system performance does not incur any penalties for slower response times or reap any rewards for faster response times, then they have no real incentive to comply.













