http://www.itsmwatch.com/itil/article.php/3801266/The-Evolution-of-Incident-Management.htm
Back to Article
|
|
|
|
By George Spafford Feb 6, 2009 For years, real-world ITSM practitioners knew there were challenges with how Incident Management attempted to incorporate service requests and alerts from monitoring tools. As a result, they developed their own practices. Now, with ITIL v3, the Incident, Service Request and Event Management processes are independent and that is a great thing. Lets step through each area and review why this is the case. Service Requests Traditionally, users who needed something contacted the service desk or opened their own incident record via a self-service portal. While it could be done, it was problematic. A request for information, employee moves, and so forth were lumped in with real incidents. To explain the challenge, an incident is a disruption to a service or something that may disrupt a service. A service request, on the other hand, can be viewed as everything else. Organizations found if service requests were handled by the same group doing break/fix, often times the requests would be handled at a lower priority and subsequently the requestor would have to wait unpredictable amounts of time while incidents were handled first. This unpredictable performance damaged customer satisfaction and ITs credibility. Now, by having a dedicated service request process, organizations can design the process in accordance with their requirements. For example, service requests can be used to initiate new user requests, hardware moves, additional software, and so forth that then follow the appropriate workflows and integrate with other processes properly. This results in a more refined single entry into IT. From an organizational perspective, some IT groups have further improved service by splitting the management of service requests and incidents into two different groups. This way, incidents have dedicated staff and so do the requests. As a result, the predictability of service requests being handled in a timely and efficient manner increases, as does customer satisfaction. Event Management In ITIL v2, practitioners realized they needed to route alerts from monitoring tools into the Incident Management process and the supporting software. The traditional method of sending pages or emails directly to specific staff could cause incidents to be delayed or completely overlooked if someone was very busy, out of the office sick, or otherwise unavailable. Instead, by opening an incident record, there could be escalation, metrics collected and so forth. The approach was very dependent on the process and technology teams implementing the integration because there wasnt formal guidance. Part of the challenge lays the creation of alerts and responses. Incident Management stakeholders identify as many alerts as they can based on experience. When new or changed service are implemented, alerts are defined as experienced, i.e. in reaction to incidents. If this happens again we can detect it by monitoring X and will send message Y to the tool Z. Sometimes, more emphasis is given to detecting incidents than resolving them. Now, with Event Management, there is a robust process that begins in Service Design and flows into Service Transition to understand for each service what potential incidents may be and defines them as events before going live in production. Detection criteria and responses, ranging from automatic to console alerts with manual handling, are formally defined. The monitoring and ITSM tools are then configured accordingly in a proactive manner. Of course, as new incident types are experienced, the criteria and actions to detect and respond are formally documented and enacted in the tools as well. For years, real-world ITSM practitioners knew there were challenges with how Incident Management attempted to incorporate service requests and alerts from monitoring tools. As a result, they developed their own practices. Now, with ITIL v3, the Incident, Service Request and Event Management processes are independent and that is a great thing. Lets step through each area and review why this is the case. Service Requests Traditionally, users who needed something contacted the service desk or opened their own incident record via a self-service portal. While it could be done, it was problematic. A request for information, employee moves, and so forth were lumped in with real incidents. To explain the challenge, an incident is a disruption to a service or something that may disrupt a service. A service request, on the other hand, can be viewed as everything else. Organizations found if service requests were handled by the same group doing break/fix, often times the requests would be handled at a lower priority and subsequently the requestor would have to wait unpredictable amounts of time while incidents were handled first. This unpredictable performance damaged customer satisfaction and ITs credibility. Now, by having a dedicated service request process, organizations can design the process in accordance with their requirements. For example, service requests can be used to initiate new user requests, hardware moves, additional software, and so forth that then follow the appropriate workflows and integrate with other processes properly. This results in a more refined single entry into IT. From an organizational perspective, some IT groups have further improved service by splitting the management of service requests and incidents into two different groups. This way, incidents have dedicated staff and so do the requests. As a result, the predictability of service requests being handled in a timely and efficient manner increases, as does customer satisfaction. Event Management In ITIL v2, practitioners realized they needed to route alerts from monitoring tools into the Incident Management process and the supporting software. The traditional method of sending pages or emails directly to specific staff could cause incidents to be delayed or completely overlooked if someone was very busy, out of the office sick, or otherwise unavailable. Instead, by opening an incident record, there could be escalation, metrics collected and so forth. The approach was very dependent on the process and technology teams implementing the integration because there wasnt formal guidance. Part of the challenge lays the creation of alerts and responses. Incident Management stakeholders identify as many alerts as they can based on experience. When new or changed service are implemented, alerts are defined as experienced, i.e. in reaction to incidents. If this happens again we can detect it by monitoring X and will send message Y to the tool Z. Sometimes, more emphasis is given to detecting incidents than resolving them. Now, with Event Management, there is a robust process that begins in Service Design and flows into Service Transition to understand for each service what potential incidents may be and defines them as events before going live in production. Detection criteria and responses, ranging from automatic to console alerts with manual handling, are formally defined. The monitoring and ITSM tools are then configured accordingly in a proactive manner. Of course, as new incident types are experienced, the criteria and actions to detect and respond are formally documented and enacted in the tools as well. For years, real-world ITSM practitioners knew there were challenges with how Incident Management attempted to incorporate service requests and alerts from monitoring tools. As a result, they developed their own practices. Now, with ITIL v3, the Incident, Service Request and Event Management processes are independent and that is a great thing. Lets step through each area and review why this is the case. Service Requests Traditionally, users who needed something contacted the service desk or opened their own incident record via a self-service portal. While it could be done, it was problematic. A request for information, employee moves, and so forth were lumped in with real incidents. To explain the challenge, an incident is a disruption to a service or something that may disrupt a service. A service request, on the other hand, can be viewed as everything else. Organizations found if service requests were handled by the same group doing break/fix, often times the requests would be handled at a lower priority and subsequently the requestor would have to wait unpredictable amounts of time while incidents were handled first. This unpredictable performance damaged customer satisfaction and ITs credibility. Now, by having a dedicated service request process, organizations can design the process in accordance with their requirements. For example, service requests can be used to initiate new user requests, hardware moves, additional software, and so forth that then follow the appropriate workflows and integrate with other processes properly. This results in a more refined single entry into IT. From an organizational perspective, some IT groups have further improved service by splitting the management of service requests and incidents into two different groups. This way, incidents have dedicated staff and so do the requests. As a result, the predictability of service requests being handled in a timely and efficient manner increases, as does customer satisfaction. Event Management In ITIL v2, practitioners realized they needed to route alerts from monitoring tools into the Incident Management process and the supporting software. The traditional method of sending pages or emails directly to specific staff could cause incidents to be delayed or completely overlooked if someone was very busy, out of the office sick, or otherwise unavailable. Instead, by opening an incident record, there could be escalation, metrics collected and so forth. The approach was very dependent on the process and technology teams implementing the integration because there wasnt formal guidance. Part of the challenge lays the creation of alerts and responses. Incident Management stakeholders identify as many alerts as they can based on experience. When new or changed service are implemented, alerts are defined as experienced, i.e. in reaction to incidents. If this happens again we can detect it by monitoring X and will send message Y to the tool Z. Sometimes, more emphasis is given to detecting incidents than resolving them. Now, with Event Management, there is a robust process that begins in Service Design and flows into Service Transition to understand for each service what potential incidents may be and defines them as events before going live in production. Detection criteria and responses, ranging from automatic to console alerts with manual handling, are formally defined. The monitoring and ITSM tools are then configured accordingly in a proactive manner. Of course, as new incident types are experienced, the criteria and actions to detect and respond are formally documented and enacted in the tools as well. These documented approaches then allow for staff involved with incident and problem resolution to have ready access to symptoms and resolutions. It also allows for automatic responses to be planned and implemented. For example, if a memory leak requires that a particular server be rebooted regularly, then why not establish an automated procedure to detect that the server is reaching a state that requires rebooting, wait until an off-hours time and then trigger an automatic reboot? With todays tools, automated responses are becoming increasingly feasible. This new approach allows for IT to be far more proactive. Knowledge about incidents gained during development and testing are carried forward into operations shortening the learning curve normally associated with new, or heavily changed, services. This, then, results in improved mean time to repair (MTTR), service availability and customer satisfaction. In closing, the separation of the Service Request and Event Management processes from the Incident Management process is a great move by the authors of v3. IT organizations that are currently following a v2 approach can look to the new processes to identify potential improvement opportunities. At the same time, groups looking to begin ITIL will be well served to look at these three processes and consider the costs and benefits in their unique situation. Most groups will get very real benefits from this combination of processes. George Spafford is a principal consultant with Pepperweed Consulting and a long-time IT professional. George's professional focus is on compliance, security, management and overall process improvement. |