Home �   ITIL�  Index

The Challenges of RCA in ITIL and the "New" Deming Cycle

Jun 5, 2008

Jan Vromant

High FCR and Poor pPM

The understanding of FCR metrics can be a major hurdle. As described in the above example, a high FCR can be the sign of a poorly functioning pPM. Most outsourcing contracts mention the FCR as a specific service level. A typical contractual phrasing is, “First Call Resolution will be greater than or equal to 75%.” Such a service level could undermine the usage of pPM because pPM normally drives down the FCR. You actually want the FCR to go down, as it could indicate a well-functioning pPM and the reduction of the number of easy tickets over time. (It also might point out a decreasing performance of the service desk caused by - for example - skyrocketing attrition rates.). The improvement can only be achieved by having a good reporting system and a keen understanding of the metrics and the service levels of your IT operations.

Link with Change Management

A poor change management process can be another hurdle because of the cycle of Incident Management ► Problem Management ► Change Management. Maybe the weekly Change Advisory Board (CAB) doesn't have sufficient financial knowledge representation and the return on investment (ROI) effect of a particular change on the number of incidents is poorly understood.

Proactive changes tend to often fall through the cracks in a typical CAB meeting, because they lack the urgency of changes related to immediate operational needs or the importance of mega-projects. In addition, the documentation of proactive changes often does not clearly convey or communicate the financial benefits of the proposed change and instead focuses on technological gobbledygook.


There are several countermeasures that you can put in place to boost your pPM efforts:

1. Understand How to Get to the Root Cause

There is a relatively easy way to understand when you have reached the root cause. You know you have gotten to the root of the problem when the elimination of that issue, through a formal change, will eliminate the recurrence of that set of tickets. When you do a deep enough analytical dive into the root causes of most issues, the final result you will find for any root cause is “because we are human.”

Some examples include:

·Programming an application and failing to correct known bugs due to time constraints;

·Switching off the electricity of a major data center with 635 servers because of work on the air-conditioning system; and

·Not finding the “any” key on the keyboard.

A similar situation is when you determine that the problem is caused by something out of your control. The following example conveys the idea:

I know that my 3rd party application freezes whenever I do a certain thing. I have no idea what code is causing it, but I have communicated it to the vendor, who provides me with a patch that fixes it. Do I know the root cause? No. Have I solved the problem and prevented its recurrence? Yes.

The real or fundamentally correct root cause does not matter, as long as you can eliminate the tickets and enhance your end-user productivity. The moment you can identify any change that costs less than the cost of the incidents it eliminates, you have a mini-investment project with a positive ROI.

2. A Good Tool

Trying to implement pPM can be painful if you don’t have an effective Incident and Problem Management tool. Preferably, the tool should be linked with your Change and Configuration Management process. The absence of a tool makes the RCA research and the implementation and documentation of your pPM efforts particularly difficult in two areas.

The first area is the categorization of the incidents. An effective tool will make the proper categorization of the incidents mandatory. Thus the tool will help spot and report the Top 10 incident categories and help in prioritizing your pPM activities. Second, the linkage with Change and Configuration Management will assist in the seamless changing and updating of your environment to reduce the likelihood of recurrence of the incidents.

3. Process

A well-defined process structure with clear roles and responsibilities, metrics, and strong cross-functional links between users, support personnel, and service providers is the foundation to facilitate pPM.

Accountability: In an internal IT environment, a way to tackle the challenge of pPM is by assigning responsibility for its implementation to an individual—not a committee! In addition, you should define appropriate metrics to judge the performance of the pPM process. These metrics should then flow into the personnel performance review. This is the “throat-to-choke” principle. You should be on the road to improvement the moment you define the accountability and the consequences for not reaching mutually agreed upon goals.