Changes, Incidents & Unintended ConsequencesIt always happens when you least expect it but there are some simple things you can do today to get away from unintended consequences, writes ITSMWatch columnist Jason Drubert of BT Consulting.
To help illustrate the relationship try this technique: plot on the same axis your total number of major incidents per month (going back six months to a year) and the total number of change records per month for the same period. I call this chart the Unintended Consequences Index. And I try to do one every time I work with a new organization. Nine times out of ten, the graph looks something like this:
In this situation, is there any doubt about a relationship between changes and incidents? At first glance, the lines have a similar shape, suggesting that changes raised to resolve incidents may have resulted in this correlation. However, in my experience, the data have never once shown this to be the case.
In a few cases where I have seen a poorly defined relationship, another overwhelming driver of major incidents is the cause―and not because change management is effective. For example, in one case, unstable power and severe weather in Latin America caused more than half of all major incidents, drowning out any affect that wayward changes might have had.
If, as a metric, you are already tracking the number of incidents related to changes, then I would expect the graphs plotted on the same axis to have a similar shape. If not, further investigation would certainly be warranted.
There are a couple of ways to turn this concept into a process metric. On one hand, you can divide the number of major incidents by the number of changes to get a ratio. On the other, if you are more statistically-minded, you can calculate the correlation coefficient of the two numbers. While these are useful, neither is as effective as simply showing the picture. When you present a graph like the one above during a meeting, people are genuinely taken aback. I have even been accused (half-jokingly) of doctoring the data to make a point.
I have found that this relationship holds true even for companies that have good change Management by ITIL standardsregular CAB meetings, minimal unauthorized changes, and a well-defined approval processes. So, if a company is doing change control by the book, why would they see this relationship?
In rough order of occurrence, here is my list of the most common reasons:
- Evaluation of changes is based on procedural correctness as opposed to potential risk and impact.
- Information needed to properly evaluate the potential impact of a change is unavailable or inaccurate.
- Inadequately tested changes and Releases; often because of a lack of suitable non-production testing environments or a perceived lack of time.
These items are not mutually exclusive, and there are plenty of other possible reasons.
So, you have made this nifty graph to show what you probably knew intuitivelychanges often result in unintended consequences. Perhaps you have an inkling as to why, but what to do about it?
Analyze the Data The graph will generate questions. Be able to tell a story to go along with the picture. As part of the proactive problem management process, understand the data and associate specific changes and incidents. Graphing by week instead of month may help add clarity.
Share the Information Once the relationship is clearly demonstrated, it is more difficult to ignore the issue and easier to justify challenging changes and allocating the time and resources to test them properly.
Post Implementation Reviews A change has been implemented. What are you doing to ensure future changes benefit from the experience (good or bad)? Review all aspects of your most significant changes or failed or problematic changes (identified by number 1). Others can be selected at random. Then evaluate them to determine what could have been done better. Follow up on those findings.
Major Incident Reviews Once a major Incident has been resolved, what are you doing to keep similar incidents from occurring again? If you arent sitting down as a group, closely examining what happened, and taking actions based on your findings, you should not be surprised when history repeats itself. This should also be part of the problem management process. Great care should be taken to prevent these reviews from resembling the Spanish Inquisition in any way; keep the reviews positive and forward looking to the extent possible.
Repeat Conduct reviews regularly, at least monthly. Continue graphing the data. If your efforts are effective, the lines of the graph should diverge and the patterns become less similar. If you produce the graph only once, you will never know.
None of the actions I have recommended here are drastic. You may eventually determine you need to hire consultants, add bureaucracy, buy software, or make changes to the organizational chart. However, none of these should be the first step; unfortunately, they too often are.
The first step simply should involve making someone accountable for checking the results of each processes and acting on the findings to improve the processes incrementally over time. When there is discussion of plan, do, check, act, we all nod our heads in agreement. In most cases, however, an organizations quality circle is heavily weighted to the plan, do side and flat on the check, act side.
The No.1 excuse for not performing the action items listed above is insufficient staffing, yet I maintain that every organization has enough staff to review at least a couple of changes and major incidents each month. Thats all it takes to get the process improvement ball rolling.
The ability to implement timely changes the business needs with minimal unintended consequences is perhaps the Holy Grail of ITIL. If you are to succeed at this epic quest, my suggestion is start with figuring out where you are.
Jason Druebert is a consultant with BT Professional Services. Jason has extensive experience in ITSM, IT operations, and project management.