Let’s talk about incident and problem management. So here’s the thing. ITIL has been around since the late 1980s. We’re currently on version four (ITIL 4), but, while there are books, courses, and blog posts galore about ITIL, there’s still real confusion about where incident management stops, and problem management begins. Plus, the difference between the two. If it was just a terminology issue, I wouldn’t be so worried about it, but the reality is – confusion about incident and problem management hurts us all.
If there’s confusion about the incident and problem management practices/processes, then we need to sit down and explain the difference between incidents and problems. Otherwise, the same incidents will continue to recur, we’re still reliant on individual heroes to fix things, no root cause analysis gets done, so nothing gets fixed permanently, and opportunities for continual improvement are missed.There’s still real confusion about where incident management stops and problem management begins, says @Vawns. Here she attempts to clear up the confusion. #servicedesk #ITSM #ITIL4 Click To Tweet
We’ve all heard of Batman vs. Superman (the comics and the film), and to help end the incident and problem management confusion once and for all, I like to talk about: Batman vs. Columbo.
Batman vs. Columbo Explained
Let’s get back to basics with incident and problem management. Incident management is the process that restores service as quickly as possible, with as little adverse impact as possible. In other words, incident management personnel are the superheroes of the IT service management (ITSM) world, swooping in like Batman to save the day, i.e. to get operations and the business back up and running.
However, the primary objectives of problem management are to eliminate recurring incidents (problems) and to minimize the impact of incidents that cannot be prevented. In other words, problem management people pop up after normal service has been restored and, like Columbo, act as detectives to figure out what happened, what caused things to go wrong, how it was fixed, and how to stop recurrence. And, hopefully, no dead bodies are discovered.
Hopefully, this incident and problem management analogy helps 🙂In this article, @Vawns shares her top tips for incident vs. problem management in the real world. #ITSM #ServiceDesk #ITIL4 Click To Tweet
Top Tips for Incident and Problem Management In the Real World
So now that I’ve explained the difference between incident and problem management, here are my top tips for getting them right.
1. Capture the Right Information for Problem Management
Incident records are about service restoration or break-fix as it’s commonly known. The following information is typically needed for an incident record (and hopefully, much will already be held in your help desk or ITSM tool):
- Contact details
- Employee ID
- Asset tag
- VIP/critical user status
- Service affected
- Assigned teams
- Resolution details
- Fix details
- Related problem record
Problem records and problem management, on the other hand, are all about root cause analysis. Problem records will typically contain the following information:
- Description of issue
- Service affected and business impact
- Remedial actions to date
- Support team details
- Root cause analysis
- Meeting minutes
- Next steps
- Related incidents
- Related changes
So don’t muddle the two, as this affects your incident and problem management capabilities.
2. Have Separate Roles for Incident and Problem Management
Be organized with incident and problem management such that there’s no duplication or wasted effort. In short, the incident manager is concerned with speed, whereas the problem manager is concerned with investigation and diagnosis to improve the quality of the end-to-end service.'In short, the incident manager is concerned with speed, whereas the problem manager is concerned with investigation and diagnosis'. @Vawns talks all things incident & problem mgmt in this article. #ITSM #ServiceDesk #ITIL4 Click To Tweet
Key priorities for the incident manager will include coordinating the incident, managing communications with both technical support teams and business customers, and ensuring that the issue is fixed ASAP. Whereas the problem manager and problem management will focus on root cause investigation, trending (has this issue appeared before?), finding a fix (interim workarounds and permanent resolution), and ensuring that any incident and problem management lessons learned are documented and acted on.
3. Have Defined Handover Points Between Incident and Problem Management
It’s really important to keep an eye on business-as-usual operations, as seemingly-small incidents can spiral out of control to have a negative effect on availability levels and customer satisfaction. Simple things can make a big difference here. For example, placing a whiteboard near the service desk with a list of the top ten problems makes it easy for service desk analysts to link incidents to problems such that trends can be identified later on. Or if the service desk has a team meeting, ask the problem manager (or equivalent) to attend to update them on any new problems and updates and workarounds for existing problems.
Finally, don’t forget to close the problem management loop and let the service desk know when a problem record has been fixed and closed. There’s nothing worse for a service desk than to have to call a list of customers about an issue that was sorted out months ago (which happens when incident and problem management don’t work together well).When it comes to both incident & problem mgmt, don't forget continual improvement says @Vawns. Here she explains. #ITSM #ServiceDesk #ITIL4 Click To Tweet
4. Don’t Forget Continual Improvement of Incident and Problem Management
Get proactive! Get incident and problem management to work as a team to view service performance throughout the month. Have a process to automatically raise a new proactive problem record if availability targets are threatened so that things can be done to prevent further issues. Don’t just sit there waiting to fail the SLA. And keep moving forward – small incremental improvements can really build up over time; it’s called the effect of the marginal gain.
Build continual improvement into your incident and problem management processes for looking at “own goals” when reviewing major incidents and lessons learned when resolving problems and known errors. Add them to your improvement register so that these improvement ideas are documented, tracked, and – most importantly – acted on. Especially for problem management.
5. Don’t forget about Major Incidents in all this incident and problem management talk!
So many folks panic and throw all their practices out the window when faced with a particularly challenging major incident, both incident and problem management personnel, so let’s figure out a game plan in advance. Your service desk management team and incident managers are best placed to deal with major incidents. They deal with reactive and painful stuff day in and day out. They are best positioned to deal with the initial firefighting and fix efforts. Busy, fast-paced, reactive activities are their happy place, so let them do what they do best – get a handle on the issue and fix it as quickly and with as little adverse effect on the business as possible.
But incident management doesn’t get to have all the glory! When the service has been restored and everyone has calmed down, what can add real value afterward is some effective root cause analysis. Enter your problem management team. Sometimes more can be accomplished in a calm, safe, non-judgmental space where everyone looks at what went wrong, how it was fixed, whether there were any lessons learned, and if there is anything that can be done to prevent a repeat occurrence rather than weeks of reactive firefighting. So, make space for both incident and problem management practices when dealing with major incidents.So many people panic & throw all their practices out the window when faced with a particularly challenging major incident, so let's figure out a game plan in advance, says @Vawns. #ITSM #ServiceDesk #ITIL4 Click To Tweet
6. Communication is everything
Incident and problem management are two sides of the same coin. One is focused on speed and efficiency; the other is more detail-oriented and looks after long-term service quality. Both are needed to manage and maintain excellent service levels, so ensure both teams communicate well. Have problem management team members attend service desk team meetings to give updates on problems, known errors, and workarounds. Conversely – have service desk representation at problem investigation meetings so that the business impact can be fully understood and any new technical details or fault symptoms can be captured.
In Summary: Incident and Problem Management
Done well, incident management can fix things quickly, effectively, and safely with problem management complementing the fix-effort by providing support in the aftermath by confirming the root cause and preventing a future occurrence.
The value of both incident and problem management processes increases exponentially when they work together rather than working in silos. When incident management is combined with problem management, it takes your IT support offering to the next level – instead of just focusing on break-fix, you’re moving to a model that looks at the underlying cause(s) and how to make things better for the customer.
Improved first-time fix rates, improvement in overall service quality, and increased customer satisfaction? Deal us in for problem management! Just make sure that you have both incident and problem management 🙂
If you liked this incident and problem management article, you might find the following ITSM articles helpful:
Vawns Murphy holds qualifications in ITIL V2 Manager (red badge) and ITIL V3 Expert (purple badge), and also has an SDI Managers certificate. Plus she holds further qualifications in COBIT, ISO 20000, SAM, PRINCE2, and Microsoft. In addition, she is an author of itSMF UK collateral on Service Transition, Software Asset Management, Problem Management and the "How to do CCRM" book. She was also a reviewer for the Service Transition ITIL 3 2011 publication.
In addition to her day job as a Senior ITSM Consultant at i3Works, she is also an Associate Analyst at ITSM.tools.