Let’s talk about AIOps and IT service management (ITSM). It’s becoming a truism that the pace of change in modern life is faster than it has ever been before, and the changes seem to keep coming faster and faster. IT is a big part of the reason for that acceleration in the rate of change, as it becomes more and more interwoven with our daily lives. Artificial intelligence (AI) and machine-learning, once obscure academic subjects, are now the talk of dinner tables and cocktail parties.
The downside for us as IT professionals is that our own domain is not exempt from this acceleration: physical infrastructure was virtualized fairly slowly, then cloud management frameworks were overlaid on top. Soon it was no longer virtual servers being managed, but rather containers, and now we talk about application architectures that are completely serverless. Each wave of innovation comes closer on the heels of the one before, often before we have had time to come to grips with all the implications of the last technology but one.
New Technologies and IT Support
There is a danger that the constant churn of new shiny objects can distract attention and resources from what is already there and working: the IT service desk. This is dangerous because, whatever is going on behind the scenes, the service desk remains the first point of contact between customers and IT.
There have been all sorts of attempts to move away from a reliance on human service desk agents over the years. Self-service is great – when it works. It tends to function well for routine tasks, which by definition constitute a large proportion of the load on a typical service desk. However, there will always be edge cases that cannot easily be addressed through a frequently asked questions (FAQ) list, interactive voice response (IVR) tree, or self-service portal.
These days, the oft-suggested fix is to add an AI chatbot that can guide people through the options available to them. The issue with chatbots though is that they often fail with specific cases or issues of phrasing, leading to user frustration.
There thus needs to be an exception procedure to be able to access flexible human expertise, with its ability to adapt and understand situations that may be slightly outside normal parameters. Equally, it’s important to reserve those valuable (and expensive!) people for situations where they’re indeed required.
The way to square this circle is to integrate the IT service desk with the new technologies being considered. In most industries, the word “legacy” is a positive one; IT on the other hand treats it with disdain at best. Instead, we need to consider what is a valuable legacy, and what is ballast that needs to be discarded.
Getting Rid of Noise And Distractions with AIOps
One of the big issues IT operations teams face is “noise”: too many alarms, too many tickets – and too few of them actually useful. Instead of irritating chatbots, AIOps – the application of AI and machine-learning technologies to IT operations – can help filter those event streams to avoid overloading front-line operators with meaningless incident reports.
Then they can detect, diagnose, and resolve issues before end users are even aware there is an issue, lessening the need for the end users to call the IT service desk for help in the first place. ChatOps does have a place, enabling a more fluid merging of human-to-human and human-to-machine communication, but it takes knowledge and experience to take full advantage of it. The onus should not be placed on the end user to know what questions to ask the chatbot. On the other hand, busy operations staff can really benefit from having useful information brought to their attention.
Addressing Complexity with AIOps
The main reason for that high noise level in IT operations is the huge, and still-growing, complexity of modern IT environments. Companies are expecting orders-of-magnitude increases in the number of devices in their IT environments – and those devices are changing faster and faster, often without direct operator intervention, simply in response to changes in user demand.
As a consequence, old assumptions of incidents with one root cause and one owner no longer hold true. Foreseeable root causes have generally been addressed already, and so actual incidents tend to be triggered by multiple conditions all occurring in a particularly unlucky configuration. Because each trigger may well occur in a different team’s ownership from the others, there is the risk of duplicate tickets being created, leading to wasted time and effort.
Instead, by identifying these correlations on the fly, AIOps can help IT professionals from different teams and backgrounds to “swarm together” into a virtual team, specific to that particular incident. By collaborating and sharing information and insights, these swarms can solve issues and problems much faster than traditional “waterfall” sequential processes can.
Protect Your People
Another way to think about “legacy” is as the result of past investments. In IT terms, that means experience. Operations people who’ve been around for a while know how to prioritize and resolve issues, but they also know how to advise end users who come to them with requests instead of issues.
If on the other hand those experienced people are spending all their time on firefighting, they won’t have any time to think and act strategically. The classic organizational segmentation into level one-two-three, with junior staff acting as a firewall to allow more senior people to focus on the important issues, is only useful if the frontline people are empowered to learn and act themselves.
Machine learning techniques embedded into AIOps tools can identify and capture useful knowledge, making it available throughout the organization. This removes bottlenecks from the process, where a more senior person has to be called in, as well as avoiding burning out those senior people by wasting their time on activities that should be handled by others.
Trust Is Required for Success With AIOps
All of this requires a certain organizational maturity, and this means the right combination of flexibility to adopt innovation and a stable framework to build on top of. IT service management (ITSM) has the potential to be that foundation, extended as necessary by new technologies and approaches as they become available.
Trust is required between different teams to avoid different bits of the organization going off in their own directions without considering how they might impact others – or what valuable inputs might make their own area better.
AI and machine learning are still new technologies, leading some to question how they can be integrated with, and build on, existing systems. AIOps represents ways of applying these innovative techniques to the domain of IT operations. However, this does not mean discarding what has gone before, but rather complementing it and enabling it to function better than before – by adapting to changes in the environment and the requirements and expectations of end users.
Dominic is Director of Strategic Architecture at Moogsoft, helping companies adopt AIOps to streamline their IT Operations and become more agile and responsive to ever-changing demands. He has been involved in IT operations for a number of years, working in fields as diverse as SecOps, cloud computing, and data center automation.