Many companies introduce service design as a means to improve the suitability of IT services which make it to “live” for their customers and clients; but few consider what makes service design successful.
There’s a misconception that if you put in a framework and some standards around design, that you’ll see a fundamental change in how fit-for-purposes IT services become. In reality, whilst there is some improvement, encouraging repeatable process and design, the fundamental issues impacting design are not mitigated.
So what are these issues we so often miss?
The best Geiger counter for service design is the effectiveness of your service level agreements (SLAs). If you were to take two SLAs, one for a business critical service and one for a non critical business service, would they look the same? Would they have the same availability targets? Would they reflect their differences? Beyond that, how does that translate into the way you report back to the business? Are you in fact only reporting on infrastructure SLAs rather than on an end-to-end capability?
Once you start to dig into how SLAs are approached, you begin to see why service design isn’t working. If at the end of the day, you’re fundamentally only reporting against your infrastructure, it’s no wonder that at service reviews, customers are aghast when they see a sea of green on the report, when they have experienced some pretty hard hitting impacts to service.
So what can we do to set a good foundation on which to base service design?
Well apart from looking at those two SLAs, the most important activity to undertake is a full review of risks, issues, dependencies and assumptions (a RAID assessment); you can achieve similar results purely focusing on risks and issues. However, these aren’t just risks and issues that you’d find in your IT risk register; there’ll be clues in business risk registers all over the place; for example “we were unable to meet our target of increasing our leasing contract processing throughput to 600,000 interactions a month from 400,000 because we had numerous issues with technology”. Straight away this should ring alarm bells in capacity management, whether requirements were adequately verified, whether there had been some changes that caused major incidents etc.
Rather than looking at RAID from a technology driven perspective, you can derive far more telling problems by looking at them from a process perspective. Whether you use ITIL, PRM-IT, COBIT, or any other quality process framework does not really make any difference, but take what you use internally as your reference model. As you discover risks and issues across your business and throughout IT, you can begin to plot those against the processes, assessing likelihood and relative impact of risks, and actual impact of issues as you go along. Once you have this in place, you can analyze the results, looking for the top threats to service design, and the repeat offenders. You’ll also have a view of what mitigating activities are already underway from records in the risk registers. Your analysis will result in a plan of action of what will make the fundamental difference to service design. This activity should be re-baselined every year, otherwise effectiveness of service design will wane.
For me, the number one fundamental has always been risk management – how do we know what the business uses the technology for, and what’s the frontline risk to the business if that technology is not available or not fit for purpose? What comes second will often depend on the culture of your organization; by this I mean, is your culture naturally collaborative, or are you a silo-ed organization? It’s quite incredible how much this does impact effectiveness of service design.