Four Problem Management SLAs you really can't live without

Simon Higginson

This article has been contributed by Simon Higginson.

Problem Management is the intriguing discipline of the Service Management suite.  The IT Department is continually being asked to be proactive not reactive.

Often in IT we presuppose what our customers in the business require, then give them a solution to issues that they didn’t know that they had.  But what happens when that business customer is asking IT for a permanent solution to an issue we might not have known that we had, or to an issue where we know only a sticking plaster fix is in place?

Your Problem Manager is the key

Step up to the plate the Problem Manager, the individual focussed on reacting to, and managing, issues that have already happened. They can’t really help but have a reactive mindset, rooted in the analysis of fact.  The incident might be closed but the Problem Manager is the person entrusted with ensuring that appropriate steps are taken to guarantee the incident doesn’t repeat itself.  It can be a stressful role, the systems were down, the company perhaps lost, and may still be, losing money, trading has been impacted.  People want to know what is being done.  So what SLAs can be put in place between the Problem Manager and the service owner to support the Problem Manager’s activities and maybe give them breathing space, whilst at the same time ensuring that there is some focus on resolution?

Lets look at the four problem management SLAs that you really can’t live without

#1 – Provision of Problem Management reference number

A simple SLA to get you started.  This is simply an acknowledgement by the problem management team that the problem has been logged, referenced and is in the workflow of the team.  It provides reassurance that the problem is going to be dealt with.

#2 – Time to get to the root cause of the issue

So this is where some breathing space is provided.  The message being given in this particular SLA is that there is a distinction between incident management and problem management.  Incident management has resulted in a temporary fix to an issue, now it is the turn of problem management to actually work out what lay at the heart of the matter – what was the root cause.

Note this is an SLA about identifying and not resolving the root cause – that could take a significant time period involving redevelopment of code.

The outcome that is being measured by the SLA is going to be the production of a deliverable, perhaps in the form of a brief document or even just an email that highlights the results of the root cause analysis.  Each company will have to determine its own policy of what that deliverable might contain, but the SLA is there to measure the time between the formal closure of the incident and the formal provisioning time of problem management’s root cause analysis deliverable.

#3 – Measurement of provision of Root Cause Analysis documentation.  To be provided within X working days of initial notification.

So, you’ve acknowledged receipt of the problem, and you’ve determined the root cause. The next SLA is in place to ensure that a formal document is delivered in a timely fashion. It should have a set format and set down the timeline of events that caused the problem, and actions that have been taken to provide a workaround. It should then list all of the actions and recommendations together with clearly identified owners that need to be completed by realistic dates in order to fix the problem. A suggested target date would be 3 days for simple problems and 5 and 10 days for increasingly more complex ones.

#4 – Measurement of progress on root cause analysis actions as agreed (Target dates not to change more than twice)

In the previous SLA we have measured the time to produce the root cause analysis.  This SLA takes over where the previous clock stopped.

The root cause analysis work will have identified actions that need to be undertaken and implemented to affect a permanent fix to the original issue and allow the sticky plaster solution to be superseded.

However, all resolutions will not be equal in complexity, effort and duration, therefore there will be an initial estimation of a target date for live implementation of a permanent fix.  Moving the target completion date is allowed, however this SLA limits how often this can occur to prevent action timescales drifting.

This article has been contributed by Simon Higginson of Frimley Green Ltd, Simon’s expertise is helping clients get the best out of their service suppliers and creating win-win partnerships.

Do SLAs hinder collaborative relationships with our supply chain?

Pretty much all outsourcing contracts in the IT Service Management world rely on, or at the very least, utilise the Service Level Agreement (SLA).

Certainly they are important as they are the physical representation of performance of the contracting party and used as the measure by which trends in supplier performance is understood.

But is there too much reliance on SLAs as a measure of performance and are they often inserted by the eager contract or procurement manager to mitigate risk or provide a means for the insertion of penalty / reward clauses because “that is what is expected in a contract”?

In my personal experience SLAs are often poorly defined or their alignment to the realities of IT service delivery misunderstood.  Because of that, there has been many mitigating circumstances offered as to why an SLA has been failed by the contracted organisation, followed with significant discussion as to whether the mitigation can be accepted.  This has a tendency to suck up both time, effort and therefore money, from both organisations into managing the performance measures, drafting contract change notices and often not looking at the root cause of why SLAs are being missed or in one case I have dealt with perpetually exceeded (plainly in that case the SLAs were too generous or measuring the wrong outcomes).

Problem 1

The vendor will look to win the opportunity and subsequently concern themselves with making the delivery side work (especially when the bid team is not going to be involved in delivery).  They will obviously try their utmost to meet the targets set, but also expect to provide mitigations in the event of failed SLAs.  They have the experience of dealing with a number of clients and so have reference points to support them, whereas the contracting organisation does not have the same level of experience or number of reference points.

A common resolution is the instigation of a continuous performance improvement plan, and when that has been met redrafting of workable SLAs agreed by both parties, or if it fails penalty clauses or litigation.

Problem 2

Poorly defined requirements from the customer.  Either they are unsure what they want, have over / under specified the level of service really needed, they are looking to outsource a problem, or the business units have been poorly engaged, if at all, by procurement through the tender process.  In such circumstances the supplier is almost being set up to fail from the outset (which from their experience they will probably realise) and therefore they will look to manage their way around the issues as they arise.

The common resolution is a redefinition of the SLAs probably with an element of contract renegotiation once the customer has determined the service it expects or requires.

So what should a good SLA really be about?  A well constructed SLA should be seen as an important measure to support a positive contractual relationship, it should also be periodically reviewed for its applicability in light of changing business demand.  However, the SLA should not replace or overshadow the development of the relationship between the customer and supplier.  Rather, the SLAs in place should support a collaborative attitude towards delivering a contract outcome that benefits both parties; the customer receives the service they need, the supplier makes the profit margin they expected and the customer is satisfied with.

Neither party should be wasting time and money negotiating mitigations, instead the time saved can be spent on delivering future value.  Unfortunately developing proper, mutually beneficial collaborative relationships in a business environment is not easy where customer and supplier aspirations are not aligned.