Target Corporation wanted to close the gap in their high severity incident management processes. The goal was to speed up resolution while minimizing operational and customer impact resulting from those incidents across their global technology services organization.
The challenges were multi-faceted:
- Incidents were becoming more complex, increasing the probability and risk of major outages
- The geographically disbursed team of Subject Matter Experts (SMEs) required a high-degree of coordination during the incident handling process
- The organization was lacking a consistent, repeatable approach for taking control of incidents and keeping everyone “on the same page”
- Inconsistent data gathering reduced the effectiveness of Problem Management during potential RCA investigations at a later stage
- SLAs were at risk of being missed.
In order to respond to these challenges, the company engaged KepnerTregoe to introduce a scalable approach to High Severity Incident Management and improve the core performance metrics: Time-to-Restore, Variation and “Avoidance of Global Incidents” (incidents of the highest severity level).
A pilot project was implemented. Over a four-month period, Kepner-Tregoe worked with the High Severity Incident Management group of one of the major technology groups.
The major phases of the project included:
- An analysis of the Incident Management function in order to baseline performance and assess the underlying capabilities, processes, and IT ecosystem
- A streamlining of the High Severity Incident Management process with respect to the sequence of process steps and the execution of those steps
- 74% reduction in Mean-Timeto-Restore
- 77% reduction in Variation and Increase in the percentage of “Global Incidents Avoided”
- Improvement in process quality and consistency