FY21-Q3 Infrastructure KR: Elevate MTTR to KPI => 5%
Current Situation
We lack accessible metrics to assess the impact on the time between a customer impacting incident and when it has been resolved. This leaves us unable to effectively assess our efforts of improving our incident response workflow.
Desired Outcome
We are able to capture and display a metric indicating the mean time to resolution for incidents related to GitLab.com. Ideally this metric is available on the KPI section of the handbook as a sisense graph. A fall-back position would be a link to a Google Sheet with the metric programmatically gathered from incident issue metadata.
Additionally we will work with the Monitor:Health product team to identify metadata on alerting and incident issue types which can help us achieve this goal in future iterations of Monitor:Health.
Acceptance Criteria
-
Issues are generated for the work required to collect and display this metric and linked to this issue. (Recommendation is that there is a parent issue/epic containing the prioritized for for the team) -
Timeline for completion is defined (recommendation is that this is also captured in a parent issue/epic)
Retrospective
- This effort was halted prior to any significant effort due to other priorities and a unexpected higher incident load in Q3.