| |
---|
Triggers Describe how the change that was implemented didn't work as expected. If available, include relevant data visualizations.
| |
Impact Describe how internal and external users were impacted during the incident. Include how many support cases were raised.
| |
Detection Report when the team detected the incident and how they knew it was happening. Describe how the team could've improved time to detection.
| |
Response Report who responded to the incident and describe what they did at what times. Include any delays or obstacles to responding.
| |
Recovery Report how the user impact was mitigated and when the incident was deemed resolved. Describe how the team could've improved time to mitigation.
| |
Timeline Detail the incident timeline using UTC to standardize for timezones. Include lead-up events, post-impact event, and any decisions or changes made.
| |
Five whys root cause identification Run a 5-whys analysis to understand the true causes of the incident.
| |
Blameless root cause Note the final root cause and describe what needs to change without placing blame to prevent this class of incident from recurring.
| |
Related incidents Check if any past incidents could've had the same root cause. Note what mitigation was attempted in those incidents and ask why this incident occurred again.
| |
Lessons learned Describe what you learned, what went well, and how you can improve.
| |
Follow-up tasks List the Jira issues created to prevent this class of incident in the future. Note who is responsible, when they have to complete the work, and where that work is being tracked.
| |