July 1, 2025 5 min read

🧑‍🚀 Operational Readiness & Resilience III: Postmortems

Luke Curtis

Luke Curtis

Engineering Leader

Header image

Postmortems at a glance

Incidents are inevitable. Repeating them is a choice.

A postmortem helps engineering teams turn failure into insight. It documents what went wrong, why it happened, and what needs to change to prevent it from happening again. But more than that, it’s a cultural tool—it tells your team, “We learn, we improve, and we don’t hide from mistakes.”

Done right, postmortems create real change—not just reports that collect dust.

A postmortem template

incident.io has a really solid breakdown on what should be included in a post mortem;

Go deep, not wide

Most teams stop at surface-level causes: a bad deploy, a flaky test, someone clicked the wrong thing. But those aren’t root causes—they’re symptoms.

Use the 5 Whys approach to dig deeper. For example:

A bug was deployed → It wasn’t caught in review → The reviewer didn’t understand the change → There’s no documentation for this part of the system → Ownership isn’t clear.

The deeper you go, the more systemic and actionable your takeaways will be.

In practice

After an incident has been closed, putting in a meeting with the key people that were involved in the incident should methodically go through this list and ensure everyone is aligned on what the state of play is regarding what happened.

Write your postmortem so both engineers and stakeholders can understand what happened. Append technical deep-dives, but keep the core summary accessible.

It's worth noting that depending on the size of your org, you may not need to go this deep on documentation, a simple sense check could suffice to ensure you're not adding undue burden on operations.

Working in the open

Postmortems aren't just documents, they're signals. Writing them well, sharing them openly, and reviewing them together sends a clear message: we don’t blame, we learn. Keeping the communications open and at least internally accessile to all (within reason) is a good practice and encourages people to not shy away from raising the alarm when things go wrong.

Equally, once your postmortem has been written up, consider running an explicit synchronous meeting presenting the findings if the severity of the incident is high enough, this allows for engineers to learn from others mistakes and ensure the blameless culture continues within your organisation.

Luke Curtis

Luke Curtis

Engineering Leader with over 10 years of experience in building and leading high-performing teams. Passionate about transforming organizations through technical excellence and empowered engineering cultures.

Stay Updated

Subscribe to receive the latest insights and articles directly in your inbox.