Weekly reliability retrospective in Notion
Every Monday at 8am, draft a weekly SRE retro in Notion that correlates Datadog incidents with GitHub deploys so your team starts the week with the full reliability story.
Build me an agent workflow that drafts our weekly reliability retrospective in Notion every Monday at 8am local time.
Trigger: cron, weekly on Mondays at 08:00 in the configured time zone.
What the agent should do each run:
1. Pull incidents from Datadog for the last seven days using Datadog "Search Incidents" (or "List Incidents" if a query is not provided). For each incident, capture severity, declared and resolved timestamps, computed duration, services impacted, commander, and a short title. Also call Datadog "Search Monitors" to list any monitors that triggered during the same window so the agent has alert context to weave into the narrative.
2. Pull the week's deploys from GitHub across the configured list of main repositories. For each repo, call GitHub "List Releases" to get formal releases published in the window, and GitHub "Search Issues and Pull Requests" with a query like is:pr is:merged merged:>=<start> base:main to enumerate merged PRs. Capture timestamp, title, author, and repo for each release and merge.
3. Cross-reference the two timelines. Any incident whose start time falls within one hour after a release or merge to main in the same week should be flagged as a likely deploy correlation, with the specific release or PR called out.
4. Summarize the week's reliability story in plain English: total incidents by severity, MTTR, the two or three standout incidents worth discussing, and any patterns the agent notices (e.g. repeated alerts on the same service, a noisy monitor, a cluster of incidents after a specific deploy). Extract action items from the incident timelines where the agent can find them.
5. Write the retro to Notion using "Create a Page" as a new child of the configured weekly review database. The page should be titled "Weekly reliability retro, week of <Monday date>" and contain these sections as headings with content underneath: Summary, Incidents, Deploy correlation, Action items.
Inputs the user configures once during setup:
- The Notion database ID of the weekly review database where new retro pages should be created.
- A list of GitHub repos in owner/repo form that count as the team's main repos for deploy tracking.
- Time zone for the cron schedule and the week boundary.
- Optional: the deploy correlation window in minutes, default 60.
Output: one new Notion page per run inside the configured weekly review database, ready for the on-call lead to review and share before Monday standup. If a week has zero incidents, still write the page with a short "clean week" summary plus the deploys shipped, so the cadence is never silent.
Additional information
What does this prompt do?
- Pulls every incident from the last seven days with severity, duration, services impacted, and commander.
- Lists the deploys and merged pull requests shipped in the same week across your main repos.
- Flags incidents that started within an hour of a release as likely deploy correlated.
- Writes a new page in your Notion weekly review database with Summary, Incidents, Deploy correlation, and Action items sections.
What do I need to use this?
- A Datadog account with permission to read incidents and monitors.
- A GitHub login with read access to the repositories your team ships from.
- A Notion workspace with a weekly review database shared to your General Input connection.
- A short list of the repos you want covered and the Notion database where the retro should land.
How can I customize it?
- Change the schedule from Monday at 8am to whatever cadence and time zone fits your on-call handoff.
- Add or remove repositories in the GitHub repo list to match the services you actually care about.
- Adjust the deploy correlation window (default one hour) if your rollouts are slower or faster.
- Add extra sections to the Notion page, like a customer impact recap or a learnings field for the incident commander.
Frequently asked questions
Do I need Datadog Incident Management to use this?
What if my team uses GitHub Releases inconsistently?
How does the deploy correlation actually work?
Where does the retro show up in Notion?
Can I run it for a different time range, like the last month?
Stop assembling the SRE retro by hand every Monday morning.
Connect Datadog, GitHub, and Notion once, and Geni drafts the weekly reliability story for your team before standup.