On-demand incident RCA from your logs and runbooks
Upload alerts, logs, stack traces, and runbooks, and get a clear timeline, the most likely root cause, and remediation steps you can ship.
Build me an on-demand agent workflow that runs a focused incident root cause analysis from evidence I provide. Each run starts when I upload the evidence I have on hand: alert payloads, raw log excerpts, stack traces, dashboards, timeline notes, runbooks, and any prior incident write-ups. The agent must treat the uploaded evidence as the primary source of truth.
When the uploaded evidence has gaps, the agent can selectively pull additional context from these connected tools: Slack, Microsoft Teams, Linear, Asana, ClickUp, Notion, Google Drive, and Microsoft SharePoint. It should only reach into those tools when doing so closes a meaningful evidence gap or confirms fresher incident state. Concrete examples: searching the on-call Slack channel or Teams channel for messages near the incident window, reading a Linear, Asana, or ClickUp ticket that was referenced in an alert, retrieving a runbook page from Notion as markdown, or pulling a referenced document from Google Drive or SharePoint. It must not crawl those tools blindly.
The output is a single written RCA with these sections, in this order:
1. Summary. One paragraph in plain English describing the incident and customer impact. 2. Timeline. An ordered list of events with timestamps and the source each event came from (which log file, which Slack message, which ticket). 3. Affected systems. Services, environments, and customer surfaces that were affected. 4. Likely root cause. The most probable cause, with a confidence level of low, medium, or high, and the specific evidence that supports it. If multiple causes are plausible, list each one with its own confidence and evidence. 5. Remediation. Short-term fixes the team should ship now. 6. Follow-up actions. Longer-term hardening work, with suggested owners or teams when the evidence makes that obvious. 7. Open questions and unknowns. Every gap the agent could not close, stated explicitly. The agent must never invent a cause to fill a gap.
At the start of each run, let me choose three things: the report depth (short on-call note, full postmortem, or customer-facing summary), which connected tools the agent is allowed to search, and where to write the final report. Output destination options should include returning it in the run, creating a Notion page, posting a summary in a Slack channel or Microsoft Teams channel, and creating or commenting on a ticket in Linear, Asana, or ClickUp. I should be able to pick more than one destination.
Two non-negotiables. First, every claim in the report must cite the evidence it came from, by name or by quoted snippet (a log line, a chat message, a runbook section, a ticket comment). Second, if the agent had to skip a connected tool because the integration was not connected or access was missing, it must call that out at the end of the report so I know what extra context I could unlock next time.
Additional information
What does this prompt do?
- Reads the incident evidence you drop in: alerts, logs, stack traces, dashboards, timelines, and runbooks.
- Reconstructs the timeline of what happened and the systems and customer surfaces that were affected.
- States the most likely root cause with a confidence level and the exact evidence behind it.
- Recommends short-term fixes and follow-up hardening, and calls out every unknown instead of guessing.
- Pulls in linked Slack or Teams threads, Linear, Asana, or ClickUp tickets, Notion runbooks, and Google Drive or SharePoint docs only when they close a real gap.
What do I need to use this?
- Your incident evidence ready to upload: alert messages, log excerpts, stack traces, dashboards, timelines, and any existing runbooks or prior postmortems.
- Optional logins for the tools where supporting context lives: Slack, Microsoft Teams, Linear, Asana, ClickUp, Notion, Google Drive, or Microsoft SharePoint.
- Optional: a place you want the final RCA written, like a Notion page, a Slack or Teams channel, or a ticket in Linear, Asana, or ClickUp.
How can I customize it?
- Choose the report depth each run: a short on-call note, a full postmortem, or a customer-facing summary.
- Tell the agent which connected tools it is allowed to search so it does not chase irrelevant context.
- Point the finished RCA at the place your team already reviews incidents, like a Notion page, a Slack channel, or a Linear ticket.
Frequently asked questions
What evidence should I upload?
Does it need access to my logging or monitoring stack?
Can I trust the root cause it gives me?
Where does the final RCA go?
Will it work for past incidents too?
Turn raw incident evidence into a real RCA.
Drop in your logs, alerts, and runbooks, and Geni writes the timeline, the likely cause, and the next steps.