Monthly Grafana alert rule hygiene audit

On the first Monday of every month, find stale, silent, and missing-runbook alert rules in Grafana and file owner-grouped cleanup tickets in Linear.

Agentic Task
GrafanaLinearSlack BotEngineeringOperationsNotifications & AlertsResearch & MonitoringAI Reports

Build me a monthly Grafana alert rule hygiene auditor that runs on a cron the first Monday of every month. The reasoning about what counts as stale or misconfigured is judgment-heavy, so this should be an agent workflow.

Step 1: gather state. The agent should call Grafana's List Alert Rules to fetch every provisioned alert rule, List Contact Points to see what notification destinations exist, and Get Notification Policy Tree to see how rules route. Then call Find Annotations over the past 90 days filtered to alert-state annotations, so each rule has a recent fire history.

Step 2: flag three classes of hygiene problems. (a) Stale: rules that have never fired in the last 90 days are candidates for deletion or threshold review. (b) Silent: rules whose labels do not match any route in the notification policy tree and therefore would not reach any contact point if they fired. The silent check is the most valuable, treat it as P0. (c) Missing metadata: rules without a severity label or without a runbook annotation.

Step 3: file cleanup tickets in Linear. For each cleanup item the agent should call Create Issue in a configured Alerting Hygiene team (the team name should be configurable). Each issue should have a clear title that includes the rule's folder and name, a severity for the hygiene problem itself (silent = high, stale = medium, missing metadata = low), and a checklist in the description of what to verify or fix. Group issues by Grafana folder so folder owners can claim their cleanup as a batch. Be conservative: never auto-delete or auto-modify rules, only file tickets.

Step 4: post a single Slack summary via Send a Message to a configured channel. The summary should say how many rules were audited, how many tickets were filed in each of the three categories, link the new Linear tickets in a bundle, and call out the silent alerts explicitly because those are the ones most likely to bite during an incident.

Configurables I want to be able to set: the Slack channel, the Linear team name, the lookback window (default 90 days), and whether to skip rules in specific folders (e.g. a sandbox folder).

Additional information

What does this prompt do?
  • Audits every provisioned Grafana alert rule on the first Monday of each month so cruft does not pile up between incidents.
  • Flags three classes of hygiene problems: rules that have not fired in 90 days, silent alerts whose labels do not route to any contact point, and rules missing a severity label or runbook link.
  • Files a Linear ticket per cleanup item in your Alerting Hygiene team, grouped by folder, with a checklist of what to verify before changing anything.
  • Posts a single Slack summary that links the bundle of new Linear tickets so on-call owners can claim their cleanup in one place.
What do I need to use this?
  • A Grafana service account with read access to alert rules, contact points, the notification policy tree, and annotations.
  • A Linear workspace with an Alerting Hygiene team (or any team you want the tickets filed into).
  • A Slack workspace and the channel where you want the monthly summary posted.
How can I customize it?
  • Change the cadence or day. Monthly on the first Monday is a sensible default, but weekly, quarterly, or a different weekday all work.
  • Tweak what counts as stale. The 90-day lookback and the silent-alert and missing-runbook checks can each be tightened or loosened.
  • Pick the Linear team, the issue title format, and the Slack channel that should receive the summary.

Frequently asked questions

Will this delete any alert rules for me?
No. The audit is read-only on the Grafana side. It only opens Linear tickets so a human can decide whether to delete, retune, or fix routing.
What is a silent alert and why does it matter?
A silent alert is a rule whose labels do not match any route in your notification policy tree, so even if it fires no one gets paged. These look healthy in the UI and are the most dangerous kind of drift, which is why the audit calls them out separately.
What if a rule legitimately should not fire in 90 days?
Plenty of safety-net rules are like that. The audit flags them as candidates for review, not for deletion. You can close the Linear ticket as wontfix and the rule keeps running.
Can I send the cleanup tickets to a team other than Alerting Hygiene?
Yes. Any Linear team works. You set the team name once when the workflow is set up, and the audit files tickets there each month.
Does this work with Grafana Cloud and self-hosted Grafana?
Yes. As long as the service account token can read alert rules, contact points, the notification policy tree, and annotations, both work the same way.

Stop letting silent alerts rot in Grafana.

Connect Grafana, Linear, and Slack once and Geni audits your alert rules on the first Monday of every month.