Auto-stop idle RunPod GPUs and log the savings
Every hour, stop any RunPod Pod that has been running too long, log it to a Google Sheet, and post a savings rollup to Slack.
Build a deterministic, code-based workflow that auto-shuts-down long-running RunPod GPU Pods on a schedule. The trigger is a cron that fires every hour.
On each run:
1) Call RunPod's List Pods operation to get every Pod on the account.
2) Filter that list down to Pods that meet ALL of these conditions: status is RUNNING (or desiredStatus is RUNNING), uptime is greater than a configurable threshold (default 6 hours, derived from the Pod's lastStatusChange / costPerHr / createdAt timestamp depending on what RunPod returns), and the Pod name does NOT start with any of the protected prefixes in a configurable list (default ["keep-", "prod-"]).
3) For each Pod that qualifies, call RunPod's Start, Stop, Restart, or Reset Pod operation with the stop action, passing the Pod ID.
4) For each Pod that was successfully stopped, append a row to a Google Sheet via the Append Values operation. The columns are: pod id, name, GPU type, cost per hour (USD), uptime at stop (in hours), and ISO timestamp of the stop. The spreadsheet ID and tab name are workflow inputs.
5) After all stops complete, send a single rollup Slack message via Send a Message to a configurable channel. The message should look roughly like: "Auto-stopped {N} idle RunPod Pods. Reclaimed ${X}/hr. See audit log: {sheet link}." Include a short bullet list of the stopped Pods (name and GPU type) underneath.
6) If nothing qualifies on a given run, finish silently. Do not send a Slack message and do not write to the sheet.
Workflow inputs (exposed as configuration, not hardcoded): uptime threshold in hours (default 6), protected name prefixes (default ["keep-", "prod-"]), Google Sheet spreadsheet ID, sheet tab name, and Slack channel ID.
This is a code workflow, not an agent. Every step is a known list, filter, act, log pipeline with no judgement involved. Use discrete deterministic nodes wired together: cron trigger, RunPod List Pods, a filter / transform node, a loop over the filtered Pods that calls RunPod Stop and then Google Sheets Append Values, then a conditional Slack Send a Message at the end. Handle errors per-Pod so one failed stop does not abort the whole sweep.
Additional information
What does this prompt do?
- Checks every running RunPod GPU once an hour and stops the ones that have been on too long.
- Skips anything you mark as protected, like Pods named with a keep- or prod- prefix.
- Appends a row to a Google Sheet for every Pod it stops so you have an audit trail with cost, GPU type, and uptime.
- Posts one short Slack rollup per run with how many Pods were stopped and how many dollars per hour you just reclaimed.
What do I need to use this?
- A RunPod account with an API key that can list and stop Pods.
- A Google Sheet you can write to, with a tab ready to receive the audit log.
- A Slack workspace and a channel where the savings rollup should land.
How can I customize it?
- Change the uptime threshold. The default is 6 hours, but a training team might want 12 and an inference team might want 2.
- Edit the protected name prefixes so production or long-running training Pods are never touched.
- Pick the Slack channel, the Google Sheet, and whether the rollup runs hourly, every few hours, or only during business hours.
Frequently asked questions
Will this stop my production GPUs?
What is the default uptime cutoff?
Does stopping a Pod delete my work?
What if nothing qualifies to be stopped?
Can I see exactly which Pods were stopped and what they cost?
Stop paying for GPUs you forgot to turn off.
Connect RunPod, Google Sheets, and Slack once, and Geni will sweep idle Pods every hour and tell you how much you saved.