Run your private RunPod model over a Google Sheet nightly

Every night at 2am, send pending rows from your Google Sheet to your private RunPod model, write the answers back, and post a Slack summary.

Agentic Task
RunPodGoogle SheetsSlackOperationsEngineeringAI ReportsLead EnrichmentContent Generation

Every night at 2am, run a batch LLM enrichment over a Google Sheet using my private RunPod serverless endpoint, then post a summary to Slack. This should be an agent (not code) because output parsing, retries on malformed responses, and picking the most interesting rows all need judgement.

Trigger: cron, every night at 2am in my local timezone.

Step 1. Read the sheet. Call Google Sheets Get Values on the configured spreadsheet and range. The sheet has a header row that includes a 'status' column plus one or more input columns (typically 'prompt' and 'context'). Build a list of rows where status is empty or equals 'pending'. If there are no pending rows, post a short Slack message saying nothing to do and exit.

Step 2. For each pending row, construct an input payload. Take a configurable system instruction (set when the workflow is created) and combine it with the row's input columns to form the user message. Submit the job to my RunPod serverless LLM endpoint using Run Serverless Job (Async). Poll the job until it completes or fails. The endpoint runs vLLM or SGLang and returns a completion plus usage stats (prompt tokens, completion tokens, worker-seconds).

Step 3. Parse the output. It may be JSON, free text, or a refusal. If JSON, extract the structured fields I asked for. If free text, capture the response verbatim. If the model refuses or returns malformed output, retry up to 3 times with a slightly adjusted prompt. After the final retry, mark the row as failed and record the error reason.

Step 4. Write back to the sheet. Call Google Sheets Batch Update Values to update the correct cells in that row: response text, model used, prompt tokens, completion tokens, worker-seconds, total cost in dollars (computed from worker-seconds and the endpoint's per-second price), and status set to 'done' or 'failed'. Use batch updates instead of one update per row so we don't hit the per-cell rate limit.

Step 5. When the whole batch is finished, post a summary to the configured Slack channel using Send a Message. Include: total rows processed, rows succeeded, rows failed, total spend in dollars, total worker-seconds used, and the three most interesting outputs (agent picks based on length, uniqueness, or anomaly compared to the rest of the batch) with a one-line reason for each pick and a link back to the spreadsheet.

Configuration the workflow should expose: spreadsheet ID, sheet/tab name, input column names, output column names, RunPod endpoint ID, system instruction, max retries per row, Slack channel, and the cron expression.

Additional information

What does this prompt do?
  • Pulls pending rows from your Google Sheet each night and sends them to your private RunPod model in batch.
  • Writes each model response, status, token counts, and seconds of GPU time back to the matching row so you can see exactly what was processed.
  • Handles messy outputs gracefully, retrying when the model returns garbled JSON or refuses to answer.
  • Posts a morning Slack summary with rows processed, rows failed, total spend in dollars, and the three most interesting results.
What do I need to use this?
  • A Google account with edit access to the spreadsheet you want to enrich.
  • A RunPod account with a private serverless model endpoint already deployed (vLLM, SGLang, or similar).
  • A Slack workspace where Geni can post the morning summary.
  • A 'status' column in your sheet so the workflow knows which rows are pending and which are done.
How can I customize it?
  • Change when it runs. 2am is a sensible default, but pick any time or cadence that fits your team.
  • Swap the model. Point it at any private RunPod endpoint you have, whether that's a different size, quantization, or fine-tune.
  • Pick which Slack channel gets the morning summary and how many of the most interesting results to highlight.
  • Tweak the system instruction that goes with every row to control tone, output format, or what counts as a refusal.

Frequently asked questions

Do I need to write any code?
No. The agent reads your sheet, calls your RunPod model, and writes the answers back. You configure it once with plain language, and Geni handles the rest.
What happens to rows that already have a result?
They are skipped. The workflow only touches rows where the status column is empty or marked 'pending', so it's safe to re-run.
Can I use a hosted model like GPT-4 instead of RunPod?
This prompt is built around RunPod so you can use a private model that's much cheaper than hosted APIs at batch scale. If you'd rather use a hosted provider, start from a different prompt.
What happens if my model returns invalid output?
The agent retries on malformed or refused responses a few times, then marks the row as failed with the reason. Failed rows show up in the Slack summary so you can review them in the morning.
Will this work with vLLM and SGLang endpoints?
Yes. Any private serverless RunPod endpoint that accepts a prompt and returns a completion will work, including vLLM, SGLang, and custom inference handlers.

Stop overpaying for batch AI jobs.

Connect RunPod, Google Sheets, and Slack once, and Geni runs your enrichment overnight, every night.