Qlik Cloud reload failure triage agent

Every hour during business hours, scan Qlik Cloud for failed or canceled reloads, post a triage note in Slack, and open a ServiceNow incident for business-critical apps.

Agentic Task
Qlik CloudSlack BotServiceNowOperationsEngineeringNotifications & AlertsResearch & Monitoring

Build me an agent workflow that triages Qlik Cloud reload failures on a schedule and routes them to Slack and ServiceNow.

Trigger: a cron schedule that runs once an hour during business hours (default Monday to Friday, 7am to 7pm in my local time zone). Persist the timestamp of the last successful run so each invocation only considers reloads that finished after that point.

On each run, do the following:

1. Determine the set of Qlik Cloud apps to watch. Accept a configurable list of app IDs as input. If the list is empty, fall back to every app the account can see by calling List Apps in Qlik Cloud.

2. For each app in scope, call List Reloads in Qlik Cloud and filter to reloads whose status is FAILED or CANCELED and whose end time is after the last run timestamp. Skip reloads that are still QUEUED or RELOADING.

3. For each failed or canceled reload, call Get App Details in Qlik Cloud to fetch the human-readable app name and owner. Capture the reload ID, end time, duration, and any error log or message available on the reload record.

4. Write a short triage note per failure (3 to 6 lines) that includes:

- The app name and a link back to the app in Qlik Cloud.

- When it failed (in the user's local time zone) and how long it ran.

- Final status (FAILED or CANCELED) and the reload ID.

- The most likely cause inferred from the error text. Pick from categories like script error, source connectivity, permissions, timeout, missing field or table, license or capacity, or unknown.

- One concrete "what to check first" suggestion tied to that likely cause.

5. Post each triage note to a configured Slack channel (default #data-platform) by calling Send a Message in Slack Bot. Use Slack mrkdwn formatting. Group multiple failures from the same run into a single message if there are more than three, with each failure as its own short section.

6. For any failed reload on an app whose ID appears in a configurable businessCriticalAppIds list, also call Create Incident in ServiceNow. Set:

- short_description: "Qlik reload failed: <app name>"

- description: the full triage note plus the Qlik app link and failed reload ID.

- priority: configurable, defaulting to 2 (High) for business-critical apps.

- category: "Software" and a configurable assignment_group for the data platform team.

After creating the incident, append the incident number back into the Slack message so the team can click straight through.

7. If no failures were found since the last run, exit silently without posting to Slack or opening a ServiceNow incident.

8. After a successful run, update the stored last-run timestamp to the current time so the next invocation picks up only new failures.

Inputs the workflow should expose:

- watchedAppIds: optional list of Qlik app UUIDs (empty = all apps).

- businessCriticalAppIds: optional list of Qlik app UUIDs that escalate to ServiceNow.

- slackChannel: target channel name or ID (default #data-platform).

- serviceNowAssignmentGroup: assignment group for created incidents.

- defaultIncidentPriority: integer 1 to 5, defaulting to 2.

- timezone: IANA time zone for schedule and timestamps.

Tone for the triage note: calm, factual, no marketing language, no emojis beyond a single status indicator. The note should read like an on-call summary, not a slack rant.

Additional information

What does this prompt do?
  • Checks Qlik Cloud once an hour for reloads that failed or were canceled since the last run.
  • Writes a short triage note for each failure with the app name, when it failed, the likely cause, and what to check first.
  • Posts the notes to your data platform Slack channel so the team sees them in one place.
  • For apps you mark as business critical, also opens a ServiceNow incident with the priority and details linked back to the failed reload.
  • Stays quiet when nothing failed, so the channel only lights up when there is real work to do.
What do I need to use this?
  • A Qlik Cloud tenant where you can generate an API key and see the apps you care about.
  • A Slack workspace and a channel for the team that handles data platform issues.
  • A ServiceNow login with permission to create incidents, if you want incidents for business-critical apps.
  • Optional: the list of Qlik app IDs you consider business critical, so the agent knows when to escalate.
How can I customize it?
  • Change the schedule, for example every 30 minutes, or limit it to weekdays between 7am and 7pm in your time zone.
  • Narrow the watch list to specific Qlik apps or spaces, or leave it on everything you can see.
  • Mark which apps are business critical so only those open a ServiceNow incident, and pick the default incident priority.
  • Pick a different Slack channel, switch to a private channel, or send to a DM instead.
  • Tune the triage note style, for example add owner mentions for specific apps or include a link to the reload log.

Frequently asked questions

What counts as a failure?
Any Qlik Cloud reload whose final status is FAILED or CANCELED since the last time the agent ran. Successful reloads are ignored.
Will it spam the channel if a reload keeps retrying?
No. The agent only looks at reloads that finished since the last run, so a single failure produces a single note. If the same app fails again on the next retry, it posts a fresh note for that new failure.
Do I have to list every app I want to watch?
No. You can leave it on all the apps your Qlik account can see, or you can give the agent a specific list of app IDs or a space to focus on.
How does it decide what is business critical?
You give it a list of Qlik app IDs that you consider critical. Only failures on those apps create a ServiceNow incident. Every other failure just goes to Slack.
What if my team uses Microsoft Teams instead of Slack?
Swap Slack for your preferred chat tool when you set the agent up. The Slack channel post is just the delivery step, so any channel the agent can post to will work.
Will it run overnight?
By default it runs once an hour during business hours. You can extend or restrict the schedule when you set it up, including overnight if your data loads run then.

Stop finding out about broken dashboards from your stakeholders.

Connect Qlik Cloud, Slack, and ServiceNow once, and Geni watches every hour so the data platform team sees failed reloads before anyone else does.