Enrich new Airtable leads from their website, sync to HubSpot

When a new lead lands in Airtable, an agent reads their company website, fills in a clean profile, and pushes the contact into HubSpot.

Agentic Task
AirtableProxy ScrapeHubSpotSalesOperationsLead EnrichmentData Sync

Build me an agent-based workflow that enriches new sales leads from their company website and writes the result into both Airtable and HubSpot.

Trigger: an Airtable poll trigger on the 'Leads' base, watching the 'Leads' table for the 'new_record' event. Each new row will have at minimum a contact email and a company website URL. The table also has writable columns for the enriched profile (see below).

When a new row fires, the agent should do the following:

1. Use Proxy Scrape's 'Fetch a Web Page' operation to pull the company homepage as markdown. Start on the default proxy tier. If that fails (blocked, 403, 429, empty body), retry on the 'premium' tier, then 'stealth' as the final fallback. Then fetch the '/about' page on the same domain using the same escalation. Skip the about fetch gracefully if it 404s.

2. From the combined markdown, extract a structured company profile with these fields: one-sentence company description, primary industry, apparent company size signals (e.g. team page count, 'series B', 'enterprise'), main product or service, target customer, and any recent news, launches, or funding mentioned on-site. Leave a field blank rather than guessing if the website doesn't support it.

3. Write the extracted fields back to the originating Airtable row using the 'Update Record' operation. Also set an 'enrichment_status' column to 'enriched' and stamp 'enriched_at' with the current timestamp.

4. Upsert the contact into HubSpot using the 'Batch Upsert Contacts' operation, matched by the email property. Map the company description onto a contact property (e.g. 'company_description' or 'about_company') and the industry onto HubSpot's standard 'industry' field. Include the company name and website on the contact too.

Failure handling: if every proxy tier still fails to return usable content for the homepage, do not attempt extraction. Instead, update the Airtable row with 'enrichment_status' = 'enrichment_failed' and a short 'enrichment_error' note explaining why (e.g. 'site unreachable on stealth tier', 'domain returned 404'). Do not push a partial contact to HubSpot in that case.

Why an agent, not code: extracting a clean company profile from unstructured marketing copy is a judgement task. Site layouts vary wildly, and a rigid scraper will pull the wrong sentences. An LLM agent reading the markdown can produce consistent structured fields across very different sites.

Why Proxy Scrape specifically: most SaaS marketing sites sit behind Cloudflare or similar bot protections, so a plain HTTP fetch will return a challenge page. Proxy Scrape's tiered proxy handles that, and the markdown return format is already LLM-friendly.

Additional information

What does this prompt do?
  • Watches your Airtable Leads base and reacts the moment a new row appears with an email and a company website.
  • Reads the company's homepage and about page, even on sites that block normal bots, and pulls out a one-line description, primary industry, size signals, main product, target customer, and any recent news or launches.
  • Writes the enriched profile back onto the same Airtable row so your team sees it without leaving the base.
  • Upserts the contact into HubSpot, matched by email, with the company description and industry mapped to their contact record.
What do I need to use this?
  • An Airtable base with a Leads table that has at least an email column and a company website column.
  • A HubSpot account where you can add or update contacts.
  • A General Input connection to Proxy Scrape, used to fetch sites that block normal traffic. No extra signup, it is billed through your General Input usage.
How can I customize it?
  • Change which fields the agent fills in. Add things like tech stack guesses, pricing model, or whether the company has open roles.
  • Adjust how failed enrichments are logged. Send them to a Slack channel, drop a task on a teammate, or just mark a status column in Airtable.
  • Swap HubSpot for another CRM, or fan out to multiple destinations so the same enriched record lands in your CRM and your outbound tool.

Frequently asked questions

What happens if the company's website is blocked or down?
The agent first tries a normal fetch, then escalates to a stronger proxy for sites behind Cloudflare or similar protections. If even that fails, the row is marked as 'enrichment failed' so a human can review it instead of silently dropping the lead.
Does the contact have to already exist in HubSpot?
No. Contacts are matched by email and either created or updated, so you can use this to onboard brand new leads as well as keep existing ones fresh.
How fresh is the data?
The agent reads the website live every time a new lead lands, so the description, products, and recent news reflect what the site says today, not a stale third-party database from last quarter.
Will it work if my Leads table uses different column names?
Yes. You map your own column names when you set up the workflow. As long as each lead has an email and a website URL, the agent can run.
Can I run this on leads I've already collected, not just brand new ones?
Yes. You can change the trigger to fire on any new or updated row, or run a one-time backfill by flipping a checkbox column the agent watches.

Stop hand-researching every new lead.

Connect Airtable and HubSpot once, and Geni enriches every new row from the company website in seconds.