Daily industry research library in Notion
Every morning, capture the most meaningful new articles in your industry to a Notion database with summaries, themes, and zero SEO listicles.
Build me an evolving research library in Notion that captures every meaningful new article in my industry. Run on a cron, every day at 08:00 in my local timezone.
At each run, use Ahrefs Firehose Stream with recent:24h against a rule that covers my industry keywords. I will supply the Lucene query when I configure the workflow, for example: (title:"vector database" OR added:"RAG" OR added:"retrieval augmented") AND (page_type:article OR page_type:guide) AND language:en. Treat that query as the rule and just stream the matches for the last 24 hours.
For each match, read the page title, URL, source domain, and a snippet of the added text from the Firehose event. Then decide what kind of article it is: thought-leadership essay, product launch, how-to, benchmark, or news. Drop pure SEO listicles and shallow roundups. If it is none of the keeper categories, skip it.
Before writing anything to Notion, call Notion Query a Database against my target database, filtered by the URL property, to check whether this article is already saved. If it is, skip it. This dedupes across days when the same article keeps appearing in the firehose.
For each surviving item, call Notion Create a Page in my target database. Set these properties: Title, URL, Source Domain, Theme (a short phrase you infer from the content, like "RAG evaluation" or "vector index pricing"), Article Type (one of: thought leadership, product launch, how-to, benchmark, news), and Published Date. In the page body, write a 2 to 3 sentence why-this-matters summary and quote the single most interesting line from the article.
At the end of the run, log how many items the firehose returned, how many were kept, how many were skipped as duplicates, and how many new pages were created in Notion. I want this workflow built as an agent because the quality filter, theme tagging, and why-this-matters summary all need real reasoning.
Inputs I will provide at setup time: my Lucene query for industry keywords, my Notion database ID, and the timezone for the 08:00 cron.
Additional information
What does this prompt do?
- Each morning at 8am, scans the last 24 hours of fresh web content for articles matching your industry keywords.
- Filters out SEO listicles and shallow news, keeping only thought leadership, product launches, how-tos, benchmarks, and substantive news.
- For every keeper, files a new Notion page with title, link, source, theme, article type, published date, and a 2 to 3 sentence why-this-matters summary.
- Automatically dedupes against your database so you never see the same article twice.
What do I need to use this?
- An Ahrefs Firehose account with a tap configured for your industry keywords.
- A Notion workspace and a database with properties for Title, URL, Source Domain, Theme, Article Type, and Published Date.
- A short list of keywords or phrases that define your industry (you supply these as a search rule).
How can I customize it?
- Change the time of day or run it twice (morning and afternoon) instead of once.
- Adjust your industry keywords or add language and content type filters to tighten the signal.
- Tell the agent which article categories you do or do not want, for example skip news but keep benchmarks.
- Swap the destination database or add extra columns like Priority or Reviewed.
Frequently asked questions
How is this different from Google Alerts?
Can I use this with my existing Notion research database?
Will I get duplicate entries if the same article keeps trending?
Do I need to write a complex query to define my industry?
What kinds of articles get filtered out?
Stop saving links by hand and let your research library build itself.
Connect Ahrefs Firehose and Notion once, and Geni files the best new articles in your industry every morning at 8am.