Skip to main content
This example shows how to build a product extraction agent using Upsonic’s Agent with the built-in FirecrawlTools. Point it at any shopping website and it scrapes the page, pulls out product names, prices, and short descriptions, then returns the results as a clean, structured table. The example targets books.toscrape.com β€” a publicly available, scraping-safe demo bookstore β€” but the same pattern works for any publicly accessible e-commerce site.

Overview

The agent has three components:
  1. Agent β€” LLM-driven agent that orchestrates scraping and data extraction
  2. FirecrawlTools β€” Built-in Upsonic toolkit wrapping the Firecrawl API; only scrape_url is enabled to keep the tool surface minimal
  3. Task β€” Defines the target URL and the exact output format

Project Structure

firecrawl_shopping_scraper/
β”œβ”€β”€ main.py          # Entry point: Agent + FirecrawlTools + Task
β”œβ”€β”€ requirements.txt # Python dependencies
└── .env             # API keys (never commit this file)

Environment Variables

Get your free Firecrawl API key β€” Sign up at firecrawl.dev, navigate to your dashboard, and copy your key. No credit card required to get started.
# Required: Firecrawl API key β€” https://firecrawl.dev
FIRECRAWL_API_KEY=fc-your-key-here

# Required: LLM provider key (example uses Anthropic Claude)
ANTHROPIC_API_KEY=your-anthropic-key-here

Installation

# With uv (recommended)
uv venv && source .venv/bin/activate
uv pip install upsonic "firecrawl-py" python-dotenv

# With pip
python3 -m venv .venv && source .venv/bin/activate
pip install upsonic firecrawl-py python-dotenv

Complete Implementation

main.py

import os
from dotenv import load_dotenv
from upsonic import Agent, Task
from upsonic.tools.custom_tools.firecrawl import FirecrawlTools

load_dotenv()

# ── 1. Configure FirecrawlTools β€” only scrape_url is needed ───────
firecrawl = FirecrawlTools(
    enable_scrape=True,
    enable_crawl=False,
    enable_map=False,
    enable_search=False,
    enable_batch_scrape=False,
    enable_extract=False,
    enable_crawl_management=False,
    enable_batch_management=False,
    enable_extract_management=False,
)

# ── 2. Define the extraction task ─────────────────────────────────
task = Task(
    description="""
    Scrape the homepage of http://books.toscrape.com and extract ALL
    products visible on the page.

    For each product return:
      - Name  (full book title)
      - Price (as shown, e.g. 'Β£51.77')
      - Rating (word form, e.g. 'Three')

    Format the output as a Markdown table:

    | # | Book Title | Price | Rating |
    |---|-----------|-------|--------|

    Sort by price descending. Add a one-line summary at the top
    with the total number of products found and the price range.
    """
)

# ── 3. Create the agent ───────────────────────────────────────────
agent = Agent(
    model="anthropic/claude-sonnet-4-6",
    tools=[firecrawl],
)

# ── 4. Run ────────────────────────────────────────────────────────
result = agent.do(task)
print(result)

requirements.txt

upsonic
firecrawl-py
python-dotenv
anthropic

How It Works

StepWhat Happens
1agent.do(task) sends the task prompt to the LLM
2The LLM calls scrape_url("http://books.toscrape.com") via FirecrawlTools
3Firecrawl fetches the page and returns clean Markdown
4The LLM parses the Markdown, identifies every product block, and extracts name, price, and rating
5Results are formatted as a sorted Markdown table and returned

Sample Output

Found 20 products Β· Price range: Β£10.00 – Β£59.69

| #  | Book Title                                   | Price  | Rating |
|----|----------------------------------------------|--------|--------|
| 1  | Libertarianism for Beginners                 | Β£59.69 | Two    |
| 2  | It's Only the Himalayas                      | Β£52.29 | Two    |
| 3  | The Black Maria                              | Β£52.15 | One    |
| 4  | Starving Hearts (Triangular Trade Trilogy…)  | Β£13.99 | Two    |
| 5  | ...                                          | ...    | ...    |

Extending the Example

Crawl multiple pages

Switch from scrape_url to crawl_website to follow pagination automatically:
firecrawl = FirecrawlTools(
    enable_scrape=False,
    enable_crawl=True,
    enable_crawl_management=True,
)

task = Task(
    description="""
    Crawl http://books.toscrape.com (up to 5 pages) and extract every
    product: name, price, and rating. Return a single Markdown table
    sorted by price descending.
    """
)

Structured JSON extraction

Use extract_data for schema-driven, LLM-powered extraction directly inside Firecrawl:
firecrawl = FirecrawlTools(
    enable_scrape=False,
    enable_extract=True,
)

task = Task(
    description="""
    Use extract_data on http://books.toscrape.com/* with this JSON schema:
    {
      "products": [
        {"name": "string", "price": "string", "rating": "string"}
      ]
    }
    Return the raw structured result.
    """
)

Point at a different shop

Replace the URL in the task description with any publicly accessible store:
task = Task(
    description="""
    Scrape https://your-target-shop.com and extract all visible products.
    For each product return name, price, and a short description (1-2 sentences).
    Format as a Markdown table sorted by price descending.
    """
)

Key Features

FeatureDetail
Minimal tool surfaceOnly scrape_url is enabled β€” the agent cannot accidentally crawl, search, or batch-scrape
Clean Markdown inputFirecrawl strips boilerplate and returns structured Markdown, making product parsing straightforward
Model-agnosticSwap anthropic/claude-sonnet-4-6 for any Upsonic-supported provider
ExtensibleSwitch to crawl_website or extract_data for multi-page or schema-driven extraction

Security Notes

  • The agent only has access to scrape_url β€” it cannot read local files, execute code, or access other systems.
  • Only point the agent at publicly accessible URLs. Firecrawl respects robots.txt by default.
  • Store FIRECRAWL_API_KEY and ANTHROPIC_API_KEY in .env β€” never hardcode keys in source files.

Repository

View the full example: Firecrawl Shopping Scraper