Lesson 1 — What is Firecrawl: Web Data Infrastructure for AI

⏱ Est. reading time: 5 min Updated on 5/7/2026

1.1 Positioning

Firecrawl is a web data infrastructure specifically designed for AI Agents and LLM applications. It is not just a simple web scraper, but a comprehensive platform for web data acquisition, transformation, and interaction, turning unstructured web content into clean data ready for AI consumption.

Core Value Propositions:

  • Search: Built-in search engine; one call returns both search results and page content.
  • Scrape: Convert any URL into Markdown, HTML, screenshots, or structured JSON.
  • Interact: Perform clicks, form fills, and scrolls in the browser to handle dynamic content.
  • Autonomous Research (Agent): Allows AI Agents to browse multiple sites autonomously for complex research tasks.
  • File Parsing (Parse): Directly convert local PDF, Word, and Excel files into structured data.

1.2 Core Capabilities at a Glance

Capability API Endpoint MCP Tool Name Description
Scrape /v1/scrape firecrawl_scrape Scrape a single page with JS rendering support
Search /v1/search firecrawl_search Integrated search and scrape
Crawl /v1/crawl firecrawl_crawl Batch deep scraping of entire sites
Map /v1/map firecrawl_map Discover all URLs of a site
Extract /v1/extract firecrawl_extract Structured multi-page extraction via LLM
Interact /v1/scrape + interact firecrawl_interact Browser interaction after scraping
Parse /v1/parse firecrawl_parse Local file parsing
Agent /v1/agent firecrawl_agent Autonomous browsing research agent

1.3 Architecture Overview

Firecrawl's underlying architecture ensures stability under high concurrency and complex anti-scraping environments:

  1. API Server (Express.js): Handles request dispatching, authentication, and routing.
  2. Worker Queue (BullMQ/Redis): Manages asynchronous tasks like Crawl and Agent jobs.
  3. Browser Engine (Playwright): A pool of headless browsers for JS rendering and interaction.
  4. Proxy Pool: Built-in global residential proxies providing three levels of anti-scraping protection.

1.4 Use Cases

Scenario Recommended Tool Combination
AI Agents getting real-time web info Search → Scrape
Building RAG knowledge bases Map → Crawl → Markdown
Competitor price monitoring Extract + JSON Schema
Batch technical doc collection Map (search) → Crawl
Scraping sites that require login Scrape → Interact
Local PDF/Word doc parsing Parse

1.5 Two "Agent" Concepts: Don't Get Confused

Within the Firecrawl ecosystem, there are two distinct types of Agents with very different responsibilities:

Feature Firecrawl Agent (FIRE-1) LLM Agent (e.g., Claude Code)
Location Firecrawl Cloud Servers Your local development environment
Decision Maker Firecrawl AI decides what to search/visit LLM decides which MCP tool to call
Execution Mode One call completes all steps Multiple calls in a loop
Billing Dynamic billing (on-demand) Standard API call billing

Key Difference:

  • Cloud Mode: Claude Code makes one call, and the Firecrawl Cloud Agent autonomously performs all searching and browsing.
  • Local Mode: Claude Code acts as the Agent itself, calling different Firecrawl tools step-by-step to compile the final result.