Lesson 9 — Parse and Agent: From Local Files to Autonomous Research

⏱ Est. reading time: 3 min Updated on 5/7/2026

Firecrawl doesn't just handle online webpages; it can also deeply parse local documents and even act as your autonomous AI research assistant.

9.1 Parse: Structured Parsing for Local Files

The Parse tool allows you to convert unstructured local documents directly into AI-readable Markdown or structured JSON.

Supported Formats:

  • PDF (Most common)
  • Word (.docx, .doc)
  • Excel (.xlsx, .xls)
  • HTML/RTF

Core Feature: Structured PDF Extraction

For contracts or financial reports, Parse's Extract mode is incredibly powerful:

{
  "filePath": "/path/to/contract.pdf",
  "formats": ["json"],
  "jsonOptions": {
    "prompt": "Extract the names of both parties, the start date, and the total amount",
    "schema": { ... }
  }
}

Tip: When parsing large PDFs, be sure to set the maxPages parameter to prevent Token overflow.


9.2 Agent (FIRE-1): Autonomous Research Assistant

The Agent is an advanced cloud-only feature of Firecrawl (executed asynchronously). You provide a research topic, and it automatically:

  1. Searches for relevant webpages.
  2. Browses multiple pages to extract information.
  3. Summarizes and outputs the final result.

Use Cases:

  • "Research and compare the pricing strategies of Firecrawl and Tavily."
  • "Summarize the functional differences of the most popular AI coding assistants in 2026."

9.3 Agent Asynchronous Workflow

Since Agent research tasks typically take 2–5 minutes, the process is as follows:

  1. Start Task: Call firecrawl_agent to receive a Job ID.
  2. Poll Status: Call firecrawl_agent_status every 30 seconds.
  3. Retrieve Data: Once the task is complete, get the result from the data field in the response.

9.4 Agent vs Other Tools

Your Need Recommended Tool
Know which URL to get data from Scrape
Know what keyword to search for summaries Search
Vague research goal requiring multi-site data Agent
Processing PDF reports on your computer Parse