Lesson 4 — Cloud vs Self-Hosted: How to Choose? — Mastering Firecrawl — The Ultimate Guide to AI Web Scraping

Before deciding between Firecrawl Cloud and Self-Hosted, you need to weigh your options across three dimensions: features, performance, and cost.

4.1 Feature Comparison Matrix

Feature	Cloud	Self-Hosted
Basic Scraping (Scrape)	✅ Fully Supported	✅ Fully Supported
Search Engine (Search)	✅ Built-in	❌ Requires external Search API
Research Agent (Agent)	✅ Exclusive Feature	❌ Not Supported
Residential Proxy Pool	✅ Built-in 3 levels	❌ Must provide your own
Structured Extraction (LLM)	✅ Ready to use	⚠️ Requires own LLM API Key
Browser Interaction	✅ Ready to use	⚠️ Requires Playwright service
Maintenance Cost	Zero Maintenance	Requires server management

Firecrawl Cloud uses a Credit system. In most cases, 1 credit corresponds to 1 page scraped.

Cloud: The Hobby plan ($19/mo) includes 3,000 pages, suitable for small to medium-sized projects or those needing rapid deployment.
Self-Hosted: You bear the server costs (approx. $20-$100/mo) and proxy IP costs. Self-hosting usually becomes economically advantageous only when monthly volume exceeds 500,000 pages.

Use this logic to quickly decide:

Do you need Search functionality?
- Yes → Cloud is preferred (Self-hosted lacks a search index).
Is your monthly volume less than 100,000 pages?
- Yes → Cloud is preferred (Save time on maintenance).
Does your data have strict privacy or compliance requirements?
- Yes → Self-hosting is mandatory.
Do you have strong DevOps skills and seek the lowest possible cost?
- Yes → Self-hosting.

Enable Caching: Use storeInCache: true. Repeated requests for the same URL within the cache period do not consume credits.
Targeted Scraping: Use Map to discover URLs first, then use includeTags to fetch only the core content. This reduces downstream token consumption (saves AI costs).
Remove Images: Setting removeBase64Images: true can significantly speed up response times.