Self-host Firecrawl

Run self-hosted Firecrawl only when control beats managed reliability.

Self-hosting is useful for internal security boundaries and custom services. It also adds maintenance, queue, browser, network, and AGPL review work.

Infrastructure

  • Docker or Kubernetes runtime.
  • API server, worker, Redis, PostgreSQL where configured, and browser service.
  • Health checks for API, queue, and browser extraction.

Environment names

  • PORT, HOST, and authentication mode.
  • Optional AI provider variables for structured extraction.
  • Proxy, SearXNG, Supabase, and parser keys only when needed.

Security

  • Strong database credentials.
  • Private database ports.
  • Protected queue admin access.
  • Rate limits and outbound domain policy.

AGPL-3.0 checkpoint

Firecrawl is primarily AGPL-3.0. If you modify and publicly run a network service based on AGPL-covered code, plan source availability, notices, license text, and legal review before launch.

Self-host limits to document

The upstream self-host guide notes limitations around advanced managed capabilities. Public copy should not promise the same reliability, proxy coverage, or anti-blocking behavior as the official cloud unless you can prove it in your own deployment.

StageDecisionEvidence to keep
Local smokeCan scrape a simple allowed URL.Request, response, logs, queue status.
Browser pathCan handle JavaScript-heavy pages needed by the product.Browser service logs, timeout settings, screenshot sample.
Scale pathCan crawl or batch scrape without queue collapse.Limits, retry rules, memory and CPU observations.
Compliance pathCan enforce allowed domains, robots policy, and data boundaries.Policy file, deny list, audit log sample.
License pathCan satisfy AGPL notices and source obligations.License page, source link, modification changelog.