Skip to content

feat(tools): add Cloudflare Browser Rendering backend for web_crawl#1008

Open
vu1n wants to merge 1 commit intoNousResearch:mainfrom
vu1n:feat/web-crawl-cloudflare
Open

feat(tools): add Cloudflare Browser Rendering backend for web_crawl#1008
vu1n wants to merge 1 commit intoNousResearch:mainfrom
vu1n:feat/web-crawl-cloudflare

Conversation

@vu1n
Copy link

@vu1n vu1n commented Mar 12, 2026

Summary

  • Registers web_crawl as a tool (previously defined but unregistered) with two backends: Firecrawl (existing) and Cloudflare Browser Rendering (new, per changelog)
  • Backend auto-detected from env vars (CF_BROWSER_TOKEN + CF_ACCOUNT_ID → Cloudflare, FIRECRAWL_API_KEY → Firecrawl), or forced per-call via the backend parameter
  • Cloudflare implementation: async POST to start crawl job, poll for completion, paginate results, normalize to same page format as Firecrawl
  • Adds web_crawl to the web toolset and _HERMES_CORE_TOOLS
  • Adds Cloudflare Browser Rendering as a provider option in CLI setup

Env vars

Variable Purpose
CF_ACCOUNT_ID Cloudflare Account ID
CF_BROWSER_TOKEN API Token with Browser Rendering - Edit permission
WEB_CRAWL_BACKEND Optional global override (cloudflare or firecrawl)

Test plan

  • uv run --extra dev pytest tests/tools/test_web_tools_config.py — 8/8 pass
  • Import verification: registry picks up web_crawl, toolset resolution includes it
  • End-to-end Cloudflare crawl tested against httpbin.org (2 pages crawled successfully)
  • Firecrawl backend regression (requires FIRECRAWL_API_KEY)
  • Test on Linux

Platform tested

  • macOS (Darwin 24.6.0)

@vu1n vu1n force-pushed the feat/web-crawl-cloudflare branch 2 times, most recently from 2f61681 to c6639f1 Compare March 12, 2026 04:07
Register web_crawl as a tool with support for two backends:
- Firecrawl (existing) via the Firecrawl SDK
- Cloudflare Browser Rendering (new) via the /crawl REST API

Backend is auto-detected from env vars (CF_BROWSER_TOKEN + CF_ACCOUNT_ID
for Cloudflare, FIRECRAWL_API_KEY for Firecrawl) or forced per-call via
the `backend` parameter. WEB_CRAWL_BACKEND env var provides a global
override.

Cloudflare backend: POST to start an async crawl job, poll for
completion, paginate results, normalize to the same page format as
Firecrawl output.
@vu1n vu1n force-pushed the feat/web-crawl-cloudflare branch from c6639f1 to b55b469 Compare March 12, 2026 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant