Fetch and CDP Design
This note records the intended direction for browser-backed fetch support.
Boundary
afhttp should stay a deterministic transport/acquisition tool. It should not
become a web-reading agent.
The tool may:
- fetch HTTP responses
- run a browser renderer when explicitly requested or when a future fetch mode requires it
- preserve browser state behind a
session_id - expose a low-level CDP command tunnel
- return raw facts and artifacts such as headers, body files, rendered HTML, screenshots, network logs, and CDP results
The tool must not:
- infer page intent
- decide that a page is a login page
- choose selectors
- fill forms or click buttons on its own
- parse HTML into “meaning”
- run readability/markdown extraction as a core behavior
- invent a Playwright-like action API
The agent decides what a page means and what to do next.
Fetch
fetch should mean “acquire this URL”, not “understand this page”.
The useful v1 shape is:
{"code":"fetch","id":"p1","url":"https://example.com","render":"auto","session_id":"work"}
fetch should return a transport result with facts:
{
"code": "fetch_result",
"id": "p1",
"status": 200,
"final_url": "https://example.com/",
"headers": {},
"body_file": "/tmp/afhttp/p1.body.html",
"rendered_html_file": "/tmp/afhttp/p1.rendered.html",
"screenshot_file": "/tmp/afhttp/p1.png",
"network_file": "/tmp/afhttp/p1.network.json",
"trace": {
"render_used": true,
"duration_ms": 850
}
}
Small raw bodies may be inline using the existing body / body_base64 style.
Large or browser-produced outputs should use _file fields.
render should be a transport option:
never: only HTTPauto: implementation may use browser rendering if the fetcher is configured to do soalways: require browser rendering or return a structured unavailable error
fetch should not expose browser backend names as normal agent-facing choices.
Backend selection is an implementation detail. If a debug mode is needed later,
it should be explicitly marked as debug/diagnostic.
CDP Tunnel
If browser support is implemented through CDP, expose CDP as a pipe-mode escape
hatch instead of inventing click, type, or navigate actions.
Input:
{
"code": "cdp",
"id": "cmd1",
"session_id": "work",
"method": "Runtime.evaluate",
"params": {
"expression": "document.title",
"returnByValue": true
}
}
Output:
{
"code": "cdp_result",
"id": "cmd1",
"session_id": "work",
"method": "Runtime.evaluate",
"result": {},
"trace": {
"duration_ms": 12
}
}
Errors should use normal code: "error" envelopes with CDP-specific
error_code values such as cdp_unavailable, cdp_error, or
cdp_timeout.
Events should not stream by default. CDP emits too much data for an agent
transcript. Prefer an explicit wait_for field first:
{
"code": "cdp",
"id": "nav1",
"session_id": "work",
"method": "Page.navigate",
"params": {"url": "https://example.com"},
"wait_for": {"event": "Page.loadEventFired", "timeout_ms": 10000}
}
Long-lived event subscriptions can be added later if a real use case requires them.
Sessions
Only expose logical session_id values. Do not expose CDP targetId,
browserContextId, or internal session identifiers in normal output.
Rules:
- missing
session_id: use an ephemeral context - provided
session_id: create or reuse that browser context close: close all browser contexts- optional future command: close one named session
The session may contain cookies, localStorage, sessionStorage, current page target, and CDP attachment state. These details are implementation internals.
Agent Workflow
For a page that requires credentials, the tool still only returns facts. A typical workflow is:
fetchtarget URL withsession_id.- Agent sees the returned URL/body/rendered artifact and decides login is needed.
- Agent sends CDP commands with the same
session_idto inspect DOM, fill fields, submit, and wait for a chosen event. - Agent calls
fetchagain with the samesession_id.
The tool does not decide that login is required and does not choose selectors.
Help Without Intelligence
It is acceptable to add a cdp_help or capabilities command that returns
static templates and supported features. That help should be factual, not
adaptive:
- common CDP method examples
- supported domains
- result size limits
- whether a browser backend is available
This helps an agent use the tunnel without making afhttp itself infer intent
or plan actions.