Fetch and CDP Design

This note records the intended direction for browser-backed fetch support.

Boundary

afhttp should stay a deterministic transport/acquisition tool. It should not become a web-reading agent.

The tool may:

fetch HTTP responses
run a browser renderer when explicitly requested or when a future fetch mode requires it
preserve browser state behind a session_id
expose a low-level CDP command tunnel
return raw facts and artifacts such as headers, body files, rendered HTML, screenshots, network logs, and CDP results

The tool must not:

infer page intent
decide that a page is a login page
choose selectors
fill forms or click buttons on its own
parse HTML into “meaning”
run readability/markdown extraction as a core behavior
invent a Playwright-like action API

The agent decides what a page means and what to do next.

Fetch

fetch should mean “acquire this URL”, not “understand this page”.

The useful v1 shape is:

{"code":"fetch","id":"p1","url":"https://example.com","render":"auto","session_id":"work"}

fetch should return a transport result with facts:

{
  "code": "fetch_result",
  "id": "p1",
  "status": 200,
  "final_url": "https://example.com/",
  "headers": {},
  "body_file": "/tmp/afhttp/p1.body.html",
  "rendered_html_file": "/tmp/afhttp/p1.rendered.html",
  "screenshot_file": "/tmp/afhttp/p1.png",
  "network_file": "/tmp/afhttp/p1.network.json",
  "trace": {
    "render_used": true,
    "duration_ms": 850
  }
}

Small raw bodies may be inline using the existing body / body_base64 style. Large or browser-produced outputs should use _file fields.

render should be a transport option:

never: only HTTP
auto: implementation may use browser rendering if the fetcher is configured to do so
always: require browser rendering or return a structured unavailable error

fetch should not expose browser backend names as normal agent-facing choices. Backend selection is an implementation detail. If a debug mode is needed later, it should be explicitly marked as debug/diagnostic.

CDP Tunnel

If browser support is implemented through CDP, expose CDP as a pipe-mode escape hatch instead of inventing click, type, or navigate actions.

Input:

{
  "code": "cdp",
  "id": "cmd1",
  "session_id": "work",
  "method": "Runtime.evaluate",
  "params": {
    "expression": "document.title",
    "returnByValue": true
  }
}

Output:

{
  "code": "cdp_result",
  "id": "cmd1",
  "session_id": "work",
  "method": "Runtime.evaluate",
  "result": {},
  "trace": {
    "duration_ms": 12
  }
}

Errors should use normal code: "error" envelopes with CDP-specific error_code values such as cdp_unavailable, cdp_error, or cdp_timeout.

Events should not stream by default. CDP emits too much data for an agent transcript. Prefer an explicit wait_for field first:

{
  "code": "cdp",
  "id": "nav1",
  "session_id": "work",
  "method": "Page.navigate",
  "params": {"url": "https://example.com"},
  "wait_for": {"event": "Page.loadEventFired", "timeout_ms": 10000}
}

Long-lived event subscriptions can be added later if a real use case requires them.

Sessions

Only expose logical session_id values. Do not expose CDP targetId, browserContextId, or internal session identifiers in normal output.

Rules:

missing session_id: use an ephemeral context
provided session_id: create or reuse that browser context
close: close all browser contexts
optional future command: close one named session

The session may contain cookies, localStorage, sessionStorage, current page target, and CDP attachment state. These details are implementation internals.

Agent Workflow

For a page that requires credentials, the tool still only returns facts. A typical workflow is:

fetch target URL with session_id.
Agent sees the returned URL/body/rendered artifact and decides login is needed.
Agent sends CDP commands with the same session_id to inspect DOM, fill fields, submit, and wait for a chosen event.
Agent calls fetch again with the same session_id.

The tool does not decide that login is required and does not choose selectors.

Help Without Intelligence

It is acceptable to add a cdp_help or capabilities command that returns static templates and supported features. That help should be factual, not adaptive:

common CDP method examples
supported domains
result size limits
whether a browser backend is available

This helps an agent use the tunnel without making afhttp itself infer intent or plan actions.