Architecture

This document is the canonical contract for afhttp going forward. It supersedes the pipe-mode / curl-mode / JSONL chapters of design.md. The philosophy-only content of design.md (acquisition facts, structured errors, artifact field conventions) still holds and is referenced where it applies.

1. Motivation

Agents fail before they can reason about a page when URL acquisition is opaque. Plain HTTP returns enough that a human at a terminal can guess the next step, but not enough that a program can branch deterministically. The interesting failure surface is concentrated where plain HTTP cannot produce usable artifacts:

afhttp exists to make this whole surface deterministic for agents: the tool returns facts and artifacts that let the agent decide what to do next. It is not a curl replacement and not a browser automation framework.

The contract is deliberately observation-first. A successful fetch should leave the agent with enough raw evidence to choose the next branch: what the server returned, what the browser rendered, what the page said to the console, what network requests actually carried the data, what interactive elements exist, and what this host is capable of doing.

2. Boundary

The tool may:

The tool must not:

The agent decides what a page means and what to do next. Mechanical, well-defined transformations (innerText, accessibility tree projection, DOM bounding boxes, gzip decompression, base64 encoding, request/response body capture) are not “thinking” and are allowed as artifacts. Heuristic transformations are not.

3. Roles

The networked runtime has two roles. Every command that talks to a running browser-host is one of these roles.

RoleResponsibility
browser-hostRuns Chromium (or compatible). Holds an on-disk profile. Exposes a CDP endpoint. Optionally embeds the ops panel.
agent-driverA CDP client. Issues commands to a browser-host endpoint. Writes artifacts to its own local disk. Comes in two flavors: programmatic (afhttp fetch, afhttp cdp, or the SDK from Rust code) and interactive (ops panel served by the host, opened in any browser).

There is no third “operator UI” role. Interactive operation is a CDP client served by the browser-host, opened by a human in a normal browser. The host does not know who its CDP clients are; that is not its concern.

afhttp profile ... commands are local administration helpers. They inspect or modify profile directories on the machine where they run and do not join the endpoint protocol.

4. Topology

Connectivity is not afhttp’s problem. The tool assumes any two afhttp instances that need to talk can reach each other over the network. Mesh, VPN, SSH tunnels, or direct LAN — that is the user’s infrastructure.

Process lifecycle is also not afhttp’s problem. afhttp host is a long-running foreground process. The user starts it however suits them (systemd unit, tmux pane, docker container, shell &, etc.) and stops it with a normal signal. afhttp does not fork, does not write pidfiles, and does not maintain a registry of running hosts.

These two non-concerns mean afhttp only ever sees endpoint URLs. The CLI and SDK take an endpoint and speak CDP. Whether the endpoint is unix:///run/afhttp/work.sock, ws://localhost:9222, or ws://browser.mesh.internal:9222 does not affect the protocol layer.

5. CLI Surface

Seven subcommands. No legacy --mode pipe|cli|curl flag. No JSONL stdin protocol. No curl-shim subcommand.

afhttp host

Long-running foreground process. Holds one Chromium instance and one profile. Exposes one CDP endpoint.

afhttp host
  --listen <addr>            # required; tcp:0.0.0.0:9222 | tcp:127.0.0.1:9222 | unix:/path
  --profile <name|->         # profile dir name under $XDG_DATA_HOME/afhttp/profiles/<name>;
                             # "-" or omitted = ephemeral tempdir, removed on exit
  --display headless|headful # default: headless
  --browser auto|chromium|chrome|edge|brave|lightpanda
                             # default: auto (resolve via $PATH)
  --browser-bin <path>       # override binary discovery
  --token <string>           # optional bearer for TCP listeners; mesh-trusted listens skip it
  --ops on|off               # default: on; serves /ops panel at the listen address
  --health on|off            # default: on; serves /health and /capabilities
  --health-public off|minimal # default: off; unauthenticated /health only reports readiness

Lifecycle: starts the browser, opens the listener, serves CDP plus host HTTP routes (/ops, /health, /capabilities), blocks until SIGTERM / SIGINT. On exit, terminates the browser, removes the profile dir if ephemeral, releases the listener.

afhttp fetch

One-shot URL acquisition. Either connects to an existing host or spawns an inline ephemeral one.

afhttp fetch <url>
  --endpoint <url>           # CDP endpoint; omit to spawn inline ephemeral host
  --token <string>           # bearer for endpoint, if required
  --render none|auto|always  # default: auto. none = HTTP only; always = require browser.
  --tab new|<id>             # default: new (fresh tab); pass an existing tab id to reuse
  --wait load|idle|selector:<css>|ms:<n>
                             # when to consider the page ready; default: load
  --want <list>              # artifacts to produce; default: all browser-applicable
                             # tokens: body,rendered_html,text,screenshot,network,console,observation
  --network-bodies off|xhr|all # default: off; save response bodies referenced from network.json
  --network-body-max-bytes <n> # default: 1048576 per captured response body
  --network-redact on|off    # default: on; redact Cookie/Auth/Set-Cookie-like headers
  --out <dir>                # artifact output directory on caller's machine;
                             # default: ./afhttp-out/<request-id>/
  --timeout <duration>       # overall timeout; default: 30s

Output: one JSON object on stdout. Includes status, final_url, tab_id, per-artifact file paths, trace (render decision, escalation reason, phase timings), and warnings (e.g. per-artifact failures).

Failure: one JSON object on stdout with error_code (enum) and retryable (bool). No partial-write to stdout; the artifact directory may exist with whatever files were captured before the failure.

The inline-ephemeral path (no --endpoint) forces an ephemeral profile. The --profile flag is not available there; persistent profiles require an explicit afhttp host first. Rationale: inline fetches are one-shot and parallel-safe; persistent profile semantics demand a single owner process, which is precisely what afhttp host is.

afhttp cdp

Raw CDP escape hatch. The agent’s tool when fetch is not enough.

afhttp cdp <method>
  --endpoint <url>           # required
  --token <string>
  --tab <id>                 # required (CDP commands are target-scoped)
  --params <json|@->         # method params; "@-" reads from stdin
  --wait <event>:<timeout_ms>  # optional; wait for an event before exiting

Output: one JSON object — the CDP method’s result, or a structured error with error_code. No wrapping of CDP semantics; no click/type/navigate convenience verbs. The agent writes whatever CDP it wants.

afhttp ui

Convenience: print or open the ops panel URL for a given endpoint.

afhttp ui
  --endpoint <url>
  --token <string>
  --open                     # optionally xdg-open / open in default local browser

The ops panel itself is HTML+JS served by the host at /ops. afhttp ui is a thin wrapper for xdg-open <endpoint>/ops?token=.... Power users skip it and paste the URL into their browser directly.

afhttp health

Convenience wrapper around the host /health endpoint. Intended for agents and supervisors to decide whether a host is usable before allocating work.

afhttp health
  --endpoint <url>           # ws:// or http:// host endpoint
  --token <string>

Output: one JSON object with readiness, version, uptime, backend status, current profile summary, active tab count, and a pointer to /capabilities. Health is shallow: it reports whether the host and browser process are alive, not whether a target site is reachable.

afhttp capabilities

Convenience wrapper around the host /capabilities endpoint. Intended for agents that need to branch by backend support before requesting artifacts.

afhttp capabilities
  --endpoint <url>           # ws:// or http:// host endpoint
  --token <string>

Output: one JSON object describing backend family/version, supported artifact tokens, supported wait modes, CDP domains known to be available, profile persistence, ops-panel support, network body capture support, limits, and security policy.

afhttp profile

Local profile lifecycle tooling. These commands operate on profile directories on the machine where they run; they do not manage profiles over a remote endpoint.

afhttp profile list
  --profile-root <dir>       # optional; default: $XDG_DATA_HOME/afhttp/profiles

afhttp profile info <name>
  --profile-root <dir>

afhttp profile lock-status <name>
  --profile-root <dir>

afhttp profile delete <name>
  --profile-root <dir>
  --confirm <name>           # required; refuses to delete locked profiles

afhttp profile prune
  --profile-root <dir>
  --older-than <duration>    # e.g. 30d; refuses to prune locked profiles
  --dry-run

Output is structured JSON for every subcommand. Profile creation still happens by starting afhttp host --profile <name>; there is no separate “create empty profile” operation because Chromium initializes the directory.

6. Host Health and Capabilities Endpoints

afhttp host serves JSON host metadata on the same listener as CDP and the ops panel.

RouteAuthPurpose
GET /healthToken required unless --health-public minimal is setLiveness/readiness for agents and supervisors.
GET /capabilitiesToken requiredDetailed backend and artifact support for planning fetch requests.

When --token is configured, authenticated requests use Authorization: Bearer <token> or the same token query parameter accepted by afhttp ui. Unauthenticated public health is intentionally minimal: it may return only { "status": "ok" } / { "status": "starting" } / { "status": "degraded" } and never exposes profile names, browser versions, paths, tabs, or network policy.

/health response shape:

{
  "code": "health",
  "status": "ok",
  "version": "0.5.0",
  "uptime_s": 42,
  "backend": {"family": "chromium", "version": "124.0.0.0", "connected": true},
  "profile": {"kind": "persistent", "name": "work", "locked": true},
  "tabs_active": 3,
  "capabilities_url": "/capabilities"
}

/capabilities response shape:

{
  "code": "capabilities",
  "backend": {"family": "chromium", "version": "124.0.0.0"},
  "artifacts": {
    "body": {"supported": true},
    "rendered_html": {"supported": true},
    "text": {"supported": true},
    "screenshot": {"supported": true},
    "network": {"supported": true, "body_capture": ["off", "xhr", "all"]},
    "console": {"supported": true},
    "observation": {"supported": true, "source": "accessibility+dom"}
  },
  "wait_modes": ["load", "idle", "selector", "ms"],
  "ops_panel": {"supported": true, "screencast": true},
  "profile": {"persistent": true, "ephemeral": true},
  "limits": {"network_body_max_bytes_default": 1048576}
}

Capabilities are descriptive, not a reservation. A later fetch can still return per-artifact warnings if the page crashes, permissions change, or a CDP method fails.

7. Profile Model

Profiles are Chromium user-data directories. They are host-local on disk and never copied between hosts. A profile holds cookies, localStorage, sessionStorage, IndexedDB, service worker registrations, and cached browser fingerprint state.

Profile portability is explicitly out of scope. Sessions bound to a specific IP/device fingerprint should remain on a single host; the right way to “move a session” is to put afhttp host where the session needs to be and connect to it remotely.

Profile lifecycle metadata

Persistent profile directories include a small afhttp-profile.json metadata file maintained by afhttp host and afhttp profile ... commands:

{
  "schema_version": 1,
  "name": "work",
  "created_at": "2026-05-27T00:00:00Z",
  "last_used_at": "2026-05-27T01:23:45Z",
  "last_host_version": "0.5.0"
}

The metadata is advisory. If a legacy Chromium profile has no metadata file, afhttp profile list still reports it with metadata_present: false and infers size/mtime from the filesystem.

Profile lifecycle commands:

CommandBehavior
profile listLists persistent profiles with kind, path, size, metadata status, last used time, and lock status.
profile info <name>Reports metadata, profile path, approximate disk usage, active lock owner when known, and browser-family hints.
profile lock-status <name>Returns whether the profile is locked and, when possible, the owning pid/start time.
profile delete <name>Deletes an unlocked persistent profile after --confirm <name>. Refuses ephemeral profiles and locked profiles.
profile pruneDeletes unlocked persistent profiles older than --older-than; --dry-run reports the candidate list without deleting.

profile delete and profile prune are intentionally local-only. Remote deletion over the CDP/HTTP endpoint would make a stolen token able to destroy browser identities.

8. Artifacts

Seven artifacts, identified by a stable token.

TokenContentFilenameNotes
bodyRaw HTTP response bodybody.<ext>Always produced when an HTTP response was received. Ext derived from content-type.
rendered_htmlPost-JS DOM serialized to HTMLrendered.htmlOnly when render was used.
textdocument.body.innerTexttext.txtOnly when render was used. Mechanical, not heuristic.
screenshotFull-page PNGpage.pngOnly when render was used.
networkDeep request/response log from CDP Network.* eventsnetwork.jsonAlways produced when a browser was used; HTTP-only fetches produce a one-entry log. Optional captured bodies live under network-bodies/.
consoleConsole eventsconsole.jsonOnly when render was used.
observationAgent-readable accessibility/DOM snapshotobservation.jsonOnly when render was used. Mechanical projection of page state; no semantic ranking or intent inference.

Files are written to --out <dir> (default ./afhttp-out/<request-id>/) on the agent-driver’s machine, not the browser-host’s. The response JSON references them as absolute paths.

Each artifact can fail independently of the overall fetch. A missing screenshot returns warnings: [{artifact: "screenshot", code: "backend_unsupported"}] rather than failing the whole fetch. The agent decides whether the partial result is useful.

Observation artifact

observation.json is the artifact meant for LLM and agent planning loops. It is smaller and more action-oriented than full HTML, but still mechanical data:

{
  "schema_version": 1,
  "url": "https://example.com/dashboard",
  "title": "Dashboard",
  "viewport": {"width": 1280, "height": 720, "device_scale_factor": 1},
  "frames": [{"frame_id": "main", "url": "https://example.com/dashboard"}],
  "nodes": [
    {
      "ref": "obs-17",
      "frame_id": "main",
      "role": "button",
      "name": "Export",
      "text": "Export",
      "visible": true,
      "enabled": true,
      "bbox": {"x": 1032, "y": 88, "width": 91, "height": 36},
      "actions": ["click"]
    }
  ],
  "forms": [],
  "focused_ref": null
}

Refs are stable only within one observation snapshot and the current DOM revision. They are not durable selectors. An agent that wants to act still uses raw CDP and may resolve a ref by coordinates, accessibility node id, backend DOM node id, or a best-effort selector hint included in the node when available.

Allowed observation fields are mechanical: accessibility role/name/state, visible text, bounding box, frame id, href/src/action URLs, form ownership, enabled/checked/selected/focused states, input type, and redacted input value metadata. Disallowed fields: “important”, “likely login”, “best button”, “captcha”, “paywall”, or any page-intent label.

Observation node collection starts with native interactive elements (a[href], button, form controls, summary, iframe) plus explicit interaction markers (role, tabindex, contenteditable=true). It also appends non-semantic elements whose computed cursor is pointer, scanning at most the first 2,000 body elements and adding at most 100 pointer nodes per snapshot. This keeps framework-authored clickable div / span elements visible to agents while preserving a deterministic payload budget.

Network artifact depth

network.json is a structured capture, not a flat HAR dump. It keeps enough information for an agent to discover whether the useful data came from an XHR/fetch request, GraphQL endpoint, document load, script, iframe, service worker, cache hit, or failed resource.

Each entry includes, when available:

Response body capture is opt-in because network logs often contain credentials, PII, and large binary resources.

ModeBehavior
--network-bodies offDefault. Metadata only; no response bodies saved.
--network-bodies xhrSaves text/JSON/XHR/fetch response bodies up to --network-body-max-bytes each.
--network-bodies allAttempts to save every response body up to --network-body-max-bytes each, including documents/scripts/images when CDP exposes them.

Captured bodies are written under network-bodies/<request_id>.<ext> and referenced from network.json via body_file. Binary bodies may be base64 files if the original bytes cannot be represented as UTF-8. Per-entry body capture failures become warnings with artifact: "network" and do not fail the fetch.

9. Ops Panel

The ops panel is a small static HTML+JS application embedded in the afhttp binary and served by afhttp host at /ops. It exists to let a human drive the browser without VNC, X server, or any system-level remote-desktop stack on the host machine.

Architecture. The panel page loaded in the operator’s local browser opens two WebSocket flows against the host:

Risk-control honesty. Capturing real human pointer/keyboard events and replaying them via CDP gives substantially higher fingerprint fidelity than synthesized events from a coarse UI. Specifically:

For sites where the residual ops-panel fingerprint is still detected, the only durable fix is to put afhttp host --display=headful on a machine where a human physically operates the actual Chromium window. The ops panel does not pretend to be that scenario.

Multi-attach. The ops panel is a CDP client. The agent (via afhttp fetch or afhttp cdp) is also a CDP client. CDP supports multiple flattened sessions, so both can be connected to the same browser at the same time. Whichever client sends commands is the one acting. There is no handoff protocol; coordination between agent and human is the agent’s concern.

10. Backends

afhttp’s protocol layer (fetch logic, CDP escape hatch, ops panel) is CDP-generic. The launcher layer (afhttp host) knows specific browser families.

BackendLaunch profile in hostCapabilities
Chromium / Chrome / Edge / Bravechromium (and aliases)Full: body, rendered_html, text, screenshot, network, console, observation, network body capture, ops panel, health/capabilities, multi-attach.
Lightpandalightpandabody, rendered_html (modulo JS engine limits), text, network metadata, console, limited observation. No screenshot, no screencast, no usable ops panel (no rendering), network body capture depends on backend support.
Any other CDP-compatible browsernone — user launches it themselvesWhatever the backend implements. afhttp clients connect via --endpoint.

Unsupported per-artifact operations return per-artifact warnings (backend_unsupported), not whole-fetch failures.

11. Error Codes

All errors carry error_code (stable enum), error (human-readable detail), and retryable (bool). Agents match on error_code only.

Initial enum (extended during implementation):

CodeMeaningRetryable
navigation_timeoutPage did not reach --wait condition before --timeoutyes
render_unavailable--render=always requested but no browser backend availableno
host_unreachableEndpoint connection failed, or browser-side navigation failed with a net::ERR_* symbol that does not match a more specific code belowyes
dns_resolution_failedTarget hostname did not resolve (Chromium ERR_NAME_NOT_RESOLVED, ERR_ICANN_NAME_COLLISION)yes
target_unreachableTCP-level failure to the target (Chromium ERR_CONNECTION_*, ERR_ADDRESS_UNREACHABLE)yes
tls_errorTLS handshake or certificate failure to the target (Chromium ERR_CERT_*, ERR_SSL_*)no
tab_crashedThe CDP target crashed mid-fetchyes
profile_lockedAnother process holds the profile directoryno
browser_launch_failedSubprocess could not startdepends on detail
cdp_unavailableThe endpoint speaks something other than CDPno
cdp_errorCDP method returned an errordepends on method
cdp_timeoutWait-for event did not arrive in timeyes
backend_unsupportedPer-artifact warning; backend lacks the CDP method to produce this artifactno
artifact_capture_failedPer-artifact warning; artifact capture failed even though the backend usually supports itdepends on artifact
network_body_truncatedPer-artifact warning; a captured network body exceeded --network-body-max-bytes and only the prefix was written to diskno
profile_not_foundRequested local profile does not existno
profile_delete_lockedLocal profile delete/prune refused because a host holds the profile lockno
profile_invalid_nameProfile name failed validation or would escape the profile rootno
profile_root_unavailableProfile root does not exist or cannot be read/written as requireddepends on detail

12. Multi-Client Attach

CDP allows multiple flattened sessions per target. The agent and the ops panel are independent clients. The browser is shared state.

Coordination is the agent’s concern, not the protocol’s. Common pattern: the agent emits an out-of-band signal (e.g. to its own orchestrator) saying “I need help on <endpoint>/tab <id>”. A human runs afhttp ui --endpoint ..., does their part, closes the panel. The agent’s next afhttp fetch --tab <id> or afhttp cdp continues from the new browser state.

The Rust SDK keeps one lazy CDP WebSocket per Client and reuses it across fetch / cdp calls until Client::close().await or drop. This cache is per SDK client, not a browser-wide lease: the ops panel and other SDK clients still attach through their own CDP connections, and all of them can continue to multi-attach to the same target. Each one-shot cdp --tab / fetch --tab operation detaches its temporary flattened session when the call completes; --tab controls target lifetime, not connection ownership.

There is no “lease,” “lock,” or “active driver” in the protocol. Both clients can issue commands at any time; if they conflict, that is the user’s coordination bug to solve.

13. Library / SDK

The Rust library exposes the same surface as the CLI, in-process. It is not an embedded browser engine; it is an SDK that talks to a browser-host over CDP/HTTP. Everything that physically requires a Chromium process — launching, active profile locking, and the ops panel — stays in afhttp host. Local profile lifecycle helpers operate on disk and do not pull browser-launch dependencies into SDK-only consumers.

use afhttp::{Client, RenderMode, Wait, Artifact};

let client = Client::connect("ws://chromium-host:9222")?;

let result = client.fetch("https://example.com")
    .render(RenderMode::Auto)
    .wait(Wait::Load)
    .timeout(Duration::from_secs(30))
    .want([Artifact::RenderedHtml, Artifact::Observation, Artifact::Screenshot])
    .network_bodies(NetworkBodies::Xhr)
    .send()
    .await?;
// result.rendered_html_file -> path on the caller's local disk
// result.observation_file -> agent-readable page snapshot

let health = client.health().await?;
let capabilities = client.capabilities().await?;

let cdp = client.cdp("Runtime.evaluate")
    .tab(tab_id)
    .params(json!({ "expression": "document.title" }))
    .send()
    .await?;

client.close().await; // optional: closes the cached CDP connection

// Dev / test convenience: spawn a private host in-process, use it, kill it
// on drop. Requires the `host` feature; pure `features = ["sdk"]` consumers
// connect to an externally started afhttp host instead.
let local = Client::inline_ephemeral().await?;

What the SDK exposes: Client, fetch/cdp/health/capabilities builders, the artifact and error enums, request/response/cookie/render-mode/network-capture types, and local profile-store helpers.

What the SDK does not expose: chromiumoxide types, host launch internals, ops panel internals, or a remote profile-administration API.

CLI is the first SDK consumer. afhttp fetch and afhttp cdp parse args, call into the SDK, format the response. They are not parallel implementations.

Cargo features.

[features]
default  = ["sdk", "cli"]
sdk      = []                                   # client-side; what library consumers want
host     = ["dep:chromiumoxide", ...]           # browser-launch deps; only in the bin
cli      = ["sdk", "host", "dep:clap",
            "dep:agent-first-data"]             # the afhttp binary

External consumers (e.g. the fetch service) depend on the crate with default-features = false, features = ["sdk"] and link only the SDK weight, not chromiumoxide or any browser-launch code.

14. Non-Goals