TODO

Tracked follow-ups beyond the N8 wave. Each item lists what to change, why an agent benefits, and where the work lives. Order is loose — pick by what the next session needs, not top-down.

SDK ergonomics

Custom headers / cookies / user-agent on FetchBuilder

Today an agent that needs to send Authorization, a custom User-Agent, or a session cookie has no choice but to spin up afhttp host --profile and warm cookies through the browser. For one-shot scripts hitting an API, or for sidestepping UA-based bot rules, that is way too heavy. The first wall most agents hit.

Add three builder methods (mod.rs): .header(name, value) (repeatable), .cookie(name, value) (repeatable), .user_agent(s). Plumb through:

CLI flags: --header K:V (repeatable), --cookie name=value (repeatable), --user-agent <string>. Document in architecture.md §5 and cli.md.

Pipeline / network

empty_html_shell auto-escalation

RenderMode::Auto currently escalates only on status≥400 or connect failure. SPAs that return 200 with an empty <div id="root"></div> shell slip through, and the agent gets useless body content unless it sets --render always defensively — which negates the whole point of auto.

In pipeline::execute’s Auto arm, after the Ok(r) success branch with status < 400, inspect the in-memory body bytes already on hand: if content-type is HTML and the body has a <script> tag plus either no <body> element or fewer than ~200 visible characters, escalate. Set Trace.escalation_reason = "empty_html_shell" so the agent sees why. Keep the threshold conservative — false positives are worse than false negatives.

Error contract

wait_selector_unmatched error code

ErrorCode::NavigationTimeout is currently overloaded: both “HTTP overall timeout” and “selector never appeared” share the same code. An agent that retries a navigation_timeout will retry “slow site” cases correctly but will also blindly retry “selector typo” cases forever.

Add ErrorCode::WaitSelectorUnmatched. Emit it only from the Wait::Selector polling branch in pipeline.rs::wait_for_condition (currently maps to NavigationTimeout via the CdpTimeout translation near line 258). Wait::Load/Idle/Ms and the overall HTTP fetch timeout keep emitting NavigationTimeout. Update the error-codes table in architecture.md §11 and the “Agent should” table in reference.md.

Deferred

Stable escalation_reason token vocabulary

The agent-facing value is low until there is a second meaningful token beyond empty_html_shell. The current free-form strings (http status <n>, http failed: <error_code>) are documented as opaque and agents that pattern-match on them keep working. Revisit when a new escalation case lands that an agent would actually branch on.

HTTP main-body size cap

Real OOM risk on multi-GB direct downloads, but no one has hit it. Stream-and-cap the resp.bytes().await at pipeline.rs:150-153 once there is a concrete report or a use case that needs binary downloads.

Promote cheap Trace fields

render_used: bool and render_mode: "none"|"auto"|"always" are cheap to plumb and useful for agent retry logic. Add opportunistically when the next pipeline change is in the area. phase_ms, http_version, remote_addr need real reqwest/CDP timing plumbing — own change.

Visibility-aware Wait::Selector

Current existence-only semantics are documented honestly after the Phase-1 truth-up. If frameworks-with-deferred-display turn out to be common enough that agents want it, add a new wait mode selector-visible:<css> rather than changing existing behavior.

Observation artifact

DONE - selector_hint_unique: bool on each node

Implemented: observation nodes now include selector_hint_unique whenever selector_hint is present, and the protocol reference documents the field.

The JS projection in src/sdk/fetch/artifacts/observation.rs builds a best-effort CSS path (tag.class:nth-of-type(n) > …) capped at four ancestors. On deep DOMs the resulting selector may match multiple elements. Today the agent has to verify uniqueness itself.

Add a selector_hint_unique boolean computed in the same JS pass: document.querySelectorAll(hint).length === 1. Skip the field when the hint itself is null. Agents can then decide between “use selector directly” and “fall back to coordinates / accessibility node id”.

DONE - Cross-link iframe nodes to their frames[] entry

Implemented: iframe observation nodes now include frame_ref pointing to the matching frames[].frame_id, assigned by DOM iframe order.

nodes includes entries with role: "iframe" (DOM tag projection) and frames lists every iframe with its own frame_id. They’re not joined, so an agent that wants to “click inside the iframe” must match by src or position.

Add a frame_ref field to iframe nodes pointing at the matching frames[].frame_id. Generation: in the JS pass, after collecting frames, walk iframe nodes and assign the same id-by-index.

DONE - Fix <select> role mapping

Implemented: select role projection now reports default/sized/multiple <select> controls as listbox, and preserves combobox only when aria-haspopup is present.

roleFor() maps <select> to combobox. WAI-ARIA spec only assigns combobox when the element has aria-haspopup or is editable; the default for <select> (single-select) is listbox. The mismatch can confuse LLMs that expect ARIA semantics.

Change to:

DONE - Extract the observation JS into an asset file

Implemented: OBSERVATION_JS now uses include_str! to load assets/observation/snapshot.js, keeping the browser projection in a standalone asset file.

The OBSERVATION_JS const in observation.rs is ~100 lines of inlined JavaScript. It’s readable but loses local lint/format/test. Move it to assets/observation/snapshot.js and load via include_str!. Optionally add a tiny browser-side smoke test (run the JS against a tiny static HTML, validate output shape).

DONE - Capture clickable non-semantic elements

Implemented: the observation snapshot appends a bounded set of non-semantic elements whose computed cursor is pointer, and architecture.md §8 documents the scan/budget rule.

Today the selector list is a[href], button, input, textarea, select, summary, iframe, [role], [tabindex], [contenteditable=true]. Pages written in modern frameworks frequently use bare <div onclick> or <span> with cursor styles. These never appear in the observation.

Consider also collecting elements with cursor: pointer computed style, gated by a depth cap or a max-node budget to keep payload size bounded. Document the rule in architecture.md §8 so it stays mechanical.

Pipeline / network

DONE - Surface a “main request finished” debug hint

Implemented: fetch traces now include main_request_observed, true for HTTP-only responses and browser fetches with a collected main document network entry, false when the browser path never saw one.

NetworkCollector::wait_for_main_status now resolves on either status or failure. When it returns None (no main entry observed within the 500ms cap) the only signal is a warning string. Promote this into a dedicated trace.main_request_observed: bool field on FetchResult so agents can distinguish “request never started” (data: URL, blocked by extension) from “request started but slow”.

DONE - Trace escalation_reason for navigate failures

Implemented: HTTP fast-path connection failures now classify into the same fine-grained DNS/TLS/target buckets used for browser navigation, so auto-render trace.escalation_reason can carry the specific code.

When the auto-render path falls back to the browser, Trace.escalation_ reason records something like http failed: host_unreachable. With the new fine-grained codes (dns_resolution_failed, tls_error, target_unreachable) the reason should propagate the specific code, not the bucket. Today http_only only knows the reqwest-level error; the mapping in http_only could be tightened to match the browser-side classifier so the trace strings stay parallel across paths.

SDK ergonomics

DONE - Single CDP connection per Client

Implemented: Client now opens one lazy cached CDP connection for fetch and cdp calls, exposes Client::close().await, and detaches one-shot flattened target sessions after each operation. architecture.md §12 documents multi-attach implications.

Each fetch / cdp call opens a fresh Connection to /cdp. For multi-step agent workflows (login → navigate → screenshot) this adds handshake overhead and prevents target reuse without manual --tab plumbing. Investigate a connection cache on Client keyed by endpoint, with explicit Client::close() to tear down.

This is a behavior change; document the implications for ops-panel multi-attach before shipping.

Documentation

DONE - Reference doc for the new error codes

Implemented: docs/reference.md now enumerates every error_code with a representative detail/errorText and an “Agent should” action.

docs/reference.md should grow a section enumerating every error_code with a one-line example errorText and an “agent should” line. Today the only reference is the table in architecture.md §11, which lists the codes but not the cases that produce them.

DONE - CLI doc for --tab reuse semantics

Implemented: docs/cli.md now documents default target allocation/close versus --tab <id> target reuse, and the clap help text explains that a reused tab is left open.

docs/cli.md (and the --help text) should call out that afhttp fetch --tab <id> reuses the target and leaves it open on exit, contrasted with default (which allocates + closes). Today the behavior is correct but undocumented; agents using --tab for session reuse benefit from explicit confirmation.