TODO
Tracked follow-ups beyond the N8 wave. Each item lists what to change, why an agent benefits, and where the work lives. Order is loose — pick by what the next session needs, not top-down.
SDK ergonomics
Custom headers / cookies / user-agent on FetchBuilder
Today an agent that needs to send Authorization, a custom User-Agent,
or a session cookie has no choice but to spin up afhttp host --profile
and warm cookies through the browser. For one-shot scripts hitting an
API, or for sidestepping UA-based bot rules, that is way too heavy. The
first wall most agents hit.
Add three builder methods (mod.rs): .header(name, value) (repeatable),
.cookie(name, value) (repeatable), .user_agent(s). Plumb through:
- HTTP fast path: build a per-request
reqwest::RequestBuilderfrom the sharedClient::http()instance and apply headers /COOKIE/USER_AGENTthere. Reject duplicateCookieheaders via--headerto avoid double-emit. - Browser path: right after the existing
Network.enableblock inpipeline.rs::browser_path, conditionallyconn.send(...)forNetwork.setExtraHTTPHeaders,Network.setCookies,Network.setUserAgentOverride. Raw CDP only — keeps the SDK build chromiumoxide-free.
CLI flags: --header K:V (repeatable), --cookie name=value (repeatable),
--user-agent <string>. Document in architecture.md §5 and cli.md.
Pipeline / network
empty_html_shell auto-escalation
RenderMode::Auto currently escalates only on status≥400 or connect
failure. SPAs that return 200 with an empty <div id="root"></div> shell
slip through, and the agent gets useless body content unless it sets
--render always defensively — which negates the whole point of auto.
In pipeline::execute’s Auto arm, after the Ok(r) success branch
with status < 400, inspect the in-memory body bytes already on hand:
if content-type is HTML and the body has a <script> tag plus either no
<body> element or fewer than ~200 visible characters, escalate. Set
Trace.escalation_reason = "empty_html_shell" so the agent sees why.
Keep the threshold conservative — false positives are worse than false
negatives.
Error contract
wait_selector_unmatched error code
ErrorCode::NavigationTimeout is currently overloaded: both “HTTP
overall timeout” and “selector never appeared” share the same code. An
agent that retries a navigation_timeout will retry “slow site” cases
correctly but will also blindly retry “selector typo” cases forever.
Add ErrorCode::WaitSelectorUnmatched. Emit it only from the
Wait::Selector polling branch in pipeline.rs::wait_for_condition
(currently maps to NavigationTimeout via the CdpTimeout translation
near line 258). Wait::Load/Idle/Ms and the overall HTTP fetch
timeout keep emitting NavigationTimeout. Update the error-codes table
in architecture.md §11 and the “Agent should” table in reference.md.
Deferred
Stable escalation_reason token vocabulary
The agent-facing value is low until there is a second meaningful token
beyond empty_html_shell. The current free-form strings (http status <n>, http failed: <error_code>) are documented as opaque and agents
that pattern-match on them keep working. Revisit when a new escalation
case lands that an agent would actually branch on.
HTTP main-body size cap
Real OOM risk on multi-GB direct downloads, but no one has hit it.
Stream-and-cap the resp.bytes().await at pipeline.rs:150-153 once
there is a concrete report or a use case that needs binary downloads.
Promote cheap Trace fields
render_used: bool and render_mode: "none"|"auto"|"always" are cheap
to plumb and useful for agent retry logic. Add opportunistically when
the next pipeline change is in the area. phase_ms, http_version,
remote_addr need real reqwest/CDP timing plumbing — own change.
Visibility-aware Wait::Selector
Current existence-only semantics are documented honestly after the
Phase-1 truth-up. If frameworks-with-deferred-display turn out to be
common enough that agents want it, add a new wait mode
selector-visible:<css> rather than changing existing behavior.
Observation artifact
DONE - selector_hint_unique: bool on each node
Implemented: observation nodes now include selector_hint_unique whenever
selector_hint is present, and the protocol reference documents the field.
The JS projection in src/sdk/fetch/artifacts/observation.rs builds a
best-effort CSS path (tag.class:nth-of-type(n) > …) capped at four
ancestors. On deep DOMs the resulting selector may match multiple
elements. Today the agent has to verify uniqueness itself.
Add a selector_hint_unique boolean computed in the same JS pass:
document.querySelectorAll(hint).length === 1. Skip the field when the
hint itself is null. Agents can then decide between “use selector
directly” and “fall back to coordinates / accessibility node id”.
DONE - Cross-link iframe nodes to their frames[] entry
Implemented: iframe observation nodes now include frame_ref pointing to the
matching frames[].frame_id, assigned by DOM iframe order.
nodes includes entries with role: "iframe" (DOM tag projection) and
frames lists every iframe with its own frame_id. They’re not joined,
so an agent that wants to “click inside the iframe” must match by src
or position.
Add a frame_ref field to iframe nodes pointing at the matching
frames[].frame_id. Generation: in the JS pass, after collecting
frames, walk iframe nodes and assign the same id-by-index.
DONE - Fix <select> role mapping
Implemented: select role projection now reports default/sized/multiple
<select> controls as listbox, and preserves combobox only when
aria-haspopup is present.
roleFor() maps <select> to combobox. WAI-ARIA spec only assigns
combobox when the element has aria-haspopup or is editable; the
default for <select> (single-select) is listbox. The mismatch can
confuse LLMs that expect ARIA semantics.
Change to:
<select multiple>→listbox<select size>1>→listbox<select>→comboboxonly whenaria-haspopupis set; otherwise default tolistbox
DONE - Extract the observation JS into an asset file
Implemented: OBSERVATION_JS now uses include_str! to load
assets/observation/snapshot.js, keeping the browser projection in a
standalone asset file.
The OBSERVATION_JS const in observation.rs is ~100 lines of inlined
JavaScript. It’s readable but loses local lint/format/test. Move it to
assets/observation/snapshot.js and load via include_str!. Optionally
add a tiny browser-side smoke test (run the JS against a tiny static
HTML, validate output shape).
DONE - Capture clickable non-semantic elements
Implemented: the observation snapshot appends a bounded set of non-semantic
elements whose computed cursor is pointer, and architecture.md §8
documents the scan/budget rule.
Today the selector list is a[href], button, input, textarea, select, summary, iframe, [role], [tabindex], [contenteditable=true]. Pages
written in modern frameworks frequently use bare <div onclick> or
<span> with cursor styles. These never appear in the observation.
Consider also collecting elements with cursor: pointer computed style,
gated by a depth cap or a max-node budget to keep payload size bounded.
Document the rule in architecture.md §8 so it stays mechanical.
Pipeline / network
DONE - Surface a “main request finished” debug hint
Implemented: fetch traces now include main_request_observed, true for
HTTP-only responses and browser fetches with a collected main document
network entry, false when the browser path never saw one.
NetworkCollector::wait_for_main_status now resolves on either status
or failure. When it returns None (no main entry observed within the
500ms cap) the only signal is a warning string. Promote this into a
dedicated trace.main_request_observed: bool field on FetchResult so
agents can distinguish “request never started” (data: URL, blocked by
extension) from “request started but slow”.
DONE - Trace escalation_reason for navigate failures
Implemented: HTTP fast-path connection failures now classify into the same
fine-grained DNS/TLS/target buckets used for browser navigation, so
auto-render trace.escalation_reason can carry the specific code.
When the auto-render path falls back to the browser, Trace.escalation_ reason records something like http failed: host_unreachable. With the
new fine-grained codes (dns_resolution_failed, tls_error,
target_unreachable) the reason should propagate the specific code, not
the bucket. Today http_only only knows the reqwest-level error; the
mapping in http_only could be tightened to match the browser-side
classifier so the trace strings stay parallel across paths.
SDK ergonomics
DONE - Single CDP connection per Client
Implemented: Client now opens one lazy cached CDP connection for fetch
and cdp calls, exposes Client::close().await, and detaches one-shot
flattened target sessions after each operation. architecture.md §12
documents multi-attach implications.
Each fetch / cdp call opens a fresh Connection to /cdp. For
multi-step agent workflows (login → navigate → screenshot) this adds
handshake overhead and prevents target reuse without manual --tab
plumbing. Investigate a connection cache on Client keyed by endpoint,
with explicit Client::close() to tear down.
This is a behavior change; document the implications for ops-panel multi-attach before shipping.
Documentation
DONE - Reference doc for the new error codes
Implemented: docs/reference.md now enumerates every error_code with a
representative detail/errorText and an “Agent should” action.
docs/reference.md should grow a section enumerating every error_code
with a one-line example errorText and an “agent should” line. Today
the only reference is the table in architecture.md §11, which lists
the codes but not the cases that produce them.
DONE - CLI doc for --tab reuse semantics
Implemented: docs/cli.md now documents default target allocation/close
versus --tab <id> target reuse, and the clap help text explains that a
reused tab is left open.
docs/cli.md (and the --help text) should call out that
afhttp fetch --tab <id> reuses the target and leaves it open on exit,
contrasted with default (which allocates + closes). Today the behavior
is correct but undocumented; agents using --tab for session reuse
benefit from explicit confirmation.