Design
Problem
AI agents call HTTP APIs through bash tool calls. With curl, every request spawns a new process, pays a full TCP+TLS handshake, and returns human-readable text that must be parsed. Agents need structured JSON output and — when making multiple calls — connection reuse.
The cost of curl-per-request
Agent curl process Server
│ │ │
├─ spawn curl ──────────────→│ │
│ ├─ TCP handshake ───────→│
│ ├─ TLS handshake ───────→│
│ ├─ HTTP request ────────→│
│ │←──── HTTP response ────┤
│←── stdout (text) ─────────┤ │
│ ╳ (process exits) │
│ │
├─ spawn curl ──────────────→│ │ ← another process
│ ├─ TCP handshake ───────→│ ← another handshake
│ ├─ TLS handshake ───────→│ ← another TLS
│ ├─ HTTP request ────────→│
│ │←──── HTTP response ────┤
│←── stdout (text) ─────────┤ │
│ ╳ (process exits) │
10 requests to the same host = 10 TCP handshakes + 10 TLS negotiations. On a 200ms RTT link, that’s 4 seconds of pure overhead.
Two Modes
CLI mode (default)
One bash tool call, one request, one JSON response, process exits:
Agent ──→ afhttp GET https://api.example.com/users ──→ JSON stdout ──→ Agent
Default output: response or error — one JSON line, process exits. For streaming: chunk_start → chunk_data... → chunk_end. Use --verbose for diagnostic output (startup, request, progress, retry, redirect).
This is how most agent tool calls work — fire a request, read the result, move on.
Pipe mode (--mode pipe)
For workflows that benefit from connection reuse, concurrent requests, or WebSocket:
Agent ──→ afhttp --mode pipe (stdin JSONL ←→ stdout JSONL) ──→ Agent
A long-lived process. The agent sends request/config/send/cancel/close commands as JSONL to stdin, reads responses from stdout. Connections stay open between requests. Multiple requests in-flight simultaneously. close triggers shutdown by cancelling active work, waiting briefly for terminal events, then emitting a final close acknowledgement.
Architecture
CLI mode: Pipe mode:
argv ──→ parse_args() stdin ──────────→ Request Parser (JSONL)
│ │
▼ ▼
┌────────────┐ ┌─────────────────────────────┐
│ reqwest │ │ Connection Pool Manager │
│ Client │ │ pool[host1] ─→ conn(h2) │──→ host1:443
│ (single │──→ server │ pool[host2] ─→ conn(h2) │──→ host2:443
│ request) │ │ pool[host3] ─→ conn(h1) │──→ host3:80
└────────────┘ └─────────────────────────────┘
│ │
▼ ▼
stdout (JSON) stdout ←──── Response Writer (JSONL)
All runtime protocol output goes to stdout as JSON. stderr is not a protocol channel.
Shared core
Both modes share the same handler, chunked streaming, and WebSocket code. CLI mode builds a single request from argv, sends it through the same execute_request() path, and collects output via the same mpsc channel — just stripping id/tag fields before writing.
Concurrency (pipe mode)
stdin reader (main task)
│
├─ parse request 1 ──→ spawn tokio task ──→ client.send() ──→ write stdout
├─ parse request 2 ──→ spawn tokio task ──→ client.send() ──→ write stdout
├─ parse request 3 ──→ spawn tokio task ──→ client.send() ──→ write stdout
│
└─ (continues reading stdin without blocking)
Each request is an independent tokio task. The stdin reader never blocks on HTTP I/O. Responses are written to stdout as they complete, identified by id.
Design Principles
Server errors are errors
If the server violates HTTP protocol (e.g. sends non-ASCII bytes in a header), afhttp surfaces this as code: "error" with error_code: "invalid_response". No silent patching, no lossy fallbacks. The agent receives accurate information and decides how to react.
Errors are structured, not human text
Every error carries error_code (machine-readable, stable), error (human-readable detail), and retryable (bool). Agents match on error_code — not string-parsing message.
Secret fields are redacted in config echo
All stdout lines go through agent_first_data::output_json_with() for consistent single-line JSON formatting with explicit redaction policy. Config/log output (startup, config, log) uses full _secret redaction so key_pem_secret never appears in plain text in config echo.
Server response data (response bodies, headers, WebSocket messages) is passed through unmodified. Redaction does not apply to server-originated content.
Header scope safety boundary
defaults.headers_for_any_hosts is global and applies to every outbound host. It is restricted to non-sensitive public headers only (for example User-Agent, Accept).
Any credential material (Authorization, API keys, cookies, bearer tokens) must be scoped with host_defaults[host].headers so secrets cannot be sent to unrelated domains.
Agent-First Data naming conventions for fields
Field names carry meaning through suffixes:
| Suffix | Meaning | Example |
|---|---|---|
_ms | milliseconds | duration_ms, retry_base_delay_ms |
_s | seconds | timeout_connect_s, timeout_idle_s |
_bytes | byte count | response_save_above_bytes, received_bytes |
_file | file path | body_file, cacert_file, key_file |
_base64 | base64-encoded bytes | body_base64, data_base64 |
_pem | inline PEM-format text | cacert_pem, cert_pem, key_pem_secret |
_secret (at end) | sensitive value — auto-redacted in output | key_pem_secret |
Inline and file-path variants are mutually exclusive per slot: setting one clears the other in stored config. Inline takes precedence when both are present in a patch.
CLI flags: long only, no abbreviations
CLI flags use long form only (--header, --body, --timeout-idle-s). No single-letter short flags (-H, -b). This is deliberate — agents read and write flags by name, not by memorized shortcuts. Long flags are self-describing and less error-prone in generated commands.
CLI flag names correspond to JSON field names with hyphens replacing underscores (e.g. JSON timeout_idle_s → CLI --timeout-idle-s, JSON body_base64 → CLI --body-base64).
Boolean flags that default to false are bare flags (--verbose, --chunked, --tls-insecure). Boolean flags that default to true take an explicit value (--response-parse-json false, --response-decompress false).
Output formats via --output
CLI mode supports three output formats via --output json|yaml|plain:
- json (default): Single-line JSON via
agent_first_data::output_json_with(). Config/log fields use full_secretredaction; server response payload fields remain raw. - yaml: Multi-line YAML via
agent_first_data::output_yaml(). Field name suffixes stripped (duration_ms→duration), values formatted (10485760→"10.0MB"). - plain: Logfmt via
agent_first_data::output_plain(). Same suffix stripping and value formatting as YAML but single-line.
Server response body is never modified. Non-string body values (parsed JSON objects/arrays) are serialized to a JSON string before passing to yaml/plain formatters, so the formatters treat them as opaque strings. This ensures the agent receives exact server data regardless of output format.
No unwrap / expect / panic anywhere in the codebase
#![deny(clippy::unwrap_used, clippy::expect_used, clippy::panic)] is enforced at crate level. Every error case is handled explicitly — either propagated as a structured error output to the agent, or (for truly impossible cases) handled with a hardcoded fallback string rather than a panic.
Dependencies
| Crate | Purpose |
|---|---|
tokio | Async runtime, stdin reader, task spawning |
reqwest | HTTP client with connection pooling and HTTP/2 |
tokio-tungstenite | WebSocket client (upgrade handshake, framed read/write) |
clap | CLI argument parsing (derive API) |
agent-first-data | Agent-First Data output serialization with automatic _secret redaction |
serde_json | JSON parsing and serialization |
base64 | Body encoding/decoding |
uuid | Process-unique download directory |
Future
- HTTP/3 (QUIC) — eliminates TCP head-of-line blocking, 0-RTT reconnection. Waiting for
hyper-h3stabilization. - WebSocket TLS config — apply custom TLS settings to WebSocket connections (currently uses system root store only).
- Request pipelines — declare request dependencies (
"after": "req-1") for sequential workflows. - Response caching — optional ETag/Last-Modified caching per URL.