Giving agents hands: a CLI that lets coding agents explore any API

The Problem: APIs at Scale

Large enterprise platforms accumulate APIs the way cities accumulate roads. Start with a few clean endpoints, add microservices and gateway layers over time, and before long you have 2,000+ operations spanning 100+ domains — accounts, tickets, conversations, deployments, permissions, analytics, AI services. Each with its own authentication requirements, pagination patterns, and data shapes.

That surface doesn’t stay still. Endpoints are added, renamed, split behind gateways, or routed to new backing services. Documentation goes stale the moment teams ship new use cases, and those use cases often change request and response shapes in subtle ways — one payload adds nested filters, another returns a new envelope, a third changes how pagination tokens are emitted.

For a human developer, this is manageable. You open the documentation, search for what you need, try a few requests in Postman, read the response, iterate. The process is slow but navigable.

For a coding agent, this API surface is a wall.

Human path

Traditional API exploration:

Open API docs in a browser
Scroll, search, click through pages
Copy endpoint URLs into Postman
Manually construct request bodies
Read response, adjust, retry
Build mental model over hours or days

Slow, but works. The human brain fills in gaps, infers patterns, and adapts.

Agent problem

What a coding agent faces:

Can’t open a browser or click through docs
Can’t scroll or visually scan
Needs exact endpoint paths, methods, and schemas
Must construct requests programmatically
Needs structured, parseable responses
Has no way to discover what’s available

The agent is powerful but blind. It can execute, but it can’t explore.

Why Agents Struggle with APIs

This isn’t just about documentation format. There are structural reasons why traditional API tooling fails coding agents:

Why Traditional API Tools Fail Agents

Why traditional API tools fail agents

Challenge	Why It Breaks Agents
Interactive interfaces	Postman, Swagger UI, and API consoles require mouse clicks, form fills, and visual navigation. Agents work through text.
No incremental discovery	Most API docs are a flat list. There’s no hierarchical path from “what domains exist” to “what can I do in this domain” to “what does this endpoint accept.”
Unstructured outputs	Pretty-printed HTML responses, mixed content types, and inconsistent error formats make reliable parsing fragile.
Permission opacity	Agents can’t easily tell which endpoints they’re allowed to call. Trial-and-error against auth barriers wastes tokens and time.
Pagination complexity	Cursor-based pagination requires manual state management — follow the cursor, accumulate results, handle edge cases. Most agents give up after the first page.
RPC-style patterns	Many enterprise APIs use POST for everything — reads, writes, searches. The REST mental model (GET = read, POST = write) breaks down.

The Core Insight: CLIs as Agent Interfaces

The answer was hiding in plain sight. CLIs are the original text-in, text-out interface. They’re composable, scriptable, and non-interactive by default. Every coding agent already knows how to run shell commands.

But most CLIs are designed for human ergonomics — colorful output, interactive prompts, confirmation dialogs. The insight wasn’t “build a CLI.” It was: build a CLI where the OpenAPI specification IS the interface.

This means:

Zero hard-coded routes. Every endpoint comes from the spec. Add an endpoint to the spec, and it’s immediately available in the CLI. No code changes.
The CLI is always complete. If the API has 1,000+ endpoints, the CLI has 1,000+ commands. No manual mapping, no forgotten endpoints, no drift between docs and reality.
Discovery is built in. The same spec that defines endpoints also provides their schemas, parameters, and descriptions — all queryable from the terminal.

Design Decisions That Enable Agency

Every design choice was evaluated through a dual lens: does this work for a human developer at a terminal, AND does this work for a coding agent running commands non-interactively? Seven decisions stood out:

Design Decisions and Their Agent Impact

Design decisions and their agent impact

Decision	Human Benefit	Agent Benefit
Spec-driven commands	Always up-to-date with API changes	Self-discovery — agent can list all available operations
Structured output (`--json`)	Clean data for scripting	Reliable parsing — `{status, body, headers}` envelope
TTY detection	Interactive picker when in terminal	Auto non-interactive mode when agent runs commands
Dry-run & curl generation	Preview before executing	Safe exploration — understand requests without side effects
Auto-pagination	No manual cursor management	Complete result sets in one command
Typed field coercion	Less shell escaping hassle	Predictable input handling — booleans, numbers, JSON, file refs
Permission-as-config	Project-level safety policies	Bounded exploration — agent sees only allowed endpoints

TTY Detection: The Dual-Mode Pattern

The most subtle design decision. When run in a terminal (TTY), the CLI offers an interactive fuzzy-search picker — type a few characters, see matching endpoints, select one. When run non-interactively (piped, backgrounded, or by an agent), it silently switches to structured output mode. No prompts, no confirmations, no color codes that break parsing.

The agent never needs to know about this. It just works.

Structured Output Envelope

In non-interactive mode, every response follows the same shape:

{"status": 200, "body": {"works": [...], "next_cursor": "abc"}, "headers": {...}}

Status codes, response bodies, and headers — all in a predictable envelope. The agent can write one parser that works for every endpoint. Contrast this with raw curl, where you’d need to parse status from headers, handle different content types, and deal with error formats that vary by endpoint.

Permission Boundaries

A regex-based allow/deny system controls which endpoints are visible and executable. Configuration lives in settings files — project-level or global — not in code. This means:

Safe exploration: Give an agent access to read-only endpoints while blocking mutations
Progressive trust: Start with a narrow allowlist, widen as confidence grows
No code changes: Permissions are config, not logic

Loading diagram…

The Observability Surface for APIs

Before this tool, understanding the API landscape meant reading documentation pages one by one. After, the entire surface is queryable from the terminal.

Discover the landscape

Stage 1

api domains reveals API domains with endpoint counts, showing which areas are large, specialized, or worth exploring first.

Narrow and filter

Stage 2

api list --domain works collapses the search space from thousands of endpoints to a focused domain-level list.

Understand and execute

Stage 3

--dry-run, --generate=curl, and --fields make every request inspectable before and after execution.

Discover

Run api domains to map the surface before guessing endpoints

Filter

Use api list --domain works to shrink the search space

Preview

Dry-run the command or generate curl before making the request

Execute

Invoke with structured output and optional field selection

Loading diagram…

Before vs. After: API Exploration Workflow

Before vs. after API exploration workflow

Step	Before (Manual)	After (CLI)
Find available endpoints	Browse docs, Ctrl+F, scan pages	`api list --domain accounts --format json`
Understand request shape	Read API reference, find examples	`api /accounts.list --dry-run`
Test an endpoint	Copy to Postman, fill fields, send	`api /accounts.list -F limit=5`
Get paginated results	Write cursor loop in code	`api /accounts.list --paginate`
Extract specific fields	Parse full response in code	`api /accounts.list --fields id,display_name`
Generate integration code	Manually write fetch/axios calls	`api /accounts.list --generate=curl`

Patterns for Agent-Friendly CLI Design

The techniques that emerged from this project are generalizable. Any CLI that wants to serve both humans and coding agents can apply these patterns:

Agent-Friendly CLI Design Patterns

Agent-friendly CLI design patterns

Pattern	Implementation	Why It Matters
TTY-aware dual mode	Detect `stdin.isTTY`; interactive prompts for humans, structured output for agents	One binary serves both audiences without flags
Structured envelope	Wrap responses in `{status, body, headers}`	Agents write one parser, not per-endpoint parsing
Spec-driven commands	Generate CLI surface from OpenAPI at runtime	CLI can’t drift from API; always complete and accurate
Incremental discovery	domains → endpoints → schema → invoke	Agents navigate API surface without documentation
Permission-as-config	Regex allow/deny in JSON settings files	Safety boundaries without code changes; progressive trust
Typed field coercion	Auto-parse booleans, numbers, JSON, file refs (`@file.json`)	Reduces agent friction with shell escaping and type mismatches
Preview-before-execute	`--dry-run` shows request; `--generate=curl` produces equivalent	Agents can plan and verify before committing to actions
Auto-pagination	`--paginate` follows cursors, combines pages	Agents get complete data sets without state management

Loading diagram…

What Changes When Agents Can Explore

The most significant impact isn’t speed — it’s autonomy. When a coding agent can discover, understand, and invoke APIs without human guidance, the nature of the collaboration changes.

From “tell me the endpoint” to self-discovery

Without the CLI, a typical agent interaction looks like: “I need to list tickets. What’s the endpoint?” With it, the agent runs api domains, finds works, runs api list --domain works, finds /works.list, runs --dry-run to understand the schema, and executes. No human in the loop.

Integration workflows collapse

Building a feature that touches an unfamiliar API domain used to mean hours of documentation reading and Postman experimentation. Now the discovery-to-execution cycle is four commands. The agent handles the exploration; the developer focuses on the business logic.

The CLI is the documentation

New developers (and agents) don’t need to find and read API docs. The CLI itself is queryable documentation. api list is the endpoint catalog. --dry-run is the request reference. --fields is the response guide. The API is self-describing through its tooling.

Safe exploration in production

The permission model means agents can explore real environments without risk. Set up a read-only allowlist, point the agent at a staging environment, and let it learn the API surface. When ready, widen permissions for write operations. The progression from read-only exploration to full access is controlled, auditable, and reversible.

Using the CLI as an agent skill

The CLI compounds in value when wired into how Claude Code operates. A companion skill defines exactly when and how to use it, so Claude follows a documented workflow rather than inferring API mechanics from context.

CLAUDE.md as the entry point

CLAUDE.md is read at the start of every Claude Code session. A reference to the companion skill is enough to activate it:

## API access
When the task requires reading from or writing to any API endpoint, use the @cli-for-agents skill.

Claude treats this as a standing instruction. Any task involving API data triggers skill lookup before attempting direct execution.

What the skill defines

The skill is designed around the CLI’s own design principles: non-interactive first, incremental discovery, and fail usefully with correct examples. It teaches Claude three things:

When to use the CLI — any task that requires listing, filtering, reading, or writing API resources: “fetch open issues,” “create a work item,” “list accounts by domain.”

How to discover the right endpoint — the four-step pattern the CLI is built around, matching the terminal diagram above:

my-cli api domains                         # map the API surface
my-cli api list --domain <name>            # scope to a relevant domain
my-cli api <endpoint> --dry-run            # inspect before executing
my-cli api <endpoint> -F key=value --json  # execute with structured output

How to handle the response — the skill specifies the --json envelope shape (status, body, headers), instructs Claude to check status before processing body, and sets safe defaults: --dry-run for any write operation before executing, --fields to limit response size on large result sets, --paginate when completeness matters.

How Claude completes the task

With the skill active, “create a ticket for the login bug” becomes a four-step uninterrupted sequence. No human guidance needed for API mechanics:

agent session bash

my-cli api domains
# → finds 'works' domain

my-cli api list --domain works
# → finds /works.create — POST

my-cli api /works.create --dry-run
# → confirms required fields: title, type

my-cli api /works.create -F title="Fix login bug" -F type=issue --json
# → status: 201 — id: ISS-123

Claude discovers the surface, verifies the shape, and executes. The skill provides the method; the CLAUDE.md reference makes that method available at session start.

Takeaways

Each iteration of building this tool reinforced a set of principles about designing for the agent era:

Spec-driven design eliminates drift

Generating every command from the OpenAPI spec made the CLI complete, accurate, and trustworthy for autonomous agents.

TTY detection is the simplest dual-mode pattern

A single TTY check lets humans get interactive prompts while agents get structured JSON and silent operation.

Agents need incremental discovery

Domains, filtered endpoint lists, and request previews create a navigable path where a flat endpoint catalog would fail.

Permission-as-config enables progressive trust

Settings-level permissions allow cautious read-only exploration first, then wider access as confidence grows.

Preview modes prevent expensive mistakes

Dry-run and curl generation let agents understand mutation requests before executing them.

The best agent tools are also great human tools

Structured output, auto-pagination, and incremental discovery improved the developer experience as much as agent autonomy.

Loading diagram…