Agent skills: progressive disclosure as a context budget solution

Agent Skills solve a fundamental tension in LLM applications: you want your agent to be capable of many things, but every capability you add to the system prompt consumes context window tokens. Skills sidestep this with a progressive disclosure architecture rooted in filesystem access.

The three-level loading system

Skills use three loading levels, each progressively more expensive in tokens but only paid when needed:

Skill loading levels

Level	Content	When loaded	Token cost
Level 1	Metadata (name, description, triggers)	Always, in system prompt	~100 tokens per skill
Level 2	SKILL.md with full instructions	When the skill is triggered by context	Under 5,000 tokens
Level 3+	Bundled resources, scripts, templates	As needed during execution	Effectively unlimited

Why this scales

The key insight: you can install dozens of Skills without meaningful context penalty. Only the metadata lives in the system prompt. The full instructions load only when the agent determines a Skill is relevant to the current task. Supporting files and scripts load only when the agent explicitly accesses them.

This means an agent can carry capabilities for PowerPoint generation, Excel analysis, Word document creation, PDF processing, and dozens of custom Skills — all at a baseline cost of roughly 100 tokens each.

The filesystem as capability boundary

Skills are fundamentally filesystem-based. Each Skill is a directory containing:

my-skill/ text

my-skill/
  SKILL.md          # Level 2: instructions loaded when triggered
  script.py         # Level 3: executable code loaded as needed
  template.pptx     # Level 3: reference file loaded as needed
  data/schema.json  # Level 3: supporting data loaded as needed

Skills run in a code execution VM with filesystem access and bash. This means a Skill can execute arbitrary Python, read and write files, and produce artifacts — all within the Skill’s directory scope.

API skills vs. Claude Code skills

Skill environment differences

Capability	API Skills	Claude Code Skills
Network access	None	Full
Package installation	Pre-installed only	Runtime `pip install`
Filesystem access	Skill directory only	Full workspace
Bash execution	Yes	Yes
Use case	Controlled, secure production environments	Flexible development and automation

API Skills trade off flexibility for security: no network access prevents data exfiltration, and no runtime package installation keeps the environment predictable. Claude Code Skills assume a developer’s machine where full access is appropriate.

Pre-built skills ecosystem

Anthropic ships pre-built Skills for common document formats:

PowerPoint: Create and edit .pptx files
Excel: Analyze and generate spreadsheets
Word: Create and modify .docx documents
PDF: Extract text, fill forms, merge documents

Each comes with domain-specific instructions in its SKILL.md that encode format knowledge — no need to prompt-engineer PowerPoint’s XML schema into your agent.

The progressive disclosure pattern beyond skills

The three-level architecture generalizes to any LLM system dealing with the context budget problem:

Always loaded: The minimum information needed to decide relevance
Conditionally loaded: Full instructions, loaded on trigger
On-demand loaded: Supporting resources, loaded by explicit access

This pattern applies equally to RAG systems (metadata → chunks → full documents), tool catalogs (tool names → schemas → implementation details), and multi-agent systems (agent descriptions → system prompts → context documents).

Takeaways

Progressive disclosure controls context cost

Metadata loads by default, instructions load on trigger, and resources load only when execution needs them.

Capability count can scale independently

Dozens of skills cost roughly 100 tokens each at baseline because full instructions stay out of context until matched.

Skills are filesystem packages

A skill directory can contain `SKILL.md`, scripts, templates, and reference data that the agent accesses on demand.

API skills trade flexibility for security

No network access and pre-installed packages make hosted skill execution more predictable than a full developer workstation.

The pattern generalizes

RAG systems, tool catalogs, and multi-agent systems can use the same metadata-to-instructions-to-resources loading model.