Claude Code Skills: Building Reusable Knowledge Packages for AI Agents

Suppose you have a team building a large project together. The frontend and backend devs need different documentation on coding standards, the risk analyst needs documentation on the specific methodology to use, etc. Each team member needs documentation on their specific roles, but they still might need to have access to documentation from other team members. Together, that is 10,000 lines of domain documentation.

If the team plans on using AI to assist their work they could dump all of it into CLAUDE.md. Every agent would load every line at startup, consuming tens of thousands of tokens of context before a single line of code gets written. The CRDT specialist pays the token cost for Kubernetes documentation it will never use. The frontend dev carries risk analysis methodology it has no use for.

Claude Code Skills solve this with a pattern called progressive disclosure, where each skill is a folder containing a SKILL.md file and optional reference materials. At startup, Claude reads only the YAML frontmatter: a name and description consuming perhaps a small amount of tokens per skill. The full content loads only when the skill matches the current task. Deeper reference files load only when the agent needs that specific detail. Using progressive disclosure, the project consumes only hundreds of tokens at startup instead of tens of thousands. At any given moment, an agent typically has one skill body and one or two reference files loaded. The knowledge is comprehensive and the context cost is minimal.

This article covers how skills work, how to design them effectively, and the patterns that make the difference between a skill that gathers dust and one that transforms agent productivity.

Progressive Dislosure

What Are Skills?#

Skills are structured knowledge packages that live in your project’s .claude/skills/ directory. Each skill is a folder containing a SKILL.md file and optionally a reference/ directory with additional documentation and utility scripts.

Skills Directory Structure

The key design principle is that skills are discovered by metadata but loaded by need. At session startup, Claude reads only the YAML frontmatter of each SKILL.md. When the agent encounters a task where the skill is relevant, it loads the SKILL.md body. If it needs deeper detail, it loads specific reference files. This three-tier loading strategy keeps context lean while making comprehensive knowledge available.

How Progressive Disclosure Works#

Let’s trace what happens when an agent encounters a task that requires CRDT (conflict-free replicated data type) knowledge.

Tier 1, Always loaded (startup): The agent’s context includes a skills index showing available skills:

1
Available skills:
2
- crdt-implementation: "Conflict-free replicated data type implementation
3
  patterns for real-time collaborative editing. Use when implementing
4
  document sync, conflict resolution, or operational transform logic."
5
- kubernetes-deployment: "Kubernetes deployment patterns and conventions..."
6
- portfolio-risk-analysis: "Portfolio risk assessment methodology..."

This costs about 200 tokens total for all skill descriptions. The agent sees what is available without loading any of the actual content.

Tier 2, Loaded on relevance: When the agent’s current task involves implementing real-time sync, it recognizes that the crdt-implementation skill is relevant and reads the SKILL.md body. This might be 200-400 lines covering the overall approach, key decisions, and pointers to deeper references.

Tier 3, Loaded on specific need: When the agent needs to implement a specific conflict resolution strategy, it reads reference/conflict_resolution.md. When it needs Yjs-specific API patterns, it reads reference/yjs_patterns.md. Each reference file is loaded individually, only when needed.

Three-Tier Progressive Disclosure

Progressive disclosure turns comprehensive documentation into efficient context. You can bundle 10,000 lines of domain knowledge into skills without paying any context cost until the knowledge is actually needed. The startup cost is proportional to the number of skills (their descriptions), not the total volume of knowledge they contain.

Executable Scripts: Computation Without Context Cost#

Skills can include scripts in their reference/scripts/ directory that Claude runs without loading the source code into context. Only the script’s output consumes tokens. Consider a risk calculation script that is 200 lines of Python:

1
import pandas as pd
2
import numpy as np
3
from scipy import stats
4

5
def compute_var(returns, confidence=0.95):
6
    """Compute Value at Risk using historical simulation."""
7
    return np.percentile(returns, (1 - confidence) * 100)
8

9
def compute_cvar(returns, confidence=0.95):
10
    """Compute Conditional Value at Risk."""
11
    var = compute_var(returns, confidence)
12
    return returns[returns <= var].mean()
13

14
# ... 180 more lines of computation ...
15

16
if __name__ == "__main__":
17
    data = pd.read_csv("/data/portfolio_returns.csv")
18
    print(f"VaR (95%): {compute_var(data['returns']):.4f}")
19
    print(f"CVaR (95%): {compute_cvar(data['returns']):.4f}")
20
    print(f"Portfolio Beta: {compute_beta(data):.4f}")

If the agent loaded this source into context, it would consume approximately 2,000 tokens. Instead, the agent runs the script and receives 3 lines of output:

1
VaR (95%): -0.0234
2
CVaR (95%): -0.0389
3
Portfolio Beta: 1.15

Script Efficiency

Anatomy of a SKILL.md File#

A well-structured SKILL.md has 3 parts: YAML frontmatter (required), a body section (the main content), and references to bundled files.

The Frontmatter#

The frontmatter is the skill’s index card containing what Claude reads at startup to decide whether this skill exists and when to use it. For example:

1
---
2
name: crdt-implementation
3
description: >
4
  Conflict-free replicated data type (CRDT) implementation patterns
5
  for real-time collaborative editing. Use when implementing document
6
  sync, conflict resolution, or operational transform logic. Covers
7
  Yjs library patterns, document model design, and WebSocket sync.
8
---

The frontmatter has constraints:

Name constraints: Maximum 64 characters, lowercase with hyphens only. The name should be descriptive enough that Claude can match it to relevant tasks. For example, crdt-implementation is clear but utils is not.
Description constraints: Maximum 1,024 characters. No XML tags, no reserved words. The description is the discovery mechanism. It is how Claude decides whether to load the skill. Write it as if you are telling a colleague when they should consult this reference. Include specific triggers like “Use when implementing…”, “Use when debugging…”, “Use when configuring…”.

The Body#

The SKILL.md body provides the working-level guidance that an agent needs to apply the skill effectively. Keep it under 500 lines as it is not a comprehensive reference manual; it is the information an experienced developer would want before starting implementation.

1
# CRDT Implementation Guide
2

3
## Chosen Approach: Yjs
4

5
This project uses Yjs as the CRDT library for real-time collaboration.
6

7
## Document Model
8

9
The document is represented as a Y.Doc with the following structure:
10
- Y.XmlFragment for rich text content
11
- Y.Map for document metadata (title, author, last modified)
12
- Y.Array for version history entries
13

14
## Key Decisions
15

16
1. **Merge strategy**: Last-writer-wins for metadata, CRDT merge for content
17
2. **Persistence**: Y.Doc state is serialized to PostgreSQL on every change
18
3. **Transport**: WebSocket with binary encoding (more efficient than JSON)
19

20
## Common Patterns
21

22
### Creating a shared document

Notice the pointer to the reference file for deeper detail. The body gives enough context to start working and the reference provides the exhaustive detail for specific sub-problems.

Encode project-specific decisions, not general knowledge. Claude already knows what WebSockets are and how CRDTs work. Your skill should capture how this project uses them: the specific library, the connection pattern, the message format, the merge strategy. Project-specific decisions are what make skills valuable.

Bundled Reference Files#

Reference files in the reference/ directory contain deep-dive documentation on specific topics. These are loaded individually when the agent needs that specific knowledge.

Keep references one level deep. A reference file should not point to another reference file that points to another. One level of progressive disclosure (SKILL.md to reference file) is the practical limit before agents get lost in a documentation tree.

Anatomy of a Skill

Three Types of Skills#

Skills serve 3 distinct purposes, and designing for the right type determines how effective the skill will be.

Domain Knowledge Skills#

This is the most common type. These encode specialized knowledge about a specific technology, methodology, or domain.

1
---
2
name: portfolio-risk-analysis
3
description: >
4
  Portfolio risk assessment methodology including VaR, CVaR, beta computation, and hedging strategies. Use when computing risk metrics, assessing portfolio exposure, or recommending hedging actions.
5
---

Domain knowledge skills work best when they capture project-specific decisions rather than general knowledge.

Workflow Pattern Skills#

These encode how to perform a multi-step process, covering what to do and in what order.

1
---
2
name: safe-deployment
3
description: >
4
  Production deployment workflow with pre-flight checks, staged rollout, and automated rollback. Use when deploying any service to production or staging environments.
5
---

The body of a workflow skill reads like a playbook: pre-flight checklist, deployment stages with specific thresholds, rollback criteria, and pointers to the detailed rollback playbook in the reference directory. Workflow skills are particularly powerful when combined with embedded hooks. A deployment skill can include hooks that validate each step of the checklist, creating a self-enforcing workflow.

Utility Script Skills#

These primarily provide executable tools that agents can run. The SKILL.md file explains when and how to use the scripts but the scripts themselves do the heavy lifting.

1
---
2
name: data-quality-validation
3
description: >
4
  Data quality validation utilities. Use when ingesting data from external sources, after ETL transformations, or before loading data into production databases. Includes schema validation, completeness checks, and anomaly detection scripts.
5
---

Three Skill Types

Skills with Embedded Hooks#

The most advanced skill pattern combines knowledge, workflow instructions, and embedded hooks into a single distributable package. The skill tells the agent what to do, how to do it, and automatically validates that it was done correctly.

For example:

1
---
2
name: api-endpoint-development
3
description: >
4
  API endpoint development patterns with automatic validation. Use when creating new REST API endpoints, modifying existing routes, or adding API middleware.
5
hooks:
6
  PostToolUse:
7
    - matcher: "Write|Edit"
8
      hooks:
9
        - type: command
10
          command: "$CLAUDE_PROJECT_DIR/.claude/skills/api-endpoint-development/scripts/validate_endpoint.sh"
11
  Stop:
12
    - matcher: "*"
13
      hooks:
14
        - type: command
15
          command: "$CLAUDE_PROJECT_DIR/.claude/skills/api-endpoint-development/scripts/check_api_tests.sh"
16
---

When an agent uses this skill, the embedded hooks activate automatically. Every file write triggers endpoint validation. Session completion requires passing API tests. The skill is self-contained with knowledge, workflow, and quality enforcement in one package.

Self-Validating Skill Workflow

Skills with embedded hooks create self-validating workflows with knowledge, process, and quality enforcement bundled into a single distributable package. Install the skill, and you get the expertise and the guardrails.

Architectural Example#

To illustrate how skills fit into a real architecture, consider a financial research agent team with specialized analysts. Each agent needs different domain knowledge, but the skills system ensures context stays efficient.

1
.claude/skills/
2
├── market-regime-detection/
3
│   ├── SKILL.md                    # What regimes exist, how to detect them
4
│   └── reference/
5
│       ├── regime_indicators.md    # VIX thresholds, correlation benchmarks
6
│       ├── historical_regimes.md   # Past regime changes for calibration
7
│       └── scripts/
8
│           └── compute_regime_signals.py
9
├── portfolio-risk-analysis/
10
│   ├── SKILL.md                    # Risk assessment methodology
11
│   └── reference/
12
│       ├── var_methodology.md      # VaR/CVaR computation approaches
13
│       ├── hedging_strategies.md   # Common hedging instruments and costs
14
│       └── scripts/
15
│           └── compute_risk_metrics.py
16
├── earnings-analysis/
17
│   ├── SKILL.md                    # How to analyze earnings reports
18
│   └── reference/
19
│       ├── earnings_template.md    # Standard analysis format
20
│       ├── key_metrics.md          # Revenue, EPS, guidance metrics
21
│       └── scripts/
22
│           └── parse_earnings_data.py
23
└── swarm-orchestration/
24
    ├── SKILL.md                    # How to reconfigure the swarm
25
    └── reference/
26
        ├── team_compositions.md    # Optimal team for each regime
27
        ├── handoff_protocol.md     # How agents hand off work
28
        └── reconfiguration_playbook.md

Example of Architecture using Financial Swarm

At startup, all agents see the descriptions of all 4 skills (approximately 300 tokens). The risk monitor agent loads portfolio-risk-analysis when computing VaR. The earnings analyst loads earnings-analysis when processing quarterly reports. The team lead loads swarm-orchestration when regime changes require team reconfiguration. No agent ever loads skills it does not need.

The regime detection skill’s bundled script (compute_regime_signals.py) is particularly effective here. The script contains complex statistical computation (VIX analysis, correlation matrix calculation, sector dispersion measurement) that would consume significant context if loaded as source code. Instead, the agent runs the script and receives a compact JSON output of regime signals, consuming perhaps 20 tokens instead of 2,000.

Common Anti-Patterns#

Every team building skills hits the same failure modes:

The kitchen-sink skill: a single skill that tries to cover everything from coding standards, deployment, testing, security, and performance. This defeats the purpose of progressive disclosure because the entire body loads whenever any sub-topic is relevant. Split it into focused skills.
The description-less skill: a skill with a vague description like “project utilities” or “helpful patterns.” Claude cannot match this to tasks reliably, so the skill rarely gets loaded when it is actually needed. Descriptions should be specific and action-oriented.
The copy-paste skill: a skill that duplicates content from CLAUDE.md or from another skill. This creates maintenance burden and risks inconsistency. Each piece of knowledge should live in exactly one place.

The script-in-body skill: including long utility scripts directly in the SKILL.md body instead of as bundled scripts. This wastes context tokens because the full script loads whenever the skill is referenced, even if the agent only needs to run it. Put scripts in reference/scripts/ where the agent can execute them without loading the source.

The infinite-depth skill: reference files that point to other reference files that point to more reference files. Agents get lost in deep documentation trees. Keep it to one level: SKILL.md to reference file.

Common Anti-Patterns of Skills

Best Practices#

Write descriptions as discovery triggers. The description is how Claude decides whether to load a skill. Include specific task verbs: “Use when implementing…”, “Use when debugging…”, “Use when deploying…” If the description is vague (“general utilities”), Claude will not reliably match it to relevant tasks.

Keep the SKILL.md body under 500 lines. The body should provide working-level guidance, not encyclopedic coverage. Anything deeper belongs in reference files. A 500-line body is enough for the key decisions, common patterns, a few code examples, and pointers to deeper references.

Use one level of progressive disclosure. SKILL.md to reference files is the practical limit. Do not create reference files that point to other reference files. If your knowledge structure is that deep, reorganize it into multiple skills or flatten the hierarchy.

Use executable scripts for computation. Any skill involving data processing, metric computation, or complex validation should include scripts. The agent runs the script and receives output and the script source never enters context.

Match skills to agent roles. Design skills so that each agent typically needs only 1 or 2 skills for its role. If a single agent needs 5 skills loaded simultaneously, either the agent’s role is too broad or the skills are too granular.

Use consistent terminology. If your project calls something a “document” in CLAUDE.md, call it a “document” in your skills too. Inconsistent terminology confuses agents and reduces the reliability of skill discovery.

Avoid time-sensitive information. Skills should contain patterns and knowledge that remain stable. Do not include version numbers that change frequently, links that might break, or information about current market conditions.

Test skills with different models. A skill that works well with Opus might be too ambiguous for Haiku. Test your skills across the models your team uses to ensure the instructions are clear enough for the least capable model that will use them.

Conclusion#

Skills represent a careful solution to the fundamental tension in agent systems between comprehensive knowledge and efficient context usage. By encoding domain expertise, workflow patterns, and utility scripts into progressively disclosed packages, skills give agents access to deep knowledge without the context cost of loading everything upfront.

The most effective skills share common characteristics: specific discovery-oriented descriptions, concise bodies under 500 lines that capture project-specific decisions, reference files for deep dives into specific topics, and executable scripts that keep computation out of context. When combined with embedded hooks, skills become self-validating workflows that bundle knowledge, process, and quality enforcement into distributable packages.

Skills in the Agent Ecosystem

As agent teams grow in size and tackle more complex domains, skills become the knowledge infrastructure that makes specialization practical. Without skills, every agent would need to carry every piece of domain knowledge in its context. With skills, each agent loads exactly the knowledge it needs, exactly when it needs it.

The most valuable skill you can write is the one that captures the decisions your team has already made: the specific choices, conventions, and patterns that no external documentation covers. That is the knowledge that turns a general-purpose AI into a domain specialist.

The Series#

This is part 2 of a 5 part series on Claude Code:

Claude Autonomous Coding Overview --- The control layer architecture that makes coding reliable
Building Effective Claude Code Agents: From Definition to Production --- Agent definitions, tool restrictions, and least privilege
Claude Code Skills: Building Reusable Knowledge Packages for AI Agents(this article) --- Progressive disclosure and reusablel knowledge packets
Claude Code Hooks: The Deterministic Control Layer for AI Agents --- PreToolUse, PostToolUse, and deterministic enforcement
Claude Code Agent Teams: Building Coordinated Swarms of AI Developers --- Defense-in-depth

References#

[1] Anthropic, “Skill authoring best practices,” Claude Platform Documentation, 2025. https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices

[2] Anthropic, “Create plugins,” Claude Code Documentation, 2025. https://code.claude.com/docs/en/plugins

[3] Anthropic, “Automate workflows with hooks,” Claude Code Documentation, 2025. https://code.claude.com/docs/en/hooks-guide

[4] E. Schluntz and B. Zhang, “Building effective agents,” Anthropic Engineering Blog, Dec 2024. https://www.anthropic.com/engineering/building-effective-agents

[5] J. Young et al., “Effective harnesses for long-running agents,” Anthropic Engineering Blog, Nov 2025. https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents

[6] N. Carlini, “Building a C compiler with a team of parallel Claudes,” Anthropic Engineering Blog, Feb 2025. https://www.anthropic.com/engineering/building-c-compiler

[7] Disler, “Agentic Finance Review,” GitHub Repository, 2025. https://github.com/disler/agentic-finance-review

[8] Disler, “Claude Code Hooks Mastery,” GitHub Repository, 2025. https://github.com/disler/claude-code-hooks-mastery

[9] Anthropic, “Orchestrate teams of Claude Code sessions,” Claude Code Documentation, 2025. https://code.claude.com/docs/en/agent-teams

[10] Anthropic, “Extend Claude Code,” Claude Code Documentation, 2025. https://code.claude.com/docs/en/features-overview

[11] A. Osmani, “Claude Code Swarms,” AddyOsmani.com, Feb 2026. https://addyosmani.com/blog/claude-code-agent-teams/

[12] Disler, “Claude Code Hooks Multi-Agent Observability,” GitHub Repository, 2025. https://github.com/disler/claude-code-hooks-multi-agent-observability