Claude Code Subagents: The Practical Guide

Q: "What are Claude Code subagents?"

" Subagents are specialized Claude instances that Claude Code can delegate tasks to. Each one runs in its own context window with its own system prompt, tools, and optional model, then returns a summary to the main session. You define them as Markdown files with YAML frontmatter in .claude/agents/ (project-scoped) or ~/.claude/agents/ (user-scoped)."

Q: "How do I invoke a Claude Code subagent?"

" Three ways, from implicit to explicit: (1) phrase your request so it matches the subagent\u0026rsquo;s description and let Claude delegate automatically, (2) name it in natural language (\u0026ldquo;use the pr-reviewer to look at the auth changes\u0026rdquo;), or (3) @-mention it, which guarantees that subagent runs. For session-wide use, launch Claude with --agent pr-reviewer and the whole session uses that subagent\u0026rsquo;s prompt and tools."

Q: "Do Claude Code subagents cost extra tokens?"

" Each subagent runs its own conversation, so yes, they consume tokens, but often fewer than doing the same work in the main thread, because the main thread doesn\u0026rsquo;t have to hold all the intermediate search results and file bodies. Setting model: haiku for exploration-heavy subagents is where most of the cost savings come from in practice."

TL;DR

Claude Code subagents are isolated Claude instances with their own context window, tools, and system prompt. Use one when a side task would otherwise dump a pile of logs, search results, or file bodies into your main conversation. They shine for parallel exploration, cheap-model routing, and enforcing read-only boundaries, and they stop being worth the setup cost the moment the task is small enough that copying the output back by hand is easier.

Why I bothered learning subagents

I ignored subagents for months. The feature shipped, I read the release notes, I thought “cute” and went back to writing prompts in one big context. Then I hit a codebase audit where Claude burned 70% of the context window reading files I’d never look at again, and the actual recommendations came out thin because there was no room left to think. Second try with a dedicated repo-explorer subagent: same audit, recommendations three times as detailed, main session context barely moved.

Subagents are a context-management tool first and a cost-management tool second. Everything else (parallelism, tool scoping, model routing) follows from those two.

What a subagent actually is

Each subagent is a Markdown file with YAML frontmatter, stored in .claude/agents/ (project) or ~/.claude/agents/ (user). Claude Code loads them at session start. When a request matches a subagent’s description, Claude delegates: the subagent runs in a fresh context window, uses whatever tools you allowed it, and returns only a summary to the main thread.

The minimal file is embarrassingly short:

---
name: repo-explorer
description: Explore and summarize an unfamiliar codebase. Use proactively before planning changes.
tools: Read, Grep, Glob, Bash
model: haiku
---

You are a repo-exploration specialist. When given a path or question,
walk the codebase with Read, Grep, and Glob. Return a concise summary:
directory layout, build and test commands, entry points, and any
obvious architectural patterns. Do not edit files.

The frontmatter says who it is and what it can touch; the body is the system prompt. Claude Code also ships three built-ins (Explore on Haiku and read-only, Plan for plan mode, and general-purpose with full tools), which is usually where people start before they realize custom ones fit their actual workflow better.

When to spin one up (and when to skip it)

The /agents command makes creation cheap, so the real question is whether a subagent pays back its setup cost. My rough rule:

Situation	Subagent?
Task produces logs or file bodies you won’t reread	Yes
You keep copy-pasting the same “find X in the codebase” request	Yes
You want read-only enforcement (e.g., a reviewer that can’t edit)	Yes
Task runs cheap on Haiku while main runs on Opus	Yes
You’ll reuse this in other projects	Yes (user scope)
One-off question with a short answer	No, just ask Claude directly
The logic depends on state held in the main conversation	No, keep it in-thread

The second column flips to “no” fast when the task is genuinely small. I’ve seen teams wrap a 4-line grep inside a subagent “for cleanliness” and then spend ten minutes debugging why the YAML won’t parse. The tool wants to help with noise reduction; it isn’t a replacement for being willing to read the file yourself.

Your first custom subagent, step by step

The fastest path is /agents inside Claude Code, which walks you through it interactively. If you’d rather see the raw mechanics, here’s the version I use for a read-only PR reviewer:

---
name: pr-reviewer
description: Review diffs for quality, security, and test coverage. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
disallowedTools: Write, Edit
model: sonnet
memory: project
color: blue
---

You are a senior code reviewer. When invoked:

1. Run `git diff --staged` (or the diff the user names) and read the
   full touched files for context, not just the hunks.
2. Check for: obvious bugs, missing error handling on new code paths,
   N+1 queries, secrets, tests for new behavior.
3. Respond with at most 10 findings, each as: file:line, severity
   (blocking / nit / suggestion), and a concrete rewrite.

Update your agent memory with patterns you see more than once.

Drop that at .claude/agents/pr-reviewer.md, restart the session (or re-run /agents), and Claude will route PR reviews there automatically if your message sounds like a review request. To force it, @pr-reviewer look at the auth changes. The @-mention guarantees the subagent runs; plain natural language leaves the decision to Claude’s router.

A few of those fields deserve a closer look.

The tools + disallowedTools combination has subtle ordering. If both are set, disallowedTools applies first, then tools picks from what’s left. I reach for the denylist when I want to inherit everything except writes, and the allowlist when I want a tight whitelist. Don’t set both unless you’ve thought through the ordering; two teams I’ve worked with shipped subagents that silently had no tools because the allowlist and denylist cancelled each other out.

Setting memory: project gives the subagent its own directory at .claude/agent-memory/pr-reviewer/ to build up a MEMORY.md over time. Weeks in, this is how a reviewer learns that your team bikesheds on naming conventions every PR and stops flagging them.

color feels cosmetic but earns its place when three subagents run in parallel — the tinted task labels are how you know which transcript is which without reading them all.

Subagents vs skills vs agent teams

People search this comparison constantly, so here’s the quick rundown:

Feature	Subagent	Skill	Agent team
Runs in own context window	Yes	No (injects into yours)	Yes, each teammate has its own
Separate tool set	Yes	No	Yes
Can run in parallel	Yes	No	Yes, explicitly designed for it
Coordinates across sessions	No	No	Yes
Good for	Heavy, self-contained side tasks	Injecting domain knowledge or procedures	Multi-agent workflows that need message passing

The practical heuristic: if you need knowledge (how we write migrations, where our API conventions live), write a skill. If you need capability (go off and audit the whole repo, come back with a summary), write a subagent. If you need multiple specialists talking to each other across jobs (a researcher handing findings to a writer who hands the draft to a reviewer), that’s agent teams.

You can also combine them. A subagent’s frontmatter accepts a skills: list that preloads skill content into its startup context. I do this for an api-implementer subagent that preloads api-conventions and error-handling-patterns; the subagent gets the knowledge without burning tokens discovering it.

Tool scoping: less is actually more

The strongest argument for subagents is blast radius, more than context. A subagent with tools: Read, Grep, Glob physically cannot write a file. No prompt injection, no misread instruction, no rogue tool call can turn a reviewer into a deleter, which is a much stronger guarantee than “we told it not to.”

Two patterns I reach for often:

The read-only researcher inherits nothing and explicitly lists Read, Grep, Glob, Bash. Bash is safe here because the subagent’s prompt says “never write or execute destructive commands,” and, if you’re paranoid, a PreToolUse hook can block rm, curl | sh, and friends before they run.
The scoped Bash executor has full Bash, but a hook script rejects any command not starting with pytest, go test, or npm test. Now the subagent can run tests and only run tests.

Hooks are where this gets serious. A subagent can validate every tool call against a script you control. That’s how you let Claude do useful unattended work without letting it nuke the repo.

Model selection: the cheapest optimization

The model field defaults to inherit, which is the single most expensive default in Claude Code. If your main session runs on Opus and you spawn four subagents for a codebase audit, all four run on Opus too. That bill adds up fast.

My working split: Haiku handles file walking, grep, and anything that’s “find this, summarize that.” It’s fast, it’s cheap, and the summaries are good enough to make downstream decisions. Sonnet handles code review, refactor proposals, and anything where the quality of reasoning shows up in the output. Opus comes out only when the subagent is doing the hard thinking: architecture analysis, spec synthesis, adversarial review. With Claude Opus 4.7’s pricing drop the gap is smaller than it used to be, but Haiku is still 5× cheaper per million tokens.

Concrete example: the repo-explorer I showed earlier runs on Haiku. It’s faster, roughly 5× cheaper per invocation on Anthropic's API pricing (Opus 4.7 at $5/$25 per MTok vs Haiku 4.5 at $1/$5), and the quality gap on “tell me the directory layout and build commands” is invisible. Save the expensive models for the places where reasoning depth actually changes the answer.

Where parallel execution actually pays off

Claude can run multiple subagents in parallel by spawning them in a single response. This is the feature that turns a 15-minute audit into a 3-minute audit. Ask for a codebase review and Claude can fire off security-reviewer, test-coverage-reviewer, and style-reviewer at once, each with its own context, its own tools, its own transcript.

flowchart LR
    Main[Main session] --> S1[security-reviewer<br/>Sonnet, read-only]
    Main --> S2[test-coverage-reviewer<br/>Haiku, read-only]
    Main --> S3[style-reviewer<br/>Haiku, read-only]
    S1 --> Sum[Summary back to main]
    S2 --> Sum
    S3 --> Sum

Two rules I’ve learned the hard way:

Subagents can’t spawn other subagents. This is deliberate (it prevents infinite nesting), but it means parallelism has to be orchestrated from the main thread. If your plan needs two layers, restructure it as a pipeline, not a tree.
Parallel subagents don’t talk to each other. They return to the main thread, which merges their outputs. If you need cross-talk (e.g., the reviewer needs to ask the researcher a follow-up), that’s an agent teams problem, not a subagents problem.

Common mistakes I see people make

Things that show up in other people’s configs (and, early on, mine):

Writing vague descriptions. description: "helps with code" won’t get delegated to, because Claude has nothing to match against. Be specific about the trigger: “Review staged diffs for security and test coverage. Use proactively after code changes.”
Inheriting all tools out of habit. If you don’t list tools, the subagent gets everything the main conversation has, including MCP servers you didn’t intend to expose. Start from the denylist or a tight allowlist.
Giving it write access “just in case.” Most subagents should never edit, and limiting to read-only makes the thing safer and easier to trust the first time it runs unattended.
Skipping the memory flag on reviewers. A pr-reviewer with memory: project gets better every week; without it, you re-teach the same conventions every session.
Defaulting to Opus everywhere. The inherit default is expensive. Set model: haiku or model: sonnet explicitly unless the subagent really needs frontier reasoning.
One giant subagent. A do-everything subagent is just the main thread with extra YAML. Split by job-shape: test-runner, docs-researcher, migration-writer. Narrow is what makes them delegatable.

FAQ

What are Claude Code subagents?

Subagents are specialized Claude instances that Claude Code can delegate tasks to. Each one runs in its own context window with its own system prompt, tools, and optional model, then returns a summary to the main session. You define them as Markdown files with YAML frontmatter in .claude/agents/ (project-scoped) or ~/.claude/agents/ (user-scoped).

Claude Code subagents vs skills: which should I use?

Use a skill when you want to inject knowledge or procedures into Claude’s context (e.g., “how we write migrations”). Use a subagent when you want a side task handled in an isolated context with its own tools (e.g., “audit the codebase and summarize”). Skills don’t get their own context window; subagents do. You can also list skills in a subagent’s frontmatter to preload them at startup.

How do I invoke a Claude Code subagent?

Three ways, from implicit to explicit: (1) phrase your request so it matches the subagent’s description and let Claude delegate automatically, (2) name it in natural language (“use the pr-reviewer to look at the auth changes”), or (3) @-mention it, which guarantees that subagent runs. For session-wide use, launch Claude with --agent pr-reviewer and the whole session uses that subagent’s prompt and tools.

Can Claude Code subagents run in parallel?

Yes. Claude spawns multiple subagents in a single response when the tasks are independent. That’s the main time-saving feature: three read-only reviewers running in parallel instead of sequentially can cut a review from minutes to seconds. Subagents themselves cannot spawn other subagents, so parallelism has to be orchestrated from the main thread.

Do Claude Code subagents cost extra tokens?

Each subagent runs its own conversation, so yes, they consume tokens, but often fewer than doing the same work in the main thread, because the main thread doesn’t have to hold all the intermediate search results and file bodies. Setting model: haiku for exploration-heavy subagents is where most of the cost savings come from in practice.

Bottom line

Subagents earn their keep in exactly three places: context preservation on tasks that would otherwise flood the main thread, cost control by routing noisy work to cheaper models, and blast-radius control via tool scoping. Everything else the docs list is downstream of those three.

If you’ve tried Claude Code and felt your main context filling up with stuff you’ll never reread, write two subagents this week: a read-only repo-explorer on Haiku, and a pr-reviewer on Sonnet with memory: project. You’ll know by Friday whether they pay back. In my case they did, and I haven’t opened a codebase audit without subagents since.

TL;DR#

Why I bothered learning subagents#

What a subagent actually is#

When to spin one up (and when to skip it)#

Your first custom subagent, step by step#

Subagents vs skills vs agent teams#

Tool scoping: less is actually more#

Model selection: the cheapest optimization#

Where parallel execution actually pays off#

Common mistakes I see people make#

FAQ#

What are Claude Code subagents?#

Claude Code subagents vs skills: which should I use?#

How do I invoke a Claude Code subagent?#

Can Claude Code subagents run in parallel?#

Do Claude Code subagents cost extra tokens?#

Bottom line#

Don't miss what's next

Related Articles

Claude Code vs Codex CLI: Real Costs, Benchmarks, and When to Use Each

Anthropic Leaked Claude Code's Entire Source — Here's What 512K Lines of TypeScript Reveal

Claude Opus 4.7 Is Out: Benchmarks, Pricing, and What Actually Changed

AI Coding Tools in 2026: The Complete Guide to What Works, What Doesn't, and What's Coming