How-to Guide

How to build a multi-agent system with Claude Code and sub-agents.

Learn how to architect and deploy a hierarchical multi-agent system using Claude Code's Task System, sub-agents, and dynamic workflow delegation patterns.

Daniel Fleuren2026-01-1512 min readDevelopers and technical teamsUpdated 2026-06-19

Written by

Daniel Fleuren

Founder, AI Kick Start. 20+ years enterprise IT

Updated 2026-06-19

AI Kick Start editorial image for How to build a multi-agent system with Claude Code and sub-agents.

Decision

Start narrow

Use the article to decide the smallest useful workflow worth testing before expanding the system.

Risk to watch

Hype drift

Avoid turning a practical adoption step into a broad transformation promise nobody can verify.

Proof to collect

Business signal

Write down the owner, data boundary, review point, and measurable outcome before the first build.

TL;DR

TL;DR: This guide walks you through building a hierarchical multi-agent system using Claude Code's Task System and sub-agents. You'll learn the orchestrator-worker pattern, task delegation, result aggregation, and error recovery, all within Claude Code's native framework. By the end, you'll have a working system of 5+ specialised agents coordinated by a central orchestrator.

Key takeaways

Pattern: Use an orchestrator + N workers with defined contracts
Communication: JSON task envelopes with schema validation
Error handling: Each worker has retry logic; orchestrator handles cascade failures
Context limits: Claude Sonnet 4.6's 1M context beta handles large codebases in a single pass
Cost control: Set token budgets per sub-agent to avoid runaway spend

Analysis

The idea is simple enough to explain over coffee. Instead of asking one AI to do everything, you give it a team. One agent runs the show. The others do the legwork: one digs through your code, one writes new code, one checks the work, one runs the tests. The boss agent hands out the jobs and stitches the answers back together.

That's a multi-agent system, and Claude Code can run one out of the box. It ships with a Task tool that spins up sub-agents, each with its own fresh context and a clear job to do, and it can run several of them side by side. The orchestrator-worker shape this guide describes maps straight onto that.

There's a fair-warning note before we start. The code samples below are written as a TypeScript SDK, things like new Claude() and claude.useSkill(). Treat that as illustrative pseudocode for the architecture, not a copy-paste API. In real Claude Code, sub-agents are Markdown files with YAML frontmatter in .claude/agents/, and skills are SKILL.md files in .claude/skills/ folders. The pattern is real and worth building. The exact function names here are a teaching device.

Analysis

Prerequisites

Claude Code installed and authenticated (the article cites claude --version >= 0.35, though that exact version threshold is unconfirmed, any recent build with sub-agent support is fine)
A project directory initialised (claude init)
Basic TypeScript or Python knowledge
Understanding of JSON schema

Step-by-Step Framework

Step 1: Define Your Agent Topology

Most multi-agent setups fall into a handful of shapes. For day-to-day project work, the orchestrator-workers shape earns its keep:

Orchestrator Agent
├── Research Worker (finds relevant files/context)
├── Code Writer Worker (generates/modifies code)
├── Review Worker (checks quality and style)
└── Test Runner Worker (executes and validates)

Work out the topology before you write a line. Two questions sort it: what does each agent need to be good at, and what has to pass from one agent to the next?

Step 2: Create the Orchestrator Skill

In Claude Code, skills live in .claude/skills/. (One correction to the code below: each skill is a folder with a SKILL.md Markdown file inside, not a .ts file, the TypeScript here is shorthand for the logic.) Create the orchestrator:

// .claude/skills/orchestrator.ts
import { Claude, TaskEnvelope } from './types';

const WORKER_SKILLS = [
  'research-agent',
  'code-writer',
  'review-agent',
  'test-runner'
];

export async function orchestrate(task: string): Promise<string> {
  const claude = new Claude();

  // Phase 1: Decompose the task
  const plan = await claude.generate({
    prompt: `Break this task into sub-tasks for a multi-agent system: ${task}`,
    outputSchema: {
      subtasks: 'array of {id, skill, description, dependencies}'
    }
  });

  // Phase 2: Execute in dependency order
  const results: Record<string, unknown> = {};

  for (const subtask of topologicalSort(plan.subtasks)) {
    const worker = await claude.useSkill(subtask.skill);
    const envelope: TaskEnvelope = {
      taskId: subtask.id,
      skill: subtask.skill,
      input: subtask.description,
      context: gatherContext(subtask.dependencies, results),
      budget: { maxTokens: 100000, maxCost: 5.00 }
    };

    results[subtask.id] = await worker.execute(envelope);
  }

  // Phase 3: Synthesise results
  return claude.generate({
    prompt: `Synthesise these results into a coherent output: ${JSON.stringify(results)}`
  });
}

The three phases are the whole story: break the job apart, run the pieces in the right order, then pull the answers back together.

Step 3: Build Individual Worker Skills

Each worker is a Claude Code skill with a tight, focused system prompt:

// .claude/skills/research-agent.ts
export const researchAgentConfig = {
  name: 'research-agent',
  systemPrompt: `You are a research specialist. Your job is to:
1. Search the codebase for relevant files using ripgrep and find
2. Read and summarise file contents
3. Return a structured report with file paths, relevant line ranges, and summaries
4. NEVER modify files, only read and report

Return your findings as JSON matching the ResearchOutput schema.`,

  tools: ['ripgrep', 'file_read', 'git_log'],

  outputSchema: {
    files: 'array of {path, relevanceScore, relevantLines, summary}',
    confidence: 'number 0-1'
  }
};

// .claude/skills/code-writer.ts
export const codeWriterConfig = {
  name: 'code-writer',
  systemPrompt: `You are a senior TypeScript developer. Your job is to:
1. Write clean, typed, well-documented code
2. Follow the project's existing patterns and conventions
3. Generate unit tests alongside implementation
4. Return the full file content, not diffs

Always include error handling and input validation.`,

  tools: ['file_write', 'file_read', 'shell_exec'],

  constraints: {
    maxFileSize: '500 lines',
    requireTests: true,
    requireTypes: true
  }
};

Notice the research agent is read-only by design. Giving each worker the narrowest set of tools it needs keeps it in its lane and stops a stray write from doing damage.

Step 4: Wire Up Task Delegation

The orchestrator hands work to the workers through Claude Code's built-in task tool:

// In your orchestrator skill
async function delegateToWorker(envelope: TaskEnvelope) {
  const result = await claude.task({
    description: `${envelope.skill}: ${envelope.input}`,
    prompt: `You are the ${envelope.skill} agent.

TASK: ${envelope.input}

CONTEXT FROM OTHER AGENTS:
${JSON.stringify(envelope.context, null, 2)}

BUDGET: ${envelope.budget.maxTokens} tokens max.

Follow your skill definition precisely. Return structured JSON output.`,
    skills: [envelope.skill],
    timeout: 300000 // 5 minutes
  });

  return validateOutput(result, envelope.skill);
}

The envelope carries everything the worker needs: the task, what the other agents already found, and a token ceiling. The validateOutput call at the end matters more than it looks, it's where you catch a worker that wandered off-schema before its answer poisons the next step.

Step 5: Implement Error Recovery

Workers fail. Plan for it:

async function executeWithRetry(
  envelope: TaskEnvelope,
  maxRetries = 2
): Promise<WorkerResult> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const result = await delegateToWorker(envelope);
      if (result.confidence < 0.7 && attempt < maxRetries) {
        console.warn(`Low confidence (${result.confidence}), retrying...`);
        envelope.input += '\n\n[Previous attempt had low confidence. Please be more thorough.]';
        continue;
      }
      return result;
    } catch (error) {
      if (attempt === maxRetries) throw error;
      await sleep(1000 * (attempt + 1));
    }
  }
  throw new Error('Max retries exceeded');
}

Two safety nets here. A worker that comes back unsure of itself gets asked to try again with a nudge to dig deeper. A worker that throws an outright error gets a backoff before the next attempt. After the retries run dry, the orchestrator gives up cleanly rather than pretending the work is done.

Step 6: Add Human-in-the-Loop Gates

For anything pricey or destructive, put a human in front of the button:

// .claude/skills/gatekeeper.ts
export async function requireApproval(
  action: string,
  estimatedCost: number
): Promise<boolean> {
  if (estimatedCost < 1.00) return true; // Auto-approve cheap ops

  const approval = await claude.prompt({
    type: 'confirm',
    message: `Agent requests: ${action}\nEstimated cost: $${estimatedCost}\nApprove?`
  });

  return approval;
}

Cheap operations wave straight through. Anything over a dollar stops and asks. Tune that threshold to whatever number makes you nervous.

Do/Don't

Do	Don't
Define clear output schemas for every worker	Let agents return free-form text
Set token budgets per sub-task	Let a single agent consume your whole context window
Use topological sort for dependency ordering	Fire all agents simultaneously without planning
Log every delegation decision	Run agents as black boxes with no observability
Implement graceful degradation	Fail the entire workflow if one worker errors

Testing Your Multi-Agent System

Build yourself a test harness:

# Test the full pipeline
claude run skill orchestrator --input "Refactor the auth module to use JWT tokens"

# Test individual workers
claude run skill research-agent --input "Find all API endpoint definitions"
claude run skill code-writer --input "Write a rate-limiting middleware"
claude run skill review-agent --input "Review src/auth.ts for security issues"

Run each worker on its own first. If a single agent misbehaves in isolation, you don't want to be untangling that from inside the full pipeline.

Advanced: Dynamic Agent Creation

A word of caution before this section: the code below describes a runtime that generates and loads new agents on the fly via calls like claude.createSkill() and claude.loadAgent(). As far as Claude Code's documented framework goes, no such runtime API exists, sub-agents and skills are authored as static Markdown files, not conjured mid-run. Read what follows as a sketch of where the pattern could head, not a feature you can wire up today:

async function spawnSpecialist(domain: string): Promise<Agent> {
  const skillDefinition = await claude.generate({
    prompt: `Create a Claude Code skill definition for a specialist agent in: ${domain}`,
    outputSchema: { name, systemPrompt, tools, constraints }
  });

  await claude.createSkill(skillDefinition);
  return claude.loadAgent(skillDefinition.name);
}

The pitch is a system that grows new abilities when it meets a problem it hasn't seen, the rough shape of a self-improving agent. Worth understanding as a direction of travel; not something to depend on in a build today.

Conclusion

A multi-agent system in Claude Code rests on three real things: the native Task tool, skill definitions, and the ability to delegate to sub-agents, including several at once. Start small with one orchestrator and a few workers, draw firm contracts between them, then add retries and approval gates. And there's a genuine reason the coordination gets easier on newer models: Claude Sonnet 4.6 ships with a 1M-token context window in beta, at the same $3/$15 per million tokens as 4.5, so the orchestrator can hold a large codebase and the whole workflow's state in one pass instead of leaning on an external message queue.

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

Pick the smallest useful workflow that proves the pattern.
Write down the owner, data boundary, review point, and success measure.
Review the result after the first real run and decide whether to scale, change, or stop.

Want help applying this? Explore AI agent design systems.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: How to build a multi-agent system with Claude Code and sub-agents

Read with ChatGPT Open Claude Search with AI Mode

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call

How to build a multi-agent system with Claude Code and sub-agents.

Daniel Fleuren

Start narrow

Hype drift

Business signal

TL;DR

Key takeaways

Analysis

Analysis

Prerequisites

Step-by-Step Framework

Step 1: Define Your Agent Topology

Step 2: Create the Orchestrator Skill

Step 3: Build Individual Worker Skills

Step 4: Wire Up Task Delegation

Step 5: Implement Error Recovery

Step 6: Add Human-in-the-Loop Gates

Do/Don't

Testing Your Multi-Agent System

Advanced: Dynamic Agent Creation

Conclusion

Primary references to keep this briefing grounded

What to do next

Use the article as a decision prompt

Turn this into a practical roadmap.

Related articles

How to orchestrate 10 agents: The IndyDevDan method

How to create an agent heartbeat system

How to set up CI/CD for AI agent deployments