Skip to content
🤖 AI-optimized docs: llms-full.txt

Subagent Orchestration Patterns

Subagents are isolated LLM instances that a parent skill spawns to handle specific tasks. They have their own context window, receive instructions, and return results. Used well, they keep the parent context small while enabling massive parallel work.

All patterns share one principle: the filesystem is the single source of truth. Parent context stays tiny (file pointers + high-level plan). Subagents are stateless black boxes — instructions in, response out, isolated context.

Every pattern below builds on this infrastructure. The filesystem acts as a shared database so the parent never bloats its context.

/output/
├── status.json ← task states, completion flags
├── knowledge.md ← accumulated findings (append-only)
└── task-queue.json ← pending work items
/tasks/{id}/
├── input.md ← instructions for this subagent
└── output/
├── result.json ← structured output (strict schema)
└── summary.md ← compact summary (≤200 tokens)
/artifacts/ ← final deliverables

One technique is to have every subagent prompt ends the same way: “You are stateless. Read ONLY the files listed. Write ONLY result.json + summary.md. Do not echo data back.”

The simplest pattern. Subagents read sources and return only distilled summaries. The parent never touches raw data.

AspectDetail
How it worksParent spawns readers in parallel; each reads a source and returns a compact summary; parent synthesizes from summaries only
Critical ruleParent must delegate before touching any source material — if it reads first, the tokens are already spent
When to use5+ documents, web research, large codebase exploration
Not worth it for1-2 files where the overhead exceeds the savings
Token savings~99%. Five docs at 15K tokens each = 75K raw vs ~350 tokens in summaries

For large-scale operations with potentially a lot of relevant data across multiple sources. Subagents write results to temp files. A separate assembler subagent combines them into a cohesive deliverable.

AspectDetail
How it worksParent spawns N worker subagents writing to tmp/{n}.md; after all complete, spawns an assembler subagent that reads all temp files and creates the final artifact
When to useWhen summaries are still too large to return inline, or when assembly needs a dedicated agent with fresh context
ExampleThe BMad quality optimizer uses this — 5 parallel scanner subagents write temp JSON, then a report-creator subagent synthesizes them

Multiple subagents communicate through shared files, building on each other’s work. The parent controls turn order.

AspectDetail
How it worksAgent A writes to shared.md; Agent B reads it and adds; Agent A can be resumed to continue; the shared file grows incrementally
VariantsShared file (multiple agents read/write a common file) or session resumption (reawaken a previous subagent to continue with its full context)
When to usePipeline stages where later work depends on earlier work, but each agent’s context stays small

A lead subagent analyzes the task once and writes a breakdown. The parent spawns workers from that plan. Mid-level sub-orchestrators can handle complex subtasks.

AspectDetail
How it worksLead agent writes plan.json with task breakdown; parent reads plan and spawns workers in parallel; complex subtasks get their own sub-orchestrator
When to useTasks that need analysis before decomposition, or where the parent cannot predict the work structure upfront
VariantMaster-clone — spawn near-identical agents with slight persona tweaks exploring different branches of the same problem

Pattern 5: Persona-Driven Parallel Reasoning

Section titled “Pattern 5: Persona-Driven Parallel Reasoning”

The most powerful pattern for quality. Spawn diverse specialists in parallel — genuinely independent thinking from isolated contexts.

AspectDetail
How it worksParent spawns 3-6 agents with distinct personas (Architect, Red Teamer, Pragmatist, Innovator); each writes findings independently; an evaluator subagent scores and merges the best elements
When to useDesign decisions, code review, strategy, any task where diverse perspectives improve quality
KeyHeavy persona injection gives genuinely different outputs, not just paraphrases of the same analysis

Useful diversity packs:

PersonaPerspective
ArchitectScale and elegance above all
Red TeamerBreak this — what fails?
PragmatistShip it Friday — what is the minimum?
InnovatorWhat if we approached this entirely differently?
User AdvocateHow does the end user actually experience this?
Future-SelfWith 5 years of hindsight, what would you change?

Sub-patterns:

Sub-PatternHow It Works
Multi-Path ExplorationSame task, different personas. Each writes to /explorations/path_N/. Parent prunes or merges best paths
Debate & CritiqueRound 1: parallel proposals. Round 2: critics attack proposals. Round 3: refinement
Ensemble VotingSame subtask K times with persona variations. Evaluator scores. Weighted merge of winners

Pattern 6: Evolutionary & Emergent Systems

Section titled “Pattern 6: Evolutionary & Emergent Systems”

These turn stateless subagents into something that feels alive. All use the filesystem blackboard as connective tissue.

VariantHow It WorksBest For
Evolutionary OptimizationSpawn 8-20 agents as a “generation”; evaluator scores; “breeder” creates next-gen instructions from winners; run 5-10 generationsOptimizing algorithms, UI designs, strategies
Stakeholder SimulationAgents are characters (customer, competitor, regulator) acting on shared “world state” files in turnsProduct strategy, risk analysis
Swarm IntelligenceDozens of lightweight agents explore solution space, depositing “pheromone” scores; later agents bias toward high-scoring pathsBroad coverage with minimal planning
Recursive Meta-Improvement”Evolver” agents analyze past logs and propose improved system prompts, new roles, or better orchestration heuristicsSystem self-improvement across sessions

The Most Common Mistake: Parent Reads First

Section titled “The Most Common Mistake: Parent Reads First”

The single most important thing to get right with subagent patterns is preventing the parent from reading the data it is delegating. If the parent reads all the files before spawning subagents, the entire pattern is defeated — you have already spent the tokens, bloated the context, and lost the isolation benefit.

This happens more often than you might expect. You write a skill that should spawn subagents to each read a document and return findings. You run it. The parent agent helpfully reads every document first, then passes them to subagents, then collects distilled summaries. The subagents still provide fresh perspectives (a real benefit), but the context savings — the primary reason for the pattern — are gone.

The fix is defensive language in your skill. You need to explicitly tell the parent agent what it should and should not do. The key is being specific without being verbose.

Practical tips for getting this right:

TipExample Language
Tell the parent what to discover, not read”List all files in resources/ by name to determine how many subagents to spawn — do not read their contents”
Tell subagents what to return”Return only findings relevant to [topic]. Output as JSON to {output-path}. Do not echo raw content”
Use pre-pass scriptsRun a lightweight script that extracts metadata (file names, sizes, structure) so the parent can plan without reading
Be explicit about the boundary”Your role is ORCHESTRATION. Scripts and subagents do all analysis”

Testing is how you catch this. Run your skill and watch what actually happens. If you see the parent reading files it should be delegating, tighten the language. This is normal iteration — the builders are tuned with these patterns, but different models and tools may need more explicit guidance. Review the existing BMad quality optimizer prompts (prompts/quality-optimizer.md) and scanner agents (agents/quality-scan-*.md) for working examples of this defensive language in practice.

NeedPattern
Read multiple sources without bloating context1 — Delegated Data Access
Combine many outputs into one deliverable2 — Temp File Assembly
Pipeline where stages depend on each other3 — Shared-File Orchestration
Task needs analysis before work can be decomposed4 — Hierarchical Lead-Worker
Quality through diverse perspectives5 — Persona-Driven Parallel Reasoning
Iterative optimization or simulation6 — Evolutionary & Emergent
  • Force strict JSON schemas on every subagent output for reliable parent parsing
  • Use git worktrees or per-agent directories to prevent crosstalk
  • Start small — one orchestrator that reads plan.md and spawns the first wave
  • Patterns compose: use Delegated Access for data gathering, Persona-Driven for analysis, Temp File Assembly for the final report
  • Always include graceful degradation — if subagents are unavailable, the main agent performs the work sequentially