plan.md file is the brief, the contract, and the source of truth for every project. The human’s role is to write it (or approve a drafted version); the agent team’s role is to execute against it.
Why a single file
ZO follows Karpathy’s hard-oracle discipline: autonomy scales through rigorous specification, not natural-language ambiguity. A single Markdown file enforces three things:- Reviewable as a unit. The human can read the full brief in 5–10 minutes before approving.
- Diff-able. Plan edits mid-project produce clean Git diffs that agents can detect and re-plan against.
- One source of truth. No ambiguity about which document represents the current intent.
Eight required sections
Frontmatter
YAML metadata: project name, version, owner, status. Used for routing and registry.Two optional frontmatter fields control the cost-saving preset:
low_token: true— activates low-token mode for this project (Sonnet lead, 2 max iterations, no headlines, full-auto gates, earlier auto-compaction). Equivalent to passing--low-tokenon everyzo buildfor this plan.lead_model: sonnet— overrides the lead orchestrator model. Acceptsopus,sonnet, orhaiku. Composes withlow_token: true(lead model wins).
Objective
What you’re building, in business or research terms. Two paragraphs maximum. No technical solution language — that comes later.
Oracle definition
The verifiable success metric. Includes primary metric, ground-truth source, evaluation method, target threshold (must/should/could tiers), evaluation frequency. See the oracle.
Workflow configuration
Mode (
classical_ml, deep_learning, or research), gate behavior per phase, iteration budget, human checkpoints.Data sources
Where the data lives, format, access method, known issues. Each source gets its own subsection.
Domain priors
What you already know — ML knowledge, expected relationships, known risks, edge cases. This is where domain expertise lands.
Agents
Active agents (executed every phase), phase-in agents (activated for specific phases), inactive agents (explicitly skipped). Optional
**Custom agents:** for project-specific specialists, and **Agent adaptations:** for per-project tuning of generic agents.- Milestones — phase-level deadlines tied to gates.
- Delivery specification — target repo path, branch, output structure.
- Environment — auto-populated from
zo.environment.detect_environment()atinittime. Captures host platform, Python version, GPU/CUDA details, base image, data layout.
Authoring options
- Hand-written
- Agent-drafted
Read the schema in
specs/plan.md, copy a template from plans/mnist-digit-classifier.md, fill in your project. Validate with:Editing a plan mid-project
Plans aren’t immutable. Editplan.md at any time, then re-run zo build:
- Smart mode detection — ZO auto-detects whether to start fresh, continue, or re-decompose due to plan edits.
- Gate confirmation — before re-execution, the orchestrator surfaces a diff and asks for human approval if the change crosses a phase boundary.
- Memory continuity — completed phase artifacts are preserved; the orchestrator only re-runs what the diff requires.
What a good plan looks like
The reference example isplans/mnist-digit-classifier.md:
- ~80 lines total
- Oracle with explicit must/should/could tiers (0.95 / 0.98 / 0.99)
- Domain priors that capture why (e.g. “MNIST is a solved benchmark — simple CNNs achieve >99%”)
- Constraints that explicitly limit complexity (“max 2 conv layers, no pre-trained models”)
plans/cifar10-classifier.md) is a longer example covering a harder problem with richer domain priors (cat-dog confusion, augmentation criticality, animal-vs-vehicle pose variation).
Next
The oracle
The hard verifiable metric every plan must define.
Plan schema
Full reference for every section.