This quickstart uses the MNIST digit classifier as the target. ZO’s v1 ships with this as a known-good demo: it converges in under two minutes on Apple Silicon and exercises every phase, gate, agent, and memory operation.
Prerequisites
Claude Code
The Claude CLI is required. Install via the official curl method, see installation.
Python 3.11+
Plus
uv for dependency management. ZO is built on PyTorch.tmux
Used for the conversational init/draft/build flow. Optional with
--no-tmux.1. Clone and install
setup.sh validates prerequisites and offers to install missing ones (Claude CLI, uv, tmux). It also symlinks the zo binary onto your PATH.
After it succeeds:
2. Initialise a project
memory/mnist-demo/{STATE,DECISION_LOG,PRIORS}.md: project memorytargets/mnist-demo.target.md: delivery repo pathplans/mnist-demo.md: auto-populated## Environmentsection<delivery-path>/.zo/: portable project state in the delivery repo<delivery-path>/{src,configs,experiments,reports,docker}/: scaffolded structure
zo init mnist-demo --no-tmux --branch main --base-image pytorch/pytorch:2.4.0-cpu for headless one-shot init.
3. Draft the plan
plan.md against the v1 schema: objective, oracle, workflow mode, data sources, domain priors, agents, constraints, milestones, delivery.
You confirm each section before it lands. The agent runs --dry-run first so you see exactly what will be written.
4. Preflight
WARN on nvidia-smi is expected on macOS, Docker Desktop has no GPU passthrough; ZO emits a CPU-only compose template automatically.
5. Build
- A tmux pane with the orchestrator’s live session
- Phase progress + gate status streamed to your terminal
- Live task list + recent agent events from the comms log
- A live training dashboard (split-pane) once Phase 4 starts
- At the end of the run, a short Haiku-generated bullet summary of what the team accomplished
On a Pro plan, student account, or otherwise watching your token budget? Add Sonnet lead instead of Opus, Haiku for code-reviewer / test-engineer / oracle-qa, 2 Phase-4 iterations instead of 10, full-auto gates, no end-of-session Haiku summary, earlier auto-compaction. ~30% measured savings on the MNIST bench ($7.75 vs ~$11 default); 50-60% targeted with the Haiku routing + per-phase trims (pending second bench), 70-80% on the roadmap via SDK refactor (prompt caching, Batch API, Files API). Slight quality trade at the lead step. See low-token mode for the full preset and trade-offs.
--low-token:6. Verify
When the build completes, the delivery repo contains a complete project:What just happened
You ran a complete six-phase ML project, data review, feature engineering, model design, training, analysis, packaging, driven by a coordinated team of AI agents, with you involved only at plan approval and two gates.Next steps
Read the concepts
The plan, the oracle, agents, phases, memory, the mental model.
Try CIFAR-10
Same pattern, different dataset, harder problem. The v1 reference run hit 91.62%.
CLI reference
Every command, flag, and example.
Architecture
How orchestration, contracts, and memory actually work.