--low-token cost-saving preset. For motivation, trade-offs, and FAQ, see the low-token mode concept page.
Measured savings: ~30% on the first MNIST bench ($7.75 vs. ~$11 default, 2026-04-27). The preset since added two structural levers, Haiku routing for code-reviewer / test-engineer / oracle-qa and per-phase agent drops (Phase 1 and Phase 5), which target the ~50-60% ceiling. A second bench post-PR-C is required to confirm the new measured number; this page updates when the next bench lands. Full breakdown: Cost benchmark.
Activation
| Method | Where | Persistence |
|---|---|---|
--low-token flag | zo build, zo continue | Per-invocation |
low_token: true | Plan YAML frontmatter | Per project |
The preset
src/zo/cli.py for _LOW_TOKEN_PRESET, src/zo/_orchestrator_phases.py for the routing tables.
Knob reference
| Knob | Default | Low-token | Where in code |
|---|---|---|---|
| Lead model | opus | sonnet | _resolve_lead_model in cli.py |
| Sub-agent routing | per .md (mostly Sonnet) | two-tier (Haiku for review/test/oracle, Sonnet for reasoning) | _prompt_low_token_overrides in orchestrator.py |
| Phase 1 trim | full reviewer set | data-engineer only (reviewers/tests deferred to Gate 5) | LOW_TOKEN_PHASE_DROPS["phase_1"] |
| Phase 5 trim | model-builder + oracle-qa + xai-agent + domain-evaluator | model-builder + oracle-qa only (lead writes single-shot summary) | LOW_TOKEN_PHASE_DROPS["phase_5"] |
Phase-4 max_iterations | 10 | 2 | _LOW_TOKEN_LOOP_CLAMPS in experiment_loop.py |
Phase-4 stop_on_tier | must_pass | could_pass | _LOW_TOKEN_LOOP_CLAMPS in experiment_loop.py |
Cross-cutting research-scout | enabled | disabled | _agents_for_phase in orchestrator.py |
Cross-cutting code-reviewer | enabled | dropped from Phase 1; on Haiku elsewhere | LOW_TOKEN_PHASE_DROPS + LOW_TOKEN_HAIKU_AGENTS |
| End-of-session Haiku summary | generated | skipped | _generate_session_summary in cli.py. (The previous per-60-second _maybe_print_headline ticker was removed unconditionally; only the one-shot wrap-up call remains, and --low-token skips even that.) |
| Default gate mode | supervised | full-auto | _resolve_gate_mode in cli.py |
| Lead prompt: dedicated adaptations section | included | omitted | build_lead_prompt in orchestrator.py |
| Lead prompt: roster | full descriptive | compact comma-list | _prompt_roster in orchestrator.py |
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE env | unset | 60 | extra_env in cli.py → wrapper |
Override flags
These compose with--low-token to fine-tune individual knobs:
| Flag | Type | Effect |
|---|---|---|
--lead-model {opus,sonnet,haiku} | choice | Override the lead model |
--max-iterations N | int | Hard cap on Phase-4 iterations |
--no-headlines | bool | Skip the end-of-session Haiku bullet summary (independent of low-token) |
--gate-mode {supervised,auto,full-auto} | choice | Override the gate mode |
Plan-level fields
Two YAML frontmatter fields complement the CLI flags:## Experiment Loop section overrides individual loop fields with full granularity:
low_token is on AND a plan ## Experiment Loop field is set, the plan field wins (the preset is a “sensible defaults” layer, not a hard clamp).
Precedence (highest first)
- CLI flag:
--lead-model,--max-iterations,--gate-mode,--no-headlines - Plan YAML frontmatter:
lead_model,low_token - Plan body section:
## Experiment Loopfor loop knobs - Low-token preset: applied when
--low-tokenorlow_token: trueis set - Base default: Opus, 10 iterations, supervised, etc.
| Invocation | Plan has | Effective lead | Effective max-iterations |
|---|---|---|---|
zo build x.md | (nothing relevant) | opus | 10 |
zo build x.md --low-token | (nothing relevant) | sonnet | 2 |
zo build x.md --low-token | lead_model: opus | opus (plan wins over preset) | 2 |
zo build x.md --low-token --lead-model haiku | lead_model: opus | haiku (CLI wins over plan) | 2 |
zo build x.md --low-token | ## Experiment Loopmax_iterations: 8 | sonnet | 8 (plan wins over preset clamp) |
zo build x.md --low-token --max-iterations 4 | ## Experiment Loopmax_iterations: 8 | sonnet | 4 (CLI wins over plan) |
Visual confirmation
When low-token is active, the ZO banner shows a[low-token] badge:
See also
- Low-token mode concept: motivation, trade-offs, FAQ
- Cost benchmark: measured savings on the MNIST reference run
zo buildCLI reference- The plan: frontmatter fields