Low-token preset - Zero Operators

A one-page reference for the --low-token cost-saving preset. For motivation, trade-offs, and FAQ, see the low-token mode concept page.

Measured savings: ~30% on the first MNIST bench ($7.75 vs. ~$11 default, 2026-04-27). The preset since added two structural levers, Haiku routing for code-reviewer / test-engineer / oracle-qa and per-phase agent drops (Phase 1 and Phase 5), which target the ~50-60% ceiling. A second bench post-PR-C is required to confirm the new measured number; this page updates when the next bench lands. Full breakdown: Cost benchmark.

Activation

Method	Where	Persistence
`--low-token` flag	`zo build`, `zo continue`	Per-invocation
`low_token: true`	Plan YAML frontmatter	Per project

When either is on, the preset below applies. CLI flag wins over plan field if both are set.

The preset

_LOW_TOKEN_PRESET = {
    "lead_model": "sonnet",
    "max_iterations": 2,
    "stop_on_tier": "could_pass",
    "drop_research_scout": True,
    "headlines_disabled": True,
    "gate_mode": "full-auto",
    "compact_threshold": "60",
}

# Plus two routing tables in src/zo/_orchestrator_phases.py:

LOW_TOKEN_HAIKU_AGENTS = frozenset({
    "code-reviewer", "test-engineer", "oracle-qa",
})

LOW_TOKEN_PHASE_DROPS = {
    "phase_1": frozenset({"code-reviewer", "test-engineer", "domain-evaluator"}),
    "phase_5": frozenset({"xai-agent", "domain-evaluator"}),
}

Authoritative locations: src/zo/cli.py for _LOW_TOKEN_PRESET, src/zo/_orchestrator_phases.py for the routing tables.

Knob reference

Knob	Default	Low-token	Where in code
Lead model	`opus`	`sonnet`	`_resolve_lead_model` in `cli.py`
Sub-agent routing	per `.md` (mostly Sonnet)	two-tier (Haiku for review/test/oracle, Sonnet for reasoning)	`_prompt_low_token_overrides` in `orchestrator.py`
Phase 1 trim	full reviewer set	data-engineer only (reviewers/tests deferred to Gate 5)	`LOW_TOKEN_PHASE_DROPS["phase_1"]`
Phase 5 trim	model-builder + oracle-qa + xai-agent + domain-evaluator	model-builder + oracle-qa only (lead writes single-shot summary)	`LOW_TOKEN_PHASE_DROPS["phase_5"]`
Phase-4 `max_iterations`	`10`	`2`	`_LOW_TOKEN_LOOP_CLAMPS` in `experiment_loop.py`
Phase-4 `stop_on_tier`	`must_pass`	`could_pass`	`_LOW_TOKEN_LOOP_CLAMPS` in `experiment_loop.py`
Cross-cutting `research-scout`	enabled	disabled	`_agents_for_phase` in `orchestrator.py`
Cross-cutting `code-reviewer`	enabled	dropped from Phase 1; on Haiku elsewhere	`LOW_TOKEN_PHASE_DROPS` + `LOW_TOKEN_HAIKU_AGENTS`
End-of-session Haiku summary	generated	skipped	`_generate_session_summary` in `cli.py`. (The previous per-60-second `_maybe_print_headline` ticker was removed unconditionally; only the one-shot wrap-up call remains, and `--low-token` skips even that.)
Default gate mode	`supervised`	`full-auto`	`_resolve_gate_mode` in `cli.py`
Lead prompt: dedicated adaptations section	included	omitted	`build_lead_prompt` in `orchestrator.py`
Lead prompt: roster	full descriptive	compact comma-list	`_prompt_roster` in `orchestrator.py`
`CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` env	unset	`60`	`extra_env` in `cli.py` → wrapper

Override flags

These compose with --low-token to fine-tune individual knobs:

Flag	Type	Effect
`--lead-model {opus,sonnet,haiku}`	choice	Override the lead model
`--max-iterations N`	int	Hard cap on Phase-4 iterations
`--no-headlines`	bool	Skip the end-of-session Haiku bullet summary (independent of low-token)
`--gate-mode {supervised,auto,full-auto}`	choice	Override the gate mode

Plan-level fields

Two YAML frontmatter fields complement the CLI flags:

---
project_name: "..."
# ... required fields ...
low_token: true        # optional, default false
lead_model: sonnet     # optional, default unset
---

Plan-level ## Experiment Loop section overrides individual loop fields with full granularity:

## Experiment Loop

max_iterations: 5
plateau_epsilon: 0.005
stop_on_tier: should_pass

When low_token is on AND a plan ## Experiment Loop field is set, the plan field wins (the preset is a “sensible defaults” layer, not a hard clamp).

Precedence (highest first)

CLI flag: --lead-model, --max-iterations, --gate-mode, --no-headlines
Plan YAML frontmatter: lead_model, low_token
Plan body section: ## Experiment Loop for loop knobs
Low-token preset: applied when --low-token or low_token: true is set
Base default: Opus, 10 iterations, supervised, etc.

Concrete examples:

Invocation	Plan has	Effective lead	Effective max-iterations
`zo build x.md`	(nothing relevant)	`opus`	`10`
`zo build x.md --low-token`	(nothing relevant)	`sonnet`	`2`
`zo build x.md --low-token`	`lead_model: opus`	`opus` (plan wins over preset)	`2`
`zo build x.md --low-token --lead-model haiku`	`lead_model: opus`	`haiku` (CLI wins over plan)	`2`
`zo build x.md --low-token`	`## Experiment Loop` `max_iterations: 8`	`sonnet`	`8` (plan wins over preset clamp)
`zo build x.md --low-token --max-iterations 4`	`## Experiment Loop` `max_iterations: 8`	`sonnet`	`4` (CLI wins over plan)

Visual confirmation

When low-token is active, the ZO banner shows a [low-token] badge:

╭──────────────────────────────────────────────────────────────╮
│   ◎ Zero Operators  v1.0.2  [low-token]                     │
│      Autonomous AI Research & Engineering Teams              │
│                                                              │
│   Project:   mnist-demo                                      │
│   Mode:      build                                           │
│   Phase:     starting                                        │
│   Gates:     full-auto                                       │
╰──────────────────────────────────────────────────────────────╯

​Activation

​The preset

​Knob reference

​Override flags

​Plan-level fields

​Precedence (highest first)

​Visual confirmation

​See also

Activation

The preset

Knob reference

Override flags

Plan-level fields

Precedence (highest first)

Visual confirmation

See also