AgentSwitch Recipes

Run a Task

ax "run go test ./... and fix any failures"

This runs in your current directory. ax tracks it in a background window that closes when the task finishes. The command prints the session id. Watch it with ax read <id> --follow.

Use this for a task that needs no supervision. The task runs to completion on its own. It stays visible in the picker while it works. Afterward its transcript is indexed like any other session.

Run an Accept Check on a Task (CI)

ax "make 'go vet ./...' pass" --wait --unattended --accept "go vet ./..." --timeout 30m

This blocks and exits 0 only if the accept command passes when the run concludes.

Use this in CI. --wait turns the run into an ordinary blocking command with a meaningful exit code. --unattended makes ax ask return a default instead of waiting on a human, so nothing deadlocks in CI. --accept means success requires the check to pass, not just the model claiming it is done. --timeout caps the wall-clock time. A session sitting idle never counts as done.

Organize Sessions with Tags

In the picker: i to filter, Enter to keep the filter, v + j/k to select rows, l to edit their labels, b to pivot the list by any tag key. From the shell:

ax tag <id> --add-label project=blog
ax move --tag project=blog

The second command moves every tagged window into its own tmux session. Tags are key=value labels stored in ax's sidecar metadata, never in the harness's transcript, so they are safe to add, change, and remove. A TAGS column appears in the picker once anything is tagged.

Watch a Run from a Shell

ax list --run myrun
ax read --run myrun --follow
ax runs

Snapshot, live event stream (one JSON line per turn/waiting/exit/crash), and concluded runs with outcomes.

These three commands are the same view a coordinator uses, so anything it can see, you can see from any terminal. list shows who is alive and what each session costs. read --follow shows every turn boundary as it happens. runs shows what past runs concluded.

Get Notified When a Run Needs You

Set notify in ~/.config/ax/config.toml to get a message when a run blocks on you, or finishes and wants review, without having to watch the picker:

notify = "tmux"     # status-line message on every attached client
# notify = "bell"   # rings the session's tmux window

For anything richer, point it at a command. Write each placeholder bare, with no quotes of your own around it, because ax quotes every value into a single shell word. {state} is either needs-you or done-review, so one command can handle both cases:

notify = "notify-send ax {state}:{summary}"       # desktop notification
# notify = "curl -s --data-urlencode text={state}:{summary} https://hooks.slack.com/..."

The notify command fires from ax ask, and from claude's own permission prompts once you run ax hook install claude. It does not fire on a crash. Catching a crash would need a daemon, and ax does not run one.

Coordinate a Goal with Multiple Agents

ax "Add dark mode to this app. Done when: the toggle works and 'npm test' passes." --behavior coordinator --model opus --max-cost 25 --max-workers 3

--behavior coordinator resolves the coordinator behavior by preset name. The behavior file is behaviors/coordinator.md in the repository. The quick start covers getting the checkout.

One session drives the run. It splits the work, launches other sessions, judges their results, and concludes when the criteria pass. The fences put a hard cap on cost and fan-out. Watch the run in the picker (f filters to the run, T shows the run tree). Answer its questions with r when a row shows "needs you".

State the done-criteria in the task, and make them concrete enough to check. The coordinator behavior concludes on evidence. "Done when: 'npm test' passes" gives it a command to run. "Make it nice" gives it nothing to check. The coordinated task tutorial walks through one of these runs end to end.

Run a Project with a Coordinator

The previous recipe concludes once the goal is met. To run an ongoing project instead, launch a coordinator that keeps running and give it a file to manage the project from:

ax "Manage the blog project: triage requests into backlog.md, delegate work, verify, report." --behavior coordinator --interactive

--interactive keeps the coordinator running so you can steer it. Leave --max-cost and --max-workers off for an open-ended project: a tripped fence cascade-kills every session in the run. Add them only when you want that hard stop. There is no project feature to turn on. The coordinator keeps a plain backlog.md that it reads and rewrites. Each item carries a state and an owner, and the file also holds the decisions waiting on you and the coordinator's own follow-ups.

# backlog.md: blog

## In flight
- [~] add RSS feed  (owner: worker/rss)  -- generating feed.xml, verifying links

## Ready
- [ ] dark-mode toggle  (owner: unassigned)  -- done when the toggle persists across reloads

## Decisions for you
- [?] drop IE11 support? blocks the CSS cleanup item

## Done
- [x] fix broken image paths  (verified: all <img> resolve, site build is green)

The coordinator delegates each Ready item to a new session (ax "task"). The child session inherits the run and labels, so the whole run stays one tree in the picker. Before moving a row to Done, the coordinator verifies the result against something real: the built site, the passing test, the file on disk. It never marks work done on trust alone.

Send the coordinator a message and it sorts each item into the backlog. A new request becomes a Ready item. A question becomes a Decisions row. An answer unblocks whatever was waiting on it.

ax send <id> "ship the dark-mode toggle next, and yes, drop IE11"

When several sessions edit files at the same time, give each one its own git worktree so their edits cannot clobber each other. This is a plain git convention, not an ax feature. Pass each session its worktree path in the task text. Tell the coordinator to integrate a session's work only after that session's verification passes.

git worktree add ../worker-rss feature/rss
git worktree add ../worker-darkmode feature/darkmode

Watch the run in the picker (f filters to the run, T shows the run tree). Answer with r when a row shows "needs you". The backlog file is the one place the project's state lives, so you or any session can see where things stand by reading a single file.

Supervise an Existing Session Toward a Goal

Use this for a session whose context is too valuable to restart, driven by a model that works in bursts and stops early. Copy the session's id from the picker preview into the first line, then paste the rest as is:

W=the-session-id
ax tag "$W" --run myrun --name worker
ax "Supervise worker $W, an already-running session. Do not restart it. Goal: finish its current task. Done when: it demonstrates the working result. It stalls after bursts. Keep it moving." --behavior coordinator --model sonnet --run myrun --name supervisor --interactive

The ax tag line adopts the existing session into a named run. The launch then starts a supervisor in that same run, kept running and steerable by --interactive. The supervisor never does the work itself. It watches the session's turns, sends the next concrete instruction whenever the session stalls, verifies "done" against evidence, and asks you only when the decision is yours. Leave --max-cost off here. If a fence trips, ax cascade-kills the whole run, including the supervised session.

Change the Goalposts Mid-Run

The accept check (--accept) usually points at a script. Editing that script is how you update the criteria, because the next success claim runs the new version. After editing it, tell the root session:

ax send <root-id> "the acceptance criteria changed: re-read ./check.sh and continue until it passes"

If the run already concluded, relaunch with the same run id. The new run picks up the same history:

ax "Continue the previous run. Read your state file first. Updated criteria: ..." --behavior coordinator --run myrun

The same reply channel works at review time. When a finished run shows the green ✓ review state, answer its question with new criteria instead of accepting, and it goes back to work.

Route a Task Up Through Model Tiers

Run a task on the cheapest model first. If the accept check fails, move to a stronger tier. Stop as soon as one tier passes.

for MODEL in haiku sonnet opus; do
    ax "$TASK" --model "$MODEL" --max-tokens 100000 \
        --wait --accept ./check.sh && exit 0
done
exit 1

Write check.sh to verify the concrete deliverable, not to trust the model's claim. It runs automatically when the session tags itself done. Exit 0 means the tier passed. Any other exit means the run gave up, the fence tripped, or the check rejected the result. The loop moves to the next tier on any non-zero exit.

Fence each tier on --max-tokens to prevent runaway spend. On subscription auth (the default), --max-cost is inert. On API auth (--api), use --max-cost in USD instead. After any --wait run, the record at ~/.local/state/ax/runs/<run-id>.json carries cost, token counts, and the outcome field (success, gave_up, budget_hit, crashed):

for f in ~/.local/state/ax/runs/*.json; do
    jq -r '[.group, .outcome, .cost] | @tsv' "$f"
done | column -t

Starting from zero: copy the directory recipes/cost-routing/. It ships the escalation wrapper (escalate.sh, with the tier list inline) and a sample accept check (check.sh). Replace the check with your task's real acceptance criteria and run ./escalate.sh "your task" (it defaults to ./check.sh). No prior sessions or data required.

Share State Between Agents with a Blackboard

Two agents coordinate through a shared JSON file on disk. A producer writes its conclusions to the file. A critic reads those conclusions, evaluates them, and appends its verdicts. Neither agent has access to the other's context window, only to what was written.

BB=/tmp/blackboard.json
echo '{"items":[],"verdicts":[]}' > "$BB"

PROD=$(ax \
    "Blackboard: $BB. Read it. Append one item to 'items'. Write it back." \
    --wait --json | jq -r .id)
ax wait "$PROD" --timeout 5m
ax result "$PROD"

CRIT=$(ax \
    "Blackboard: $BB. Read it. For each item, append a verdict to 'verdicts'. Write it back." \
    --wait --json | jq -r .id)
ax wait "$CRIT" --timeout 5m
ax result "$CRIT"

cat "$BB" | jq .

The primitives: --json on the launch prints the session id before --wait blocks, so you can capture both. ax wait is a clean block point between pipeline stages. ax result reads the final report after the session exits. The blackboard is append-only by convention: each agent writes only to its own array key and never modifies what another agent wrote. That makes it a tamper-evident audit log you can inspect at any point with cat "$BB" | jq .

The same pattern extends to N agents and M rounds. Add a judge worker that reads both items and verdicts and writes a judgment array. Run the critic loop twice for a second-pass refinement. Fan out N critics in parallel (each with a distinct key) and join on all keys being populated. The blackboard is the queue.

Starting from zero: create the blackboard file, write a task prompt that tells each agent its file path and its section to write to, and run the two-stage sequence above. No prior sessions or shared infrastructure required.

Fan Out Workers Over a File Collection

Split a corpus into chunks, launch one worker per chunk in parallel, and join on a single verify pass. Each worker reads its chunk and writes structured notes to a shared vault directory.

VAULT=./vault
RUN_ID="vault-$(date +%s)"
mkdir -p "$VAULT"

PIDS=()

for CHUNK in fixture/chunks/chunk-*.txt; do
    SLUG="${CHUNK##*/chunk-}"
    SLUG="${SLUG%.txt}"
    ax "Read $CHUNK. Extract key topics, commitments, and dates. Write notes to $VAULT/notes-${SLUG}.md." \
        --run "$RUN_ID" --label "role=worker" --label "correspondent=$SLUG" \
        --max-workers 5 --max-tokens 500000 --timeout 10m \
        --wait --json > "/tmp/launch-${SLUG}.json" &
    PIDS+=($!)
done

wait "${PIDS[@]}"

ax "Check each file in $VAULT/notes-*.md for required sections.
    Report PASS or FAIL per file. Exit 0 only if all pass." \
    --run "$RUN_ID" --label "role=verifier" \
    --wait --accept ./check-vault.sh

Each worker is launched with & before any is awaited, so all chunks run in parallel. wait "${PIDS[@]}" is the join point. The --run flag ties all sessions into one tree in the picker (T to expand it). The verify worker reads every notes file and the --accept check enforces a hard pass before the run can conclude.

For large corpora, add --model haiku to the extract workers (mechanical note-taking) and keep the default model for the verify worker. This composes directly with the cost-routing pattern above.

Starting from zero: write a chunking script that splits your source files into per-item text files, define the fields each notes file must contain, write a check-vault.sh that verifies those fields are present, and run the fan-out. No prior sessions or vault contents required.

Audit Session History for Behavior Patterns

Search your session history for a topic, pull transcripts, and run a headless analysis worker that produces a human-reviewable document with recurring patterns, failure modes, and proposed behavior file edits.

QUERY="coordinator behavior"
CORPUS=$(mktemp /tmp/audit-corpus.XXXXXX)
PROMPT=$(mktemp /tmp/audit-prompt.XXXXXX)

ax search "$QUERY" --json \
    | jq -r '.ids[:8][]' \
    | while read -r id; do
        echo "--- session: $id ---" >> "$CORPUS"
        ax read "$id" --format text | tail -80 >> "$CORPUS"
    done

cat > "$PROMPT" <<PROMPT_EOF
You are a behavior-audit worker. Corpus:
$(cat "$CORPUS")

Write: recurring successful patterns (with session citations), recurring
failure modes (with citations), curated lessons, and proposed behavior file
edits as [ADD to section X] / [REVISE Y to] blocks. Mark single-instance
observations as 'watch' not 'pattern'.
PROMPT_EOF

ID=$(ax claude - --wait --unattended --max-tokens 400000 --timeout 15m --json \
    < "$PROMPT" | jq -r .id)
ax result "$ID" > audit-output.md

The primitives: ax search --json returns a ranked list of session IDs matching the query. ax read --format text extracts the normalized transcript. ax claude - reads the task from stdin, so the whole corpus rides inside the prompt file. With --wait the launch blocks until the worker finishes and ax result reads its report. The output goes to a file you review and apply manually.

This is not persistent learning. The worker re-derives the summary from transcripts at call time. The behavior file is version-controlled. A human decides what edits to apply. Run it again after a batch of sessions to see what changed.

The default depth fence (--max-depth 1) allows a root session plus one level of workers, so this launch works from a plain shell and from inside a root session. It is refused only when the script runs inside a session that is already a worker (depth 1), because the analysis worker would land at depth 2. In that case, run it from a shell outside any ax session, or launch the root with --max-depth 2.

Starting from zero: the script writes a blank template if ax search returns no sessions. Run it again once you have sessions that match your query.

Triage Email: Act on Everything, Notify on Exceptions

A cron job reads new mail, dispatches archive and unsubscribe actions automatically, and fires a notification only when something requires human attention. The default is silence. The notification is the exception.

# ~/.config/ax/config.toml
notify = "notify-send ax {state}:{summary}"

# cron: every 4 hours
0 */4 * * *  /path/to/triage-wrapper.sh ~/Maildir >> ~/.local/state/ax/log/email-triage.log 2>&1

Inside the wrapper:

TASK="Process these emails. For each: emit ACTION|||ARCHIVE|||filename,
    ACTION|||UNSUBSCRIBE|||filename, or ACTION|||DRAFT_REPLY|||filename|||text.
    If all were noise, emit SILENT on its own line and stop.
    If any need human attention, emit IMPORTANT: summary."

SESSION=$(ax "$TASK" --behavior behaviors/email-triage.md --model haiku \
    --wait --unattended --timeout 3m --max-cost 0.25 --max-tokens 200000 --json | jq -r .id)
OUTPUT=$(ax result "$SESSION")

# dispatch action lines; ||| separator preserves maildir filenames containing ":"
while IFS= read -r line; do
    if [[ "$line" == "ACTION|||ARCHIVE|||"* ]]; then
        archive "${line#ACTION|||ARCHIVE|||}"
    elif [[ "$line" == "ACTION|||UNSUBSCRIBE|||"* ]]; then
        unsubscribe "${line#ACTION|||UNSUBSCRIBE|||}"
    elif [[ "$line" == "ACTION|||DRAFT_REPLY|||"*"|||"* ]]; then
        rest="${line#ACTION|||DRAFT_REPLY|||}"
        draft_reply "${rest%%|||*}" "${rest#*|||}"
    fi
done <<< "$OUTPUT"

# sentinel gate: silence is the normal case
echo "$OUTPUT" | grep -q '^SILENT$' && exit 0

# something needs attention: notify
"$NOTIFY_CMD" "$(echo "$OUTPUT" | grep '^IMPORTANT:' | head -1)"

The sentinel gate inverts the usual pattern. Most recipes notify on completion. This one notifies only when SILENT is absent. The worker decides what each email is and what action to take. The wrapper owns the notification decision with a single grep.

--unattended means ax ask returns a default instead of waiting on a human, so nothing deadlocks in cron. --timeout 3m caps the wall-clock time so a hung session never blocks the cron slot. On the default subscription auth --max-cost is inert, so --max-tokens 200000 is the fence that binds.

Starting from zero: create a maildir with a few test messages (mix of newsletters and one real item), write a behavior file that tells the worker how to classify mail and what format to emit, run the wrapper once manually, and verify the sentinel fires correctly before wiring cron.

Draft Listings and Hold for Human Approval Before Publishing

A read-only coordinator fans out one worker per niche in parallel. The coordinator itself writes nothing. Each worker produces a design brief, generates artwork via an image API adapter, writes listing copy, and creates a draft product via your platform's API. An accept check verifies every draft mechanically. Then ax ask holds the pipeline for explicit human approval before anything publishes.

ax "Read niches.json. For each niche, launch a worker that:
    1. Writes a design brief to drops/NICHE/brief.md
    2. Calls scripts/image-api.sh to generate drops/NICHE/design.png
    3. Calls scripts/listing-copy.sh to write drops/NICHE/listing.json
       (title <= 140 chars, exactly 13 tags)
    4. Calls scripts/create-draft.sh to write drops/NICHE/draft.json
       (state=draft, published=false)
    Run all workers in parallel. Done when all niches have passing drafts." \
    --behavior behaviors/pod.md \
    --read-only \
    --max-workers 5 --max-tokens 2000000 \
    --accept "bash scripts/check-drop.sh drops" \
    --interactive

The accept check runs before the coordinator can declare success. It verifies that every niche has a design file, a listing with a title under the character limit and the right tag count, and a draft marked as unpublished. No niche passes on the coordinator's word alone.

After the checks pass, the coordinator calls ax ask before the publish step:

ans=$(ax ask "3 drafts ready. Approve to publish?")
[ "$ans" = "yes" ] || exit 0    # stop at drafts if not approved

In attended mode, ax ask shows the session as "needs you" in the picker and fires the notify hook. Answer with r in the picker or ax reply <id> yes from a shell. In unattended mode (--unattended), ax ask returns immediately with no reply and the coordinator stops at drafts. Nothing publishes unattended. This is the structural guarantee for any cron or CI run.

Starting from zero: create a niches.json with two or three entries (each with a slug and keywords), write stub adapter scripts for each platform API call, run the coordinator once with --unattended, and verify that drafts are produced but nothing publishes.

Collect Data in a Script, Format It with an Agent

A bash script collects ground-truth data (service health, dependency audit, metrics). Its output is injected into an ax task. A headless formatter session produces a human-readable report or emits SILENT if there is nothing to report. The agent never fetches data itself.

# Stage 1: collect data deterministically
SCRIPT_OUTPUT=$(./check-services.sh)

# Stage 2: inject into task, let the agent format only
ID=$(ax "A health check ran. Its output:

== OUTPUT START ==
${SCRIPT_OUTPUT}
== OUTPUT END ==

If the output contains NO_ISSUES: respond with exactly the word SILENT and stop.
If it contains OUTAGE_DETECTED: write a concise incident summary and stop.
Do not run any commands. Do not fetch additional data." \
    --behavior recipes/prescript-formatter/behaviors/lm-formatter.md \
    --wait --unattended --timeout 5m --max-tokens 200000 \
    --json 2>/dev/null | head -1 | jq -r .id)

# Stage 3: sentinel gate
OUTPUT=$(ax result "$ID" --json | jq -r .result)
echo "$OUTPUT" | grep -q '^SILENT$' && exit 0

# Stage 4: deliver
"$NOTIFY_CMD" "$OUTPUT"

The behavior file constrains the formatter: read the task, format only, no tool calls. This matters because a curl call inside an agent session is billed and subject to permission prompts. A curl call in bash is free and deterministic. The script handles data collection. The agent handles formatting.

The --json | head -1 | jq -r .id pattern captures the session ID from the JSON line printed before --wait blocks. After --wait returns, ax result --json reads the formatter's output cleanly via the .result field.

Adapt to any monitoring job by replacing check-services.sh with your real check tool and adjusting the sentinel tokens (NO_ISSUES / OUTAGE_DETECTED) to match what your script emits.

Starting from zero: write a check script that outputs a fixed sentinel word for the clean case, run the wrapper once in each fixture mode, and confirm the sentinel gate suppresses delivery in the clean case before wiring cron.

Schedule Any Recurring Workflow

One wrapper, one behavior file, any task string on any schedule. A cron entry calls the wrapper, which runs the task headless, reads the output, and delivers a report only when the output does not contain the suppression sentinel.

# ax-scheduled-chain.sh: parametrized cron wrapper
TASK="$1"
NOTIFY_CMD="${NOTIFY_CMD:-echo}"
SENTINEL="${SENTINEL:-SILENT}"
RUN_ID="chain-$(date +%Y%m%d-%H%M%S)"

SESSION=$(ax "$TASK" \
    --behavior recipes/scheduled-chain.md \
    --wait --unattended --timeout 10m --max-cost 2.00 --max-tokens 400000 \
    --run "$RUN_ID" --json | jq -r .id)

OUTPUT=$(ax result "$SESSION")

printf '%s\n' "$OUTPUT" | grep -q "^${SENTINEL}$" \
    && exit 0

"$NOTIFY_CMD" "$OUTPUT"

# crontab: daily briefing at 07:00
0 7 * * *  NOTIFY_CMD=/usr/local/bin/ax-slack \
    /path/to/ax-scheduled-chain.sh \
    "Search the web for the top 5 AI agent developments from yesterday.
     Summarize each in 2-3 sentences." \
    >> ~/.local/state/ax/log/briefing.log 2>&1

Only the task string changes between workflows. The same wrapper handles a daily briefing, a nightly dependency audit, a weekly digest, or a competitive repo watch. Wire the sentinel: the worker emits SILENT when there is nothing to report (no new vulnerabilities, empty inbox, no commits), and the wrapper suppresses delivery.

For the dependency audit variant, run the deterministic scanner in bash first (the pre-script pattern above), then inject its output into the task string so the agent only interprets, not fetches:

VULN=$(govulncheck ./... 2>&1)

ax-scheduled-chain.sh \
    "A Go vulnerability scan ran. Output:

$VULN

If it says 'No vulnerabilities found', output SILENT. Otherwise summarize each
finding: ID, affected module, version found, fixed version, upgrade action." \
    >> ~/.local/state/ax/log/audit.log 2>&1

ax has no built-in scheduler. Wire it with cron, systemd.timer, or launchd. The notify hook in ~/.config/ax/config.toml is an alternative to the NOTIFY_CMD variable for delivery targets you already have configured.

Starting from zero: copy the wrapper, set the task string, run it manually once with NOTIFY_CMD=echo, confirm the sentinel fires on a clean run, then add the cron entry.

Run a Produce-Judge-Iterate Loop Against a Rubric

A producer writes or revises a deliverable. A fresh reviewer scores it against a rubric file that encodes your taste. If the score clears the threshold, the loop stops and emits the deliverable. If not, the reviewer's specific violations are folded into the next producer iteration. At the iteration cap, if still failing, the loop emits the best attempt marked NOT PASSING. No human is involved until the loop concludes.

# Improve an existing draft (seed mode): score the draft on iteration 1,
# then revise it against the violations on each subsequent iteration.
bash taste-loop.sh \
    --task-file goal.md \
    --seed draft.md \
    --rubric rubrics/prose-flat-register.md \
    --threshold 85 \
    --max-iter 5

# Produce from scratch: the producer writes the first draft,
# then iterates against the reviewer's violations.
bash taste-loop.sh \
    --task "Write the README intro in flat developer-doc register." \
    --rubric rubrics/prose-flat-register.md \
    --threshold 85 \
    --max-iter 4

Exit 0 means the deliverable passed. Exit 2 means the cap was hit without passing and the output carries the best attempt plus its remaining violations. evidence/trajectory.tsv records the score at each iteration so you can see whether the loop is converging.

The loop ships two behavior files. behaviors/producer.md constrains the producer to emit only the deliverable text. No preamble, no explanation. behaviors/reviewer.md constrains the reviewer to return one JSON object with a score, a violations list with exact quoted spans, and a pass flag. Both workers run --read-only. The reviewer is launched fresh each iteration with no memory of prior rounds. The rubric is where the taste lives: copy rubrics/TEMPLATE.md to write your own. The flagship rubric (rubrics/prose-flat-register.md) penalizes rhetorical openers, marketing verbs, em-dashes, semicolons, and rule-of-three padding.

The coordinator behavior uses this recipe autonomously. Given a goal that requires prose at a specific register, a coordinator session runs taste-loop.sh and passes the reviewer behaviors/reviewer.md as its behavior file. The human sees only the passing result or the cap report, not the intermediate drafts.

Starting from zero: copy recipes/taste-gate/ from the repository, write a rubric by editing rubrics/TEMPLATE.md, and run the loop against a draft or a task prompt. Add --model sonnet (or any model name) to set the model for both workers. Bind each worker's spend with --max-tokens N.