All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.8.7 - 2026-03-10
src/runtimes/cursor.ts— new runtime adapter for Cursor CLI (agentbinary), implementing theAgentRuntimeinterface with TUI spawning via tmux,.cursor/rules/overstory.mdinstruction delivery,--yolopermission bypass, and headless one-shot mode — thanks to @XavierChevalier (#104, #66)src/runtimes/cursor.test.ts— comprehensive test suite (497 lines) covering spawn command building, overlay generation, readiness detection, and transcript parsing
stabilityfield onAgentRuntime— new"stable" | "beta" | "experimental"field on the runtime interface; Claude and Sapling markedstable, Pi and Codex asbeta, Copilot/Gemini/OpenCode/Cursor asexperimental- Stability surfaced in
ov agentsand runtime documentation
- Per-coordinator session tracking —
SessionStorenow trackscoordinator_nameper session with auto-migration for existing databases, enabling isolated run tracking when multiple coordinators operate in the same project OVERSTORY_TASK_IDenv var — slung agents now receive their task ID as an environment variable; trackerclosecommands are guarded to prevent agents from closing issues outside their assigned scope
- Runtime column in dashboard agent panel — the live TUI dashboard now shows which runtime each agent is using (e.g.,
claude,cursor,sapling) — thanks to @mustafamagdy (#99)
- Dashboard crash on SQLite lock contention —
ov dashboardno longer crashes when concurrent agents causeSQLITE_BUSY; database reads are wrapped with retry logic - Silent content loss in merge auto-resolve — merge resolver Tier 2 (hunk-level) no longer silently drops non-conflicting content when resolving conflicts; the entire file is now preserved correctly
ov initENOENT on spawner calls —spawner()calls for ecosystem tool detection are now wrapped in try/catch to prevent crashes whenmulch/sd/cnCLIs are not installed- Shift+tab false positive in
detectReady— thehasStatusBarcheck no longer matches shift+tab escape sequences as a status bar indicator, preventing premature ready detection - Claude bypass dialog and Codex shared state — Claude runtime's
detectReady()now recognizes the "bypass" dialog phase; Codex runtime correctly handlessharedWritableDirsspawn option — thanks to @Ilanbux (#101) - Tmux pane retry for WSL2 race condition —
capturePaneContent()andsendKeys()now retry on transient tmux failures caused by WSL2 timing issues — thanks to @arosstale (#78) - Fish shell tmux spawn — tmux session commands are now wrapped in
/bin/bash -cto prevent failures when the user's default shell is fish coordinator_namecolumn migration —createSessionStore()now auto-migrates existingsessionstables to add thecoordinator_namecolumn without data loss
- 3364 tests across 100 files (7924
expect()calls) - New:
src/runtimes/cursor.test.ts,src/commands/ecosystem.test.ts
0.8.6 - 2026-03-06
ov coordinator check-complete— new subcommand that evaluates configured exit triggers (allAgentsDone,taskTrackerEmpty,onShutdownSignal) and returns per-trigger status; complete = true only when ALL enabled triggers are metcoordinator.exitTriggersconfig — newcoordinatorsection inconfig.yamlwith three boolean triggers controlling automatic coordinator shutdown (all default tofalse)- Exit-trigger evaluation integrated into coordinator completion protocol — the coordinator can now self-terminate when configured conditions are met
allAgentsDonetrigger also checks the merge queue to prevent premature shutdown while branches are still pending merge
rollbackWorktree()— new helper insrc/worktree/manager.tsthat removes a worktree and deletes its branch (best-effort, errors swallowed)ov slingrollback on spawn failure — if agent spawn fails after worktree creation, the worktree and branch are automatically rolled back to avoid orphaned resources
ov clean --agent <name>— targeted cleanup of a single agent: kills tmux session or process tree, removes worktree, deletes branch, clears agent and log directories, logs synthetic session-end event, and marks session as completedov stop --clean-worktreeon completed agents — previously threw an error for completed agents; now skips the kill step and proceeds directly to worktree+branch cleanup
- Auto-commit os-eco state files before merge — runtime state files (
.seeds/,.overstory/,.mulch/,.canopy/,.greenhouse/,.claude/,CLAUDE.md) are automatically committed withchore: sync os-eco runtime stateto prevent dirty-tree merge errors - Stash/pop dirty files during merge — uncommitted changes are stashed before merge and popped afterward, with proper cleanup on failure
onMergeSuccesscallback —createMergeResolver()now accepts an optionalonMergeSuccesshook called after successful merge of each entry- Untracked file handling in merge resolver improved to prevent conflicts between tracked and untracked files
- Auto-commit scaffold files at end of
ov init— ecosystem directories (.overstory/,.seeds/,.mulch/,.canopy/,.gitattributes,CLAUDE.md) are committed so agent branches don't cause untracked-vs-tracked conflicts during merge
- Headless agent kill blast radius —
killSession("")with tmux prefix matching could kill ALL tmux sessions; watchdog now useskillAgent()helper that routes headless agents through PID-basedkillProcessTree()and TUI agents through named tmux sessions - Stale headless agent detection — watchdog now checks
isProcessAlive(pid)for headless agents instead of only checking tmux session liveness - Coordinator state file commit — completion protocols now commit os-eco state files before final steps to prevent dirty-tree errors downstream
- Coordinator premature issue closure — coordinator no longer closes seeds issues before the lead agent merges its branch;
allAgentsDonetrigger checks merge queue for pending branches - Coordinator auto-complete on session-end —
ov run completeis no longer called automatically from the per-turn Stop hook, preventing premature run completion - Self-exiting coordinator — session-end hook now handles coordinators that exit themselves (e.g., via exit triggers) without throwing errors
--jsonflag stolen by parent Commander —.enablePositionalOptions()added to the root program so subcommand--jsonflags are not consumed by the parent parser- Pi runtime transcript parsing — Pi v3 JSONL format stores token usage inside
messageevents atmessage.usage.{input, output, cacheRead}, not inmessage_endevents; parser now handles both formats withcacheReadcounted toward input tokens (#82) - Pi
getTranscriptDir()— now returns~/.pi/agent/sessions/{encoded-project-path}/instead ofnull, enablingov costsfor Pi agents (#82)
- CLI command count: 34 → 35 (new
check-completesubcommand underov coordinator)
- 3248 tests across 98 files (7677
expect()calls)
0.8.5 - 2026-03-05
src/runtimes/opencode.ts— new runtime adapter for SST OpenCode (opencodeCLI), implementing theAgentRuntimeinterface with model flag support,AGENTS.mdinstruction file, and headless subprocess spawningsrc/runtimes/opencode.test.ts— test suite (325 lines) covering spawn command building, overlay generation, guard rules, and environment setup
src/events/tailer.ts— background NDJSON event tailer that pollsstdout.logfiles from headless agents (e.g. Sapling, OpenCode), parses new lines, and writes them intoevents.dbvia EventStore — enablingov status,ov dashboard, andov feedto show live progress for headless agentssrc/events/tailer.test.ts— test suite (461 lines) covering line parsing, file tailing, stop/cleanup, and edge cases- Watchdog integration —
runDaemonTick()now automatically starts/stops event tailers for active headless agents, with module-level tailer registry persisting across ticks
ov inspectstdout.log fallback — when--no-tmuxor tmux capture fails, inspect now falls back to reading the agent'sstdout.logNDJSON file, parsing recent events to display tool activity and progress for headless agents
- Sapling
buildDirectSpawn()crash — model resolution logic now guards againstundefinedmodel parameter instead of unconditionally calling.toUpperCase()on it;--modelflag is only appended when a model is actually specified - Sapling API key leak —
ANTHROPIC_API_KEYis now explicitly cleared in the child process environment to prevent the parent session's key from leaking into sapling subprocesses; gateway providers re-set it as needed
- 3201 tests across 98 files (7551
expect()calls)
0.8.4 - 2026-03-04
runtime.capabilitiesconfig field — maps capability names (e.g.builder,scout,coordinator) to runtime adapter names, enabling heterogeneous fleets where different agent roles use different runtimesgetRuntime()now accepts acapabilityparameter; lookup chain: explicit--runtimeflag >capabilities[cap]>default>"claude"- 4 tests covering capability routing, fallback, explicit override, and undefined capabilities
getTranscriptDir()method added toAgentRuntimeinterface — each runtime adapter now owns its transcript directory resolution instead of hardcoding Claude Code paths in the costs command- All 6 runtime adapters implement
getTranscriptDir()(Claude returns project-specific path; others returnnull)
getKnownInstructionPaths()inagents.tsnow queries all registered runtimes viagetAllRuntimes()instead of maintaining a hardcoded list, so new runtimes are automatically discovered
- Dirty working tree merge guard —
ov mergenow detects uncommitted changes to tracked files before attempting a merge and throws a clear error, preventing cascading failures through all 4 tiers with misleading empty conflict lists - 5 tests covering the dirty-tree detection in
resolver.test.ts
- Decoupled Claude Code specifics from costs, transcript, and agent discovery modules —
estimateCostre-export removed fromtranscript.ts(import directly frompricing.ts), transcript dir resolution moved from costs command into runtime adapters, instruction path list derived from runtime registry
- 3137 tests across 96 files (7420
expect()calls)
0.8.3 - 2026-03-04
ov slingno longer requires--name— when omitted, generates a unique name from{capability}-{taskId}, with-2,-3suffixes to avoid collisions against active sessionsgenerateAgentName()helper exported fromsrc/commands/sling.tswith collision-avoidance logic
- Coordinator can now spawn scouts and builders directly — previously only
leadwas allowed without--parent; scouts and builders are now also permitted for lightweight tasks that don't need a lead intermediary
{{INSTRUCTION_PATH}}placeholder in agent definitions — all agent.mdfiles now use a runtime-resolved placeholder instead of hardcoded.claude/CLAUDE.md, enabling Codex (AGENTS.md), Sapling (SAPLING.md), and other runtimes to place overlays at their native instruction pathinstructionPathfield added toOverlayConfigtype andgenerateOverlay()function
- Codex runtime startup —
buildSpawnCommand()now uses interactivecodex(notcodex exec) so sessions stay alive in tmux; omits--modelfor Anthropic aliases that Codex CLI doesn't accept (thanks @vidhatanand) - Zombie agent cleanup —
ov stopnow cleans up zombie agents (marks them completed) instead of erroring with "already zombie" - Headless stdout redirect —
ov slingalways redirects headless agent stdout to file, preventing backpressure-induced zombie processes - Config warning deduplication — non-Anthropic model warnings in
validateConfignow emit once per process instead of on everyloadConfig()call - Codex bare model refs —
validateConfignow accepts bare model references (e.g.,gpt-5.3-codex) when the default runtime iscodex, instead of requiring provider-prefixed format
- Agent definition
.mdfiles updated to use{{INSTRUCTION_PATH}}placeholder (builder, lead, merger, reviewer, scout, supervisor, orchestrator)
- 3130 tests across 96 files (7406
expect()calls)
0.8.2 - 2026-03-04
src/runtimes/connections.ts— module-level connection registry for activeRuntimeConnectioninstances, tracking RPC connections to headless agent processes (e.g., Sapling) keyed by agent namegetConnection(),setConnection(),removeConnection()for lifecycle management with automaticclose()on removal- 6 tests in
src/runtimes/connections.test.ts
- RuntimeConnection for SaplingRuntime — full RPC support enabling direct stdin/stdout communication with Sapling agent processes
- Model alias resolution in
buildEnv()andbuildDirectSpawn()— expandssonnet/opus/haikualiases correctly
- Headless backpressure zombie —
ov slingnow redirects headless agent stdout/stderr to log files to prevent backpressure from causing zombie processes deployConfigguard write — always writesguards.jsoneven when overlay is undefined, preventing missing guard files for headless runtimes- Sapling model alias resolution — correct alias expansion in both
buildEnv()andbuildDirectSpawn()paths
- 3116 tests across 96 files (7373
expect()calls)
0.8.1 - 2026-03-04
- Sapling (
sp) runtime adapter — fullAgentRuntimeimplementation for the Sapling headless coding agent - Headless: runs as a Bun subprocess (no tmux TUI), communicates via NDJSON event stream on stdout (
--json) - Instruction file:
SAPLING.mdwritten to worktree root (agent overlay content) - Guard deployment:
.sapling/guards.jsonwritten fromguard-rules.tsconstants - Model alias resolution: expands
sonnet/opus/haikualiases viaANTHROPIC_DEFAULT_*_MODELenv vars buildEnv()configuresANTHROPIC_BASE_URL,ANTHROPIC_AUTH_TOKEN, provider routing- Registered in runtime registry as
"sapling", available viaov sling --runtime sapling - Sapling v0.1.5 event types added to
EventTypeunion and theme labels - 972 lines of test coverage in
src/runtimes/sapling.test.ts
- Headless spawn in
ov sling— whenruntime.headless === true, bypasses tmux entirely and spawns agents as direct Bun subprocesses - New
src/worktree/process.tsmodule:spawnHeadlessAgent()for directBun.spawn()invocation,HeadlessProcessinterface for PID/stdin/stdout management DirectSpawnOptsandAgentEventtypes added tosrc/runtimes/types.ts- Headless fields added to
AgentRuntimeinterface
ov status,ov dashboard,ov inspectupdated to handle tmux-less (headless) agents gracefullyov stopupdated with headless process termination via PID-basedkillProcessTree()- Health evaluation in
src/watchdog/health.tssupports headless agent lifecycle (PID liveness instead of tmux session checks)
- CLAUDECODE env clearing — clear
CLAUDECODEenv var in tmux sessions for Claude Code >=2.1.66 compatibility - Stale comment — update
--mode rpccomment to--jsoninprocess.ts
- Runtime adapters grew from 5 to 6 (added Sapling)
- 3089 tests across 95 files (7324
expect()calls) - New test files:
src/runtimes/sapling.test.ts,src/agents/guard-rules.test.ts,src/worktree/process.test.ts,src/commands/stop.test.ts,src/commands/status.test.ts,src/commands/dashboard.test.ts,src/watchdog/health.test.ts
0.8.0 - 2026-03-03
ov coordinator send— fire-and-forget message to the running coordinator via mail + auto-nudge, replacing the two-stepov mail send+ov nudgepatternov coordinator ask— synchronous request/response to the coordinator; sends a dispatch mail with acorrelationId, auto-nudges, polls for a reply in the same thread, and exits with the reply body (configurable--timeout, default 120s)ov coordinator output— show recent coordinator output via tmuxcapture-pane(configurable--lines, default 100)- 334 lines of new test coverage in
src/commands/coordinator.test.ts
agents/orchestrator.md— new base agent definition for multi-repo coordination above the coordinator level- Defines the orchestrator role: dispatches coordinators per sub-repo via
ov coordinator start --project, monitors via mail, never modifies code directly - Named failure modes:
DIRECT_SLING,CODE_MODIFICATION,SPEC_WRITING,OVERLAPPING_REPO_SCOPE,OVERLAPPING_FILE_SCOPE,DIRECT_MERGE,PREMATURE_COMPLETION,SILENT_FAILURE,POLLING_LOOP - 239 lines of agent definition
operator-messagessection added toagents/coordinator.md— defines how coordinators handle synchronous human requests from the CLI- Reply format: always reply via
ov mail replywithcorrelationIdecho - Status request format: structured
Active leads/Completed/Blockers/Next actions - Dispatch, stop, merge, and unrecognized request handling rules
ov --project <path>— target a different project root for any command, overriding auto-detection- Validates that the target path contains
.overstory/config.yaml; throwsConfigErrorif missing setProjectRootOverride()/getProjectRootOverride()/clearProjectRootOverride()insrc/config.ts- 66 lines of new test coverage in
src/config.test.ts
ov update— refresh.overstory/managed files from the installed npm package without requiring a fullov init- Refreshes: agent definitions (
agent-defs/*.md),agent-manifest.json,hooks.json,.gitignore,README.md - Does NOT touch:
config.yaml,config.local.yaml, SQLite databases, agent state, worktrees, specs, logs, or.claude/settings.local.json - Flags:
--agents,--manifest,--hooks,--dry-run,--json - Excludes deprecated agent defs (
supervisor.md) - 464 lines of test coverage in
src/commands/update.test.ts
- Agent types grew from 7 to 8 (added orchestrator)
- CLI commands grew from 32 to 34 (added
update,coordinator send,coordinator ask,coordinator output)
- 2923 tests across 92 files (6852
expect()calls)
0.7.9 - 2026-03-03
- Gemini CLI (
gemini) runtime adapter — fullAgentRuntimeimplementation for Google's Gemini coding agent - TUI-based interactive mode via tmux (Ink-based TUI, similar to Copilot adapter)
- Instruction file:
GEMINI.mdwritten to worktree root (agent overlay content) - Sandbox support via
--sandboxflag,--approval-mode yolofor auto-approval - Headless mode:
gemini -p "prompt"for one-shot calls - Transcript parsing from
--output-format stream-jsonNDJSON events - Registered in runtime registry as
"gemini", available viaov sling --runtime gemini - 537 lines of test coverage in
src/runtimes/gemini.test.ts
ANTHROPIC_DEFAULT_{ALIAS}_MODELenv vars — expand model aliases (sonnet,opus,haiku) to specific model IDs at runtimeexpandAliasFromEnv()insrc/agents/manifest.tschecksANTHROPIC_DEFAULT_SONNET_MODEL,ANTHROPIC_DEFAULT_OPUS_MODEL,ANTHROPIC_DEFAULT_HAIKU_MODEL- Applied during
resolveModel()— env var values override default alias resolution - 169 lines of new test coverage in
src/agents/manifest.test.ts
.overstory/.gitignore— un-ignoreagent-defs/contents so custom agent definitions are tracked by git- CI lint — fix import sort order in
sling.test.ts
- 2888 tests across 91 files (6768
expect()calls)
0.7.8 - 2026-03-02
runtime.shellInitDelayMsconfig option — configurable delay between tmux session creation and TUI readiness polling, giving slow shells (oh-my-zsh, nvm, starship, etc.) time to initialize before the agent command starts- Applied to both
ov slingandov coordinator startspawn paths - Validation: must be non-negative number; values above 30s trigger a warning
ov sling --base-branch <branch>— override the base branch for worktree creation instead of using the canonical branch- Resolution order:
--base-branchflag > current HEAD >config.project.canonicalBranch - New
getCurrentBranch()helper insrc/commands/sling.ts
run_idcolumn added totoken_snapshotstable — snapshots are now tagged with the active run ID when recordedgetLatestSnapshots()accepts optionalrunIdparameter to filter snapshots by runov costs --livenow scopes to current run when--runis provided- Migration
migrateSnapshotRunIdColumn()safely adds the column to existing databases
checkSessionState()insrc/worktree/tmux.ts— detailed session state reporting that distinguishes"alive","dead", and"no_server"states (vs the booleanisSessionAlive())- Used by coordinator to provide targeted error messages and clean up stale sessions
src/commands/coordinator.ts—ov coordinator startnow detects zombie coordinator sessions (tmux pane exists but agent process has exited) and automatically reclaims them instead of blocking with "already running"- Stale sessions where tmux is dead or server is not running are now cleaned up before re-spawning
- Handles pid-null edge case (sessions from older schema) conservatively
src/config.ts— validatesshellInitDelayMsis a non-negative finite number; warns on values above 30s; falls back to default (0) on invalid input
- 2830 tests across 90 files (6689
expect()calls) src/metrics/pricing.test.ts— new test suite coveringgetPricingForModel()andestimateCost()src/metrics/store.test.ts— snapshot run_id recording and filtering testssrc/commands/coordinator.test.ts— zombie detection, stale session cleanup, and pid-null edge case testssrc/commands/sling.test.ts—--base-branchflag andgetCurrentBranch()testssrc/config.test.ts—shellInitDelayMsvalidation testssrc/worktree/tmux.test.ts—checkSessionState()tests
0.7.7 - 2026-02-27
src/runtimes/codex.ts— newCodexRuntimeadapter implementing theAgentRuntimeinterface for OpenAI'scodexCLI, with headlesscodex execmode, OS-level sandbox security (Seatbelt/Landlock),AGENTS.mdinstruction path, and NDJSON event stream parsing for token usagesrc/runtimes/codex.test.ts— comprehensive test suite (741 lines) covering spawn command building, config deployment, readiness detection, and transcript parsing- Runtime registry now includes
codexalongsideclaude,pi, andcopilot
docs/runtime-adapters.md— contributor guide (991 lines) covering theAgentRuntimeinterface, all four built-in adapters, the registry pattern, and a step-by-step walkthrough for adding new runtimes
src/commands/dashboard.ts— rewritten with rolling event buffer, compact panels, and new multi-panel layout (Agents 60% + Tasks/Feed 40%, Mail + Merge Queue row, Metrics row)
src/commands/init.test.ts— use no-op spawner in init tests to avoid CI failures from tmux/subprocess side effects
- 2779 tests across 89 files (6591
expect()calls)
0.7.6 - 2026-02-27
src/runtimes/copilot.ts— newCopilotRuntimeadapter implementing theAgentRuntimeinterface for GitHub Copilot'scopilotCLI, with--allow-all-toolspermission mode,.github/copilot-instructions.mdinstruction path, and transcript parsing supportsrc/runtimes/copilot.test.ts— comprehensive test suite (507 lines) covering spawn command building, config deployment, readiness detection, and transcript parsing- Runtime registry now includes
copilotalongsideclaudeandpi
ov initnow bootstraps sibling os-eco tools — automatically runsmulch init,sd init, andcn initwhen the respective CLIs are available; adds CLAUDE.md onboarding sections for each tool- New flags:
--tools <list>(comma-separated tool selection),--skip-mulch,--skip-seeds,--skip-canopy,--skip-onboard,--json src/commands/init.test.ts— expanded with ecosystem bootstrap tests (335 lines total)
src/doctor/providers.ts— newproviderscheck category (11th category) validating gateway provider reachability, auth token environment variables, and tool-use compatibility for multi-runtime configurationssrc/doctor/providers.test.ts— test suite (373 lines) covering provider validation scenarios
src/metrics/pricing.ts— extended with OpenAI (GPT-4o, GPT-4o-mini, GPT-5, o1, o3) and Google Gemini (Flash, Pro) pricing alongside existing Claude tiers
--bead <id>flag forov costs— filter cost breakdown by task/bead ID via newMetricsStore.getSessionsByTask()method- Runtime-aware transcript discovery —
ov costs --selfnow resolves transcript paths through the runtime adapter instead of hardcoding Claude Code paths
- Runtime-aware instruction path in
ov agents discover—extractFileScope()now tries the configured runtime'sinstructionPathbefore falling back toKNOWN_INSTRUCTION_PATHS
- CI: CHANGELOG-based GitHub release notes — publish workflow now extracts the version's CHANGELOG.md section for release notes instead of auto-generating from commits; falls back to
--generate-notesif no entry found
- Pi coding agent URL updated in README to correct repository path
- 2714 tests across 88 files (6481
expect()calls)
0.7.5 - 2026-02-26
- tmux "command too long" error — coordinator, monitor, and supervisor commands now pass agent definition file paths instead of inlining content via
--append-system-prompt; the shell inside the tmux pane reads the file via$(cat ...)at runtime, keeping the tmux IPC message small regardless of agent definition size (fixes #45) - Biome formatting in seeds tracker test (
src/tracker/seeds.test.ts)
SpawnOpts.appendSystemPromptFile— new option inAgentRuntimeinterface (src/runtimes/types.ts) for file-based system prompt injection; both Claude and Pi runtime adapters support it with fallback to inlineappendSystemPrompt- README and package description updated to be runtime-agnostic, reflecting the
AgentRuntimeabstraction
- 2612 tests across 86 files (6277
expect()calls)
0.7.4 - 2026-02-26
src/metrics/pricing.ts— extracted pricing logic fromtranscript.tsinto a standalone module withTokenUsage,ModelPricing,getPricingForModel(), andestimateCost()exports, enabling any runtime (not just Claude Code) to use cost estimation without pulling in JSONL-specific parsing logic
KNOWN_INSTRUCTION_PATHSinagents.ts—extractFileScope()now tries.claude/CLAUDE.mdthenAGENTS.md(future Codex support) instead of hardcoding Claude Code's overlay path
--classificationguidance in all 8 agent definitions — builder, coordinator, lead, merger, monitor, reviewer, and scout definitions updated with--classification <foundational|tactical|observational>guidance forml recordcommands, with inline descriptions of when to use each classification level
agent_endhandler in Pi guard extensions — Pi agents now logsession-endwhen the agentic loop completes (viaagent_endevent), preventing watchdog false-positive zombie escalation;session_shutdownhandler kept as a safety net for crashes and force-kills--tool-nameforwarding in Pi guard extensions —ov log tool-startandov log tool-endcalls now correctly forward the tool name
- Tracker adapter test suites — comprehensive tests for beads (
src/tracker/beads.test.ts, 454 lines) and seeds (src/tracker/seeds.test.ts, 469 lines) backends covering CLI invocation, JSON parsing, error handling, and edge cases - Test suite grew from 2550 to 2607 tests across 86 files (6269 expect() calls)
OVERSTORY_GITIGNOREimport inprime.ts— removed duplicate constant definition, now imports frominit.tswhere the canonical constant lives- Pi agent zombie-state bug — without the
agent_endhandler, completed Pi agents were never marked "completed" in the SessionStore, causing the watchdog to escalate them through stalled → nudge → triage → terminate - Shell completions for
sling— added missing--runtimeflag to shell completion definitions (PR #39, thanks @lucabarak) cleanupTempDirENOENT/EBUSY handling — tightened catch block for ENOENT errors and added retry logic for EBUSY from SQLite WAL handles on Windows (#41)
0.7.3 - 2026-02-26
- Mulch outcome tracking — sling now captures applied mulch record IDs at spawn time (saved to
.overstory/agents/{name}/applied-records.json) andov log session-endappends "success" outcomes back to those records, closing the expertise feedback loop MulchClient.appendOutcome()method for programmatic outcome recording with status, duration, agent, notes, and test results fields
--classificationfilter for mulch search (foundational, tactical, observational)--outcome-statusfilter for mulch search (success, failure)--sort-by-scoresupport in mulch prime for relevance-ranked expertise injection
- Tasks panel — upper-right quadrant displays tracker issues with priority colors
- Feed panel — lower-right quadrant shows recent events from the last 5 minutes
dimBox— dimmed box-drawing characters for less aggressive panel borderscomputeAgentPanelHeight()— dynamic agent panel sizing (min 8, max 50% screen, scales with agent count)- Tracker caching with 10s TTL to reduce repeated CLI calls
- Layout restructured to 60/40 split (agents left, tasks+feed right) with 50/50 mail/merge at bottom
formatEventLine()— centralized compact event formatting with agent colors and event labels (used by both feed and dashboard)numericPriorityColor()— maps numeric priorities (1–4) to semantic colorsbuildAgentColorMap()andextendAgentColorMap()— stable color assignment for agents by appearance order
--no-scout-checkflag to suppress scout-before-build warningshouldShowScoutWarning()— testable logic for when to warn about missing scouts
- 2550 tests across 84 files (6167
expect()calls), up from 2476/83/6044 - New
src/logging/format.test.ts— coverage for event line formatting and color utilities
- EventStore visibility — removed stdin-only gate on EventStore writes so Pi agents get full event tracking without stdin payload (
ov log tool-start/tool-end) - Tool name forwarding — Pi guard extensions now pass
--tool-nametoov logcalls, fixing missing tool names in event timelines
- Added missing
--runtimeflag to sling completions - Synced all shell completion scripts (bash/zsh/fish) with current CLI commands and flags
- Added
--no-scout-checkand--all(dashboard) to completions
- Restored
formatEventLine()usage lost during dashboard-builder merge conflict
- Retry temp dir cleanup on EBUSY from SQLite WAL handles (exponential backoff, 5 retries) — fixes flaky cleanup on Windows
- Tightened
cleanupTempDir()ENOENT handling
- Dashboard layout restructured from single-column to multi-panel grid with dynamic sizing
- Feed and dashboard now share centralized event formatting via
formatEventLine() - Brand color lightened for better terminal contrast
0.7.2 - 2026-02-26
- Configurable model alias expansion —
PiRuntimeConfigtype withprovider+modelMapfields so bare aliases like "opus" are correctly expanded to provider-qualified model IDs (e.g., "anthropic/claude-opus-4-6"), configurable viaconfig.yamlruntime.pi section requiresBeaconVerification?()— optional method onAgentRuntimeinterface; Pi returnsfalseto skip the beacon resend loop that spams duplicate startup messages (Pi's idle/processing states are indistinguishable via pane content)- Config validation for
runtime.pi.providerandruntime.pi.modelMapentries
- Zombie-state bug — Pi agents were stuck in zombie state because pi-guards.ts used the old
() => Extensionobject-style API instead of the correct(pi: ExtensionAPI) => voidfactory style; guards were never firing. Rewritten to ExtensionAPI factory format with properevent.toolNameand{ block, reason }returns - Activity tracking — Added
pi.on(tool_call/tool_execution_end/session_shutdown)handlers solastActivityupdates and the watchdog no longer misclassifies active Pi agents as zombies - Beacon verification loop —
sling.tsnow skips the beacon resend loop whenruntime.requiresBeaconVerification()returnsfalse, preventing duplicate startup messages for Pi agents detectReady()— Fixed to check for Pi TUI header (pi v) + token-usage status bar regex instead ofmodel:which Pi never emits- Pi guard extension tests updated for ExtensionAPI format (8 fixes + 7 new tests)
- Replaced 54 hardcoded "bead" references in agent base definitions with tracker-agnostic terminology (task/issue);
{{TRACKER_CLI}}and{{TRACKER_NAME}}placeholders remain for CLI commands - Fixed overlay fallback default from "bd" to "sd" (seeds is the preferred tracker)
- Supervisor agent soft-deprecated —
ov supervisorcommands marked[DEPRECATED]with stderr warning onstart; supervisor removed from default agent manifest andov initagent-defs copy;supervisor.mdretains deprecation notice but code is preserved for backward compatibility biome.jsonexcludes.pi/directory from linting (generated extension files)
- 2476 tests across 83 files (6044
expect()calls)
0.7.1 - 2026-02-26
src/runtimes/pi.ts—PiRuntimeadapter implementingAgentRuntimefor Mario Zechner's Pi coding agent —buildSpawnCommand()maps topi --model,deployConfig()writes.pi/extensions/overstory-guard.ts+.pi/settings.json,detectReady()looks for Pi TUI header,parseTranscript()handles Pi's top-levelmessage_end/model_changeJSONL formatsrc/runtimes/pi-guards.ts— Pi guard extension generator (generatePiGuardExtension()) — produces self-contained TypeScript files for.pi/extensions/that enforce the same security policies as Claude Code'ssettings.local.jsonPreToolUse hooks (team tool blocking, write tool blocking, path boundary enforcement, dangerous bash pattern detection)src/runtimes/types.ts—RuntimeConnectioninterface for RPC lifecycle:sendPrompt(),followUp(),abort(),getState(),close()— enables direct stdin/stdout communication with runtimes that support it (Pi JSON-RPC), bypassing tmux for mail delivery, shutdown, and health checkssrc/runtimes/types.ts—RpcProcessHandleandConnectionStatesupporting types for the RPC connection interfaceAgentRuntime.connect?()— optional method on the runtime interface for establishing direct RPC connections; orchestrator checksif (runtime.connect)before calling, falls back to tmux when absent- Pi runtime registered in
src/runtimes/registry.ts
src/agents/guard-rules.ts— extracted shared guard constants (NATIVE_TEAM_TOOLS,INTERACTIVE_TOOLS,WRITE_TOOLS,DANGEROUS_BASH_PATTERNS,SAFE_BASH_PREFIXES) fromhooks-deployer.tsinto a pure data module — single source of truth consumed by both Claude Code hooks and Pi guard extensions
transcriptPathfield onAgentSession— new nullable column in sessions.db, populated by runtimes that report their transcript location directly instead of relying on~/.claude/projects/path inferenceSessionStore.updateTranscriptPath()— new method to set transcript path per agentov logtranscript resolution — now checkssession.transcriptPathfirst before falling back to legacy~/.claude/projects/heuristic; discovered paths are also written back to the session store for future lookups- SQLite migration (
migrateAddTranscriptPath) adds the column to existing databases safely
OverstoryConfig.runtime.printCommand— new optional config field for routing headless one-shot AI calls (merge resolver, watchdog triage) through a specific runtime adapter, independent of the default interactive runtime
src/runtimes/pi.test.ts— 526-line test suite covering all 7AgentRuntimemethods for the Pi adaptersrc/runtimes/pi-guards.test.ts— 389-line test suite for Pi guard extension generation across capabilities, path boundaries, and edge cases- Test suite: 2458 tests across 83 files (6026
expect()calls)
- Watchdog completion nudges clarified as informational —
buildCompletionMessage()now says "Awaiting lead verification" instead of "Ready for merge/cleanup", preventing coordinators from prematurely merging based on watchdog nudges - Coordinator
PREMATURE_MERGEanti-pattern strengthened — coordinator.md now explicitly states that watchdog nudges are informational only and that only a typedmerge_readymail from the owning lead authorizes a merge transcriptPath: nulladded to allAgentSessionconstructions — fixes schema consistency across coordinator, supervisor, monitor, and sling agent creation paths
deployHooks()replaced byruntime.deployConfig()— coordinator, supervisor, monitor, and sling now use the runtime abstraction for deploying hooks/guards instead of callingdeployHooks()directly, enabling Pi (and future runtimes) to deploy their native guard mechanismsmerge/resolver.tswired throughruntime.buildPrintCommand()— AI-assisted merge resolution (Tier 3 and Tier 4) now uses the configured runtime for headless calls instead of hardcodingclaude --printwatchdog/triage.tswired throughruntime.buildPrintCommand()— AI-assisted failure triage now uses the configured runtime for headless calls instead of hardcodingclaude --printwriteOverlay()receivesruntime.instructionPath— sling now threads the runtime's instruction file path through overlay generation, so beacon and auto-dispatch messages reference the correct file (e.g..claude/CLAUDE.mdfor Claude, same for Pi)
0.7.0 - 2026-02-25
src/runtimes/types.ts—AgentRuntimeinterface defining the contract for multi-provider agent support:buildSpawnCommand(),buildPrintCommand(),deployConfig(),detectReady(),parseTranscript(),buildEnv(), plus supporting types (SpawnOpts,ReadyState,OverlayContent,HooksDef,TranscriptSummary)src/runtimes/claude.ts—ClaudeRuntimeadapter implementingAgentRuntimefor Claude Code CLI — delegates to existing subsystems (hooks-deployer, transcript parser) without new behaviorsrc/runtimes/registry.ts— Runtime registry withgetRuntime()factory — lookup by name, config default, or hardcoded "claude" fallbackdocs/runtime-abstraction.md— Design document covering coupling inventory, phased migration plan, and adapter contract rationale--runtime <name>flag onov sling— allows per-agent runtime override (defaults to config or "claude")runtime.defaultconfig field — new optionalOverstoryConfig.runtime.defaultproperty for setting the default runtime adapter
src/runtimes/claude.test.ts— 616-line test suite for ClaudeRuntime adapter covering all 7 interface methodssrc/runtimes/registry.test.ts— Registry tests for name lookup, config default fallback, and unknown runtime errorssrc/commands/sling.test.ts— Additional sling tests for runtime integrationsrc/agents/overlay.test.ts— Tests for parameterizedinstructionPathinwriteOverlay()- 2357 tests across 81 files (5857
expect()calls)
src/commands/sling.ts— Rewired to useAgentRuntime.buildSpawnCommand()anddetectReady()instead of hardcodedclaudeCLI construction and TUI heuristicssrc/commands/coordinator.ts— Rewired to useAgentRuntimefor spawn command building, env construction, and TUI readiness detectionsrc/commands/supervisor.ts— Rewired to useAgentRuntimefor spawn command building and TUI readiness detectionsrc/commands/monitor.ts— Rewired to useAgentRuntimefor spawn command building and env constructionsrc/worktree/tmux.ts—waitForTuiReady()now accepts adetectReadycallback instead of hardcoded Claude Code TUI heuristics, making it runtime-agnosticsrc/agents/overlay.ts—writeOverlay()now accepts an optionalinstructionPathparameter (default:.claude/CLAUDE.md), enabling runtime-specific instruction file paths
- README.md: replaced ASCII ecosystem diagram with os-eco logo image
0.6.12 - 2026-02-25
src/logging/theme.ts— canonical visual theme for CLI output: agent state colors/icons, event type labels (compact + full), agent color palette for multi-agent displays, separator characters, and header/sub-header rendering helperssrc/logging/format.ts— shared formatting utilities: duration formatting (formatDuration), absolute/relative/date timestamp formatting, event detail builder (buildEventDetail), agent color mapping (buildAgentColorMap/extendAgentColorMap), status color helpers for merge/priority/log-level
- Dashboard, status, inspect, metrics, run, and costs commands refactored to use shared theme/format primitives — eliminates duplicated color maps, duration formatters, and separator rendering across 6 commands
- Errors, feed, logs, replay, and trace commands refactored to use shared theme/format primitives — eliminates duplicated event label rendering, timestamp formatting, and agent color assignment across 5 commands
- Net code reduction: ~826 lines removed, replaced by ~214+132 lines of shared primitives
MulchClient.record(),search(), andquery()migrated fromBun.spawnCLI wrappers to@os-eco/mulch-cliprogrammatic API — eliminates subprocess overhead for high-frequency expertise operations@os-eco/mulch-cliadded as runtime dependency (^0.6.2) — first programmatic API dependency in the ecosystem- Variable-based dynamic import pattern (
const MULCH_PKG = "..."; import(MULCH_PKG)) prevents tsc from statically resolving into mulch's raw.tssource files - Local
MulchExpertiseRecordandMulchProgrammaticApitype definitions avoid cross-projectnoUncheckedIndexedAccessconflicts
countSessions()method — returns total session count without theLIMITcap thatgetRecentSessions()applies, fixing accurate session count reporting in metrics views
WORKTREE_ISSUE_CREATEfailure mode — prevents leads from running{{TRACKER_CLI}} createin worktrees, where issues are lost on cleanup- Lead workflow updated to mail coordinator for issue creation instead of direct tracker CLI calls — coordinator creates issues on main branch
- Scout/builder/reviewer spawning simplified with
--skip-task-check— removes the pattern of creating separate tracker issues for each sub-agent {{TRACKER_CLI}} createremoved from lead capabilities list
- Test suite grew from 2283 to 2288 tests across 79 files (5744 expect() calls)
- 12 observability commands consolidated onto shared
theme.ts+format.tsprimitives — reduces per-command boilerplate and ensures visual consistency across all CLI output @types/js-yamladded as dev dependency (^4.0.9)
- Static imports of
theme.ts/format.tsreplaced with variable-based dynamic pattern to fix typecheck errors when tsc follows into mulch's raw.tssource files getRecentSessions()limit cap no longer affects session count reporting — dedicatedcountSessions()method provides uncapped counts
0.6.11 - 2026-02-25
agents.maxAgentsPerLeadconfig (default: 5) — limits how many active children a single lead agent can spawn; set to 0 for unlimited--max-agents <n>flag onov sling— CLI override for the per-lead ceiling when spawning under a parentcheckParentAgentLimit()— pure-function guard that counts active children per parent and blocks spawns at the limit
--skip-reviewflag onov sling— instructs a lead agent to skip Phase 3 review and self-verify instead (reads builder diff + runs quality gates)--dispatch-max-agents <n>flag onov sling— per-lead agent ceiling override injected into the overlay so the lead knows its budgetformatDispatchOverrides()in overlay system — generates a## Dispatch Overridessection in lead overlays whenskipReviewormaxAgentsOverrideare setdispatch-overridessection inagents/lead.md— documents the override protocol so leads know to check their overlay before following the default three-phase workflowDispatchPayloadextended withskipScouts,skipReview, andmaxAgentsoptional fields
checkDuplicateLead()— prevents two lead agents from concurrently working the same task ID, avoiding the duplicate work stream anti-pattern (overstory-gktc postmortem)
shouldAutoNudge()andisDispatchNudge()exported from mail.ts for testability — previously inlined logic now unit-testableAUTO_NUDGE_TYPESexported asReadonlySetfor direct test assertions
sling.test.ts— expanded (201 lines added) coveringcheckDuplicateLead,checkParentAgentLimit, per-lead budget ceiling enforcement, and dispatch override validationoverlay.test.ts— expanded (236 lines added) coveringformatDispatchOverrides, skip-review overlay, max-agents overlay, and combined overridesmail.test.ts— expanded (64 lines added) coveringshouldAutoNudge,isDispatchNudge, and dispatch nudge behaviorhooks-deployer.test.ts— new test file (105 lines) covering hooks deployment and configurable safe prefix extractionconfig.test.ts— expanded (22 lines added) coveringmaxAgentsPerLeadvalidation
- Terminology normalization — replaced "beads" with "task" throughout CLI copy and generic code:
checkBeadLock→checkTaskLock,{{BEAD_ID}}→{{TASK_ID}}in overlay template, error messages updated ("Bead is already being worked" → "Task is already being worked") - README unified to canonical os-eco template — shortened, restructured with table-based CLI reference, consistent badge style
agents/lead.md— addeddispatch-overridessection documenting SKIP REVIEW and MAX AGENTS override protocol- Default tracker name changed from
"beads"to"seeds"in overlay fallback
ov tracedescription — changed from "agent/bead" to "agent or task" for consistency with terminology normalization
- 2283 tests across 79 files (5749
expect()calls)
0.6.10 - 2026-02-25
ov ecosystem— dashboard showing all installed os-eco tools (overstory, mulch, seeds, canopy) with version info, update status (current vs latest from npm), and overstory doctor health summary; supports--jsonoutputov upgrade— upgrade overstory (or all ecosystem tools with--all) to their latest npm versions viabun install -g;--checkflag compares versions without installing; supports--jsonoutput
--fixflag — auto-fix capability for doctor checks; fixable checks now include repair closures that are executed when--fixis passed, with human-readable action summaries- Fix closures added to all check modules — structure, databases, merge-queue, and ecosystem checks now return fix functions that can recreate missing directories, reinitialize databases, and reinstall tools
ecosystemcheck category — new 10th doctor category validating that os-eco CLI tools (ml, sd, cn) are on PATH and report valid semver versions; fix closures reinstall viabun install -g
--timingflag — prints command execution time to stderr after any command completes (e.g.,Done in 42ms)
- Quality gate placeholders in agent prompts — agent base definitions (builder, merger, reviewer, lead) now use
{{QUALITY_GATE_*}}placeholders instead of hardcodedbun test/bun run lint/bun run typecheckcommands, driven byproject.qualityGatesconfig - 4 quality gate formatter functions —
formatQualityGatesInline,formatQualityGateSteps,formatQualityGateBash,formatQualityGateCapabilitiesadded to overlay system for flexible placeholder resolution - Configurable safe command prefixes —
SAFE_BASH_PREFIXESin hooks-deployer now dynamically extracted from quality gate config viaextractQualityGatePrefixes(), replacing hardcodedbun test/bun run lint/bun run typecheckentries - Config-driven hooks deployment —
sling.tsnow passesconfig.project.qualityGatesthrough todeployHooks()so non-implementation agents can run project-specific quality gate commands
ecosystem.test.ts— new test file (307 lines) covering ecosystem command output, JSON mode, and tool detectionupgrade.test.ts— new test file (46 lines) covering upgrade command registration and option parsingdatabases.test.ts— new test file (38 lines) covering database health check fix closuresmerge-queue.test.ts— new test file (98 lines) covering merge queue health check and fix closuresstructure.test.ts— expanded (131 lines added) covering structure check fix closures for missing directoriesoverlay.test.ts— expanded (157 lines added) covering quality gate formatters and placeholder resolutionhooks-deployer.test.ts— expanded (52 lines added) covering configurable safe prefix extraction
- Agent base definitions updated — builder, merger, reviewer, and lead
.mdfiles now use{{QUALITY_GATE_*}}template placeholders instead of hardcoded bun commands DEFAULT_QUALITY_GATESconsolidated — removed duplicate definition fromoverlay.ts, now imported fromconfig.tsas single source of truth
DoctorCheck.fixreturn type — changed fromvoidtostring[]so fix closures can report what actions were taken- Feed follow-mode
--jsonoutput — now usesjsonOutputenvelope instead of rawJSON.stringify --timingpreAction — correctly readsopts.timingfrom global options instead of hardcoded checkprocess.exit(1)in completions.ts — replaced withprocess.exitCode = 1; returnto avoid abrupt process termination
- 2241 tests across 79 files (5694
expect()calls)
0.6.9 - 2026-02-25
--yes/-yflag — skip interactive confirmation prompts for scripted/automated initialization (contributed by @lucabarak via PR #37)--name <name>flag — explicitly set the project name instead of auto-detecting from git remote or directory name
- JSON envelope applied to all remaining commands — four batches (A, B, C, D) migrated every
--jsoncode path to use thejsonOutput()/jsonError()envelope format ({ success, command, ...data }), completing the ecosystem-wide standardization started in 0.6.8
accent()applied to IDs in human-readable output — agent names, mail IDs, group IDs, run IDs, and task IDs now render with accent color formatting across status, dashboard, inspect, agents, mail, merge, group, run, trace, and errors commands
hooks-deployer.test.ts— new test file (180 lines) covering hooks deployment to worktreesinit.test.ts— new test file (104 lines) covering--yesand--nameflag behavior
- Completions, prime, and watch commands migrated to print helpers — remaining commands that used raw
console.log/console.errornow useprintSuccess/printWarning/printError/printHintfor consistent output formatting
- PATH prefix for hook commands — deployed hooks now include
~/.bun/binin the PATH prefix, fixing resolution failures when bun-installed CLIs (likeovitself) weren't found by hook subprocesses - Reinit messaging for
--yesflag — corrected output messages when re-initializing an existing.overstory/directory with the--yesflag
- 2186 tests across 77 files (5535
expect()calls)
0.6.8 - 2026-02-25
jsonOutput()/jsonError()helpers (src/json.ts) — standard JSON envelope format ({ success, command, ...data }) matching the ecosystem convention used by mulch, seeds, and canopyprintSuccess()/printWarning()/printError()/printHint()helpers (src/logging/color.ts) — branded message formatters with consistent color/icon treatment (brand checkmark, yellow!, red cross, dim indent)
- Custom branded help screen —
ov --helpnow shows a styled layout with colored command names, dim arguments, and version header instead of Commander.js defaults --version --jsonflag —ov -v --jsonoutputs machine-readable JSON ({ name, version, runtime, platform })- Unknown command fuzzy matching — typos like
ov stautsnow suggest the closest match via Levenshtein edit distance ("Did you mean 'status'?")
- Auto-confirm workspace trust dialog —
waitForTuiReadynow detects "trust this folder" prompts and sends Enter automatically, preventing agents from stalling on first-time workspace access
- All 30 commands migrated to message helpers — three batches (A, B, C) updated every command to use
printSuccess/printWarning/printError/printHintinstead of ad-hocconsole.log/console.errorcalls, ensuring uniform output style - Global error handler uses
jsonError()— top-level catch inindex.tsnow outputs structured JSON envelopes when--jsonis passed, instead of rawconsole.error
- Two-phase readiness check —
waitForTuiReadynow requires both a prompt indicator (❯orTry ") AND status bar text (bypass permissionsorshift+tab) before declaring the TUI ready, preventing premature beacon submission
- Slash-command prompts moved to
.claude/commands/—issue-reviews.md,pr-reviews.md,prioritize.md, andrelease.mdremoved fromagents/directory (they are skill definitions, not agent base definitions) - Agent definition wording updates — minor reference fixes across coordinator, lead, merger, reviewer, scout, and supervisor base definitions
color.test.tsmocking — tests now mockprocess.stdout.write/process.stderr.writeinstead ofconsole.log/console.errorto match actual implementationmulch client testupdated for auto-create domain behaviormulch→mlalias in tests — test files migrated to use themlshort alias consistently
- 2167 tests across 77 files (5465
expect()calls)
0.6.7 - 2026-02-25
- Replace
--dangerously-skip-permissionswith--permission-mode bypassPermissionsacross all agent spawn paths (coordinator, supervisor, sling, monitor) — adapts to updated Claude Code CLI flag naming
- Remove remaining emoji from
ov statusoutput — section headers (Agents, Worktrees, Mail, Merge queue, Sessions recorded) and deprecation warning now use plain text; alive markers use colored>/xinstead of●/○
- Increase TUI readiness timeout from 15s to 30s —
waitForTuiReadynow waits longer for Claude Code TUI to initialize, reducing false-negative timeouts on slower machines - Smarter TUI readiness detection —
waitForTuiReadynow checks for actual TUI markers (❯prompt orTry "text) instead of any pane content, preventing premature readiness signals - Extend follow-up Enter delays — beacon submission retries expanded from
[1s, 2s]to[1s, 2s, 3s, 5s]in sling, coordinator, and supervisor, improving reliability when Claude Code TUI initializes slowly
- 2151 tests across 76 files (5424
expect()calls)
0.6.6 - 2026-02-24
overstory→ovacross all CLI-facing text — every user-facing string, error message, help text, and command comment across allsrc/commands/*.tsfiles now referencesovinstead ofoverstorymulch→mlin agent definitions and overlay — all 8 base agent definitions (agents/*.md), overlay template (templates/overlay.md.tmpl), and overlay generator (src/agents/overlay.ts) updated to use themlshort alias- Templates and hooks updated —
templates/CLAUDE.md.tmpl,templates/hooks.json.tmpl, and deployed agent defs all referenceov/mlaliases - Canopy prompts re-emitted — all canopy-managed prompts regenerated with alias-aware content
- Status icons replaced with ASCII Set D — dashboard, status, and sling output now use
>(working),-(booting),!(stalled),x(zombie/completed),?(unknown) instead of Unicode circles and checkmarks - All emoji removed from CLI output — warning prefixes, launch messages, and status indicators no longer use emoji characters, improving compatibility with terminals that lack Unicode support
- Auto-dispatch mail before tmux session —
buildAutoDispatch()sends dispatch mail to the agent's mailbox before creating the tmux session, eliminating the race where coordinator dispatch arrives after the agent boots and sits idle - Beacon verification loop — after beacon send, sling polls the tmux pane up to 5 times (2s intervals) to detect if the agent is still on the welcome screen; if so, resends the beacon automatically (fixes overstory-3271)
capturePaneContent()exported from tmux.ts — new helper for reading tmux pane text, used by beacon verification
detectOverstoryBinDir()tries bothovandoverstory— loops through both command names when resolving the binary directory, ensuring compatibility regardless of how the tool was installed
/releaseskill — prepares releases by analyzing changes, bumping versions, updating CHANGELOG/README/CLAUDE.md/issue-reviewsskill — reviews GitHub issues from within Claude Code/pr-reviewsskill — reviews GitHub pull requests from within Claude Code
- Test suite: 2151 tests across 76 files (5424 expect() calls)
- Mail dispatch race for newly slung agents — dispatch mail is now written to SQLite before tmux session creation, ensuring it exists when the agent's SessionStart hook fires
ov mail check process.exit(1)replaced withprocess.exitCode = 1— CLI entry point no longer callsprocess.exit()directly, allowing Bun to clean up gracefully (async handlers, open file descriptors)- Remaining
beadId→taskIdreferences — completed rename intrace.ts,trace.test.ts,spec.ts,worktree.test.ts, and canopy prompts for coordinator/supervisor - Post-merge quality gate failures — fixed lint and type errors introduced during multi-agent merge sessions
- Mail test assertions — updated to match lowercase Warning/Note output after emoji removal
0.6.5 - 2026-02-24
preserveSeedsChanges()in worktree manager — extracts.seeds/diffs from lead agent branches and applies them to the canonical branch via patch before worktree cleanup, preventing loss of issue files created by leads whose branches are never merged through the normal merge pipeline- Integrated into
overstory worktree clean— automatically preserves seeds changes before removing completed worktrees
resolveConflictsUnion()in merge resolver — new auto-resolve strategy for files withmerge=uniongitattribute that keeps all lines from both sides (canonical + incoming), relying on dedup-on-read to handle duplicatescheckMergeUnion()helper — queriesgit check-attr mergeto detect union merge strategy per file- Auto-resolve tier now checks gitattributes before choosing between keep-incoming and union resolution strategies
ensureTmuxAvailable()preflight in sling command — verifies tmux is available before attempting session creation, providing a clear error instead of cryptic spawn failures
- Test suite: 2145 tests across 76 files (5410 expect() calls)
beadId→taskIdrename across all TypeScript source — comprehensive rename of thebeadIdfield totaskIdin all source files, types, interfaces, and tests, completing the tracker abstraction naming migration started in v0.6.0gatherStatus()usesevaluateHealth()— status command now applies the full health evaluation from the watchdog module for agent state reconciliation, matching dashboard and watchdog behavior (handles tmux-dead→zombie, persistent capability booting→working, and time-based stale/zombie detection)
- Single quote escaping in blockGuard shell commands — fixed shell escaping in blockGuard patterns that could cause guard failures when arguments contained single quotes
- Dashboard version from package.json — dashboard now reads version dynamically from
package.jsoninstead of a hardcoded value - Seeds config project name — renamed project from "seeds" to "overstory" in
.seeds/config.yamland fixed 71 misnamed issue IDs
0.6.4 - 2026-02-24
- Full CLI migration to Commander.js — all 30+ commands migrated from custom
argsarray parsing to Commander.js with typed options, subcommand hierarchy, and auto-generated--help; migration completed in 6 incremental commits covering core workflow, nudge, mail, observability, infrastructure, and final cleanup - Shell completions via Commander —
createCompletionsCommand()now uses Commander's built-in completion infrastructure
- Chalk-based color module —
src/logging/color.tsrewritten from custom ANSI escape code strings to Chalk v5 wrapper functions with nativeNO_COLOR/FORCE_COLOR/TERM=dumbsupport - Brand palette — three named brand colors exported:
brand(forest green),accent(amber),muted(stone gray) viachalk.rgb() - Chainable color API —
color.bold,color.dim,color.red, etc. now delegate to Chalk for composable styling
- Merge queue SQL schema consistency tests added
- Test suite: 2128 tests across 76 files (5360 expect() calls)
- Runtime dependencies — chalk v5 added as first runtime dependency (previously zero runtime deps); chalk is ESM-only and handles color detection natively
- CLI parsing — all commands converted from manual
argsarray indexing to Commander.js.option()/.argument()declarations with automatic type coercion and validation - Color module API —
colorexport changed from a record of ANSI string constants to a record of Chalk wrapper functions; consumers callcolor.red("text")(function) instead of${color.red}text${color.reset}(string interpolation) noColoridentity function — replaces the oldcolor.whitedefault for cases where no coloring is needed
- Merge queue migration — added missing
bead_id→task_idcolumn migration formerge-queue.db, aligning with the schema migration already applied to sessions.db, events.db, and metrics.db in v0.6.0 - npm publish auth — fixed authentication issues in publish workflow and cleaned up post-merge artifacts from Commander migration
- Commander direct parse — fixed 6 command wrapper functions that incorrectly delegated to Commander instead of using direct
.action()pattern (metrics, replay, status, trace, supervisor, and others)
0.6.3 - 2026-02-24
- PreToolUse guards block interactive tools —
AskUserQuestion,EnterPlanMode, andEnterWorktreeare now blocked for all overstory agents via hooks-deployer, preventing indefinite hangs in non-interactive tmux sessions; agents must useoverstory mail --type questionto escalate instead
- Expanded
overstory doctordependency checks — now validates all ecosystem CLIs (overstory, mulch, seeds, canopy) with alias availability checks (ov,ml) and install hints (npm install -g @os-eco/<pkg>) - Short alias detection: when a primary tool passes, doctor also checks if its short alias (e.g.,
ovforoverstory,mlformulch) is available, with actionable fix hints
ovshort alias —overstoryCLI is now also available asovviapackage.jsonbin entry/prioritizeskill — new Claude Code command that analyzes open GitHub Issues and Seeds issues, cross-references with codebase health, and recommends the top ~5 issues to tackle next- Skill headers — all Claude Code slash commands now include descriptive headers for better discoverability
- Publish workflow — replaced
auto-tag.ymlwithpublish.ymlthat runs quality gates, checks version against npm, publishes with provenance, creates git tags and GitHub releases automatically
SessionStore.count()— lightweightSELECT COUNT(*)method replacinggetAll().lengthpattern inopenSessionStore()existence checks
- Test suite grew from 2090 to 2137 tests across 76 files (5370 expect() calls)
- SQL schema consistency tests for all four SQLite stores (sessions.db, mail.db, events.db, metrics.db)
- Provider config and model resolution edge case tests
- Sling provider environment variable injection building block tests
- Tmux dead session detection in
waitForTuiReady()— now checksisSessionAlive()on each poll iteration and returns early if the session died, preventing 15-second timeout waits on already-dead sessions ensureTmuxAvailable()guard — new pre-flight check throws a clearAgentErrorwhen tmux is not installed, replacing cryptic spawn failurespackage.jsonfiles array — reformatted for Biome compatibility
- CI workflow:
auto-tag.ymlreplaced bypublish.ymlwith npm publish, provenance, and GitHub release creation - Config field references updated:
beads→taskTrackerin remaining locations
0.6.2 - 2026-02-24
--skip-task-checkflag foroverstory sling— skips task existence validation and issue claiming, designed for leads spawning builders with worktree-created issues that don't exist in the canonical tracker yet- Bead lock parent bypass — parent agent can now delegate its own task ID to a child without triggering the concurrent-work lock (sling allows spawn when the lock holder matches
--parent) - Lead agent
--skip-task-checkadded to default sling template inagents/lead.md
- Leads now use
overstory spec write <id> --body "..." --agent $OVERSTORY_AGENT_NAMEinstead of Write/Edit tools for creating spec files — enforces read-only tool posture while still enabling spec creation
- Test suite grew from 2087 to 2090 tests across 75 files (5137 expect() calls)
- Dashboard health evaluation — dashboard now applies the full
evaluateHealth()function from the watchdog module instead of only checking tmux liveness; correctly transitions persistent capabilities (coordinator, monitor) frombooting→workingwhen tmux is alive, and detects stale/zombie states using configured thresholds - Default tracker resolution to seeds —
resolveBackend()now falls back to"seeds"when no tracker directory exists (previously defaulted to"beads") - Coordinator beacon uses
resolveBackend()— properly resolves"auto"backend instead of a simple conditional that didn't handle auto-detection - Doctor dependency checks use
resolveBackend()— properly resolves"auto"backend for tracker CLI availability checks instead of assuming beads - Hardcoded 'orchestrator' replaced with 'coordinator' — overlay template default parent address, agent definitions (builder, merger, monitor, scout), and test assertions all updated to use
coordinatoras the default parent/mail recipient
- Lead agent definition: Write/Edit tools removed from capabilities, replaced with
overstory spec writeCLI command - Agent definitions (builder, merger, monitor, scout) updated to reference "coordinator" instead of "orchestrator" in mail examples and constraints
0.6.1 - 2026-02-23
- All 8 agent definitions (
agents/*.md) restructured for Canopy prompt composition — behavioral sections (propulsion-principle,cost-awareness,failure-modes,overlay,constraints,communication-protocol,completion-protocol) moved to the top of each file with kebab-case headers, core content sections (intro,role,capabilities,workflow) placed after - Section headers converted from Title Case (
## Role) to kebab-case (## role) across all agent definitions for Canopy schema compatibility
deployHooks()now preserves existingsettings.local.jsoncontent when deploying hooks — merges with non-hooks keys (permissions, env,$schema, etc.) instead of overwriting the entire fileisOverstoryHookEntry()exported for detecting overstory-managed hook entries — enables stripping stale overstory hooks while preserving user-defined hooks- Overstory hooks placed before user hooks per event type so security guards always run first
- Test suite grew from 2075 to 2087 tests across 75 files (5150 expect() calls)
- Dogfooding tracker migrated from beads to seeds —
.beads/directory removed,.seeds/directory added with all issues migrated - Biome ignore pattern updated:
.beads/→.seeds/
deployHooks()no longer overwrites existingsettings.local.json— previously deploying hooks for coordinator/supervisor/monitor agents at the project root would destroy any existing settings (permissions, user hooks, env vars)
0.6.0 - 2026-02-23
src/tracker/module — pluggable task tracker backend system replacing the hardcoded beads dependencyTrackerClientinterface with unified API:ready(),show(),create(),claim(),close(),list(),sync()TrackerIssuetype for backend-agnostic issue representationcreateTrackerClient()factory function dispatching to concrete backendsresolveBackend()auto-detection — probes.seeds/then.beads/directories when configured as"auto"trackerCliName()helper returning"sd"or"bd"based on resolved backend- Beads adapter (
src/tracker/beads.ts) — wrapsbdCLI with--jsonparsing - Seeds adapter (
src/tracker/seeds.ts) — wrapssdCLI with--jsonparsing - Factory tests (
src/tracker/factory.test.ts) — 80 lines covering resolution and client creation
QualityGatetype ({ name, command, description }) intypes.ts— replaces hardcodedbun test && bun run lint && bun run typecheckproject.qualityGatesconfig field — projects can now define custom quality gate commands inconfig.yamlDEFAULT_QUALITY_GATESconstant inconfig.ts— preserves the default 3-gate pipeline (Tests, Lint, Typecheck)- Quality gate validation in
validateConfig()— ensures each gate has non-emptyname,command, anddescription - Overlay template renders configured gates dynamically instead of hardcoded commands
OverlayConfig.qualityGatesfield threads gates from config through to agent overlays
taskTracker: { backend, enabled }config field replaces legacybeads:andseeds:sections- Automatic migration:
beads: { enabled: true }→taskTracker: { backend: "beads", enabled: true }(and same forseeds:) TaskTrackerBackendtype:"auto" | "beads" | "seeds"with"auto"as default- Deprecation warnings emitted when legacy config keys are detected
TRACKER_CLIandTRACKER_NAMEtemplate variables in overlay.ts — agent defs no longer hardcodebd/beads- All 8 agent definitions (
agents/*.md) updated:bd→TRACKER_CLI,beads→TRACKER_NAME - Coordinator beacon updated with tracker-aware context
- Hooks-deployer safe prefixes updated for tracker CLI commands
mergeHooksByEventType()—overstory hooks install --forcenow merges hooks per event type with deduplication instead of wholesale replacement, preserving user-added hooks
- Test suite grew from 2026 to 2075 tests across 75 files (5128 expect() calls)
- beads → taskTracker config:
config.beadsrenamed toconfig.taskTrackerwith backward-compatible migration - bead_id → task_id: Column renamed across all SQLite schemas (metrics.db, merge-queue.db, sessions.db, events.db) with automatic migration for existing databases
group.tsandsupervisor.tsnow use tracker abstraction instead of direct beads client callssling.tsusesresolveBackend()andtrackerCliName()from factory module- Doctor dependency checks updated to detect the active tracker CLI (
bdorsd)
overstory hooks install --forcenow merges hooks by event type instead of replacing the entire settings file — preserves non-overstory hooksdetectCanonicalBranch()now accepts any branch name (removed restrictive regex)bead_id→task_idSQLite column migration for existing databases (metrics, merge-queue, sessions, events)config.seeds→config.taskTrackerbootstrap path insling.tsgroup.tsandsupervisor.tsnow useresolveBackend()for proper tracker resolution instead of hardcoded backend- Seeds adapter validates envelope
successfield before unwrapping response data - Hooks tests use literal keys instead of string indexing for
noUncheckedIndexedAccesscompliance - Removed old
src/beads/directory (replaced bysrc/tracker/)
0.5.9 - 2026-02-21
overstory stop <agent-name>— explicitly terminate a running agent by killing its tmux session, marking the session as completed in SessionStore, with optional--clean-worktreeto remove the agent's worktree (17 tests, DI pattern viaStopDeps)
- Bead lock —
checkBeadLock()pure function prevents concurrent agents from working the same bead ID, enforced inslingCommandbefore spawning - Run session cap —
checkRunSessionLimit()pure function withmaxSessionsPerRunconfig field (default 0 = unlimited), enforced inslingCommandto limit concurrent agents per run --skip-scoutflag — passes through to overlay viaOverlayConfig.skipScout, rendersSKIP_SCOUT_SECTIONin template for lead agents that want to skip scout phase
- Complexity-tiered pipeline in lead agent definition — leads now assess task complexity (simple/moderate/complex) before deciding whether to spawn scouts, builders, and reviewers
- Scouts made optional for simple/moderate tasks (SHOULD vs MUST)
- Reviewers made optional with self-verification path for simple/moderate tasks
SCOUT_SKIPandREVIEW_SKIPfailure modes softened to warnings- Scout and reviewer agents simplified: replaced
INSIGHT:protocol with plain notable findings
- Test suite grew from 1996 to 2026 tests across 74 files (5023 expect() calls)
- Lead agent role reframed to reflect that leads can be doers for simple tasks, not just delegators
- Lead propulsion principle updated to assess complexity before acting
- Lead cost awareness section no longer mandates reviewers
- Biome formatting in
stop.test.ts(pre-existing lint issue)
0.5.8 - 2026-02-20
ResolvedModeltype and provider gateway support inresolveModel()— resolvesModelRefstrings (e.g.,openrouter/openai/gpt-5.3) through configured provider gateways withbaseUrlandauthTokenEnv- Provider and model validation in
validateConfig()— validates provider types (native/gateway), required gateway fields (baseUrl), and model reference format at config load time - Provider environment variables now threaded through all agent spawn commands (
sling,coordinator,supervisor,monitor) — gatewayauthTokenEnvvalues are passed to spawned agent processes
- Auto-infer mulch domains from file scope in
overstory sling—inferDomainsFromFiles()maps file paths to domains (e.g.,src/commands/*.ts→cli,src/agents/*.ts→agents) instead of always using configured defaults - Outcome flags for
MulchClient.record()—--outcome-status,--outcome-duration,--outcome-test-results,--outcome-agentfor structured outcome tracking - File-scoped search in
MulchClient.search()—--fileand--sort-by-scoreoptions for targeted expertise queries - PostToolUse Bash hook in hooks template and init — runs
mulch diffafter git commits to auto-detect expertise changes
- Builder completion protocol includes outcome data flags (
--outcome-status success --outcome-agent $OVERSTORY_AGENT_NAME) - Lead and supervisor agents get file-scoped mulch search capability (
mulch search <query> --file <path>) - Overlay quality gates include outcome flags for mulch recording
limitoption added toMailStore.getAll()— dashboard now fetches only the most recent messages instead of the full mailbox- Persistent DB connections across dashboard poll ticks —
SessionStore,EventStore,MailStore, andMetricsStoreconnections are now opened once and reused, eliminating per-tick open/close overhead
- Test suite grew from 1916 to 1996 tests across 73 files (4960 expect() calls)
- Zombie agent recovery —
updateLastActivitynow recovers agents from "zombie" state when hooks prove they're alive (previously only recovered from "booting") - Dashboard
.repeat()crash when negative values were passed — now clamps repeat count to minimum of 0 - Set-based tmux session lookup in
status.tsreplacing O(n) array scans with O(1) Set membership checks - Subprocess cache in
status.tspreventing redundanttmux list-sessionscalls during a single status gather - Null-runId sessions (coordinator) now included in run-scoped status and dashboard views — previously filtered out when
--allwas not specified - Sparse file used in logs doctor test to prevent timeout on large log directory scans
- Beacon submission reliability — replaced fixed sleep with poll-based TUI readiness check (PR #19, thanks @dmfaux!)
- Biome formatting in hooks-deployer test and sling
0.5.7 - 2026-02-19
ModelAlias,ModelRef, andProviderConfigtypes intypes.ts— foundation for multi-provider model routing (nativeandgatewayprovider types withbaseUrlandauthTokenEnvconfiguration)providersfield inOverstoryConfig—Record<string, ProviderConfig>for configuring model providers per projectresolveModel()signature updated to acceptModelRef(provider-qualified strings likeopenrouter/openai/gpt-5.3) alongside simpleModelAliasvalues
--selfflag foroverstory costs— parse the current orchestrator session's Claude Code transcript directly, bypassing metrics.db, useful for real-time cost visibility without agent infrastructure
run_idcolumn added tometrics.dbsessions table — enablesoverstory costs --run <id>filtering to work correctly; includes automatic migration for existing databases
- Phase-aware
buildCompletionMessage()in watchdog daemon — generates targeted completion nudge messages based on worker capability composition (single-capability batches get phase-specific messages like "Ready for next phase", mixed batches get a summary with breakdown)
- Test suite grew from 1892 to 1916 tests across 73 files (4866 expect() calls)
0.5.6 - 2026-02-18
- Root-user pre-flight guard on all agent spawn commands (
sling,coordinator start,supervisor start,monitor start) — blocks spawning when running as UID 0, since theclaudeCLI rejects--dangerously-skip-permissionsas root causing tmux sessions to die immediately - Unmerged branch safety check in
overstory worktree clean— skips worktrees with unmerged branches by default, warns about skipped branches, and requires--forceto delete them
.overstory/README.mdgeneration duringoverstory init— explains the directory to contributors who encounter.overstory/in a project, whitelisted in.gitignore
overstory monitor startnow gates onwatchdog.tier2Enabledconfig flag — throws a clear error when Tier 2 is disabled instead of silently proceedingoverstory coordinator start --monitorrespectstier2Enabled— skips monitor auto-start with a message when disabled
sendKeysnow distinguishes "tmux server not running" from "session not found" — provides actionable error messages for each case (e.g., root-user hint for server-not-running)
- Lead agent definition (
agents/lead.md) reframed as coordinator-not-doer — emphasizes the lead's role as a delegation specialist rather than an implementer
- Test suite grew from 1868 to 1892 tests across 73 files (4807 expect() calls)
- Biome formatting in merged builder code
0.5.5 - 2026-02-18
overstory statusnow scopes to the current run by default with--allflag to show all runs —gatherStatus()filters sessions byrunIdwhen presentoverstory dashboardnow scopes all panels to the current run by default with--allflag to show data across all runs
config.local.yamlsupport for machine-specific configuration overrides — values inconfig.local.yamlare deep-merged overconfig.yaml, allowing per-machine settings (model overrides, paths, watchdog intervals) without modifying the tracked config file (PR #9)
- PreToolUse hooks template now includes a universal
git pushguard — blocks allgit pushcommands for all agents (previously only blocked push to canonical branches)
- Watchdog daemon tick now detects when all agents in the current run have completed and auto-reports run completion
- Lead agents now stream
merge_readymessages per-builder as each completes, instead of batching all merge signals — enables earlier merge pipeline starts
- Added
issue-reviewsandpr-reviewsskills for reviewing GitHub issues and pull requests from within Claude Code
- Test suite grew from 1848 to 1868 tests across 73 files (4771 expect() calls)
overstory slingnow usesresolveModel()for config-level model overrides — previously ignoredmodels:config section when spawning agentsoverstory doctordependency check now detectsbdCGO/Dolt backend failures — catches cases wherebdbinary exists but crashes due to missing CGO dependencies (PR #11)- Biome line width formatting in
src/doctor/consistency.ts
0.5.4 - 2026-02-17
- Reviewer-coverage doctor check in
overstory doctor— warns when leads spawn builders without corresponding reviewers, reports partial coverage ratios per lead merge_readyreviewer validation inoverstory mail send— advisory warning when sendingmerge_readywithout reviewer sessions for the sender's builders
- Scout-before-builder warning in
overstory sling— warns when a lead spawns a builder without having spawned any scouts first parentHasScouts()helper exported from sling for testability
overstory coordinator stopnow auto-completes the active run (readscurrent-run.txt, marks run completed, cleans up)overstory log session-endauto-completes the run when the coordinator exits (handles tmux window close without explicit stop)
.overstory/.gitignoreflipped from explicit blocklist to wildcard*+ whitelist pattern — ignore everything, whitelist only tracked files (config.yaml,agent-manifest.json,hooks.json,groups.json,agent-defs/)overstory primeauto-heals.overstory/.gitignoreon each session start — ensures existing projects get the updated gitignoreOVERSTORY_GITIGNOREconstant andwriteOverstoryGitignore()exported from init.ts for reuse
- Test suite grew from 1812 to 1848 tests across 73 files (4726 expect() calls)
- Lead agent definition (
agents/lead.md) — scouts made mandatory (not optional), Phase 3 review made MANDATORY with stronger language, addedSCOUT_SKIPfailure mode, expanded cost awareness section explaining why scouts and reviewers are investments not overhead overstory init.gitignore now always overwrites (supports--forcereinit and auto-healing)
- Hooks template (
templates/hooks.json.tmpl) — removed fragileread -r INPUT; echo "$INPUT" |stdin relay pattern;overstory lognow reads stdin directly via--stdinflag readStdinJson()in log command — reads all stdin chunks for large payloads instead of only the first line- Doctor gitignore structure check updated for wildcard+whitelist model
0.5.3 - 2026-02-17
models:section inconfig.yaml— override the default model (sonnet,opus,haiku) for any agent role (coordinator, supervisor, monitor, etc.)resolveModel()helper in agent manifest — resolution chain: config override > manifest default > fallback- Supervisor and monitor entries added to
agent-manifest.jsonwith model and capability metadata overstory initnow seeds the defaultmodels:section in generatedconfig.yaml
- Test suite grew from 1805 to 1812 tests across 73 files (4638 expect() calls)
0.5.2 - 2026-02-17
--into <branch>flag foroverstory merge— target a specific branch instead of always merging to canonicalBranch
overstory primenow records the orchestrator's starting branch to.overstory/session-branch.txtat session startoverstory mergereadssession-branch.txtas the default merge target when--intois not specified — resolution chain:--intoflag >session-branch.txt> configcanonicalBranch
- Test suite grew from 1793 to 1805 tests across 73 files (4615 expect() calls)
- Git push blocking for agents now blocks ALL
git pushcommands (previously only blocked push to canonical branches) — agents should useoverstory mergeinstead - Init-deployed hooks now include a PreToolUse Bash guard that blocks
git pushfor the orchestrator's project
- Test cwd pollution in agents test afterEach — restored cwd to prevent cross-file pollution
0.5.1 - 2026-02-16
overstory agents discover— discover and query agents by capability, state, file scope, and parent with--capability,--state,--parentfilters and--jsonoutput
- Session insight analyzer (
src/insights/analyzer.ts) — analyzes EventStore data from completed sessions to extract structured patterns about tool usage, file edits, and errors for automatic mulch expertise recording - Conflict history intelligence in merge resolver — tracks past conflict resolution patterns per file to skip historically-failing tiers and enrich AI resolution prompts with successful strategies
- INSIGHT recording protocol for agent definitions — read-only agents (scout, reviewer) use INSIGHT prefix for structured expertise observations; parent agents (lead, supervisor) record insights to mulch automatically
- Test suite grew from 1749 to 1793 tests across 73 files (4587 expect() calls)
session-endhook now callsmulch recorddirectly instead of sendingmulch_learnmail messages — removes mail indirection for expertise recording
- Coordinator tests now always inject fake monitor/watchdog for proper isolation
0.5.0 - 2026-02-16
overstory feed— unified real-time event stream across all agents with--followmode for continuous polling, agent/run filtering, and JSON outputoverstory logs— query NDJSON log files across agents with level filtering (--level), time range queries (--since/--until), and--followtail modeoverstory costs --live— real-time token usage display for active agents
--monitorflag forcoordinator start/stop/status— manage the Tier 2 monitor agent alongside the coordinator
- Mulch recording as required completion gate for all agent types — agents must record learnings before session close
- Mulch learn extraction added to Stop hooks for orchestrator and all agents
- Scout-spawning made default in lead.md Phase 1 with parallel support
- Reviewer spawning made mandatory in lead.md
- Real-time token tracking infrastructure (
src/metrics/store.ts,src/commands/costs.ts) — live session cost monitoring via transcript JSONL parsing
- Test suite grew from 1673 to 1749 tests across 71 files (4460 expect() calls)
- Duplicate
feedentry in CLI command router and help text
0.4.1 - 2026-02-16
overstory --completions <shell>— shell completion generation for bash, zsh, and fish--quiet/-qglobal flag — suppress non-error output across all commandsoverstory mail send --to @all— broadcast messaging with group addresses (@all,@builders,@scouts,@reviewers,@leads,@mergers, etc.)
- Central
NO_COLORconvention support (src/logging/color.ts) — respectsNO_COLOR,FORCE_COLOR, andTERM=dumbenvironment variables per https://no-color.org - All ANSI color output now goes through centralized color module instead of inline escape codes
- Merge queue migrated from JSON file to SQLite (
merge-queue.db) for durability and concurrent access
- Test suite grew from 1612 to 1673 tests across 69 files (4267 expect() calls)
- Freeze duration counter for completed/zombie agents in status and dashboard displays
0.4.0 - 2026-02-15
overstory doctor— comprehensive health check system with 9 check modules (dependencies, config, structure, databases, consistency, agents, merge-queue, version, logs) and formatted output with pass/warn/fail statusoverstory inspect <agent>— deep per-agent inspection aggregating session data, metrics, events, and live tmux capture with--followpolling mode
--watchdogflag forcoordinator start— auto-starts the watchdog daemon alongside the coordinator--debounce <ms>flag formail check— prevents excessive mail checking by skipping if called within the debounce window- PostToolUse hook entry for debounced mail checking
- Automated failure recording in watchdog via mulch — records failure patterns for future reference
- Mulch learn extraction in
log session-end— captures session insights automatically - Mulch health checks in
overstory clean— validates mulch installation and domain health during cleanup
- Test suite grew from 1435 to 1612 tests across 66 files (3958 expect() calls)
- Wire doctor command into CLI router and update command groups
0.3.0 - 2026-02-13
overstory runcommand — orchestration run lifecycle management (list,show,completesubcommands) with RunStore backed by sessions.dboverstory tracecommand — agent/bead timeline viewing for debugging and post-mortem observabilityoverstory cleancommand — cleanup worktrees, sessions, and artifacts with auto-cleanup on agent teardown
- Run tracking via
run_idintegrated into sling and clean commands RunStorein sessions.db for durable run stateSessionStore(SQLite) — migrated from sessions.json for concurrent access and crash safety- Phase 2 CLI query commands and Phase 3 event persistence for the observability pipeline
- Project-scoped tmux naming (
overstory-{projectName}-{agentName}) to prevent cross-project session collisions ENV_GUARDon all hooks — prevents hooks from firing outside overstory-managed worktrees- Mulch-informed lead decomposition — leader agents use mulch expertise when breaking down tasks
- Mulch conflict pattern recording — merge resolver records conflict patterns to mulch for future reference
- New commands and flags for the mulch CLI wrapper
--jsonparsing support with corrected types and flag spread
STEELMAN.md— comprehensive risk analysis for agent swarm deployments- Community files: CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md
- Package metadata (keywords, repository, homepage) for npm/GitHub presence
- Test suite grew from 912 to 1435 tests across 55 files (3416 expect() calls)
- Fix
isCanonicalRootguard blocking all worktree overlays when dogfooding overstory on itself - Fix auto-nudge tmux corruption and deploy coordinator hooks correctly
- Fix 4 P1 issues: orchestrator nudge routing, bash guard bypass, hook capture isolation, overlay guard
- Fix 4 P1/P2 issues: ENV_GUARD enforcement, persistent agent state, project-scoped tmux kills, auto-nudge coordinator
- Strengthen agent orchestration with additional P1 bug fixes
- CLI commands grew from 17 to 20 (added run, trace, clean)
0.2.0 - 2026-02-13
overstory coordinatorcommand — persistent orchestrator that runs at project root, decomposes objectives into subtasks, dispatches agents via sling, and tracks batches via task groupsstart/stop/statussubcommands--attach/--no-attachwith TTY-aware auto-detection for tmux sessions- Scout-delegated spec generation for complex tasks
- Supervisor agent definition — per-project team lead (depth 1) that receives dispatch mail from coordinator, decomposes into worker-sized subtasks, manages worker lifecycle, and escalates unresolvable issues
- 7 base agent types (added coordinator + supervisor to existing scout, builder, reviewer, lead, merger)
overstory groupcommand — batch coordination (create/status/add/remove/list) with auto-close when all member beads issues complete, mail notification to coordinator on auto-close- Session checkpoint save/restore for compaction survivability (
prime --compactrestores from checkpoint) - Handoff orchestration (initiate/resume/complete) for crash recovery
- 8 protocol message types:
worker_done,merge_ready,merged,merge_failed,escalation,health_check,dispatch,assign - Type-safe
sendProtocol<T>()andparsePayload<T>()for structured agent coordination - JSON payload column with schema migration handling 3 upgrade paths
overstory nudgecommand with retry (3x), debounce (500ms), and--forceto skip debounce- Auto-nudge on urgent/high priority mail send
- PreToolUse hooks mechanically block file-modifying tools (Write/Edit/NotebookEdit) for non-implementation agents (scout, reviewer, coordinator, supervisor)
- PreToolUse Bash guards block dangerous git operations (
push,reset --hard,clean -f, etc.) for all agents - Whitelist git add/commit for coordinator/supervisor capabilities while keeping git push blocked
- Block Claude Code native team/task tools (Task, TeamCreate, etc.) for all overstory agents — enforces overstory sling delegation
- ZFC principle: tmux liveness as primary signal, pid check as secondary, sessions.json as tertiary
- Descendant tree walking for process cleanup —
getPanePid(),getDescendantPids(),killProcessTree()with SIGTERM → grace → SIGKILL - Re-check zombies on every tick, handle investigate action
- Stalled state added to zombie reconciliation
- Builder agents send
worker_donemail on task completion - Overlay quality gates include worker_done signal step
- Prime activation context injection for bound tasks
MISSING_WORKER_DONEfailure mode in builder definition
- Switch sling from headless (
claude -p) to interactive mode with tmux sendKeys beacon — hooks now fire, enabling mail, metrics, logs, and lastActivity updates - Structured
buildBeacon()with identity context and startup protocol - Fix beacon sendKeys multiline bug (increase initial sleep, follow-up Enter after 500ms)
--verboseflag foroverstory status--jsonflag foroverstory sling--backgroundflag foroverstory watch- Help text for unknown subcommands
SUPPORTED_CAPABILITIESconstant andCapabilitytype
overstory initnow deploys agent definitions (copiesagents/*.mdto.overstory/agent-defs/) viaimport.meta.dirresolution- E2E lifecycle test validates full init → config → manifest → overlay pipeline on throwaway external projects
- Colocated tests with source files (moved from
__tests__/tosrc/) - Shared test harness:
createTempGitRepo(),cleanupTempDir(),commitFile()insrc/test-helpers.ts - Replaced
Bun.spawnmocks with real implementations in 3 test files - Optimized test harness: 38.1s → 11.7s (-69%)
- Comprehensive metrics command test coverage
- E2E init-sling lifecycle test
- Test suite grew from initial release to 515 tests across 24 files (1286 expect() calls)
- 60+ bugs resolved across 8 dedicated fix sessions, covering P1 criticals through P4 backlog items:
- Hooks enforcement: tool guard sed patterns now handle optional space after JSON colons
- Status display: filter completed sessions from active agent count
- Session lifecycle: move session recording before beacon send to fix booting → working race condition
- Stagger delay (
staggerDelayMs) now actually enforced between agent spawns - Hardcoded
mainbranch replaced with dynamic branch detection in worktree/manager and merge/resolver - Sling headless mode fixes for E2E validation
- Input validation, environment variable handling, init improvements, cleanup lifecycle
.gitignorepatterns for.overstory/artifacts- Mail, merge, and worktree subsystem edge cases
- Agent propulsion principle: failure modes, cost awareness, and completion protocol added to all agent definitions
- Agent quality gates updated across all base definitions
- Test file paths updated from
__tests__/convention to colocatedsrc/**/*.test.ts
0.1.0 - 2026-02-12
- CLI entry point with command router (
overstory <command>) overstory init— initialize.overstory/in a target projectoverstory sling— spawn worker agents in git worktrees via tmuxoverstory prime— load context for orchestrator or agent sessionsoverstory status— show active agents, worktrees, and project stateoverstory mail— SQLite-based inter-agent messaging (send/check/list/read/reply)overstory merge— merge agent branches with 4-tier conflict resolutionoverstory worktree— manage git worktrees (list/clean)overstory log— hook event logging (NDJSON + human-readable)overstory watch— watchdog daemon with health monitoring and AI-assisted triageoverstory metrics— session metrics storage and reporting- Agent manifest system with 5 base agent types (scout, builder, reviewer, lead, merger)
- Two-layer agent definition: base
.mdfiles (HOW) + dynamic overlays (WHAT) - Persistent agent identity and CV system
- Hooks deployer for automatic worktree configuration
- beads (
bd) CLI wrapper for issue tracking integration - mulch CLI wrapper for structured expertise management
- Multi-format logging with secret redaction
- SQLite metrics storage for session analytics
- Full test suite using
bun test - Biome configuration for formatting and linting
- TypeScript strict mode with
noUncheckedIndexedAccess