Claude Octopus

Multi-LLM orchestration plugin for Claude Code — coordinates Codex, Gemini, Perplexity, OpenRouter, Copilot, Qwen, and Ollama with consensus gates. Eight tentacles, zero blind spots.

🐙 Research, build, review, and ship — with eight AI providers checking each other's work. Say what you need, and the right workflow runs. A 75% consensus gate catches disagreements before they reach production. No single model's blind spots slip through.

🧠 Remembers across sessions. Integrates with claude-mem for persistent memory — past decisions, research, and context survive session boundaries.

⚡ Spec in, software out. Dark Factory mode takes a spec and autonomously runs the full pipeline — research, define, develop, deliver. You review the output, not every step.

🔄 Four-phase methodology, not just tools. Every task moves through Discover → Define → Develop → Deliver, with quality gates between phases. Other orchestrators give you infrastructure. Octopus gives you the workflows.

🐙 32 specialized personas (role-specific AI agents like security-auditor, backend-architect), 47 commands (slash commands you type), 50 skills (reusable workflow modules). Say "audit my API" and the right expert activates. Don't know the command? The smart router figures it out.

🐙 Works with just Claude. Scales to eight. Zero providers needed to start. Add them one at a time — each activates automatically when detected.

💰 Five providers cost nothing extra. Codex and Gemini use OAuth (included with subscriptions). Qwen has 1,000-2,000 free requests/day. Copilot uses your GitHub subscription. Ollama runs locally for free.

What's New

Version	Best Features
v9 (current)	8 providers (Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, OpenRouter, Claude). Four-way AI debates. Smart router — just say what you need. 92% faster execution. Context-aware warnings. Session handoff across conversations. Adversarial review in every workflow.
v8	Multi-LLM code review with inline PR comments. Parallel workstreams in isolated git worktrees. Reaction engine — auto-responds to CI failures. 32 specialized personas. Dark Factory autonomous pipeline.
v7	Double Diamond workflow. Multi-provider dispatch. Quality gates and consensus scoring. Configurable sandbox modes.

Full changelog →

Star History

Quickstart

# Terminal (not inside a Claude Code session):
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

# Then inside Claude Code:
/octo:setup

That's it. Setup detects installed providers, shows what's missing, and walks you through configuration. You need zero external providers to start — Claude is built in.

Alternative install methods

From the Claude Code UI: Type /plugin in a session → Marketplace tab → install octo.

Factory AI (Droid):

droid plugin marketplace add https://github.com/nyldn/claude-octopus
droid plugin install octo@claude-octopus

Update / Troubleshooting

# Update
claude plugin update octo

# Clean reinstall (if update fails)
claude plugin uninstall claude-octopus 2>/dev/null
claude plugin uninstall octo 2>/dev/null
rm -rf ~/.claude/plugins/cache/nyldn-plugins/claude-octopus
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

Top 8 Tentacles

🐙 Eight commands — one per arm. A real octopus has eight arms, each with its own neurons that can act independently. These eight tentacles work the same way: each orchestrates up to three AI providers, applies quality gates, and produces a deliverable.

/octo:embrace build stripe integration     # Full lifecycle: research → define → develop → deliver
/octo:factory "build a CLI that converts CSV to JSON"  # Autonomous pipeline — spec in, software out
/octo:debate monorepo vs microservices     # Structured four-way AI debate with consensus
/octo:research htmx vs react in 2026       # Multi-source synthesis from three AI providers
/octo:design mobile checkout redesign       # UI/UX design with BM25 style intelligence
/octo:tdd create user auth                 # Red-green-refactor with test discipline
/octo:security                              # OWASP vulnerability scan + remediation
/octo:prd mobile checkout redesign          # AI-optimized PRD with 100-point scoring

Plus 30 more: review, debug, extract, deck, docs, schedule, parallel, sentinel, brainstorm, claw, doctor, and the full set.

Don't remember the command name? Just describe what you need:

/octo:auto research microservices patterns    -> routes to discover phase
/octo:auto build user authentication          -> routes to develop phase
/octo:auto compare Redis vs DynamoDB          -> routes to debate

The smart router parses your intent and selects the right workflow.

Which Tentacle?

Not sure which command to use? Pick by goal:

I want to...	Use
Research a topic thoroughly	`/octo:research` or `/octo:discover`
Debate two approaches	`/octo:debate`
Build a feature end-to-end	`/octo:embrace`
Design a UI or style system	`/octo:design`
Review existing code	`/octo:review`
Write tests first, then code	`/octo:tdd`
Scan for vulnerabilities	`/octo:security`
Write a product spec	`/octo:prd`
Go from spec to shipping code	`/octo:factory`
Debug a tricky issue	`/octo:debug`
Just run something quick	`/octo:quick`

Or skip the table — type /octo:auto <what you want> or just say octo <what you want>, and the smart router picks for you. 🔍

How It Works

Eight Tentacles, One Workflow

Claude Octopus coordinates up to eight AI providers — one per tentacle:

Provider	Role
🔴 Codex (OpenAI)	Implementation depth — code patterns, technical analysis, architecture
🟡 Gemini (Google)	Ecosystem breadth — alternatives, security review, research synthesis
🟣 Perplexity	Live web search — CVE lookups, dependency research, current docs
🌐 OpenRouter	Alternative model routing — access 100+ models via single API
🟢 Copilot (GitHub)	Zero-cost research — uses existing GitHub Copilot subscription
🟤 Qwen (Alibaba)	Free-tier research — 1,000-2,000 requests/day via Qwen OAuth
⚫ Ollama (Local)	Zero-cost local LLM — offline, privacy-sensitive, fallback
🔵 Claude (Anthropic)	Orchestration — quality gates, consensus building, final synthesis

Providers run in parallel for research, sequentially for problem scoping, and adversarially for review. A 75% consensus quality gate prevents questionable work from shipping. Only Claude is required — all others are optional and auto-detected.

Double Diamond Phases

Four structured phases adapted from the UK Design Council's methodology:

Phase	Command	What happens
Discover	`/octo:discover`	Multi-AI research and broad exploration
Define	`/octo:define`	Requirements clarification with consensus
Develop	`/octo:develop`	Implementation with quality gates
Deliver	`/octo:deliver`	Adversarial review and go/no-go scoring

Run phases individually or all four with /octo:embrace. Configure autonomy: supervised (approve each phase), semi-autonomous (intervene on failures), or autonomous (run all four).

32 Personas

Specialized agents that activate automatically based on your request. When you say "audit my API for vulnerabilities," security-auditor activates. When you say "design a dashboard," ui-ux-designer takes over.

Categories: Software Engineering (11), Specialized Development (6), Documentation & Communication (5), Research & Strategy (3), Business & Compliance (3), Creative & Design (4).

Full persona reference | All 50 skills

Reaction Engine

When agents create PRs, the reaction engine monitors what happens next — CI failures, review comments, stale agents — and responds automatically. No new commands to learn. It fires transparently inside workflows you already use:

Integration Point	When It Fires
`/octo:parallel`	Between poll cycles while monitoring work packages
`/octo:sentinel`	After triage scan completes
`agent-registry.sh health --react`	On-demand health check

What it auto-handles:

Event	Reaction	Limits
CI failure	Collects failure logs into agent inbox	3 retries, escalates after 30m
Changes requested	Collects review comments into agent inbox	2 retries, escalates after 60m
Agent stuck	Escalates to human	After 15m with no progress
PR approved + CI green	Notifies you it's ready to merge	—
PR merged	Marks agent complete	—

Override defaults per project by creating .octo/reactions.conf:

# EVENT|ACTION|MAX_RETRIES|ESCALATE_AFTER_MIN|ENABLED
ci_failed|forward_logs|5|45|true
changes_requested|forward_comments|3|90|true
stuck|escalate|0|10|true

Reactions track 13 agent lifecycle states: running → pr_open → ci_pending → ci_failed / review_pending → changes_requested / approved → mergeable → merged → done.

Providers and Cost

Authentication

Method	Codex	Gemini	Claude
OAuth (recommended)	`codex login` — included in ChatGPT subscription	Google account — included in AI subscription	Built into Claude Code
API key	`OPENAI_API_KEY` — per-token billing	`GEMINI_API_KEY` — per-token billing	Built into Claude Code

OAuth users pay nothing beyond their existing subscriptions.

What Works Without External Providers

Everything except multi-AI features. You get all 32 personas, structured workflows, smart routing, context detection, and every skill. Multi-AI orchestration (parallel analysis, debate, consensus) activates when external providers are configured.

Trust and Safety

Namespace isolation — Only /octo:* commands and octo natural language prefix activate the plugin. Your existing Claude Code setup is untouched.

Data locations — Results in ~/.claude-octopus/results/, logs in ~/.claude-octopus/logs/, project state in .octo/. Nothing hidden.

Telemetry — Anonymous usage analytics help us improve the plugin. We collect session counts, workflow types used, and error rates — never prompts, file paths, or personal data. Disable anytime with OCTOPUS_TELEMETRY_OPT_OUT=1.

Provider transparency — Every command shows a 🐙 activation indicator on launch. Colored dots (🔴 🟡 🟣 🔵) show exactly which providers are running and when external APIs are called. You always know what's happening.

Clean uninstall — Run claude plugin uninstall octo from your terminal. If you see a scope error, add --scope project. No residual config changes.

OpenClaw Compatibility

Claude Octopus ships with a compatibility layer for OpenClaw, the open-source AI assistant framework. This lets you expose Octopus workflows to messaging platforms (Telegram, Discord, Signal, WhatsApp) without modifying the Claude Code plugin.

Architecture

Claude Code Plugin (unchanged)
  └── .mcp.json ─── MCP Server ─── orchestrate.sh
                                        ↑
OpenClaw Extension ─────────────────────┘

Three components, zero changes to the core plugin:

Component	Location	Purpose
MCP Server	`mcp-server/`	Exposes 10 Octopus tools via Model Context Protocol
OpenClaw Extension	`openclaw/`	Wraps workflows for OpenClaw's extension API
Skill Schema	`mcp-server/src/schema/skill-schema.json`	Universal skill metadata format

MCP Server

The MCP server auto-starts when the plugin is enabled (via .mcp.json). It exposes:

octopus_discover, octopus_define, octopus_develop, octopus_deliver — Individual phases
octopus_embrace — Full Double Diamond workflow
octopus_debate, octopus_review, octopus_security — Specialized workflows
octopus_list_skills, octopus_status — Introspection

Any MCP-compatible client can connect to the server.

OpenClaw Extension

Install in an OpenClaw instance from git:

npm install github:nyldn/claude-octopus#main --prefix openclaw

Or clone and link locally:

cd openclaw && npm install && npm run build

The extension registers as an OpenClaw plugin with configurable workflows, autonomy modes, and Claude Code path resolution.

Build & Validate

./scripts/build-openclaw.sh          # Regenerate skill registry from frontmatter
./scripts/build-openclaw.sh --check  # CI mode — exits non-zero if out of sync
./tests/validate-openclaw.sh         # 13-check validation suite

FAQ

Do I need all three AI providers? No. One external provider plus Claude gives you multi-AI features. No external providers still gives you personas, workflows, and skills.

Will this break my existing Claude Code setup? No. Activates only with the octo prefix. Results stored separately. Uninstalls cleanly.

What happens if a provider times out? The workflow continues with available providers. You'll see the status in the visual indicators.

Why "octopus"? 🐙 Fun fact: a real octopus has three hearts, blue blood, and 500 million neurons — two-thirds of which live in its eight arms. Each arm can taste, touch, and act independently. Claude Octopus works the same way: each tentacle (command) operates autonomously with its own squeeze of logic, then ink flows back as the final deliverable. The crossfire review? That's the squeeze — adversarial pressure that untangles everything before it ships.

Documentation

Documentation Guide — Start here
Command Reference — Commands, triggers, and provider indicators
Feature Gap Analysis — CC feature adoption tracker
Architecture — Provider flow and execution model
Plugin Architecture — Internal plugin structure
Agents & Personas — All 32 personas
CLI Reference — Direct CLI usage, debug mode, async, and tmux
Changelog

Attribution

wolverin0/claude-skills — AI Debate Hub. MIT License.
obra/superpowers — Discipline skills patterns. MIT License.
nextlevelbuilder/ui-ux-pro-max-skill — BM25 design intelligence databases. MIT License.
UK Design Council — Double Diamond methodology.

Community

Join r/ClaudeOctopus for help, workflow tips, showcases, and updates.

Contributing

Report issues
Submit PRs following existing code style
git clone https://github.com/nyldn/claude-octopus.git && make test

See CONTRIBUTING.md for details.

License

MIT — see LICENSE

nyldn | MIT License | r/ClaudeOctopus | Report Issues

Name		Name	Last commit message	Last commit date
Latest commit History 834 Commits
.claude-plugin		.claude-plugin
.claude		.claude
.factory-plugin		.factory-plugin
.gemini/commands/octo		.gemini/commands/octo
.github		.github
agents		agents
assets		assets
commands		commands
config		config
docs		docs
hooks		hooks
mcp-server		mcp-server
openclaw		openclaw
scripts		scripts
skills		skills
templates		templates
tests		tests
vendors		vendors
workflows		workflows
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
.gitmodules		.gitmodules
.mcp.json		.mcp.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
PRIVACY.md		PRIVACY.md
README.md		README.md
SECURITY.md		SECURITY.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Octopus

What's New

Star History

Quickstart

Top 8 Tentacles

Which Tentacle?

How It Works

Eight Tentacles, One Workflow

Double Diamond Phases

32 Personas

Reaction Engine

Providers and Cost

Authentication

What Works Without External Providers

Trust and Safety

OpenClaw Compatibility

Architecture

MCP Server

OpenClaw Extension

Build & Validate

FAQ

Documentation

Attribution

Community

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Claude Octopus

What's New

Star History

Quickstart

Top 8 Tentacles

Which Tentacle?

How It Works

Eight Tentacles, One Workflow

Double Diamond Phases

32 Personas

Reaction Engine

Providers and Cost

Authentication

What Works Without External Providers

Trust and Safety

OpenClaw Compatibility

Architecture

MCP Server

OpenClaw Extension

Build & Validate

FAQ

Documentation

Attribution

Community

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages