Skip to content

paperwave/claude-octopus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

834 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Claude Octopus

Multi-LLM orchestration plugin for Claude Code — coordinates Codex, Gemini, Perplexity, OpenRouter, Copilot, Qwen, and Ollama with consensus gates. Eight tentacles, zero blind spots.

Claude Octopus

Version 9.15.2 Requires Claude Code v2.1.83+ Factory AI Compatible MIT License

🐙 Research, build, review, and ship — with eight AI providers checking each other's work. Say what you need, and the right workflow runs. A 75% consensus gate catches disagreements before they reach production. No single model's blind spots slip through.

🧠 Remembers across sessions. Integrates with claude-mem for persistent memory — past decisions, research, and context survive session boundaries.

Spec in, software out. Dark Factory mode takes a spec and autonomously runs the full pipeline — research, define, develop, deliver. You review the output, not every step.

🔄 Four-phase methodology, not just tools. Every task moves through Discover → Define → Develop → Deliver, with quality gates between phases. Other orchestrators give you infrastructure. Octopus gives you the workflows.

🐙 32 specialized personas (role-specific AI agents like security-auditor, backend-architect), 47 commands (slash commands you type), 50 skills (reusable workflow modules). Say "audit my API" and the right expert activates. Don't know the command? The smart router figures it out.

🐙 Works with just Claude. Scales to eight. Zero providers needed to start. Add them one at a time — each activates automatically when detected.

💰 Five providers cost nothing extra. Codex and Gemini use OAuth (included with subscriptions). Qwen has 1,000-2,000 free requests/day. Copilot uses your GitHub subscription. Ollama runs locally for free.


What's New

Version Best Features
v9 (current) 8 providers (Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, OpenRouter, Claude). Four-way AI debates. Smart router — just say what you need. 92% faster execution. Context-aware warnings. Session handoff across conversations. Adversarial review in every workflow.
v8 Multi-LLM code review with inline PR comments. Parallel workstreams in isolated git worktrees. Reaction engine — auto-responds to CI failures. 32 specialized personas. Dark Factory autonomous pipeline.
v7 Double Diamond workflow. Multi-provider dispatch. Quality gates and consensus scoring. Configurable sandbox modes.

Full changelog →

Star History

Star History Chart


Quickstart

# Terminal (not inside a Claude Code session):
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

# Then inside Claude Code:
/octo:setup

That's it. Setup detects installed providers, shows what's missing, and walks you through configuration. You need zero external providers to start — Claude is built in.

Alternative install methods

From the Claude Code UI: Type /plugin in a session → Marketplace tab → install octo.

Factory AI (Droid):

droid plugin marketplace add https://github.com/nyldn/claude-octopus
droid plugin install octo@claude-octopus
Update / Troubleshooting
# Update
claude plugin update octo

# Clean reinstall (if update fails)
claude plugin uninstall claude-octopus 2>/dev/null
claude plugin uninstall octo 2>/dev/null
rm -rf ~/.claude/plugins/cache/nyldn-plugins/claude-octopus
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

Top 8 Tentacles

🐙 Eight commands — one per arm. A real octopus has eight arms, each with its own neurons that can act independently. These eight tentacles work the same way: each orchestrates up to three AI providers, applies quality gates, and produces a deliverable.

/octo:embrace build stripe integration     # Full lifecycle: research → define → develop → deliver
/octo:factory "build a CLI that converts CSV to JSON"  # Autonomous pipeline — spec in, software out
/octo:debate monorepo vs microservices     # Structured four-way AI debate with consensus
/octo:research htmx vs react in 2026       # Multi-source synthesis from three AI providers
/octo:design mobile checkout redesign       # UI/UX design with BM25 style intelligence
/octo:tdd create user auth                 # Red-green-refactor with test discipline
/octo:security                              # OWASP vulnerability scan + remediation
/octo:prd mobile checkout redesign          # AI-optimized PRD with 100-point scoring

Plus 30 more: review, debug, extract, deck, docs, schedule, parallel, sentinel, brainstorm, claw, doctor, and the full set.

Don't remember the command name? Just describe what you need:

/octo:auto research microservices patterns    -> routes to discover phase
/octo:auto build user authentication          -> routes to develop phase
/octo:auto compare Redis vs DynamoDB          -> routes to debate

The smart router parses your intent and selects the right workflow.


Which Tentacle?

Not sure which command to use? Pick by goal:

I want to... Use
Research a topic thoroughly /octo:research or /octo:discover
Debate two approaches /octo:debate
Build a feature end-to-end /octo:embrace
Design a UI or style system /octo:design
Review existing code /octo:review
Write tests first, then code /octo:tdd
Scan for vulnerabilities /octo:security
Write a product spec /octo:prd
Go from spec to shipping code /octo:factory
Debug a tricky issue /octo:debug
Just run something quick /octo:quick

Or skip the table — type /octo:auto <what you want> or just say octo <what you want>, and the smart router picks for you. 🔍


How It Works

Eight Tentacles, One Workflow

Claude Octopus coordinates up to eight AI providers — one per tentacle:

Provider Role
🔴 Codex (OpenAI) Implementation depth — code patterns, technical analysis, architecture
🟡 Gemini (Google) Ecosystem breadth — alternatives, security review, research synthesis
🟣 Perplexity Live web search — CVE lookups, dependency research, current docs
🌐 OpenRouter Alternative model routing — access 100+ models via single API
🟢 Copilot (GitHub) Zero-cost research — uses existing GitHub Copilot subscription
🟤 Qwen (Alibaba) Free-tier research — 1,000-2,000 requests/day via Qwen OAuth
⚫ Ollama (Local) Zero-cost local LLM — offline, privacy-sensitive, fallback
🔵 Claude (Anthropic) Orchestration — quality gates, consensus building, final synthesis

Providers run in parallel for research, sequentially for problem scoping, and adversarially for review. A 75% consensus quality gate prevents questionable work from shipping. Only Claude is required — all others are optional and auto-detected.

Double Diamond Phases

Four structured phases adapted from the UK Design Council's methodology:

Phase Command What happens
Discover /octo:discover Multi-AI research and broad exploration
Define /octo:define Requirements clarification with consensus
Develop /octo:develop Implementation with quality gates
Deliver /octo:deliver Adversarial review and go/no-go scoring

Run phases individually or all four with /octo:embrace. Configure autonomy: supervised (approve each phase), semi-autonomous (intervene on failures), or autonomous (run all four).

32 Personas

Specialized agents that activate automatically based on your request. When you say "audit my API for vulnerabilities," security-auditor activates. When you say "design a dashboard," ui-ux-designer takes over.

Categories: Software Engineering (11), Specialized Development (6), Documentation & Communication (5), Research & Strategy (3), Business & Compliance (3), Creative & Design (4).

Full persona reference | All 50 skills

Reaction Engine

When agents create PRs, the reaction engine monitors what happens next — CI failures, review comments, stale agents — and responds automatically. No new commands to learn. It fires transparently inside workflows you already use:

Integration Point When It Fires
/octo:parallel Between poll cycles while monitoring work packages
/octo:sentinel After triage scan completes
agent-registry.sh health --react On-demand health check

What it auto-handles:

Event Reaction Limits
CI failure Collects failure logs into agent inbox 3 retries, escalates after 30m
Changes requested Collects review comments into agent inbox 2 retries, escalates after 60m
Agent stuck Escalates to human After 15m with no progress
PR approved + CI green Notifies you it's ready to merge
PR merged Marks agent complete

Override defaults per project by creating .octo/reactions.conf:

# EVENT|ACTION|MAX_RETRIES|ESCALATE_AFTER_MIN|ENABLED
ci_failed|forward_logs|5|45|true
changes_requested|forward_comments|3|90|true
stuck|escalate|0|10|true

Reactions track 13 agent lifecycle states: runningpr_openci_pendingci_failed / review_pendingchanges_requested / approvedmergeablemergeddone.


Providers and Cost

Authentication

Method Codex Gemini Claude
OAuth (recommended) codex login — included in ChatGPT subscription Google account — included in AI subscription Built into Claude Code
API key OPENAI_API_KEY — per-token billing GEMINI_API_KEY — per-token billing Built into Claude Code

OAuth users pay nothing beyond their existing subscriptions.

What Works Without External Providers

Everything except multi-AI features. You get all 32 personas, structured workflows, smart routing, context detection, and every skill. Multi-AI orchestration (parallel analysis, debate, consensus) activates when external providers are configured.


Trust and Safety

Namespace isolation — Only /octo:* commands and octo natural language prefix activate the plugin. Your existing Claude Code setup is untouched.

Data locations — Results in ~/.claude-octopus/results/, logs in ~/.claude-octopus/logs/, project state in .octo/. Nothing hidden.

Telemetry — Anonymous usage analytics help us improve the plugin. We collect session counts, workflow types used, and error rates — never prompts, file paths, or personal data. Disable anytime with OCTOPUS_TELEMETRY_OPT_OUT=1.

Provider transparency — Every command shows a 🐙 activation indicator on launch. Colored dots (🔴 🟡 🟣 🔵) show exactly which providers are running and when external APIs are called. You always know what's happening.

Clean uninstall — Run claude plugin uninstall octo from your terminal. If you see a scope error, add --scope project. No residual config changes.


OpenClaw Compatibility

Claude Octopus ships with a compatibility layer for OpenClaw, the open-source AI assistant framework. This lets you expose Octopus workflows to messaging platforms (Telegram, Discord, Signal, WhatsApp) without modifying the Claude Code plugin.

Architecture

Claude Code Plugin (unchanged)
  └── .mcp.json ─── MCP Server ─── orchestrate.sh
                                        ↑
OpenClaw Extension ─────────────────────┘

Three components, zero changes to the core plugin:

Component Location Purpose
MCP Server mcp-server/ Exposes 10 Octopus tools via Model Context Protocol
OpenClaw Extension openclaw/ Wraps workflows for OpenClaw's extension API
Skill Schema mcp-server/src/schema/skill-schema.json Universal skill metadata format

MCP Server

The MCP server auto-starts when the plugin is enabled (via .mcp.json). It exposes:

  • octopus_discover, octopus_define, octopus_develop, octopus_deliver — Individual phases
  • octopus_embrace — Full Double Diamond workflow
  • octopus_debate, octopus_review, octopus_security — Specialized workflows
  • octopus_list_skills, octopus_status — Introspection

Any MCP-compatible client can connect to the server.

OpenClaw Extension

Install in an OpenClaw instance from git:

npm install github:nyldn/claude-octopus#main --prefix openclaw

Or clone and link locally:

cd openclaw && npm install && npm run build

The extension registers as an OpenClaw plugin with configurable workflows, autonomy modes, and Claude Code path resolution.

Build & Validate

./scripts/build-openclaw.sh          # Regenerate skill registry from frontmatter
./scripts/build-openclaw.sh --check  # CI mode — exits non-zero if out of sync
./tests/validate-openclaw.sh         # 13-check validation suite

FAQ

Do I need all three AI providers? No. One external provider plus Claude gives you multi-AI features. No external providers still gives you personas, workflows, and skills.

Will this break my existing Claude Code setup? No. Activates only with the octo prefix. Results stored separately. Uninstalls cleanly.

What happens if a provider times out? The workflow continues with available providers. You'll see the status in the visual indicators.

Why "octopus"? 🐙 Fun fact: a real octopus has three hearts, blue blood, and 500 million neurons — two-thirds of which live in its eight arms. Each arm can taste, touch, and act independently. Claude Octopus works the same way: each tentacle (command) operates autonomously with its own squeeze of logic, then ink flows back as the final deliverable. The crossfire review? That's the squeeze — adversarial pressure that untangles everything before it ships.


Documentation


Attribution


Community

Join r/ClaudeOctopus for help, workflow tips, showcases, and updates.

Contributing

  1. Report issues
  2. Submit PRs following existing code style
  3. git clone https://github.com/nyldn/claude-octopus.git && make test

See CONTRIBUTING.md for details.


License

MIT — see LICENSE

nyldn | MIT License | r/ClaudeOctopus | Report Issues

About

Multi-LLM orchestration plugin for Claude Code — 8 providers (Codex, Gemini, Claude, Perplexity, OpenRouter, Copilot, Qwen, Ollama), 47 commands, 50 skills, Double Diamond workflows

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 85.0%
  • TypeScript 9.5%
  • Go Template 3.2%
  • JavaScript 2.0%
  • Python 0.1%
  • Makefile 0.1%
  • CSS 0.1%