You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Claude Feature parity for Open WebUI - streaming, prompt caching, tool use, extended thinking, code execution with file support, programmatic tool calling, agent skills, and more.
π Overview
A comprehensive Anthropic API integration for Open WebUI built on the Anthropic Python SDK. I tried to include as much Claude API features as possible while stay compatible with all Openwebui Features. It Enables Claude to orchestrate complex multi-tool workflows
"Grab my Jira tasks and send a summary on Slack" β work token-effician and in a single request with programmatic tool calling, parallel execution.
"Check out this finance report - Extract the important information and distill it down into a nice Powerpoint Presentation" β code_execution with Files API Up- and Download Support and pptx Skill will do the trick.
"What's the meaning of life?" β Extended Thinking can use up to 64k Tokens and web_search can do ask the internet before giving a good answer. Interleaved Thinking allows to think about the path it took even during the response.
π― Key Highlights
Feature
Description
π§ Programmatic Tool Calling
Claude orchestrates tools through code execution β multi-tool workflows in one go
β‘ Parallel Execution
Independent tools execute simultaneously
πΎ Prompt Caching
4-level cache for system prompts, tools, and messages; compatible with RAG & Memory
π§ Extended Thinking
Classic budget, adaptive, and interleaved thinking with live streaming
π» Code Execution
Sandboxed Python with persistent container state, file upload/download
Default location for web searches (city, region, country, timezone)
Cache Control Options
Option
Description
cache disabled
No caching
cache tools array only
Cache tool definitions
cache tools array and system prompt
Cache tools + system prompt
cache tools array, system prompt and messages
Full caching (recommended)
π‘ RAG & Memory: The pipe is aware of your settings and your intention, for example if you're attaching a PDF document with full context mode with NATIVE_PDF_UPLOAD active, it removed the RAG Promt and Sources entirely. If there's additional knowledge added, it strips the PDF RAG sources from RAG and moves the caching point to the previous messages as the last message is now always changing. It also extracts Memories from the System Promt and add them to the last user message when the Memory System is active to avoid cache misses. If you're encountering problems, feel free to open an issue!
UserValves (Per-User Settings)
Valve
Default
Range
Description
ENABLE_THINKING
false
β
Enable Extended Thinking
THINKING_BUDGET_TOKENS
8192
1024β64000
Token budget for thinking
EFFORT
high
low/medium/high/max
Effort level (also controllable via OpenWebUI's reasoning_effort)
USE_PDF_NATIVE_UPLOAD
true
β
Visual PDF analysis instead of RAG extraction
SHOW_TOKEN_COUNT
false
β
Show context window progress bar
WEB_SEARCH_MAX_USES
5
1β20
Max web searches per turn
WEB_FETCH_MAX_USES
5
1β20
Max web fetch requests per turn
WEB_SEARCH_USER_*
β
β
Override global location settings
SKILLS
[]
β
Skills to activate (e.g., pptx, xlsx, docx, pdf, or custom IDs)
DEBUG_MODE
false
β
Logs some internal and external parameters as citation to send me for debugging ;)
Toggle Filters & Companion
Filter
Purpose
Thinking Toggle
π§ Enable thinking for the next message
Companion Filter
π Intercepts OpenWebUI's built-in web_search / code_interpreter buttons and routes them to native Anthropic tools
π Changelog
v0.8.1
Added experimental Files API Support for uploading files to the Container. Feedback welcome!
Added a Valve to control wheter Opus/Sonnet 4.6 should use the new dynamic web_fetching and web_searching (At least I have issues with that)
v0.8.0
Major streaming refactor: rebuilt on Anthropic SDK message accumulation
Programmatic Tool Calling β Claude orchestrates tools from within code execution
Web Fetch tool β Claude can fetch and analyze URL content
Fine-grained tool streaming with eager input streaming (GA)
Unified code execution display (code + tool calls + output in one block)
Updated web_search to web_search_20260209 with dynamic filtering
Citations now correctly appear after cited text
Tool search status shows actual search query
Model capabilities updated for Sonnet 4.5/4.6 and Opus 4.6
Stop reason debug logging for tool loop diagnostics
v0.7.1
Removed deprecated Sonnet 3.7 and Haiku 3 models
v0.7.0
Sonnet 4.6 model support
Fast Mode for Opus 4.6 (speed: "fast")
Web fetch tool (URL content retrieval)
Memory tool integration with OpenWebUI memory system
Fixed task model bug (_run_task_model_request() extra argument)
v0.6.3
Opus 4.6 model support
Effort: max support
Data Residency (inference_geo) support
Messages for stop_reason (refusal, stop_sequence, context exceeded)
ENABLE_INTERLEAVED_THINKING valve
Homogenized thinking and tool call/result streaming to match built-in UX
v0.6.2
Reordered payload for better caching
v0.6.1
Full Skills support (pptx, xlsx, docx, pdf, custom) with API validation and caching
v0.6.0
Live thinking streaming with collapsible blocks
Companion Filter for routing OpenWebUI web_search/code_interpreter to Anthropic tools
Files API upload for code execution file access
Built-in OpenWebUI tools support (0.7.x)
Native PDF markers for multi-turn file persistence
Container ID persistence for code execution state
Fixed RAG + Native PDF Upload interaction
v0.5.x (click to expand)
v0.5.12
Thinking is now streamed in the UI and folded when the thought process has ended
v0.5.11
Compatibility with built-in tools from OpenWebUI 0.7.x
v0.5.10
Pre-compiled regex patterns at module level
Debug logging guards for expensive JSON serialization
Comprehensive docstring and section comments
v0.5.9
Native PDF upload via USE_PDF_NATIVE_UPLOAD UserValve