|
| 1 | +# mcp.proxy |
| 2 | + |
| 3 | +Aggregates multiple upstream MCP (Model Context Protocol) servers behind a single |
| 4 | +Streamable HTTP endpoint on port `7114`, with a shared in-memory cache for |
| 5 | +`tools` / `prompts` / `resources` listings. |
| 6 | + |
| 7 | +```text |
| 8 | + ┌──────────────────── Zilla ─────────────────────┐ |
| 9 | + │ tcp(7114) → http → mcp(server) → mcp(proxy) │ |
| 10 | +client ──────►│ │ │ |
| 11 | + │ ┌──────────┴──────────┐ │ |
| 12 | + │ ▼ ▼ │ |
| 13 | + │ mcp(client) everything mcp(client) urlelicit |
| 14 | + │ │ │ │ |
| 15 | + │ ▼ ▼ │ |
| 16 | + │ http → http → │ |
| 17 | + │ tcp tcp │ |
| 18 | + └─────────────────┼─────────────────────┼───────┘ |
| 19 | + ▼ ▼ |
| 20 | + everything:3001 urlelicit:3003 |
| 21 | + (reference server) (url-mode elicitation) |
| 22 | +``` |
| 23 | + |
| 24 | +The proxy demonstrates, in one configuration: |
| 25 | + |
| 26 | +- **Multi-toolkit aggregation** — `routes[].when.toolkit` fans one Streamable HTTP |
| 27 | + endpoint into multiple upstream MCP servers. |
| 28 | +- **Shared cache** — `options.cache` backs `tools` / `prompts` / `resources` |
| 29 | + listings with an in-memory store and a five-minute TTL. |
| 30 | +- **Protocol version negotiation** — Zilla offers MCP protocol `2025-11-25` and |
| 31 | + negotiates down per peer; the negotiated version is echoed on the `initialize` |
| 32 | + response and stamped on every upstream request. |
| 33 | +- **Form elicitation pass-through** — Zilla forwards MCP `elicitation/create` |
| 34 | + messages in both directions, so any upstream that elicits structured user input |
| 35 | + drives the flow through the gateway with no extra configuration. The |
| 36 | + `everything` reference server's `trigger-elicitation-request-async` tool |
| 37 | + exercises this directly. |
| 38 | +- **URL-mode elicitation pass-through (SEP-1036)** — when the client advertises |
| 39 | + the `elicitation.url` capability, Zilla advertises it to the upstream, so a |
| 40 | + url-mode-capable upstream can ask the user to complete a secure out-of-band |
| 41 | + interaction in their browser. Zilla relays the `mode:"url"` `elicitation/create` |
| 42 | + request and the `notifications/elicitation/complete` notification end-to-end. |
| 43 | + The `urlelicit` server's `authorize` tool exercises this directly. |
| 44 | +- **MCP metrics** — the `mcp(server)` binding records per-method counters and |
| 45 | + duration histograms (`mcp.initialize`, `mcp.tools.list`, `mcp.tools.call`, |
| 46 | + `mcp.prompts.*`, `mcp.resources.*`), dimensioned by `method`, `tool`, and |
| 47 | + `outcome`, exported over Prometheus on port `7190`. |
| 48 | + |
| 49 | +## Requirements |
| 50 | + |
| 51 | +- docker compose |
| 52 | + |
| 53 | +## Setup |
| 54 | + |
| 55 | +```bash |
| 56 | +docker compose up -d |
| 57 | +``` |
| 58 | + |
| 59 | +This starts Zilla plus two locally-reachable upstream MCP servers: a Node |
| 60 | +`everything` reference server on `:3001`, and a minimal `urlelicit` server on |
| 61 | +`:3003` that demonstrates url-mode elicitation. |
| 62 | + |
| 63 | +## Verify |
| 64 | + |
| 65 | +Run the automated smoke test that the build workflow uses: |
| 66 | + |
| 67 | +```bash |
| 68 | +./.github/test.sh |
| 69 | +``` |
| 70 | + |
| 71 | +Or drive the gateway interactively with the MCP Inspector: |
| 72 | + |
| 73 | +```bash |
| 74 | +npx @modelcontextprotocol/inspector http://localhost:7114/mcp |
| 75 | +``` |
| 76 | + |
| 77 | +### Initialize the MCP session |
| 78 | + |
| 79 | +A client that supports url-mode elicitation advertises the `elicitation.url` |
| 80 | +capability and offers protocol `2025-11-25`; Zilla echoes the negotiated version: |
| 81 | + |
| 82 | +```bash |
| 83 | +curl -N http://localhost:7114/mcp \ |
| 84 | + -H "Content-Type: application/json" \ |
| 85 | + -H "Accept: application/json, text/event-stream" \ |
| 86 | + -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{"elicitation":{"url":{}}},"clientInfo":{"name":"curl","version":"0"}}}' |
| 87 | +``` |
| 88 | + |
| 89 | +The `result.protocolVersion` in the response is `2025-11-25`. Because the client |
| 90 | +advertised `elicitation.url`, Zilla advertises the same capability on its |
| 91 | +`initialize` request to each upstream. |
| 92 | + |
| 93 | +### Trigger a form elicitation round-trip |
| 94 | + |
| 95 | +Call the `everything` server's elicitation-demo tool through the gateway: |
| 96 | + |
| 97 | +```bash |
| 98 | +curl -N http://localhost:7114/mcp \ |
| 99 | + -H "Content-Type: application/json" \ |
| 100 | + -H "Accept: application/json, text/event-stream" \ |
| 101 | + -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"everything__trigger-elicitation-request-async"}}' |
| 102 | +``` |
| 103 | + |
| 104 | +`@modelcontextprotocol/server-everything` responds with an `elicitation/create` |
| 105 | +JSON-RPC request bound for the client. Zilla forwards it back through |
| 106 | +`mcp(client) → mcp(proxy) → mcp(server) → http(server)` without unwrapping. |
| 107 | + |
| 108 | +### Trigger a url-mode elicitation round-trip |
| 109 | + |
| 110 | +Use MCP Inspector (which knows how to handle `elicitation/create`) to call the |
| 111 | +`urlelicit` server's `authorize` tool: |
| 112 | + |
| 113 | +```text |
| 114 | +urlelicit__authorize { "resource": "demo" } |
| 115 | +``` |
| 116 | + |
| 117 | +The `urlelicit` server replies with a `mode:"url"` `elicitation/create` request |
| 118 | +carrying a link for the user to open in their browser. Zilla relays it back to |
| 119 | +the client unchanged; once the out-of-band interaction completes, the server |
| 120 | +sends `notifications/elicitation/complete`, which Zilla also relays. URL-mode |
| 121 | +elicitation only flows when the client advertised `elicitation.url` at |
| 122 | +`initialize` — a form-only or older client never sees the url request. |
| 123 | + |
| 124 | +### Observe the cache |
| 125 | + |
| 126 | +Repeat a `tools/list` request within five minutes and tail Zilla's logs: |
| 127 | + |
| 128 | +```bash |
| 129 | +docker compose logs -f zilla | grep mcp.proxy.cache |
| 130 | +``` |
| 131 | + |
| 132 | +The first call shows a cache miss; subsequent ones within `ttl` are served from |
| 133 | +memory. |
| 134 | + |
| 135 | +### Observe MCP metrics |
| 136 | + |
| 137 | +The `mcp(server)` binding is configured with `telemetry.metrics: [mcp.*]` and |
| 138 | +records each request as a counter plus a duration histogram, attributed by |
| 139 | +`method`, `tool`, and `outcome`. Scrape them from the Prometheus endpoint: |
| 140 | + |
| 141 | +```bash |
| 142 | +curl -s http://localhost:7190/metrics | grep '^mcp_' |
| 143 | +``` |
| 144 | + |
| 145 | +After an `everything__echo` tool call, for example, you will see: |
| 146 | + |
| 147 | +```text |
| 148 | +mcp_tools_call_total{method="tools.call",outcome="ok",tool="everything__echo"} 1 |
| 149 | +``` |
| 150 | + |
| 151 | +## Teardown |
| 152 | + |
| 153 | +```bash |
| 154 | +docker compose down -v |
| 155 | +``` |
| 156 | + |
| 157 | +## References |
| 158 | + |
| 159 | +- [Zilla docs — `binding-mcp`](https://docs.aklivity.io/zilla/latest/reference/config/bindings/binding-mcp.html) |
| 160 | +- [MCP — Streamable HTTP transport](https://modelcontextprotocol.io/docs/concepts/transports) |
| 161 | +- [MCP — elicitation](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation) |
| 162 | +- [SEP-1036 — URL mode elicitation](https://modelcontextprotocol.io/seps/1036-url-mode-elicitation-for-secure-out-of-band-intera) |
0 commit comments