fix: Fix non-proxy host pattern matching to treat dots as literals#6664
Conversation
Co-authored-by: sbuettner <1607241+sbuettner@users.noreply.github.com>
| return Arrays.stream(nonProxyHosts.split("\\|", -1)) | ||
| .map( | ||
| token -> | ||
| Arrays.stream(token.split("\\*", -1)) | ||
| .map(Pattern::quote) | ||
| .collect(Collectors.joining(".*"))) | ||
| .collect(Collectors.joining("|")); |
There was a problem hiding this comment.
Note that this is called for every request, so we might want to consider not constantly re-evaluating this logic.
When refactoring into this class I kept the previous behavior of reading the env var/property for every request, but we probably could create and cache the patterns only once if we're ok with not being able to update the non proxy patterns at runtime (which then would be aligned with the proxy host configuration).
There was a problem hiding this comment.
We could also cache the toRegEx result under the nonProxyHosts key so it would even work when changing the non-proxy hosts no?
There was a problem hiding this comment.
Added a ConcurrentHashMap-based REGEX_CACHE in NonProxyHosts that caches toRegex results keyed by the raw pattern string (commit 1beeb51). This avoids recomputing the regex on every request while still supporting runtime changes to the non-proxy host patterns.
|
@maff Can we open this PR? |
|
@maff How shall we proceed with this? |
|
@ztefanie @sbuettner can you take that over in the team? I think the general solution looks good, but it needs to be defined when these patterns need to be evaluated. Historically this is done per request, but probably this can be done once as for the other configurations? Does anybody need to change them at runtime? |
|
@maff Can you please create an issue what should be done by the team to follow our workflows so we can review and groom this? |
|
There is already an issue #6419 which I think @sbuettner assigned to Copilot to create this PR. |
|
@copilot resolve the merge conflicts in this pull request |
|
@ztefanie Thanks for the ping. Can you give this a quick review after copilot fixed the conflicts? |
There was a problem hiding this comment.
Pull request overview
This PR fixes non-proxy host pattern matching in the shared HTTP client proxy support so that . and other regex metacharacters are treated literally, aligning behavior with the documented “* wildcard only” semantics and preventing unintended host matches.
Changes:
- Update
NonProxyHosts.toRegexto escape non-wildcard characters viaPattern.quotewhile still supporting*wildcards and|-separated alternatives. - Extend
NonProxyHostsTestwith edge cases proving thatexample.dedoes not matchexampleXdeorexample-de.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| connector-commons/http-client/src/main/java/io/camunda/connector/http/client/proxy/NonProxyHosts.java | Fixes pattern→regex conversion to avoid treating . as a regex wildcard. |
| connector-commons/http-client/src/test/java/io/camunda/connector/http/client/proxy/NonProxyHostsTest.java | Adds regression assertions for dot-literal behavior. |
| static String toRegex(String nonProxyHosts) { | ||
| return nonProxyHosts.replace("*", ".*"); | ||
| return Arrays.stream(nonProxyHosts.split("\\|", -1)) | ||
| .map( | ||
| token -> | ||
| Arrays.stream(token.split("\\*", -1)) | ||
| .map(Pattern::quote) | ||
| .collect(Collectors.joining(".*"))) | ||
| .collect(Collectors.joining("|")); |
There was a problem hiding this comment.
toRegex() now performs multiple split() operations and Pattern.quote() allocations on every isNonProxyHost() call, and String#matches() recompiles the regex each time. Since this runs on every request when proxying, consider caching precompiled Pattern instances keyed by the current system property / env-var value (or precomputing once per request configuration) to avoid repeated regex building/compilation in hot paths.
There was a problem hiding this comment.
Done in commit ee9c977. Changed the ConcurrentHashMap to cache precompiled Pattern instances (keyed by the raw pattern string) and updated isNonProxyHost to use toPattern(nonProxyHostsPattern).matcher(hostname).matches(), avoiding both per-request regex compilation and per-request string-to-regex conversion.
Co-authored-by: sbuettner <1607241+sbuettner@users.noreply.github.com>
Merge conflicts resolved in commit Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
…egex recompilation per request Agent-Logs-Url: https://github.com/camunda/connectors/sessions/f812f7be-a429-4bea-9d98-2a6874d85aa9 Co-authored-by: sbuettner <1607241+sbuettner@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
| System.setProperty("http.nonProxyHosts", "localhost|*.example.com"); | ||
| assertThat(NonProxyHosts.getNonProxyHostRegexPatterns()) | ||
| .containsExactly("localhost|.*.example.com"); | ||
| .containsExactly("\\Qlocalhost\\E|\\Q\\E.*\\Q.example.com\\E"); |
There was a problem hiding this comment.
Fixed in commit 74a9978. Replaced the raw "\\Qlocalhost\\E" / "\\Q\\E.*\\Q.example.com\\E" string literals with Pattern.quote("localhost") / Pattern.quote("") + ".*" + Pattern.quote(".example.com") expressions.
…stsTest assertions Agent-Logs-Url: https://github.com/camunda/connectors/sessions/91486d4d-40cd-4954-a536-a1c38898007d Co-authored-by: sbuettner <1607241+sbuettner@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
🎉 This pull request has been included in release Thank you for your contribution! 🚀 |
NonProxyHosts.toRegexwas naively replacing*with.*without escaping other regex metacharacters, causing.in patterns to match any character. For example,example.dewould incorrectly matchexampleXde.Changes
NonProxyHosts.toRegex: Split each|-delimited token on*, escape each segment withPattern.quote, then rejoin with.*. This makes all characters except*literal, matching the documented wildcard-only semantics.NonProxyHostspattern caching: Replaced string-based regex caching with aConcurrentHashMap<String, Pattern>that stores precompiledPatterninstances keyed by the raw pattern string.isNonProxyHostnow usestoPattern(nonProxyHostsPattern).matcher(hostname).matches()instead ofString#matches(), avoiding both per-request regex string recomputation and regex recompilation. Runtime changes to non-proxy host patterns are still handled correctly — a new pattern string produces a new cache entry on first use.NonProxyHostsTest: Added edge-case assertions thatexample.dedoes not matchexampleXdeorexample-de. UpdatedgetNonProxyHostRegexPatternstest assertions to reflect the newPattern.quote-escaped regex format.Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.