-
Notifications
You must be signed in to change notification settings - Fork 983
fix(instrumentation-http)!: drop url.parse in favor of URL constructor #5091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(instrumentation-http)!: drop url.parse in favor of URL constructor #5091
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5091 +/- ##
==========================================
- Coverage 93.18% 93.17% -0.02%
==========================================
Files 315 315
Lines 8086 8086
Branches 1617 1617
==========================================
- Hits 7535 7534 -1
- Misses 551 552 +1 |
experimental/packages/opentelemetry-instrumentation-http/src/utils.ts
Outdated
Show resolved
Hide resolved
| requestUrl?.hostname || | ||
| host?.replace(/^(.*)(:[0-9]{1,5})/, '$1') || | ||
| 'localhost'; | ||
| const host = headers.host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the host header guaranteed to be there? what about previous check to get the host from the URL?, it is not there anymore for host and hostname calculation, maybe I'm missing something still going through the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, IncomingMessage.url never includes the host or protocol. (https://nodejs.org/en/learn/modules/anatomy-of-an-http-transaction#method-url-and-headers, the actual API docs don't seem to state that though)
Under the assumption that this is true, I derived that in the old implementation - when parsing the URL with url.parse on L679 - there can never be a hostname that was parsed from the URL. On L680 of the old code, we'll therefore always fall back to IncomingMessage.headers.host because the requestUrl.hostname it's always null.
For the new implementation that means: we don't try to parse the URL to get the host because it'll never be there, our only chance to get it is the host header. If it's not there, we can fall back to localhost as the old code did.
… cannot be parsed
experimental/CHANGELOG.md
Outdated
| ### :boom: Breaking Change | ||
|
|
||
| * fix(instrumentation-http): drop url.parse in favor of URL constructor [#5091](https://github.com/open-telemetry/opentelemetry-js/pull/5091) @pichlermarc | ||
| * (user-facing): signature of `getRequestInfo()` now requires a `DiagLogger` to be passed at the first position |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for reviewers: we're exporting lots of utils, I don't think this is intentional but it's still technically a breaking change.
hectorhdzg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| host?.replace(/^(.*)(:[0-9]{1,5})/, '$1') || | ||
| 'localhost'; | ||
| const host = headers.host; | ||
| const hostname = host?.replace(/^(.*)(:[0-9]{1,5})/, '$1') || 'localhost'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const hostname = host?.replace(/^(.*)(:[0-9]{1,5})/, '$1') || 'localhost'; | |
| const hostname = host?.replace(/^(.*)(:[0-9]{1,5})$/, '$1') || 'localhost'; |
Perhaps anchor at the end of the string? I'm not sure if an IPv6 address (with :[0-9]+ segments) could get in here as host. If not, ignore this comment.
Ah, I see that this is just re-using code that was already there. I think you can ignore this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that this is just re-using code that was already there. I think you can ignore this comment.
Yep, I just re-used it 🙂 Let's keep it for now and adjust it later if necessary.
Which problem is this PR solving?
Important
Note for reviewers: I noticed that there's quite a few edge cases with this. I tried to address all of them, but please review this in-depth. Try things out if possible.
See #5060 - some characters cause an error in
url.parsewhich will cause requests to fail. This PR replaces usages ofurl.parsein favor of theURLconstructor.#5085 did some preparation for this, removing one occurence, this PR removes the rest of them. Please review this PR in-depth. The
URLandurl.parsehave very different behavior and we rely quite a bit on that old behavior. It also throws more often thenurl.parsedid. Therefore we need to add a bit more defensive code and some quite ugly workarounds - please let me know if you know of more elegant solutions to accomplish this.Fixes #5060
Type of change
How Has This Been Tested?