-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Overhaul System.Net library logging #13134
Conversation
6ed2c53 to
2e26f29
Compare
| { | ||
| GlobalLog.Print("NegotiationInfoClass::.ctor() packageInfo:" + packageInfo.ToString("x8") + " negotiationState:" + negotiationState.ToString("x8") + " name:" + LoggingHash.ObjectToString(name)); | ||
| } | ||
| if (NetEventSource.IsEnabled) NetEventSource.Info(this, $"packageInfo:{packageInfo} negotiationState:{negotiationState} name:{name}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use :X to format negotiationState above, is that needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll fix that... I missed it when rewriting the line.
| /// <param name="buffer">The buffer to be logged.</param> | ||
| /// <param name="memberName">The calling member.</param> | ||
| [NonEvent] | ||
| public static void DumpArray(object thisOrContextObject, byte[] buffer, [CallerMemberName] string memberName = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is specifically for dumping a buffer, can we call it "DumpBuffer"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
| { | ||
| DebugValidateArg(first); | ||
| DebugValidateArg(second); | ||
| if (IsEnabled) Log.Associate(IdOf(first), memberName, IdOf(first), IdOf(second)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? Seems like if this is the desired behavior, callers could use the overload that takes an explicit thisOrContextObject argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's correct, or at least it's the behavior I intended. The most common use case was just calling Associate(this, somethingElse), and I wanted to avoid the caller having to do Associate(this, this, somethingElse), so I added this overload that it does it for them.
| { | ||
| SecurityEventSource.Log.SecurityContextInputBuffers(nameof(InitializeSecurityContext), (inputBuffers == null ? 0 : inputBuffers.Length), outputBuffer.size, (Interop.SECURITY_STATUS)errorCode); | ||
| } | ||
| NetEventSource.Log.SecurityContextInputBuffers(nameof(InitializeSecurityContext), inputBuffers?.Length ?? 0, outputBuffer.size, (Interop.SECURITY_STATUS)errorCode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is minor, but shouldn't there be an IsEnabled test here? (Similarly above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's very little work happening in the call site; didn't seem necessary to put it behind a guard. But I can if you'd like.
| } | ||
|
|
||
| if (errorCode != 0 && NetEventSource.Log.IsEnabled()) | ||
| if (errorCode != 0 && NetEventSource.IsEnabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IsEnabled test should come first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it doesn't really matter.
I guess I'd like to adhere to a principle that says: Any code that's only needed for tracing should be inside an IsEnabled check, implicitly or explicitly. And here, the errorCode != 0 test is only needed for tracing. I realize that's being rather nitty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not measured, but I'd expect swapping the order to actually be worse for perf, which is why I left it this way. If the IsEnabled check comes first, even in the success case we have to access the Log field and then access its enabled field. That said, I doubt either order would be at all measurable. I can swap it if it bothers you.
| _maxDataSize = checked(streamSizes.MaximumMessage - (_headerSize + _trailerSize)); | ||
| } | ||
| catch (Exception e) | ||
| catch (Exception e) when (!ExceptionCheck.IsFatal(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it could potentially have a perf impact in release, since we've added a try/catch. Not sure what to do about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we've added a try/catch
I didn't add a try/catch. Are you saying we should think about removing the try/catch? We could, but I one of my goals for this pass was to not remove any tracing... the only tracing I removed was because it was obviously duplicative (or in a few cases where it was in a TRACE_VERBOSE region, had gotten stale, and referred to things that no longer existed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the try/catch was already there.
What I should say is, we should consider removing the try/catch since it's only needed for tracing. That's beyond the scope of this change though.
| SecurityBuffer outSecBuffer, | ||
| ref Interop.SspiCli.ContextFlags outFlags) | ||
| { | ||
| #if TRACE_VERBOSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to keep TRACE_VERBOSE now, assuming it's all protected by an IsEnabled test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I didn't want to remove any of these. As I noted in the PR description, there is likely some tracing being done that's no longer necessary, and there's probably some missing tracing. My goal here wasn't to address that, but rather to get tracing into a working form. Subsequently someone can review all of what's actually being trace holistically.
| #if TRACE_VERBOSE | ||
| GlobalLog.Print(" inSecBuffers[] = length:" + inSecBuffers.Length); | ||
| #endif | ||
| NetEventSource.Enter(null, "SafeDeleteContext::CompleteAuthToken"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better handled in a single Enter event instead of Enter plus two Infos. (That may be beyond the scope of this change, but can we at least file general issues for obviously bad stuff like this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be beyond the scope of this change
Right. See my previous comments.
| if (s_loggingEnabled) | ||
| SocketsEventSource.Accepted(_acceptSocket, _acceptSocket.RemoteEndPoint, _acceptSocket.LocalEndPoint); | ||
| NetEventSource.Accepted(_acceptSocket, _acceptSocket.RemoteEndPoint, _acceptSocket.LocalEndPoint); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an IsEnabled check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't buy much. I tried to only use IsEnabled guards where there was real work happening in the call site.
| { | ||
| if (s_loggingEnabled) | ||
| SocketsEventSource.Connected(_currentSocket, _currentSocket.LocalEndPoint, _currentSocket.RemoteEndPoint); | ||
| NetEventSource.Connected(_currentSocket, _currentSocket.LocalEndPoint, _currentSocket.RemoteEndPoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsEnabled check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Is there something about these particular calls that suggests to you IsEnabled would make a measurable difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not really. As above, it's more of a "what's the general principle?" sort of issue. I'll try to elaborate better in a subsequent comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the general principle?
For me the general principle was: is there non-trivial work happening at the call site... if yes, it must be guarded, if no, it doesn't need to be.
We could say that every single call site to any tracing method in NetEventSource needs to have an IsEnabled check before it. It'll lead to clunkier code, but it'll be easy to explain. Is that what you'd prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could say that every single call site to any tracing method in NetEventSource needs to have an IsEnabled check before it. It'll lead to clunkier code, but it'll be easy to explain. Is that what you'd prefer?
I think I'm leaning that way, yeah.
I'm used to using macros to do this in native code, which is pretty easy to get right, and not very crufty. Unfortunately this isn't possible in managed code.
The problem is the "non-trivial work happening" part. This seems fairly easy to get wrong. The "no value types" rule in particular made me pause a few times when reviewing the code, since you need to think about the types of the arguments you're passing.
I realize that the value types mistake will get caught at your checks, but it seems like it would be better to have the code be correct at a glance rather than having to think about it.
And other potential issues, like "NetEventSource.Info(this, SomeExpensiveFunction())" can't be caught by your checks.
I realize that that makes the code cruftier, and I don't like that, but overall I think I prefer the more verbose code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional thought:
There's little potential confusion in trivial cases like Enter(this) or Error(this, e). So perhaps a middle path would be to keep the convenience methods for these cases and not require wrapping them in if(enabled) tests, while requiring anything more involved than this to do an explicit if(enabled) test.
For clarity, though, I think I'd name these convenience methods something other than just "Enter" and "Exit". Like"CheckEnabledAndEnter". That's a terrible name of course, but you get the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. While I was hoping to avoid all those checks, I get that having them makes it easier to always do the right thing and avoid mistakes. I'll add a commit that adds guards everywhere.
| GlobalLog.Print("UdpClient::Dispose()"); | ||
| } | ||
|
|
||
| NetEventSource.Info(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be some info text here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There wasn't in the original, so I didn't add any. The original only traced out the name of the caller, which happens implicitly in this revised version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I probably wouldn't want to do this in new code, but I understand that's beyond the scope here.
| "UnmanagedCertificateContext::GetRemoteCertificatesFromStoreContext " + | ||
| "adding remote certificate:" + cert.Subject + cert.Thumbprint); | ||
| } | ||
| if (NetEventSource.IsEnabled) NetEventSource.Info(certContext, $"Adding remote certificate:{cert}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsEnabled is checked in NetEventSource.Info no? There at other places as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the call site is using string interpolation, and will end up allocating a FormattableString along with an object[] to hold the arguments (cert). The IsEnabled check guards those costs.
|
|
||
| Debug.Fail("Find out why we are getting an unexpected exception."); | ||
| } | ||
| NetEventSource.Error(this, exception); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this should be a Fail call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one place where I didn't strictly follow what was there previously, because what was there previously was bogus. This is a valid exception path... in fact, that Ftp test that an issue got opened for around Dns resolution repros for me locally and causes the Debug.Fail that was here to fire. We should not have asserts fire for anything that's legitimately hittable, so I changed it to the Error that should have been.
| { | ||
| MailEventSource.Log.Send(nameof(pickupDirectory), pickupDirectory); | ||
| } | ||
| if (NetEventSource.IsEnabled) NetEventSource.Info(this, $"pickupDirectornameof={pickupDirectory}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$"{nameof(pickupDirectory)}={pickupDirectory}" ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Will fix.
|
Looked at the Mail changes, LGTM ! |
|
@dotnet-bot Test Innerloop Windows_NT Debug Build and Test please (#13025) |
Thanks, @Priya91 😄 |
|
This stuff looks fantastic. Thanks much for tackling this! |
davidsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for cleaning this up. I like this overall, clean, design.
2e26f29 to
6e4a0a0
Compare
This PR overhauls how logging is accomplished in the System.Net libraries.
There are a few problems with the existing solution:
1. In the original sources, logging was split across a GlobalLog class and a Logging class. Global-related tracing went to GlobalLog, and library-specific tracing went to Logging. When the sources were moved to corefx, this evolved into multiple EventSource-based types, most of which were in Common and were then src-included into each System.Net assembly. This has a significant problem, though, which is that the name/Guid identity of an EventSource can only exist once in a process, and subsequent EventSource instances with the same identity fail. This means that whichever assembly ends up logging something first wins, and all other tracing from other System.Net assemblies ends up being lost.
2. The split also may have made sense when all components shared the same GlobalLog, but now that GlobalLog is included as a unique EventSource instance in each assembly, it's no longer global. This means that we're not only loading multiple EventSources unnecessarily (each has its own costs), but we're also duplicating logging, as some logging is done to one or the other, but some logging is done to both.
3. Due to sharing the Logging class, which evolved into an EventSource-based type, many logging calls had to include which component they came from as an enum value, leading to bloating at the call sites.
4. Additionally, many logging operations included the name of the method in which the calling was being done, which meant both longer call sites as well as stale values as members got renamed over time.
5. Call sites ended up getting fairly long due to lots of string concatenation operations and other manipulations in the logging call. This also then led to most call sites needing to be guarded behind checks for IsEnabled, leading to lots of code at every call site and obscuring the actual information being traced.
6. An additional requirement that got added with the move to corefx was that certain failure conditions should both Debug.Fail and do tracing, leading to some tracing operations taking upwards of 10 lines of code.
All of these issues are addressed by this PR.
- Each System.Net assembly that needs to log now loads a single NetEventSource type for doing logging, each with its own identity.
- That implementation is split across a common, partial NetEventSource.Common.cs file used by all such assemblies, and then a per-assembly partial NetEventSource.AssemblyName.cs file used by the individual assembly. The former file contains most of the logging implementation. The latter files typically contain just the partial class declaration along with the [EventSource] attribute that gives that assembly's NetEventSource its unique identity / name.
- The logging operations in NetEventSource.Common.cs use a variety of relatively-recent language enhancements to improve the logging. Each operation uses a [CallerMemberName] so that the call sites don't need to specify the caller. FormattableString is used so that call sites can use string interpolation while still providing the logging implementation the flexibility to format the individual components in a desired fashion (e.g. IntPtrs are formatted as hex, SafeHandles are rendered in a way that includes their name+hashcode+handle, etc.), along with a mechanism that allows individual libraries to add additional formatting rules. Each operation allows the this reference (or another this-like context object) to be passed in, to simply identify the source of the call (in concert with the caller member name). A debug-only mechanism is included in the tracing to help find potential perf issues due to misuse of the logger. Etc.
- A Fail logging operation is included that both traces and does a Debug.Fail.
With these changes, most call sites are greatly simplified. Some examples:
Before:
```C#
if (GlobalLog.IsEnabled)
{
GlobalLog.AssertFormat("ContextAwareResult#{0}::ContextCopy|Called on completed result.", LoggingHash.HashString(this));
}
Debug.Fail("ContextAwareResult#" + LoggingHash.HashString(this) + "::ContextCopy |Called on completed result.");
```
After:
```C#
NetEventSource.Fail(this, "Called on completed result.");
```
Before:
```C#
if (NetEventSource.Log.IsEnabled()) NetEventSource.Enter(NetEventSource.ComponentType.Http, this, ".ctor", handler);
```
After:
```C#
NetEventSource.Enter(this, handler);
```
Etc.
In addition to fixing up all of the event sources, I of course also fixed up all usage in order to clean up the call sites. However, this was a rather mechanical change: I did not remove existing logging (other than duplication) nor did I add new logging. At some point, we should review the actual logging being done to determine what's missing and what's not needed. I also did not revise the additional existing events on the sockets and security event sources, but we may want to remove those (or some of those) in favor of just using the shared events.
- Fix a few copy-and-paste errors - Rename DumpArray to DumpBuffer - Prefix every NetEventSource usage with IsEnabled check (except for Fail, which represents a critical programming error)
390af1e to
e486c35
Compare
- Add tests that actually exercise the event sources, making sure that no event source errors are produced and that events are actually generated. - Fix a few src issues found from these tests.
e486c35 to
2284f73
Compare
|
@dotnet-bot Test Outerloop Windows_NT Debug please |
There are a few problems with the existing solution:
All of these issues are addressed by this PR.
With these changes, most call sites are greatly simplified. Some examples:
Before:
After:
Before:
After:
Etc.
In addition to fixing up all of the event sources, I of course also fixed up all usage in order to clean up the call sites. However, this was a rather mechanical change: I did not remove existing logging (other than duplication) nor did I add new logging. At some point, we should review the actual logging being done to determine what’s missing and what’s not needed. I also did not revise the additional existing events on the sockets and security event sources, but we may want to remove those (or some of those) in favor of just using the shared events.
Fixes https://github.com/dotnet/corefx/issues/5265
Fixes https://github.com/dotnet/corefx/issues/5755
Fixes https://github.com/dotnet/corefx/issues/12685
Fixes https://github.com/dotnet/corefx/issues/12808
cc: @CIPop, @davidsh, @ericeil, @geoffkizer, @Priya91, @Petermarcu