Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

There are a few problems with the existing solution:

  1. In the original sources, logging was split across a GlobalLog class and a Logging class. Global-related tracing went to GlobalLog, and library-specific tracing went to Logging. When the sources were moved to corefx, this evolved into multiple EventSource-based types, most of which were in Common and were then src-included into each System.Net assembly. This has a significant problem, though, which is that the name/Guid identity of an EventSource can only exist once in a process, and subsequent EventSource instances with the same identity fail. This means that whichever assembly ends up logging something first wins, and all other tracing from other System.Net assemblies ends up being lost.
  2. The split also may have made sense when all components shared the same GlobalLog, but now that GlobalLog is included as a unique EventSource instance in each assembly, it’s no longer global. This means that we’re not only loading multiple EventSources unnecessarily (each has its own costs), but we’re also duplicating logging, as some logging is done to one or the other, but some logging is done to both.
  3. Due to sharing the Logging class, which evolved into an EventSource-based type, many logging calls had to include which component they came from as an enum value, leading to bloating at the call sites.
  4. Additionally, many logging operations included the name of the method in which the calling was being done, which meant both longer call sites as well as stale values as members got renamed over time.
  5. Call sites ended up getting fairly long due to lots of string concatenation operations and other manipulations in the logging call. This also then led to most call sites needing to be guarded behind checks for IsEnabled, leading to lots of code at every call site and obscuring the actual information being traced.
  6. An additional requirement that got added with the move to corefx was that certain failure conditions should both Debug.Fail and do tracing, leading to some tracing operations taking upwards of 10 lines of code.

All of these issues are addressed by this PR.

  • Each System.Net assembly that needs to log now loads a single NetEventSource type for doing logging, each with its own identity.
  • That implementation is split across a common, partial NetEventSource.Common.cs file used by all such assemblies, and then a per-assembly partial NetEventSource.AssemblyName.cs file used by the individual assembly. The former file contains most of the logging implementation. The latter files typically contain just the partial class declaration along with the [EventSource] attribute that gives that assembly’s NetEventSource its unique identity / name.
  • The logging operations in NetEventSource.Common.cs use a variety of relatively-recent language enhancements to improve the logging. Each operation uses a [CallerMemberName] so that the call sites don’t need to specify the caller. FormattableString is used so that call sites can use string interpolation while still providing the logging implementation the flexibility to format the individual components in a desired fashion (e.g. IntPtrs are formatted as hex, SafeHandles are rendered in a way that includes their name+hashcode+handle, etc.), along with a mechanism that allows individual libraries to add additional formatting rules. Each operation allows the this reference (or another this-like context object) to be passed in, to simply identify the source of the call (in concert with the caller member name). A debug-only mechanism is included in the tracing to help find potential perf issues due to misuse of the logger. Etc.
  • A Fail logging operation is included that both traces and does a Debug.Fail.

With these changes, most call sites are greatly simplified. Some examples:

Before:

if (GlobalLog.IsEnabled)
{
    GlobalLog.AssertFormat("ContextAwareResult#{0}::ContextCopy|Called on completed result.", LoggingHash.HashString(this));
}
Debug.Fail("ContextAwareResult#" + LoggingHash.HashString(this) + "::ContextCopy |Called on completed result.");

After:

NetEventSource.Fail(this, "Called on completed result.");

Before:

if (NetEventSource.Log.IsEnabled()) NetEventSource.Enter(NetEventSource.ComponentType.Http, this, ".ctor", handler);

After:

NetEventSource.Enter(this, handler);

Etc.

In addition to fixing up all of the event sources, I of course also fixed up all usage in order to clean up the call sites. However, this was a rather mechanical change: I did not remove existing logging (other than duplication) nor did I add new logging. At some point, we should review the actual logging being done to determine what’s missing and what’s not needed. I also did not revise the additional existing events on the sockets and security event sources, but we may want to remove those (or some of those) in favor of just using the shared events.

Fixes https://github.com/dotnet/corefx/issues/5265
Fixes https://github.com/dotnet/corefx/issues/5755
Fixes https://github.com/dotnet/corefx/issues/12685
Fixes https://github.com/dotnet/corefx/issues/12808

cc: @CIPop, @davidsh, @ericeil, @geoffkizer, @Priya91, @Petermarcu

@stephentoub stephentoub changed the title This PR overhauls how logging is accomplished in the System.Net libraries Overhaul System.Net library logging Oct 28, 2016
@davidsh davidsh added this to the 1.2.0 milestone Oct 28, 2016
{
GlobalLog.Print("NegotiationInfoClass::.ctor() packageInfo:" + packageInfo.ToString("x8") + " negotiationState:" + negotiationState.ToString("x8") + " name:" + LoggingHash.ObjectToString(name));
}
if (NetEventSource.IsEnabled) NetEventSource.Info(this, $"packageInfo:{packageInfo} negotiationState:{negotiationState} name:{name}");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use :X to format negotiationState above, is that needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll fix that... I missed it when rewriting the line.

/// <param name="buffer">The buffer to be logged.</param>
/// <param name="memberName">The calling member.</param>
[NonEvent]
public static void DumpArray(object thisOrContextObject, byte[] buffer, [CallerMemberName] string memberName = null)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is specifically for dumping a buffer, can we call it "DumpBuffer"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

{
DebugValidateArg(first);
DebugValidateArg(second);
if (IsEnabled) Log.Associate(IdOf(first), memberName, IdOf(first), IdOf(second));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Seems like if this is the desired behavior, callers could use the overload that takes an explicit thisOrContextObject argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct, or at least it's the behavior I intended. The most common use case was just calling Associate(this, somethingElse), and I wanted to avoid the caller having to do Associate(this, this, somethingElse), so I added this overload that it does it for them.

{
SecurityEventSource.Log.SecurityContextInputBuffers(nameof(InitializeSecurityContext), (inputBuffers == null ? 0 : inputBuffers.Length), outputBuffer.size, (Interop.SECURITY_STATUS)errorCode);
}
NetEventSource.Log.SecurityContextInputBuffers(nameof(InitializeSecurityContext), inputBuffers?.Length ?? 0, outputBuffer.size, (Interop.SECURITY_STATUS)errorCode);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor, but shouldn't there be an IsEnabled test here? (Similarly above)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's very little work happening in the call site; didn't seem necessary to put it behind a guard. But I can if you'd like.

}

if (errorCode != 0 && NetEventSource.Log.IsEnabled())
if (errorCode != 0 && NetEventSource.IsEnabled)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IsEnabled test should come first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link

@geoffkizer geoffkizer Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it doesn't really matter.

I guess I'd like to adhere to a principle that says: Any code that's only needed for tracing should be inside an IsEnabled check, implicitly or explicitly. And here, the errorCode != 0 test is only needed for tracing. I realize that's being rather nitty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not measured, but I'd expect swapping the order to actually be worse for perf, which is why I left it this way. If the IsEnabled check comes first, even in the success case we have to access the Log field and then access its enabled field. That said, I doubt either order would be at all measurable. I can swap it if it bothers you.

_maxDataSize = checked(streamSizes.MaximumMessage - (_headerSize + _trailerSize));
}
catch (Exception e)
catch (Exception e) when (!ExceptionCheck.IsFatal(e))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could potentially have a perf impact in release, since we've added a try/catch. Not sure what to do about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we've added a try/catch

I didn't add a try/catch. Are you saying we should think about removing the try/catch? We could, but I one of my goals for this pass was to not remove any tracing... the only tracing I removed was because it was obviously duplicative (or in a few cases where it was in a TRACE_VERBOSE region, had gotten stale, and referred to things that no longer existed).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the try/catch was already there.

What I should say is, we should consider removing the try/catch since it's only needed for tracing. That's beyond the scope of this change though.

SecurityBuffer outSecBuffer,
ref Interop.SspiCli.ContextFlags outFlags)
{
#if TRACE_VERBOSE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to keep TRACE_VERBOSE now, assuming it's all protected by an IsEnabled test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I didn't want to remove any of these. As I noted in the PR description, there is likely some tracing being done that's no longer necessary, and there's probably some missing tracing. My goal here wasn't to address that, but rather to get tracing into a working form. Subsequently someone can review all of what's actually being trace holistically.

#if TRACE_VERBOSE
GlobalLog.Print(" inSecBuffers[] = length:" + inSecBuffers.Length);
#endif
NetEventSource.Enter(null, "SafeDeleteContext::CompleteAuthToken");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better handled in a single Enter event instead of Enter plus two Infos. (That may be beyond the scope of this change, but can we at least file general issues for obviously bad stuff like this?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be beyond the scope of this change

Right. See my previous comments.

if (s_loggingEnabled)
SocketsEventSource.Accepted(_acceptSocket, _acceptSocket.RemoteEndPoint, _acceptSocket.LocalEndPoint);
NetEventSource.Accepted(_acceptSocket, _acceptSocket.RemoteEndPoint, _acceptSocket.LocalEndPoint);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an IsEnabled check here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't buy much. I tried to only use IsEnabled guards where there was real work happening in the call site.

{
if (s_loggingEnabled)
SocketsEventSource.Connected(_currentSocket, _currentSocket.LocalEndPoint, _currentSocket.RemoteEndPoint);
NetEventSource.Connected(_currentSocket, _currentSocket.LocalEndPoint, _currentSocket.RemoteEndPoint);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsEnabled check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. Is there something about these particular calls that suggests to you IsEnabled would make a measurable difference?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not really. As above, it's more of a "what's the general principle?" sort of issue. I'll try to elaborate better in a subsequent comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the general principle?

For me the general principle was: is there non-trivial work happening at the call site... if yes, it must be guarded, if no, it doesn't need to be.

We could say that every single call site to any tracing method in NetEventSource needs to have an IsEnabled check before it. It'll lead to clunkier code, but it'll be easy to explain. Is that what you'd prefer?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say that every single call site to any tracing method in NetEventSource needs to have an IsEnabled check before it. It'll lead to clunkier code, but it'll be easy to explain. Is that what you'd prefer?

I think I'm leaning that way, yeah.

I'm used to using macros to do this in native code, which is pretty easy to get right, and not very crufty. Unfortunately this isn't possible in managed code.

The problem is the "non-trivial work happening" part. This seems fairly easy to get wrong. The "no value types" rule in particular made me pause a few times when reviewing the code, since you need to think about the types of the arguments you're passing.

I realize that the value types mistake will get caught at your checks, but it seems like it would be better to have the code be correct at a glance rather than having to think about it.

And other potential issues, like "NetEventSource.Info(this, SomeExpensiveFunction())" can't be caught by your checks.

I realize that that makes the code cruftier, and I don't like that, but overall I think I prefer the more verbose code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional thought:

There's little potential confusion in trivial cases like Enter(this) or Error(this, e). So perhaps a middle path would be to keep the convenience methods for these cases and not require wrapping them in if(enabled) tests, while requiring anything more involved than this to do an explicit if(enabled) test.

For clarity, though, I think I'd name these convenience methods something other than just "Enter" and "Exit". Like"CheckEnabledAndEnter". That's a terrible name of course, but you get the idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. While I was hoping to avoid all those checks, I get that having them makes it easier to always do the right thing and avoid mistakes. I'll add a commit that adds guards everywhere.

GlobalLog.Print("UdpClient::Dispose()");
}

NetEventSource.Info(this);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be some info text here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't in the original, so I didn't add any. The original only traced out the name of the caller, which happens implicitly in this revised version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I probably wouldn't want to do this in new code, but I understand that's beyond the scope here.

"UnmanagedCertificateContext::GetRemoteCertificatesFromStoreContext " +
"adding remote certificate:" + cert.Subject + cert.Thumbprint);
}
if (NetEventSource.IsEnabled) NetEventSource.Info(certContext, $"Adding remote certificate:{cert}");
Copy link
Contributor

@Priya91 Priya91 Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsEnabled is checked in NetEventSource.Info no? There at other places as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the call site is using string interpolation, and will end up allocating a FormattableString along with an object[] to hold the arguments (cert). The IsEnabled check guards those costs.


Debug.Fail("Find out why we are getting an unexpected exception.");
}
NetEventSource.Error(this, exception);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this should be a Fail call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one place where I didn't strictly follow what was there previously, because what was there previously was bogus. This is a valid exception path... in fact, that Ftp test that an issue got opened for around Dns resolution repros for me locally and causes the Debug.Fail that was here to fire. We should not have asserts fire for anything that's legitimately hittable, so I changed it to the Error that should have been.

{
MailEventSource.Log.Send(nameof(pickupDirectory), pickupDirectory);
}
if (NetEventSource.IsEnabled) NetEventSource.Info(this, $"pickupDirectornameof={pickupDirectory}");
Copy link
Contributor

@Priya91 Priya91 Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$"{nameof(pickupDirectory)}={pickupDirectory}" ..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will fix.

@Priya91
Copy link
Contributor

Priya91 commented Oct 28, 2016

Looked at the Mail changes, LGTM !

@stephentoub
Copy link
Member Author

@dotnet-bot Test Innerloop Windows_NT Debug Build and Test please (#13025)

@stephentoub
Copy link
Member Author

Looked at the Mail changes, LGTM !

Thanks, @Priya91 😄

@geoffkizer
Copy link

This stuff looks fantastic. Thanks much for tackling this!

Copy link
Contributor

@davidsh davidsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thank you for cleaning this up. I like this overall, clean, design.

This PR overhauls how logging is accomplished in the System.Net libraries.

There are a few problems with the existing solution:
1. In the original sources, logging was split across a GlobalLog class and a Logging class.  Global-related tracing went to GlobalLog, and library-specific tracing went to Logging.  When the sources were moved to corefx, this evolved into multiple EventSource-based types, most of which were in Common and were then src-included into each System.Net assembly.  This has a significant problem, though, which is that the name/Guid identity of an EventSource can only exist once in a process, and subsequent EventSource instances with the same identity fail.  This means that whichever assembly ends up logging something first wins, and all other tracing from other System.Net assemblies ends up being lost.
2. The split also may have made sense when all components shared the same GlobalLog, but now that GlobalLog is included as a unique EventSource instance in each assembly, it's no longer global.  This means that we're not only loading multiple EventSources unnecessarily (each has its own costs), but we're also duplicating logging, as some logging is done to one or the other, but some logging is done to both.
3. Due to sharing the Logging class, which evolved into an EventSource-based type, many logging calls had to include which component they came from as an enum value, leading to bloating at the call sites.
4. Additionally, many logging operations included the name of the method in which the calling was being done, which meant both longer call sites as well as stale values as members got renamed over time.
5. Call sites ended up getting fairly long due to lots of string concatenation operations and other manipulations in the logging call.  This also then led to most call sites needing to be guarded behind checks for IsEnabled, leading to lots of code at every call site and obscuring the actual information being traced.
6. An additional requirement that got added with the move to corefx was that certain failure conditions should both Debug.Fail and do tracing, leading to some tracing operations taking upwards of 10 lines of code.

All of these issues are addressed by this PR.

- Each System.Net assembly that needs to log now loads a single NetEventSource type for doing logging, each with its own identity.
- That implementation is split across a common, partial NetEventSource.Common.cs file used by all such assemblies, and then a per-assembly partial NetEventSource.AssemblyName.cs file used by the individual assembly.  The former file contains most of the logging implementation.  The latter files typically contain just the partial class declaration along with the [EventSource] attribute that gives that assembly's NetEventSource its unique identity / name.
- The logging operations in NetEventSource.Common.cs use a variety of relatively-recent language enhancements to improve the logging.  Each operation uses a [CallerMemberName] so that the call sites don't need to specify the caller.  FormattableString is used so that call sites can use string interpolation while still providing the logging implementation the flexibility to format the individual components in a desired fashion (e.g. IntPtrs are formatted as hex, SafeHandles are rendered in a way that includes their name+hashcode+handle, etc.), along with a mechanism that allows individual libraries to add additional formatting rules.  Each operation allows the this reference (or another this-like context object) to be passed in, to simply identify the source of the call (in concert with the caller member name).  A debug-only mechanism is included in the tracing to help find potential perf issues due to misuse of the logger.  Etc.
- A Fail logging operation is included that both traces and does a Debug.Fail.

With these changes, most call sites are greatly simplified.  Some examples:
Before:
```C#
if (GlobalLog.IsEnabled)
{
    GlobalLog.AssertFormat("ContextAwareResult#{0}::ContextCopy|Called on completed result.", LoggingHash.HashString(this));
}
Debug.Fail("ContextAwareResult#" + LoggingHash.HashString(this) + "::ContextCopy |Called on completed result.");
```
After:
```C#
NetEventSource.Fail(this, "Called on completed result.");
```

Before:
```C#
if (NetEventSource.Log.IsEnabled()) NetEventSource.Enter(NetEventSource.ComponentType.Http, this, ".ctor", handler);
```
After:
```C#
NetEventSource.Enter(this, handler);
```

Etc.

In addition to fixing up all of the event sources, I of course also fixed up all usage in order to clean up the call sites.  However, this was a rather mechanical change: I did not remove existing logging (other than duplication) nor did I add new logging.  At some point, we should review the actual logging being done to determine what's missing and what's not needed.  I also did not revise the additional existing events on the sockets and security event sources, but we may want to remove those (or some of those) in favor of just using the shared events.
- Fix a few copy-and-paste errors
- Rename DumpArray to DumpBuffer
- Prefix every NetEventSource usage with IsEnabled check (except for Fail, which represents a critical programming error)
@stephentoub stephentoub force-pushed the overhaul_net_logging branch 2 times, most recently from 390af1e to e486c35 Compare October 29, 2016 15:28
- Add tests that actually exercise the event sources, making sure that no event source errors are produced and that events are actually generated.
- Fix a few src issues found from these tests.
@stephentoub
Copy link
Member Author

@dotnet-bot Test Outerloop Windows_NT Debug please
@dotnet-bot Test Outerloop Ubuntu14.04 Debug please

@stephentoub stephentoub merged commit 6b5ef12 into dotnet:master Oct 29, 2016
@stephentoub stephentoub deleted the overhaul_net_logging branch October 29, 2016 18:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

5 participants