-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Exclude extra allocations in OrderedDictionary #32001
Conversation
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
| } | ||
| objectsTable.Clear(); | ||
| objectsArray.Clear(); | ||
| EnsureObjectsTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Count == 0 then you can return without initializing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally Clear checks Count/Size on first step so I only replaced EnsureObjectsTable/EnsureObjectsArray calls with null checks.
| object key = ((DictionaryEntry)objectsArray[index]).Key; | ||
| objectsArray.RemoveAt(index); | ||
| objectsTable.Remove(key); | ||
| EnsureObjectsTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these were not initialized, then the ArgumentOutOfRangeException would have been thrown, so you do not need to initialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks as a breaking change.
Docs
https://docs.microsoft.com/en-us/dotnet/api/system.collections.specialized.ordereddictionary.removeat?view=netframework-4.7.2#System_Collections_Specialized_OrderedDictionary_RemoveAt_System_Int32_
https://docs.microsoft.com/en-us/dotnet/api/system.collections.hashtable.remove?view=netcore-2.1
don't describe the exception. And I don't see the exception in code
https://source.dot.net/#System.Runtime.Extensions/System/Collections/ArrayList.cs,ff69c0194c6935a4
https://source.dot.net/#System.Private.CoreLib/shared/System/Collections/Hashtable.cs,1031
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it already had this behavior previously but the docs hasn't been updated.
What @danmosemsft meant here is that we don't need to call EnsureObjects apis since you're calling Count and count returns 0 when _objectArrays == null https://github.com/dotnet/corefx/pull/32001/files/551ba9e70c3e5ace47c8ccb36a0815cfc074fb3a#diff-56df9454f3e73140498b3984d3fd8ab5R84
So that means that if these were not initialized, Count will be 0, therefore index >= Count if index > 0 so it will hit: https://github.com/dotnet/corefx/pull/32001/files/551ba9e70c3e5ace47c8ccb36a0815cfc074fb3a#diff-56df9454f3e73140498b3984d3fd8ab5R361
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarify. I see the same in Insert() method too. Fix both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think it makes sense to remove it there as well.
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
| { | ||
| return null; | ||
| } | ||
| EnsureObjectsTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point you know _objectsTable is not null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/System.Collections.Specialized/src/System/Collections/Specialized/OrderedDictionary.cs
Show resolved
Hide resolved
| public void CopyTo(Array array, int index) | ||
| { | ||
| objectsTable.CopyTo(array, index); | ||
| EnsureObjectsTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just return if _objectsTable is null. No big deal.
|
Thanks @iSazonov ! |
safern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
I think #32001 (comment) was pending to be resolved, but no big deal it is just an extra check. |
|
Thanks for useful comments! |
|
Oh my bad @safern |
|
I can open new PR to continue. |
If you feel like it, you're welcome to do so 😄 |
| if (_objectsArray == null) | ||
| { | ||
| throw new ArgumentOutOfRangeException(nameof(index)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this beneficial? Seems like it should just be:
EnsureObjectsArray();
return ((DictionaryEntry)_objectsArray[index]).Value;And let the ArrayList's indexer throw the appropriate exception. There doesn't seem to be much value in trying to avoid an ArrayList allocation when the program is busted due to out-of-range accesses, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This problem is described in #31972. In PowerShell Core repository, we found that we make tons of unnecessary allocations. Perhaps there are other applications that are prone to this problem. This fix does not require resources. So fix all the properties of the class makes sense.
I was very surprised when PerfView showed that the "Get:" operation produces memory allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the general concern. I was asking about this case specifically. Going out of our way, including trying to mimic error handling of ArrayList, in order to prevent an allocation when we're about to throw an exception (which is way more expensive) seems to me like overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, it was just the indexing that caused the problem in PowerShell Core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't make sense to me. For both the getter and the setter, either the operation is successful and requires that the objects be allocated (as they were previous to this change and still are after this change), or the operation fails, allocates an exception, and throws that exception. I expect the extra ArrayList allocation in such a failure case is a drop in the bucket when compared to the exception allocation and throw, and even if it weren't, the right fix for PowerShell would be to avoid incorrect usage of the collection in the first place, so that it could avoid the exception.
So, from my perspective, adding this code is trying to microoptimize an already super expensive failure path, and in exchange is increasing the size of the binary and duplicating code from ArrayList that now needs to be maintained.
I'm not objecting to the overall PR, just to these specific pieces of it.
cc: @danmosemsft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This argument doesn't make sense to me.
My main objection is that it's very surprising to get an allocation in a read operation
Even after your change, there is still an allocation on this[int] if the object isn't yet initialized, just as there was before.
I believe this is undesirable behavior.
For success paths, if it can be avoided, I agree. That's why your change to Count and this[object] were good. But that's not relevant to this[int].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I understand that I don't understand something :-)
Even after your change, there is still an allocation on this[int] if the object isn't yet initialized, just as there was before.
Before we allocate ArrayList and Exception. After we allocate only Exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we allocate ArrayList and Exception. After we allocate only Exception.
That's true, but it also throws the exception. The cost of allocating and throwing the exception is orders of magnitude more expensive than allocating the ArrayList. Not only is the ArrayList 40 bytes and ArgumentOutOfRangeException 232 bytes, but try { throw new ArgumentOutOfRangeException("index"); } catch {} is, on my machine, ~3,833x more expensive than the ArrayList allocation in terms of throughput. The ArrayList allocation is completely inconsequential. Not only that, but this kind of optimization is really only important when it's on a hot path, and if you're throwing exceptions on such a hot path, you have way, way, way bigger problems; the extra ArrayList allocation doesn't matter in such a case. Now, any improvement, regardless of size, can be worthwhile if there's no negatives associated with it. But that's also rarely the case. Here, more code is needed, and thus needs to be maintained, and increases binary size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not pay attention to that this indexer Item[int] is so special. (I mistakenly relied Item[object]) Thank you, it's useful for my education.
Could you please clarify how do you measure the size of objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify how do you measure the size of objects?
A variety of ways:
-
You can look at the object layout in a debugger like windbg + SOS.
-
You can use a tool like sharplab that does that for you, e.g. https://sharplab.io/#v2:C4LgTgrgdgPgAgBgARwIwG4CwAoOAmBHOAZhVQDYU8yB2JAbxyWZVLUrgBYkBZAQwCWUABQBKBkxZSAklADOABwCmAY2AA6ABJK+C4VCUB3JAEEwAcwgBbJVGAB5CA4BmAJT5RzSgKIAPFUoKwAIA9iIARLIAJkq+4aKiWNhSAL44KUA
-
You can use a tool like Benchmark.NET and its memory diagnoser, e.g. https://adamsitnik.com/the-new-Memory-Diagnoser/
-
You can write code that uses https://apisof.net/catalog/System.GC.GetAllocatedBytesForCurrentThread(), checking the value, allocating the object (ideally a bunch of times and then later dividing), and checking the value again.
-
You can do a back-of-the-hand calculation based on looking at the object's fields.
Etc.
…onary Exclude extra allocations in OrderedDictionary Commit migrated from dotnet/corefx@92cc266
Address #31972.
Add checks in properties to exclude allocations internal HashTable and ArrayList if they are not needed.