Skip to content

Conversation

@EgorBo
Copy link
Member

@EgorBo EgorBo commented Oct 27, 2022

GetId(false);

static int GetId(bool condition)
{
    if (condition)
        return GetCurrentThreadId();
    return 0;
}

[DllImport("kernel32.dll")]
static extern int GetCurrentThreadId();

Current Tier0 codegen for GetID:

; Assembly listing for method Program:GetId(bool):int
; Tier-0 compilation
; MinOpts code
G_M000_IG01:                ;; offset=0000H
       55                   push     rbp
       4157                 push     r15
       4156                 push     r14
       4155                 push     r13
       4154                 push     r12
       57                   push     rdi
       56                   push     rsi
       53                   push     rbx
       4883EC78             sub      rsp, 120
       488DAC24B0000000     lea      rbp, [rsp+B0H]
       894D10               mov      dword ptr [rbp+10H], ecx
G_M000_IG02:                ;; offset=001BH
       488D8D78FFFFFF       lea      rcx, [rbp-88H]
       498BD2               mov      rdx, r10
       E81628A55F           call     CORINFO_HELP_INIT_PINVOKE_FRAME
       488945B8             mov      qword ptr [rbp-48H], rax
       488BC4               mov      rax, rsp
       48894598             mov      qword ptr [rbp-68H], rax
       488BC5               mov      rax, rbp
       488945A8             mov      qword ptr [rbp-58H], rax
       8B4510               mov      eax, dword ptr [rbp+10H]
       0FB6C0               movzx    rax, al
       85C0                 test     eax, eax
       745E                 je       SHORT G_M000_IG06
       48B8F0ECDC5BF87F0000 mov      rax, 0x7FF85BDCECF0
       48894588             mov      qword ptr [rbp-78H], rax
       488D0521000000       lea      rax, G_M000_IG04
       488945A0             mov      qword ptr [rbp-60H], rax
       488B45B8             mov      rax, qword ptr [rbp-48H]
       488D9578FFFFFF       lea      rdx, bword ptr [rbp-88H]
       48895010             mov      qword ptr [rax+10H], rdx
       488B45B8             mov      rax, qword ptr [rbp-48H]
       C6400C00             mov      byte  ptr [rax+0CH], 0
G_M000_IG03:                ;; offset=0076H
       FF15E4E50B00         call     [Program:GetCurrentThreadId():int]
G_M000_IG04:                ;; offset=007CH
       488B55B8             mov      rdx, qword ptr [rbp-48H]
       C6420C01             mov      byte  ptr [rdx+0CH], 1
       833D4903E35F00       cmp      dword ptr [(reloc 0x7ff8bbb40ab4)], 0
       7406                 je       SHORT G_M000_IG05
       FF15052CE25F         call     [CORINFO_HELP_STOP_FOR_GC]
G_M000_IG05:                ;; offset=0093H
       488B55B8             mov      rdx, qword ptr [rbp-48H]
       488B4D80             mov      rcx, bword ptr [rbp-80H]
       48894A10             mov      qword ptr [rdx+10H], rcx
       8945C4               mov      dword ptr [rbp-3CH], eax
       EB05                 jmp      SHORT G_M000_IG07
G_M000_IG06:                ;; offset=00A4H
       33C0                 xor      eax, eax
       8945C4               mov      dword ptr [rbp-3CH], eax
G_M000_IG07:                ;; offset=00A9H
       8B45C4               mov      eax, dword ptr [rbp-3CH]
G_M000_IG08:                ;; offset=00ACH
       4883C478             add      rsp, 120
       5B                   pop      rbx
       5E                   pop      rsi
       5F                   pop      rdi
       415C                 pop      r12
       415D                 pop      r13
       415E                 pop      r14
       415F                 pop      r15
       5D                   pop      rbp
       C3                   ret      
; Total bytes of code 189

New Tier0 codegen for GetID:

; Assembly listing for method Program:GetId(bool):int
; Tier-0 compilation
; MinOpts code
G_M31913_IG01:              ;; offset=0000H
       55                   push     rbp
       4883EC20             sub      rsp, 32
       488D6C2420           lea      rbp, [rsp+20H]
       894D10               mov      dword ptr [rbp+10H], ecx
G_M31913_IG02:              ;; offset=000DH
       8B4510               mov      eax, dword ptr [rbp+10H]
       0FB6C0               movzx    rax, al
       85C0                 test     eax, eax
       740C                 je       SHORT G_M31913_IG04
       E8D47BE7FF           call     Program:GetCurrentThreadId():int
       90                   nop      
G_M31913_IG03:              ;; offset=001DH
       4883C420             add      rsp, 32
       5D                   pop      rbp
       C3                   ret      
G_M31913_IG04:              ;; offset=0023H
       33C0                 xor      eax, eax
G_M31913_IG05:              ;; offset=0025H
       4883C420             add      rsp, 32
       5D                   pop      rbp
       C3                   ret      
; Total bytes of code 43

-146 bytes of codegen for an used codepath. If it turns out to be used an IL_STUB_PInvoke will be compiled on demand.

Pros:

  • We spend less time in Tier0 if it's not used
  • We don't invoke that common prologue overhead if it the code path with pinvoke is not used
  • IL_STUB_Pinvoke can already be compiled by someone else so we won't have to jit it again if it's used

Cons:

  • If the codpath is used we pay FullOpts prices compiling it (if it's not compiled yet).

Overall I think it's a reasonable tradeoff

PTAL @dotnet/jit-contrib

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 27, 2022
@ghost ghost assigned EgorBo Oct 27, 2022
@ghost
Copy link

ghost commented Oct 27, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
GetId(false);

static int GetId(bool condition)
{
    if (condition)
        return GetCurrentThreadId();
    return 0;
}

[DllImport("kernel32.dll")]
static extern int GetCurrentThreadId();

Current Tier0 codegen for GetID:

; Assembly listing for method Program:GetId(bool):int
; Tier-0 compilation
; MinOpts code
G_M000_IG01:                ;; offset=0000H
       55                   push     rbp
       4157                 push     r15
       4156                 push     r14
       4155                 push     r13
       4154                 push     r12
       57                   push     rdi
       56                   push     rsi
       53                   push     rbx
       4883EC78             sub      rsp, 120
       488DAC24B0000000     lea      rbp, [rsp+B0H]
       894D10               mov      dword ptr [rbp+10H], ecx
G_M000_IG02:                ;; offset=001BH
       488D8D78FFFFFF       lea      rcx, [rbp-88H]
       498BD2               mov      rdx, r10
       E81628A55F           call     CORINFO_HELP_INIT_PINVOKE_FRAME
       488945B8             mov      qword ptr [rbp-48H], rax
       488BC4               mov      rax, rsp
       48894598             mov      qword ptr [rbp-68H], rax
       488BC5               mov      rax, rbp
       488945A8             mov      qword ptr [rbp-58H], rax
       8B4510               mov      eax, dword ptr [rbp+10H]
       0FB6C0               movzx    rax, al
       85C0                 test     eax, eax
       745E                 je       SHORT G_M000_IG06
       48B8F0ECDC5BF87F0000 mov      rax, 0x7FF85BDCECF0
       48894588             mov      qword ptr [rbp-78H], rax
       488D0521000000       lea      rax, G_M000_IG04
       488945A0             mov      qword ptr [rbp-60H], rax
       488B45B8             mov      rax, qword ptr [rbp-48H]
       488D9578FFFFFF       lea      rdx, bword ptr [rbp-88H]
       48895010             mov      qword ptr [rax+10H], rdx
       488B45B8             mov      rax, qword ptr [rbp-48H]
       C6400C00             mov      byte  ptr [rax+0CH], 0
G_M000_IG03:                ;; offset=0076H
       FF15E4E50B00         call     [Program:GetCurrentThreadId():int]
G_M000_IG04:                ;; offset=007CH
       488B55B8             mov      rdx, qword ptr [rbp-48H]
       C6420C01             mov      byte  ptr [rdx+0CH], 1
       833D4903E35F00       cmp      dword ptr [(reloc 0x7ff8bbb40ab4)], 0
       7406                 je       SHORT G_M000_IG05
       FF15052CE25F         call     [CORINFO_HELP_STOP_FOR_GC]
G_M000_IG05:                ;; offset=0093H
       488B55B8             mov      rdx, qword ptr [rbp-48H]
       488B4D80             mov      rcx, bword ptr [rbp-80H]
       48894A10             mov      qword ptr [rdx+10H], rcx
       8945C4               mov      dword ptr [rbp-3CH], eax
       EB05                 jmp      SHORT G_M000_IG07
G_M000_IG06:                ;; offset=00A4H
       33C0                 xor      eax, eax
       8945C4               mov      dword ptr [rbp-3CH], eax
G_M000_IG07:                ;; offset=00A9H
       8B45C4               mov      eax, dword ptr [rbp-3CH]
G_M000_IG08:                ;; offset=00ACH
       4883C478             add      rsp, 120
       5B                   pop      rbx
       5E                   pop      rsi
       5F                   pop      rdi
       415C                 pop      r12
       415D                 pop      r13
       415E                 pop      r14
       415F                 pop      r15
       5D                   pop      rbp
       C3                   ret      
; Total bytes of code 189

New Tier0 codegen for GetID:

; Assembly listing for method Program:GetId(bool):int
; Tier-0 compilation
; MinOpts code
G_M31913_IG01:              ;; offset=0000H
       55                   push     rbp
       4883EC20             sub      rsp, 32
       488D6C2420           lea      rbp, [rsp+20H]
       894D10               mov      dword ptr [rbp+10H], ecx
G_M31913_IG02:              ;; offset=000DH
       8B4510               mov      eax, dword ptr [rbp+10H]
       0FB6C0               movzx    rax, al
       85C0                 test     eax, eax
       740C                 je       SHORT G_M31913_IG04
       E8D47BE7FF           call     Program:GetCurrentThreadId():int
       90                   nop      
G_M31913_IG03:              ;; offset=001DH
       4883C420             add      rsp, 32
       5D                   pop      rbp
       C3                   ret      
G_M31913_IG04:              ;; offset=0023H
       33C0                 xor      eax, eax
G_M31913_IG05:              ;; offset=0025H
       4883C420             add      rsp, 32
       5D                   pop      rbp
       C3                   ret      
; Total bytes of code 43

-146 bytes of codegen for an used codepath. If it turns out to be used an IL_STUB_PInvoke will be compiled on demand.

Pros:

  • We spend less time in Tier0 if it's not used
  • We don't invoke that common prologue overhead if it the code path with pinvoke is not used
  • IL_STUB_Pinvoke can already be compiled by someone else so we won't have to jit it again if it's used

Cons:

  • If the codpath is used we pay FullOpts prices compiling it (if it's not compiled yet).

Overall I think it's a reasonable tradeoff

PTAL @dotnet/jit-contrib

Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Contributor

cc @AaronRobinsonMSFT

@AaronRobinsonMSFT
Copy link
Member

@EgorBo This could have an impact on startup paths for users with many single use and blittable signatures. I assume when we hit tier 1, the inlining will occur, which is good. However, for a Release build blittable P/Invokes have been inlined for a long time, changing that is something to be leery of.

Can you check if there is a behavioral change when the export of the library isn't found?

@jkotas
Copy link
Member

jkotas commented Oct 27, 2022

Do you some numbers for the cost of pros and cons in some real apps?

@davidwrighton
Copy link
Member

If we're going to measure things, I'd like to see 3 measurements

  1. No pinvoke expansion
  2. What we do today
  3. Tier-0 expands pinvokes, but uses the helper methods instead of the inline expansion of everything. (This technique does not work for IL stubs, but works for normal pinvoke expansion.) This may be a really nice middle ground here.

I'm concerned about this change, and am in particular concerned about the startup time impact on things like winforms, as

  1. As I understand, IL Stubs do not currently participate in tiering, so they are always compiled as tier 1.
  2. Winforms, and other partners have spent extensive amounts of time working on making pinvoke apis blittable, so as to improve startup performance. This change has a risk of regressing all of that work.

@EgorBo
Copy link
Member Author

EgorBo commented Oct 27, 2022

Can you check if there is a behavioral change when the export of the library isn't found?

Looks like we don't throw DllNotFoundException out of JIT when we inline a pinvoke do we? I've just tested it on this sample and it runs without exceptions (with and without the PR):

public class Program
{
    public static void Main(string[] args)
    {
        if (args.Length == 42)
            Test();
    }

    [DllImport("non-existing-lib.dll")]
    static extern int Test();
}

I'm concerned about this change
Do you some numbers for the cost of pros and cons in some real apps?

Ok, I'm closing this for now and if I have data I'll reopen. It's something I found in #77465 and decided to push separately to raise a discussion. Will workaround in #77465 to keep the current behavior.

@EgorBo EgorBo closed this Oct 27, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants