-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[TRT RTX EP] Fixing the stream parameter in CopyTensors API and passing cudastream by value in createNotification API #25937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TRT RTX EP] Fixing the stream parameter in CopyTensors API and passing cudastream by value in createNotification API #25937
Conversation
|
@microsoft-github-policy-service agree company="Nvidia" |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
… by value in createNotification API Fixing the stream parameter in CopyTensors API to pass the application passed stream instead of nullptr Passing cudastream by value in createNotification API as passing pointer was leading to dangling reference issues. - Without this code, copy tensors always happens synchronously even when app specifies a different stream to it. - Passing pointer for cudastream in EP API is leading to dangling reference issues, hence switched to passing value.
CreateNotification API in CUDA EP as well Passing by pointer seems to lead to dandling reference errors
2008daa to
f6380cc
Compare
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
…ng cudastream by value in createNotification API (#25937) Fixing the stream parameter in CopyTensors API and passing cudastream by value in createNotification API ### Description <!-- Describe your changes. --> Fixing the stream parameter in CopyTensors API to pass the application passed stream instead of nullptr Passing cudastream by value in createNotification API as passing pointer was leading to dangling reference issues. Can you please make sure that this goes into 1.23? @chilo-ms ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - Without this code, copy tensors always happens synchronously even when app specifies a different stream to it. - Passing pointer for cudastream in EP API is leading to dangling reference issues, hence switched to passing value.
### Description Cherry-pick the following PRs: #25943 #25937 #25917 #25909 #25898 #25897 #25888 #25881 #25830 #25619 #25575 #25572 #25558 #25530 #25474 #25455 #25110 Also two dependent PRs for qMoE cpu: #25877 #25822 --------- Co-authored-by: xiaomsft <[email protected]> Co-authored-by: Xiaoyan Hu <[email protected]> Co-authored-by: Akshay Sonawane <[email protected]> Co-authored-by: Kunal Vaishnavi <[email protected]> Co-authored-by: Pradeep Sakhamoori <[email protected]> Co-authored-by: mingyue <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Emmanuel <[email protected]> Co-authored-by: Emmanuel Assumang <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: praneshgo <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]> Co-authored-by: Jing Fang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]>
Fixing the stream parameter in CopyTensors API and passing cudastream by value in createNotification API
Description
Fixing the stream parameter in CopyTensors API to pass the application passed stream instead of nullptr Passing cudastream by value in createNotification API as passing pointer was leading to dangling reference issues.
Can you please make sure that this goes into 1.23? @chilo-ms
Motivation and Context