Skip to content

Conversation

@skottmckay
Copy link
Contributor

@skottmckay skottmckay commented Jul 9, 2025

Description

Add ability to get shared allocator from env.
Add ability to create a MemCpyFunc using the IDataTransfer from the environment.

Motivation and Context

Add ability to create a MemCpyFunc using the IDataTransfer from the environment.

As only the CUDA EP has a shared allocator and data transfer implemented OrtValueFromShapeAndType was only updated for CUDA devices. The fallback in other places is more generic.
@skottmckay
Copy link
Contributor Author

@gedoensmax will this provide what you need? not sure if there are any other places needing updates.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@skottmckay skottmckay marked this pull request as ready for review July 21, 2025 09:12
@skottmckay skottmckay requested a review from Copilot July 21, 2025 09:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds the ability for Python bindings to use shared allocators and IDataTransfer instances registered by plugin execution providers in the Environment. The main purpose is to enable data transfer operations between CPU and plugin-defined device memory for OrtValue objects.

Key changes include:

  • Added shared allocator lookup from the Environment for plugin EPs
  • Added ability to create MemCpyFunc using IDataTransfer from the Environment for device-to-CPU transfers
  • Enhanced CUDA EP support to use shared allocators from plugin EPs in non-CUDA builds

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
onnxruntime/test/python/onnxruntime_test_python_autoep.py Added test for shared data transfer and allocator usage, fixed typo
onnxruntime/test/autoep/library/ep_factory.cc Renamed variable from cpu_allocator to device_allocator
onnxruntime/python/onnxruntime_pybind_state.cc Enhanced tensor-to-numpy conversion to use plugin EP data transfer as fallback
onnxruntime/python/onnxruntime_pybind_ortvalue.cc Added plugin EP allocator/data transfer fallback for device memory allocation
onnxruntime/python/onnxruntime_pybind_mlvalue.h Changed MemCpyFunc from function pointer to std::function, added new helper function declarations
onnxruntime/python/onnxruntime_pybind_mlvalue.cc Implemented helper functions for getting shared allocators and creating data transfer functions
onnxruntime/python/onnxruntime_inference_collection.py Enhanced OrtDevice creation to support vendor_id parameter and simplified device creation logic
onnxruntime/core/session/environment.cc Added shared allocator lookup with alignment-ignoring comparison and cleanup improvements
include/onnxruntime/core/session/environment.h Added GetRegisteredSharedAllocator method and made mutex mutable
include/onnxruntime/core/framework/ortdevice.h Added EqualIgnoringAlignment method for device comparison

Clarify/expand some comments.
Fix handling of CANN to set vendor id in backwards compatibility path.
@adrianlizarraga adrianlizarraga merged commit c65de9f into main Aug 6, 2025
111 of 120 checks passed
@adrianlizarraga adrianlizarraga deleted the skottmckay/AddSharedAllocatorAndDataTransferSupportToPythonBindings branch August 6, 2025 16:19
apwojcik added a commit to ROCm/onnxruntime that referenced this pull request Aug 7, 2025
@TedThemistokleous
Copy link
Contributor

@snnn this broke Mainline for AMD, not sure why this got in without anyone from out end even looking at this or CI checks right now.

@skottmckay
Copy link
Contributor Author

@snnn this broke Mainline for AMD, not sure why this got in without anyone from out end even looking at this or CI checks right now.

I'm not sure why the conflict detection didn't work. Typically if main has changed a PR that has conflicts with that change should be blocked from being checked in. The signature of GetCudaToHostMemCpyFunction for example was changed in different ways (return value in one PR, arguments in the other) by the two PRs and that should be trivial to detect as a conflict.

@snnn is something not setup or working correctly in the ORT repo?

@TedThemistokleous
Copy link
Contributor

@skottmckay - Relevant read #25532

You guys are flying blind on all MIGraphX/ROCm/AMD stuff. Feel free to reach out if you need further support and resources from AMD.

@skottmckay
Copy link
Contributor Author

@skottmckay - Relevant read #25532

You guys are flying blind on all MIGraphX/ROCm/AMD stuff. Feel free to reach out if you need further support and resources from AMD.

I'm slightly confused as to what the breakage was though. Was the breakage due to the checkin of this PR with the pending MIGraphX PR or did it break checked in code? If the latter, the conflict detection should have caught it, but based on the sequence of changes it's not clear to me that that was the case.

Original main from June 13
const std::unordered_map<OrtDevice::DeviceType, MemCpyFunc>* GetCudaToHostMemCpyFunction() {

Change from this PR updated the return value but the function still took no args.
const std::unordered_map<OrtDevice, MemCpyFunc>* GetCudaToHostMemCpyFunction() {

Change in new PR adds a new argument which matches what's in #25583 but as that's not checked in yet wouldn't have blocked any other changes going into main.
const std::unordered_map<OrtDevice, MemCpyFunc>* GetCudaToHostMemCpyFunction(const OrtDevice& device) {

What is 'mainline for AMD' building against?

adrianlizarraga pushed a commit that referenced this pull request Aug 8, 2025
…ataTransfer registered by a plugin EP in the Environment (#25346)

### Description
<!-- Describe your changes. -->
Add ability to get shared allocator from env.
Add ability to create a MemCpyFunc using the IDataTransfer from the
environment.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
adrianlizarraga added a commit that referenced this pull request Aug 8, 2025
…5, 25652 (#25701)

### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:

- #25391
- #25611
- #25656
- #25346
- #25374
- #25664
- #25675
- #25652


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Abhishek Jindal <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
…ataTransfer registered by a plugin EP in the Environment (microsoft#25346)

### Description
<!-- Describe your changes. -->
Add ability to get shared allocator from env.
Add ability to create a MemCpyFunc using the IDataTransfer from the
environment.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
…ataTransfer registered by a plugin EP in the Environment (microsoft#25346)

### Description
<!-- Describe your changes. -->
Add ability to get shared allocator from env.
Add ability to create a MemCpyFunc using the IDataTransfer from the
environment.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants