-
Notifications
You must be signed in to change notification settings - Fork 3.5k
1.23.0 cherry-pick prs 25391, 25611, 25656, 25346, 25374, 25664, 25675, 25652 #25701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
adrianlizarraga
merged 8 commits into
rel-1.23.0
from
adrianl/rel-1.23.0/cherrypick_080825
Aug 8, 2025
Merged
1.23.0 cherry-pick prs 25391, 25611, 25656, 25346, 25374, 25664, 25675, 25652 #25701
adrianlizarraga
merged 8 commits into
rel-1.23.0
from
adrianl/rel-1.23.0/cherrypick_080825
Aug 8, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description For f16 uniform variables, use u32 to bit-wise represent them. ### Motivation and Context Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform. This change together with #25349 will enable using f16 models on some Android devices.
### Description <!-- Describe your changes. --> !. Disable Turing GPU EP devices ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Turing will not be supported in this release @chilo-ms @jywu-msft
This currently holds 2 major improvements: - dynamic shape models should have much lower memory usage and in addition to that the management is move towards ORT allocators - the overhead for shape binding and address updates is reduce per inference --------- Co-authored-by: Gaurav Garg <[email protected]>
…ataTransfer registered by a plugin EP in the Environment (#25346) ### Description <!-- Describe your changes. --> Add ability to get shared allocator from env. Add ability to create a MemCpyFunc using the IDataTransfer from the environment. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description
Add support for onnxruntime_perf_test to register plugin EP dll and run
plugin EP.
As support for plugin execution providers (EPs) requires additional
options and most single-character options have already been used,
multi-character options are now necessary to ensure clarity and
readability. Therefore, support for `Abseil flags` is added, which
enables multi-character options and provides cross-platform
compatibility.
**New options:**
- `--plugin_ep_libs [registration names and libraries]` Specifies a list
of plugin execution provider (EP) registration names and their
corresponding shared libraries to register.
[Usage]: `--plugin_ep_libs "plugin_ep_name_1|plugin_ep_1.dll
plugin_ep_name_2|plugin_ep_2.dll ... "`
- `--plugin_eps [Plugin EPs]` Specifies a semicolon-separated list of
plugin execution providers (EPs) to use.
[Usage]: `--plugin_eps "plugin_ep_1;plugin_ep_2;... "`
- `--plugin_ep_options [EP options]` Specifies provider options for each
EP listed in --plugin_eps. Options (key-value pairs) for each EP are
separated by space and EPs are separated by semicolons.
[Usage]:
`--plugin_ep_options "ep_1_option_1_key|ep_1_option_1_value
...;ep_2_option_1_key|ep_2_option_1_value ...;..."` or
`--plugin_ep_options ";ep_2_option_1_key|ep_2_option_1_value ...;..."`
or
`--plugin_ep_options "ep_1_option_1_key|ep_1_option_1_value
...;;ep_3_option_1_key|ep_3_option_1_value ...;..."`
- `--list_ep_devices` Prints all available device indices and their
properties (including metadata). This option makes the program exit
early without performing inference.
- ` --select_ep_devices [list of device indices]` A semicolon-separated
list of device indices to add to the session and run with.
**Usage:**
1. Use `--plugin_ep_libs` and `--list_ep_devices` to list all the
devices.
````sh
--list_ep_devices --plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll example_ep|C:\example_plugin_ep.dll"
````
It will print the devices info
````
===== EP device id 0 ======
name: CPUExecutionProvider
vendor: Microsoft
metadata:
version: 1.23.0
===== EP device id 1 ======
name: example_ep
vendor: Contoso
metadata:
supported_devices: CrackGriffin 7+
version: 0.1.0
===== EP device id 2 ======
name: TensorRTEp
vendor: Nvidia
metadata:
gpu_type: data center
version: 0.1.0
````
2. Use `--select_ep_devices` to select the device by index. And add
`--plugin_eps` to specify the EP name. The EP name should match the name
when ep library passes in to create the ep factory.
````sh
--plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll" --select_ep_devices 2 --plugin_eps TensorRTEp -r 1 C:\mul_op\mul_1.onnx
````
3. Or simply use `-e` to specify the EP name. ORT perf test will add all
the devices created by the plugin EP.
The EP name should match the name when ep library passes in to create
the ep factory.
````sh
--plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll" --plugin_eps TensorRTEp -r 1 C:\mul_op\mul_1.onnx
````
### Description <!-- Describe your changes. --> Relax restriction on DML EP so other CPU based EPs can be used. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #25504
### Description <!-- Describe your changes. --> Update number of mel bins for whisper model as it differs based on the Whisper model version. Otherwise I am unable to run Whisper v3 models as the num_mel_bins is 128 for that. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Currently unable to run Whisper v3 models as the num_mel_bins is 128 and it is fixed to 80 right now and causes issue during preprocessing.
<!-- Describe your changes. --> Make Node::ToProto() const call Graph::ToGraphProto() const so it does process all of the subgraphs recursively and removes all the in-memory references. <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Node::ToProto() const does not respect constness of subgraph attributes and calls non-const version of Graph::ToGraphProto() which does not process subgraph initializers and does not remove in-memory references.
yuslepukhin
approved these changes
Aug 8, 2025
Member
yuslepukhin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
fs-eire
approved these changes
Aug 8, 2025
chilo-ms
approved these changes
Aug 8, 2025
skottmckay
approved these changes
Aug 8, 2025
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Cherry-pick the following PRs into the
rel-1.23.0branch:Motivation and Context