-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Compile API: disable optimizations by default #25474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile API: disable optimizations by default #25474
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR disables graph optimizations by default in the explicit compiling API, preferring to let execution providers handle optimizations instead. It adds a new method to allow users to manually enable optimizations when needed.
- Changes the default graph optimization level to
TransformerLevel::Default(minimum optimizations) instead of the previous behavior - Adds
ModelCompilationOptions_SetGraphOptimizationLevelAPI to allow users to explicitly set optimization levels - Updates test expectations to account for different EPContext node input counts based on optimization settings
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/core/session/model_compilation_options.h | Declares new SetGraphOptimizationLevel method |
| onnxruntime/core/session/model_compilation_options.cc | Implements optimization level setting and sets default to L0 |
| onnxruntime/core/session/compile_api.h | Declares C API wrapper for graph optimization level setting |
| onnxruntime/core/session/compile_api.cc | Implements C API wrapper function |
| include/onnxruntime/core/session/onnxruntime_c_api.h | Adds C API function declaration and documentation |
| include/onnxruntime/core/session/onnxruntime_cxx_api.h | Adds C++ wrapper method declaration |
| include/onnxruntime/core/session/onnxruntime_cxx_inline.h | Implements C++ wrapper method |
| onnxruntime/test/providers/qnn/qnn_ep_context_test.cc | Updates tests to handle different optimization behaviors and add explicit optimization setting |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
### Description - Disables graph optimizations by default when using the explicit compiling API. - Adds `ModelCompilationOptions_SetGraphOptimizationLevel` to allow the user to set an optimization level. - Adds C++, Python, and C# bindings for the new API function. - Updates `ModelCompilationOptions_SetFlags` to take in a `uint32_t flags` parameter instead of `size_t flags` to ensure the same size across platforms. This API is not yet in a public ORT release, so safe to modify. ### Motivation and Context When compiling, prefer allowing the EP to do the optimizations instead of ORT.
### Description Cherry-pick the following PRs: #25943 #25937 #25917 #25909 #25898 #25897 #25888 #25881 #25830 #25619 #25575 #25572 #25558 #25530 #25474 #25455 #25110 Also two dependent PRs for qMoE cpu: #25877 #25822 --------- Co-authored-by: xiaomsft <[email protected]> Co-authored-by: Xiaoyan Hu <[email protected]> Co-authored-by: Akshay Sonawane <[email protected]> Co-authored-by: Kunal Vaishnavi <[email protected]> Co-authored-by: Pradeep Sakhamoori <[email protected]> Co-authored-by: mingyue <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Emmanuel <[email protected]> Co-authored-by: Emmanuel Assumang <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: praneshgo <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]> Co-authored-by: Jing Fang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]>
Description
ModelCompilationOptions_SetGraphOptimizationLevelto allow the user to set an optimization level.ModelCompilationOptions_SetFlagsto take in auint32_t flagsparameter instead ofsize_t flagsto ensure the same size across platforms. This API is not yet in a public ORT release, so safe to modify.Motivation and Context
When compiling, prefer allowing the EP to do the optimizations instead of ORT.