Skip to content

Conversation

@adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented Jul 21, 2025

Description

  • Disables graph optimizations by default when using the explicit compiling API.
  • Adds ModelCompilationOptions_SetGraphOptimizationLevel to allow the user to set an optimization level.
  • Adds C++, Python, and C# bindings for the new API function.
  • Updates ModelCompilationOptions_SetFlags to take in a uint32_t flags parameter instead of size_t flags to ensure the same size across platforms. This API is not yet in a public ORT release, so safe to modify.

Motivation and Context

When compiling, prefer allowing the EP to do the optimizations instead of ORT.

@yuslepukhin yuslepukhin requested a review from Copilot August 29, 2025 22:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR disables graph optimizations by default in the explicit compiling API, preferring to let execution providers handle optimizations instead. It adds a new method to allow users to manually enable optimizations when needed.

  • Changes the default graph optimization level to TransformerLevel::Default (minimum optimizations) instead of the previous behavior
  • Adds ModelCompilationOptions_SetGraphOptimizationLevel API to allow users to explicitly set optimization levels
  • Updates test expectations to account for different EPContext node input counts based on optimization settings

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
onnxruntime/core/session/model_compilation_options.h Declares new SetGraphOptimizationLevel method
onnxruntime/core/session/model_compilation_options.cc Implements optimization level setting and sets default to L0
onnxruntime/core/session/compile_api.h Declares C API wrapper for graph optimization level setting
onnxruntime/core/session/compile_api.cc Implements C API wrapper function
include/onnxruntime/core/session/onnxruntime_c_api.h Adds C API function declaration and documentation
include/onnxruntime/core/session/onnxruntime_cxx_api.h Adds C++ wrapper method declaration
include/onnxruntime/core/session/onnxruntime_cxx_inline.h Implements C++ wrapper method
onnxruntime/test/providers/qnn/qnn_ep_context_test.cc Updates tests to handle different optimization behaviors and add explicit optimization setting

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@adrianlizarraga adrianlizarraga merged commit 45ffd99 into main Sep 3, 2025
92 checks passed
@adrianlizarraga adrianlizarraga deleted the adrianl/compile-api-disable-optimizations-default branch September 3, 2025 15:48
tianleiwu pushed a commit that referenced this pull request Sep 4, 2025
### Description
- Disables graph optimizations by default when using the explicit
compiling API.
- Adds `ModelCompilationOptions_SetGraphOptimizationLevel` to allow the
user to set an optimization level.
- Adds C++, Python, and C# bindings for the new API function.
- Updates `ModelCompilationOptions_SetFlags` to take in a `uint32_t
flags` parameter instead of `size_t flags` to ensure the same size
across platforms. This API is not yet in a public ORT release, so safe
to modify.



### Motivation and Context
When compiling, prefer allowing the EP to do the optimizations instead
of ORT.
@tianleiwu tianleiwu added cherry-picked Cherry-picked for a cherrypicks branch and removed release:1.23.0 labels Sep 4, 2025
jywu-msft pushed a commit that referenced this pull request Sep 5, 2025
### Description
Cherry-pick the following PRs:
#25943
#25937 
#25917
#25909
#25898
#25897
#25888
#25881
#25830
#25619
#25575
#25572
#25558
#25530
#25474
#25455
#25110

Also two dependent PRs for qMoE cpu: 
#25877
#25822

---------

Co-authored-by: xiaomsft <[email protected]>
Co-authored-by: Xiaoyan Hu <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: Kunal Vaishnavi <[email protected]>
Co-authored-by: Pradeep Sakhamoori <[email protected]>
Co-authored-by: mingyue <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Emmanuel <[email protected]>
Co-authored-by: Emmanuel Assumang <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: praneshgo <[email protected]>
Co-authored-by: Hariharan Seshadri <[email protected]>
Co-authored-by: Jing Fang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked Cherry-picked for a cherrypicks branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants