Skip to content

Conversation

@snnn
Copy link
Member

@snnn snnn commented Sep 16, 2025

fs-eire and others added 18 commits July 26, 2025 21:36
- DynamicQuantizeMatMul - handle case where B zero point input is
provided but not constant. (#25544)
- Refactor plugin EP support (#25541)
- Remove the python installation steps from
win-qnn-arm64-ci-pipeline.yml (#25552)
- [EP ABI] Node_GetAttrByName returns ORT_NOT_FOUND with non-existing
attr name (#25565)
- Fix C/C++ documentation generation (#25569)
- [build] fix multi-config for VCPKG (#25585)
…1.23.0 release branch (#25606)

### Description
Cherry-pick the #25566 for ORT 1.23
This PR cherry-picks some pipeline changes from the main branch to the
1.23.0 release branch.


- **[build] disable CodeQL for NPM Packaging Pipeline (#25614)**
- **Refactor Java Test Pipeline (#25608)**
- **[build] upgrade Node.js for NPM packaging pipeline (#25568)**

And a WebGPU change:

- **[webgpu] Apply Flash Attention if sliding window exceeds KV cache
length (#25594)**
### Description
<!-- Describe your changes. -->
Move moving weights to memory to the end of Graph::Resolve().
Modify Inject so it copies data into TensorProto according to the C API
docs.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
TypeAndShape inference runs as a part of `Resolve()` and it unable to
inspect and load the initializers that point to OrtValues at that time.
We choose to move TensorProto to OrtValue conversion at the end of
`Resolve()`.

References: #25579

Co-authored-by: Dmitri Smirnov <[email protected]>
Cherry-pick MiGraphX EP fixes from upstream for rel-1.23.0

This PR cherry-picks three critical fixes for the MiGraphX Execution
Provider:

1. Fix compilation after cherry-picking from win-onnxruntime (#25516)
- Adds ORT_UNUSED_PARAMETER(num_devices) to fix unused parameter warning
   - Corrects struct usage in CreateIExecutionProvider method
   
2. Fix CreateExecutionProviderFactory with correct struct and change
vendor_id (#25625)
- Updates vendor_id from 0x1002 to 0x9999 to allow DML EP to be default
   - Ensures proper device ordering in provider_policy_context.cc

3. Update OrtEpFactory in MiGraphX EP (#25567)
   - Adds complete OrtEpFactory infrastructure for auto EP selection
   - Implements all required factory methods with noexcept specifiers
   - Sets ort_version_supported to ORT_API_VERSION
- Enables MiGraphX/AMDGPU EP integration with hardware device detection

These fixes ensure MiGraphX EP builds correctly and integrates properly
with
the ORT execution provider selection framework in the 1.23.0 release.

Cherry-picked commits:
- 87f1499
- 14ca6df  
- 131cf40

---------

Co-authored-by: Artur Wojcik <[email protected]>
Co-authored-by: Owen Zhang <[email protected]>
Co-authored-by: ozhang <[email protected]>
…5, 25652 (#25701)

### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:

- #25391
- #25611
- #25656
- #25346
- #25374
- #25664
- #25675
- #25652


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Abhishek Jindal <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
### Description
Cherry-pick the following PRs into `rel-1.23.0`:
- #25629
- #25583



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Chunye Wang@AMD <[email protected]>
Co-authored-by: mingyue <[email protected]>
Co-authored-by: Artur Wojcik <[email protected]>
Co-authored-by: urpetkov-amd <[email protected]>
Co-authored-by: Ted Themistokleous <[email protected]>
Co-authored-by: Ted Themistokleous <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
### Description
Cherry-picks #25725 into
the `rel-1.23.0` branch.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: Ankit Maheshkar <[email protected]>
Co-authored-by: jatinwadhwa921 <[email protected]>
### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:
- #25592
- #25622
- #25688
- #25729
- #25743
- #25769
- #25745
- #25761
- #25751
- #25716
- #25228
- #25768
- #25788
- #25747
- #25800
- #25818
- #25762
- #25749
- #25831


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-calvnguy <[email protected]>
Co-authored-by: qti-kromero <[email protected]>
Co-authored-by: Jeff Kilpatrick <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: David Fan <[email protected]>
Co-authored-by: kuanyul-qti <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chunye Wang@AMD <[email protected]>
Co-authored-by: minfhong-qti <[email protected]>
Co-authored-by: Vishal Agarwal <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Changming Sun <[email protected]>
Co-authored-by: adrastogi <[email protected]>
Co-authored-by: Aditya Rastogi <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- **Relax WeightBiasQuantization constraint for larger QDQ node group
(#25673)**
- **Add cuda graph implementation for NV TRT RTX EP (#25787)**
- **python GPU IO Bindings for NVIDIA  (#25776)**
- **Fixes for DynamicQuantizeMatMul and Attention3D tests (#25814)**
- **Fix a long standing bug on file memory mapping on windows.
(#25833)**
- **Add API for precompiled model compatibility check using just the
compat info (#25841)**
- **Enable ABSL_FLAGS flag registration for onnxruntime_perf_test for
mobile build (#25849)**
- **Add default constructor to Ort::Status. (#25860)**
- #25871
- #25878
- #25884
- #25886
- #25866
### Description
Cherry-pick the following PRs:
#25943
#25937 
#25917
#25909
#25898
#25897
#25888
#25881
#25830
#25619
#25575
#25572
#25558
#25530
#25474
#25455
#25110

Also two dependent PRs for qMoE cpu: 
#25877
#25822

---------

Co-authored-by: xiaomsft <[email protected]>
Co-authored-by: Xiaoyan Hu <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: Kunal Vaishnavi <[email protected]>
Co-authored-by: Pradeep Sakhamoori <[email protected]>
Co-authored-by: mingyue <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Emmanuel <[email protected]>
Co-authored-by: Emmanuel Assumang <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: praneshgo <[email protected]>
Co-authored-by: Hariharan Seshadri <[email protected]>
Co-authored-by: Jing Fang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
This PR cherry-picks the following PRs to the rel-1.23.0 branch:

* #25938
* #25957
* #25960
* #25968
* #25971

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Hariharan Seshadri <[email protected]>
This PR cherry-picks the following PRs to the release branch:
- #25988
- #25991

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: umangb-09 <[email protected]>
This PR cherry-picks several commits from the main branch to the
rel-1.23.0 release branch as part of the release process.

### Changes included:

*   **Major Refactoring of Azure DevOps Pipelines (#26008)**
    *   Commit: `2e6d7ccfdff55aaf7b0799d7e28b041e607dce2b`
*   **Disables failing test to unblock Python DML Pipeline (#26043)**
    *   Commit: `64c8f40d01bf14b3cf7ac4cf8606ad9e0e56feb0`
*   **Pin cmake version in macOS github Actions (#25998)**
    *   Commit: `148f13cc6b44cae156226cd4e0dcfc154691c5b4`
*   **Bump actions/setup-python from 5 to 6 (#25979)**
    *   Commit: `97a8d332595c974ad24be133df216565493ffb95`
*   **Remove CACHE_URL settings from Github Actions (#25989)**
    *   Commit: `e2a0999ba4b224ab90ef7a8768dd4941fcc19b17`
*   **Bump actions/checkout from 4 to 5 (#25771)**
    *   Commit: `f19215db21f8e1a8fc93090748e455f41076f456`
*   **Bump ruff from 0.12.8 to 0.12.9 (#25772)**
    *   Commit: `78df404871fa2f3fbbb7f1902f9623787ba8dc86`
*   **Bump ruff from 0.12.4 to 0.12.8 (#25713)**
    *   Commit: `7204746e709005d2c7294e7a24d63a2df4a1aee8`
*   **Update macOS target version from 13.3 to 13.4 (#25616)**
    *   Commit: `65bd82564cd31e0acf9139cdd826d08193212c6e`

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Prathik Rao <[email protected]>
This pull request introduces a significant refactoring of the Azure
Pipelines CI/CD infrastructure. The primary goals of these changes are
to:
1. Solve the problem that vcpkg/cmake version can get changed when the
CI build machine image changes, which can make pipeline suddenly broken
and interrupt our release process.
2. Reduce the `Zip-Nuget-Java-Nodejs Packaging Pipeline`'s running time
by changing how macOS universal2 binaries are built.
3. Add the support for RC releases for Java packages.

**1. Standardized Build Tool Setup (`setup-build-tools.yml`)**

A new reusable template, `setup-build-tools.yml`, has been created to
centralize the setup of essential build tools.

* **Pinned Dependencies:** This new template allows us to pin specific
versions of `cmake` and `vcpkg`, which was not previously possible. This
ensures a more stable and predictable build environment across all
pipelines.
* **Reduced Redundancy:** By consolidating the setup logic into a single
template, we have significantly reduced code duplication across numerous
pipeline files, making them cleaner and easier to maintain.

Currently this file is only used in macOS and Windows pipelines since
most Linux pipelines use docker to manage their environment.

**2. Reworked macOS Universal Binary Build Process**

The methodology for building macOS `universal2` binaries has been
fundamentally changed to improve reliability and flexibility.

* **Python Packaging:** The Python packaging pipeline will no longer
produce `universal2` wheels. Instead, it will generate separate wheels
for `x86_64` and `arm64` architectures.
* **NuGet C-API Packaging:** The NuGet C-API pipeline has been updated
to first build the `x86_64` and `arm64` binaries independently. These
single-architecture binaries are then combined to create the final
universal package, rather than building a `universal2` package in a
single pass.

The change is made mainly because:
- Building for both ARCHs in a single pass is too slow, which may take
about 5 hours in the ADO machine pool.
- A lot of MLAS features are disabled when ORT is built in such a way.

**3. Java Packaging and Testing Overhaul**

The Download_Java_Tools stage in "Zip-Nuget-Java-Nodejs Packaging
Pipeline" is deleted because it is no longer used. Previously it was
added to reduce the times of downloading the same java packages again
and again , which was to reduce download errors. Now we have setup a
private ADO feed for this.

Besides, there are some major changes to the pipeline that:

1. MD5 and SHA1 checksum files are provided along with the java package
files instead of SHA256. This is because Sonatype's official docs says
MD5/SHA1 checkcums are required while the others are optional. See:
https://central.sonatype.org/publish/requirements/#supply-javadoc-and-sources
. Now the publishing would fail if we don't have the MD5/SHA1 checksum
files.
2. The format of the checksum files is changed. Previously we used
Linux's sha256sum command to generate such files, so each checksum file
contains a hash value and a filename in the file content. However, it
was not the expected format. Sonatype expects that the file only
contains a hash value. This PR fixes the issue.
3. A few powershell scripts were rewritten in python to improve error
check and robustness
4. Added the support for generating RC packages. Previously we had to
manually modify the version numbers and manually do GPG sign.
5. Two new files `jar-packaging.yml` and `setup-maven.yml` were added.
We will use maven to fetch dependent packages(instead of directly HTTP
fetching) to improve supply chain security, because maven allows us
using a private feed to do so.

**4. Dockerfile Enhancements**

The Dockerfiles used in the CI have been updated to use a `BASEIMAGE`
argument. This makes them more flexible, allowing the base image to be
specified at build time, which simplifies maintenance and updates. It
will allow us to using different base image repos in different CI
environments. In the future we will change the Github Actions to only
fetch base images from public docker repos. Meanwhile, ADO packaging
pipelines will continue to use private repos.

**5. Improved Release Management**

The run_packaging_pipelines.py script has been updated to provide more
robust and explicit control over
the package versioning for different build scenarios. This clarifies the
process for generating nightly,
 release candidate (RC), and final release packages.

 The script now handles three distinct cases for package versioning:

* Nightly Packages: For regular CI builds (e.g., on the main branch),
the script triggers the packaging
pipelines in "nightly" mode. This sets the IsReleaseBuild parameter to
false and results in packages with
      a development version suffix (e.g., 1.2.3-dev-20250909-abcdef).

* Release Candidate (RC) Builds: To create a pre-release or RC build,
the script is run with the
--build-mode release flag, along with the --pre-release-suffix-string
(e.g., rc) and
--pre-release-suffix-number (e.g., 1) arguments. This sets the
IsReleaseBuild parameter to true and
passes the suffix information to the pipelines, resulting in a
semantically versioned pre-release package
      (e.g., 1.2.3-rc.1).

* Final Release Builds: For a final release, the script is run with
--build-mode release without any
pre-release suffix arguments. This sets IsReleaseBuild to true, and the
resulting package will have a
clean, final version number (e.g., 1.2.3) based on the VERSION_NUMBER
file.

Please note:
 - Java packages still only support the second and third mode.
 - Python packages only support the first and the last mode.
Convert QNN x64 CI pipeline from ADO pipeline to Github Actions.

We shall move all PR pipelines to Github Actions.
This PR upgrades the com.diffplug.spotless Gradle plugin to version
7.2.1 in the java/build.gradle file. This brings in the latest features
and bug fixes from the Spotless code formatter.
@snnn snnn closed this Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants