Skip to content

Conversation

@thevishalagarwal
Copy link
Contributor

  • Implements GetEPContextNodes()
  • Enables usage of AddExternalInitializersFromFilesInMemory for models that have to be communicated as byte stream but are larger than 2GB
  • Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, embed_mode=0 must be used. embed_mode=1 fails due to protobuf limitations

@gedoensmax
Copy link
Contributor

@chilo-ms we will need a review of this :)
We disabled the option to refit for now (no weightless engines), since we want to make the interaction cleaner.

@thevishalagarwal
Copy link
Contributor Author

@gedoensmax
Copy link
Contributor

@jywu-msft We are adding more unit tests that i believe will also help test the compile API etc in ORT. Can we resurface the topic of running NV EP in the official ORT CI ?

@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

yuslepukhin added a commit that referenced this pull request Aug 18, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: #25747
@jywu-msft jywu-msft added the ep:NvRTX NV RTX execution provider label Aug 19, 2025
@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

adrianlizarraga pushed a commit that referenced this pull request Aug 21, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: #25747
@gedoensmax
Copy link
Contributor

@chilo-ms Since i resolve the cases where weights are unnecessarily copied based on Dimitri's comments this should be ready to merge.

@chilo-ms
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@chilo-ms chilo-ms merged commit 8d8a82b into microsoft:main Aug 21, 2025
98 of 100 checks passed
adrianlizarraga pushed a commit that referenced this pull request Aug 22, 2025
* Implements `GetEPContextNodes()`
* Enables usage of `AddExternalInitializersFromFilesInMemory` for models
that have to be communicated as byte stream but are larger than 2GB
* Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, `embed_mode=0` must be used.
`embed_mode=1` fails due to protobuf limitations

---------

Co-authored-by: Maximilian Müller <[email protected]>
adrianlizarraga added a commit that referenced this pull request Aug 25, 2025
### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:
- #25592
- #25622
- #25688
- #25729
- #25743
- #25769
- #25745
- #25761
- #25751
- #25716
- #25228
- #25768
- #25788
- #25747
- #25800
- #25818
- #25762
- #25749
- #25831


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-calvnguy <[email protected]>
Co-authored-by: qti-kromero <[email protected]>
Co-authored-by: Jeff Kilpatrick <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: David Fan <[email protected]>
Co-authored-by: kuanyul-qti <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chunye Wang@AMD <[email protected]>
Co-authored-by: minfhong-qti <[email protected]>
Co-authored-by: Vishal Agarwal <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Changming Sun <[email protected]>
Co-authored-by: adrastogi <[email protected]>
Co-authored-by: Aditya Rastogi <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
### Description
<!-- Describe your changes. -->
See the title


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make traditional EPs (non plug-in) access OrtValue initializers.

Re: microsoft#25747
gedoensmax added a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
* Implements `GetEPContextNodes()`
* Enables usage of `AddExternalInitializersFromFilesInMemory` for models
that have to be communicated as byte stream but are larger than 2GB
* Add EP context unit tests for file, bytestreams and both embed modes

NOTE: For large models > 2GB, `embed_mode=0` must be used.
`embed_mode=1` fails due to protobuf limitations

---------

Co-authored-by: Maximilian Müller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:NvRTX NV RTX execution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants