Skip to content

Conversation

@adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented Jul 21, 2025

Description

This is a follow up to #25455

  • Adds ModelCompilationOptions_SetEpContextDataWriteFunc to the Compile API to allow user to specify a write function to write out EPContext node binary data.

Example

struct EpContextDataWriterState {
std::filesystem::path bin_dir_path;
std::vector<std::unique_ptr<std::filesystem::path>> bin_locations;
};
static OrtStatus* ORT_API_CALL TestWriteEpContextData(_In_ void* state,
_In_ const char* ep_context_node_name,
_In_ const char* ep_name,
_In_ const void* buffer,
_In_ size_t buffer_num_bytes,
_Out_ const ORTCHAR_T** location) {
EpContextDataWriterState* custom_state = reinterpret_cast<EpContextDataWriterState*>(state);
std::ostringstream bin_name_builder;
bin_name_builder << ep_name << "_" << ep_context_node_name << ".bin";
auto bin_path = std::make_unique<std::filesystem::path>(custom_state->bin_dir_path / bin_name_builder.str());
std::ofstream out_file_stream(bin_path->c_str(), std::ios::binary);
out_file_stream.write(static_cast<const char*>(buffer), buffer_num_bytes);
*location = bin_path->c_str(); // EP will save this path into EPContext node's 'ep_cache_context' attribute.
return nullptr;
}
// Generate an EPContext model with a plugin EP.
// Tests user-provided function for writing out the EPContext binary data.
TEST(OrtEpLibrary, PluginEp_GenEpContextModel_OutStream) {
RegisteredEpDeviceUniquePtr example_ep;
Utils::RegisterAndGetExampleEp(*ort_env, example_ep);
const OrtEpDevice* plugin_ep_device = example_ep.get();
{
const ORTCHAR_T* input_model_file = ORT_TSTR("testdata/mul_1.onnx");
const ORTCHAR_T* output_model_file = ORT_TSTR("plugin_ep_mul_1_out_stream_ctx.onnx");
std::filesystem::remove(output_model_file);
// Create session with example plugin EP
Ort::SessionOptions session_options;
std::unordered_map<std::string, std::string> ep_options;
session_options.AppendExecutionProvider_V2(*ort_env, {Ort::ConstEpDevice(plugin_ep_device)}, ep_options);
EpContextDataWriterState ep_ctx_write_state = {};
ep_ctx_write_state.bin_dir_path = ORT_TSTR("");
// Create model compilation options from the session options.
Ort::ModelCompilationOptions compile_options(*ort_env, session_options);
compile_options.SetInputModelPath(input_model_file);
compile_options.SetOutputModelPath(output_model_file);
compile_options.SetEpContextEmbedMode(false);
compile_options.SetEpContextDataWriteFunc(TestWriteEpContextData, &ep_ctx_write_state);
// Compile the model.
Ort::Status status = Ort::CompileModel(*ort_env, compile_options);
ASSERT_TRUE(status.IsOK()) << status.GetErrorMessage();
// Make sure the compiled model was generated.
ASSERT_TRUE(std::filesystem::exists(output_model_file));
for (const auto& bin_path : ep_ctx_write_state.bin_locations) {
EXPECT_TRUE(std::filesystem::exists(*bin_path));
std::filesystem::remove(*bin_path);
}
}
}

Motivation and Context

adrianlizarraga added a commit that referenced this pull request Sep 4, 2025
)

### Description
- Adds `ModelCompilationOptions_SetOutputModelWriteFunc` to the compile
API to allow writing the output model ONNX bytes to a user-provided
write function (i.e., for streaming).
- Adds `ModelCompilationOptions_SetOutputModelHandleInitializerFunc` to
the compile API to allow the user to write individual initializers to
some destination. Also allows specifying if an initializer should be
embedded within the ONNX model or written to a custom file.
- Adds C++, Python, and C# bindings for the new APIs.

A follow-up PR adds a write function for EPContext node binary data:
#25471

### Example
`ModelCompilationOptions_SetOutputModelWriteFunc`:
https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2075-L2131

`ModelCompilationOptions_SetOutputModelHandleInitializerFunc`:

https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2160-L2292

### Motivation and Context
Add output streaming capabilities when saving compiled models.
Base automatically changed from adrianl/compile-api-output-stream to main September 4, 2025 20:10
tianleiwu pushed a commit that referenced this pull request Sep 4, 2025
)

### Description
- Adds `ModelCompilationOptions_SetOutputModelWriteFunc` to the compile
API to allow writing the output model ONNX bytes to a user-provided
write function (i.e., for streaming).
- Adds `ModelCompilationOptions_SetOutputModelHandleInitializerFunc` to
the compile API to allow the user to write individual initializers to
some destination. Also allows specifying if an initializer should be
embedded within the ONNX model or written to a custom file.
- Adds C++, Python, and C# bindings for the new APIs.

A follow-up PR adds a write function for EPContext node binary data:
#25471

### Example
`ModelCompilationOptions_SetOutputModelWriteFunc`:
https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2075-L2131

`ModelCompilationOptions_SetOutputModelHandleInitializerFunc`:

https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2160-L2292

### Motivation and Context
Add output streaming capabilities when saving compiled models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants