Compile API: output model and initializer stream write functions #25455

adrianlizarraga · 2025-07-19T00:35:23Z

Description

Adds ModelCompilationOptions_SetOutputModelWriteFunc to the compile API to allow writing the output model ONNX bytes to a user-provided write function (i.e., for streaming).
Adds ModelCompilationOptions_SetOutputModelHandleInitializerFunc to the compile API to allow the user to write individual initializers to some destination. Also allows specifying if an initializer should be embedded within the ONNX model or written to a custom file.
Adds C++, Python, and C# bindings for the new APIs.

A follow-up PR adds a write function for EPContext node binary data: #25471

Example

ModelCompilationOptions_SetOutputModelWriteFunc:

onnxruntime/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc

Lines 2075 to 2131 in c62ed23

    
           static OrtStatus* ORT_API_CALL TestWriteToStream(void* stream_state, const void* buffer, size_t buffer_num_bytes) { 
        
             std::ofstream* outfile = reinterpret_cast<std::ofstream*>(stream_state); 
        
             outfile->write(reinterpret_cast<const char*>(buffer), buffer_num_bytes); 
        
             return nullptr;  // No error 
        
           } 
        
           // Implementation of OrtOutStreamWriteFunc that directly returns an OrtStatus indicating an error. 
        
           static OrtStatus* ORT_API_CALL ReturnStatusFromStream(void* stream_state, const void* buffer, size_t buffer_num_bytes) { 
        
             ORT_UNUSED_PARAMETER(stream_state); 
        
             ORT_UNUSED_PARAMETER(buffer); 
        
             ORT_UNUSED_PARAMETER(buffer_num_bytes); 
        
             return Ort::GetApi().CreateStatus(ORT_FAIL, "Error from OrtOutStreamWriteFunc callback"); 
        
           } 
        
           // Test using the CompileModel() API with settings: 
        
           //   - input model comes from a file 
        
           //   - write output model to custom write stream 
        
           TEST_F(QnnHTPBackendTests, CompileApi_InputFile_WriteOutputModelBytes) { 
        
             const ORTCHAR_T* input_model_file = ORT_TSTR("./compileapi_inputfile_writeoutputmodelbytes.onnx"); 
        
             std::filesystem::remove(input_model_file); 
        
             // Create a test model and save it to a file. 
        
             TestModel test_model; 
        
             CreateTestModel(BuildGraphWithQAndNonQ(false), 21, logging::Severity::kERROR, test_model); 
        
             ASSERT_STATUS_OK(test_model.Save(input_model_file)); 
        
             // Initialize session options with QNN EP 
        
             Ort::SessionOptions so; 
        
             ProviderOptions provider_options; 
        
             provider_options["backend_type"] = "htp"; 
        
             provider_options["offload_graph_io_quantization"] = "0"; 
        
             so.AppendExecutionProvider("QNN", provider_options); 
        
             const ORTCHAR_T* output_model_file = ORT_TSTR("compileapi_inputfile_writeoutputmodelbytes_ctx.onnx"); 
        
             std::filesystem::remove(output_model_file); 
        
             // Open an output file. Test will incrementally write the output model to file 
        
             // via calls to our OrtOutStreamWriteFunc callback. 
        
             ASSERT_FALSE(std::filesystem::exists(output_model_file)); 
        
             std::ofstream outfile(output_model_file, std::ios::binary); 
        
             // Create model compilation options from the session options. 
        
             Ort::ModelCompilationOptions compile_options(*ort_env, so); 
        
             compile_options.SetInputModelPath(input_model_file); 
        
             compile_options.SetOutputModelWriteFunc(TestWriteToStream, reinterpret_cast<void*>(&outfile)); 
        
             compile_options.SetEpContextEmbedMode(true); 
        
             // Compile the model. 
        
             Ort::Status status = Ort::CompileModel(*ort_env, compile_options); 
        
             ASSERT_TRUE(status.IsOK()) << status.GetErrorMessage(); 
        
             outfile.flush(); 
        
             outfile.close(); 
        
             // Check that the compiled model has the expected number of EPContext nodes. 
        
             ASSERT_TRUE(std::filesystem::exists(output_model_file)); 
        
             CheckEpContextNodeCounts(output_model_file, 2, 2); 
        
           }

ModelCompilationOptions_SetOutputModelHandleInitializerFunc:

onnxruntime/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc

Lines 2160 to 2292 in c62ed23

    
           struct CustomInitializerHandlerState { 
        
             const ORTCHAR_T* external_file_path = nullptr; 
        
             std::ofstream* outfile = nullptr; 
        
           }; 
        
           static OrtStatus* ORT_API_CALL TestHandleInitializerDataFunc(void* state, 
        
                                                                        const char* initializer_name, 
        
                                                                        const OrtValue* initializer_value, 
        
                                                                        const OrtExternalInitializerInfo* /*external_info*/, 
        
                                                                        OrtExternalInitializerInfo** new_external_info) { 
        
             const OrtApi& ort_api = Ort::GetApi(); 
        
             CustomInitializerHandlerState* custom_state = reinterpret_cast<CustomInitializerHandlerState*>(state); 
        
             if (std::string("constant") == initializer_name) { 
        
               // Keep a specific initializer in the model just to test both scenarios. 
        
               // A real implementation may check the byte size and keep small initializers in the model. 
        
               *new_external_info = nullptr; 
        
               return nullptr; 
        
             } 
        
             // 
        
             // Store other initializers in an external file. 
        
             // 
        
             // Get initializer's byte size 
        
             size_t byte_size = 0; 
        
             if (OrtStatus* status = ort_api.GetTensorSizeInBytes(initializer_value, &byte_size); status != nullptr) { 
        
               return status; 
        
             } 
        
             // Get initializer's data. 
        
             const void* initializer_data = nullptr; 
        
             if (OrtStatus* status = ort_api.GetTensorData(initializer_value, &initializer_data); status != nullptr) { 
        
               return status; 
        
             } 
        
             // Write initializer data to some file. 
        
             int64_t offset = custom_state->outfile->tellp(); 
        
             const ORTCHAR_T* location = custom_state->external_file_path; 
        
             custom_state->outfile->write(static_cast<const char*>(initializer_data), byte_size); 
        
             custom_state->outfile->flush(); 
        
             // Provide caller (ORT) with the new external info. 
        
             if (OrtStatus* status = ort_api.CreateExternalInitializerInfo(location, offset, byte_size, new_external_info); 
        
                 status != nullptr) { 
        
               return status; 
        
             } 
        
             return nullptr; 
        
           } 
        
           // Test using the CompileModel() API with settings: 
        
           //   - input model comes from a file 
        
           //   - write output model to a file 
        
           //   - Use callback to specify where each initializer is stored (i.e., external file or within model). 
        
           TEST_F(QnnHTPBackendTests, CompileApi_InputFile_OutputFile_InitializerHandler) { 
        
             const ORTCHAR_T* input_model_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler.onnx"); 
        
             const ORTCHAR_T* output_model_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler_ctx.onnx"); 
        
             const ORTCHAR_T* initializer_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler.bin"); 
        
             std::filesystem::remove(input_model_file); 
        
             std::filesystem::remove(output_model_file); 
        
             std::filesystem::remove(initializer_file); 
        
             // Create a test model and save it to a file. 
        
             TestModel test_model; 
        
             CreateTestModel(BuildGraphWithQAndNonQ(false), 21, logging::Severity::kERROR, test_model); 
        
             ASSERT_STATUS_OK(test_model.Save(input_model_file)); 
        
             // Initialize session options with QNN EP 
        
             Ort::SessionOptions so; 
        
             ProviderOptions provider_options; 
        
             provider_options["backend_type"] = "htp"; 
        
             provider_options["offload_graph_io_quantization"] = "0"; 
        
             so.AppendExecutionProvider("QNN", provider_options); 
        
             // Open a file to store external initializers. ORT will call our handler function for every initializer. 
        
             ASSERT_FALSE(std::filesystem::exists(initializer_file)); 
        
             std::ofstream outfile(initializer_file, std::ios::binary); 
        
             CustomInitializerHandlerState custom_state = {initializer_file, &outfile}; 
        
             // Create model compilation options from the session options. 
        
             Ort::ModelCompilationOptions compile_options(*ort_env, so); 
        
             compile_options.SetInputModelPath(input_model_file); 
        
             compile_options.SetOutputModelPath(output_model_file); 
        
             compile_options.SetOutputModelHandleInitializerFunc(TestHandleInitializerDataFunc, 
        
                                                                 reinterpret_cast<void*>(&custom_state)); 
        
             compile_options.SetEpContextEmbedMode(true); 
        
             // Compile the model. 
        
             Ort::Status status = Ort::CompileModel(*ort_env, compile_options); 
        
             ASSERT_TRUE(status.IsOK()) << status.GetErrorMessage(); 
        
             outfile.flush(); 
        
             outfile.close(); 
        
             ASSERT_TRUE(std::filesystem::exists(initializer_file)); 
        
             ASSERT_TRUE(std::filesystem::exists(output_model_file)); 
        
             CheckEpContextNodeCounts(output_model_file, 2, 2); 
        
           } 
        
           static OrtStatus* ORT_API_CALL ReuseExternalInitializers(void* state, 
        
                                                                    const char* /*initializer_name*/, 
        
                                                                    const OrtValue* /*initializer_value*/, 
        
                                                                    const OrtExternalInitializerInfo* external_info, 
        
                                                                    OrtExternalInitializerInfo** new_external_info) { 
        
             // If the original initializer was stored in an external file, keep it there (just for testing). 
        
             if (external_info != nullptr) { 
        
               Ort::ConstExternalInitializerInfo info(external_info); 
        
               auto location = info.GetFilePath(); 
        
               int64_t offset = info.GetFileOffset(); 
        
               size_t byte_size = info.GetByteSize(); 
        
               Ort::ExternalInitializerInfo new_info(nullptr); 
        
               Ort::Status status = Ort::ExternalInitializerInfo::Create(location.c_str(), offset, byte_size, new_info); 
        
               if (!status.IsOK()) { 
        
                 return status.release(); 
        
               } 
        
               *new_external_info = new_info.release(); 
        
               // Keep track of number of reused external initializers so that we can assert 
        
               // that we reused the expected number of initializers. 
        
               // THIS IS TEST CODE. An application would not do this. 
        
               size_t* num_reused_ext_initializers = reinterpret_cast<size_t*>(state); 
        
               *num_reused_ext_initializers += 1; 
        
               return nullptr; 
        
             } 
        
             // If not originally external, save it within the generated compiled model 
        
             *new_external_info = nullptr; 
        
             return nullptr; 
        
           }

Motivation and Context

Add output streaming capabilities when saving compiled models.

…test_allocators.cc

… original initializer

include/onnxruntime/core/session/onnxruntime_cxx_api.h

onnxruntime/core/framework/ep_context_options.h

onnxruntime/core/framework/graph_partitioner.cc

include/onnxruntime/core/graph/graph.h

csharp/src/Microsoft.ML.OnnxRuntime/OrtValue.shared.cs

csharp/src/Microsoft.ML.OnnxRuntime/OrtExternalInitializerInfo.shared.cs

onnxruntime/core/graph/graph.cc

onnxruntime/core/graph/model.cc

onnxruntime/core/graph/graph.cc

onnxruntime/python/onnxruntime_pybind_model_compiler.h

include/onnxruntime/core/session/onnxruntime_c_api.h

csharp/src/Microsoft.ML.OnnxRuntime/NativeMethods.shared.cs

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.shared.cs

csharp/src/Microsoft.ML.OnnxRuntime/CompileModel.shared.cs

csharp/src/Microsoft.ML.OnnxRuntime/OrtValue.shared.cs

csharp/test/Microsoft.ML.OnnxRuntime.Tests.Common/CompileApiTests.cs

csharp/src/Microsoft.ML.OnnxRuntime/CompileModel.shared.cs

) ### Description - Adds `ModelCompilationOptions_SetOutputModelWriteFunc` to the compile API to allow writing the output model ONNX bytes to a user-provided write function (i.e., for streaming). - Adds `ModelCompilationOptions_SetOutputModelHandleInitializerFunc` to the compile API to allow the user to write individual initializers to some destination. Also allows specifying if an initializer should be embedded within the ONNX model or written to a custom file. - Adds C++, Python, and C# bindings for the new APIs. A follow-up PR adds a write function for EPContext node binary data: #25471 ### Example `ModelCompilationOptions_SetOutputModelWriteFunc`: https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2075-L2131 `ModelCompilationOptions_SetOutputModelHandleInitializerFunc`: https://github.com/microsoft/onnxruntime/blob/c62ed23c328cbbfefd3083c1f7a6ced604772c19/onnxruntime/test/providers/qnn/qnn_ep_context_test.cc#L2160-L2292 ### Motivation and Context Add output streaming capabilities when saving compiled models.

### Description Cherry-pick the following PRs: #25943 #25937 #25917 #25909 #25898 #25897 #25888 #25881 #25830 #25619 #25575 #25572 #25558 #25530 #25474 #25455 #25110 Also two dependent PRs for qMoE cpu: #25877 #25822 --------- Co-authored-by: xiaomsft <[email protected]> Co-authored-by: Xiaoyan Hu <[email protected]> Co-authored-by: Akshay Sonawane <[email protected]> Co-authored-by: Kunal Vaishnavi <[email protected]> Co-authored-by: Pradeep Sakhamoori <[email protected]> Co-authored-by: mingyue <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Emmanuel <[email protected]> Co-authored-by: Emmanuel Assumang <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: praneshgo <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]> Co-authored-by: Jing Fang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]>

…#26439) ### Description Fixes #26294 When using the old model compilation approach (session option configuration), ORT should verify that the generated output model does not already exist. Importantly, this check should be done _before_ calling an EP's compile() method. This PR fixes this check, which was unintentionally disabled by a [previous PR.](#25455). Note that this check also (correctly) happens _after_ calling the EP's compile() method, but it is better to catch it early if we can. ### Motivation and Context Fixes a regression in the older compilation workflow.

adrianlizarraga added 11 commits July 9, 2025 10:36

Add func typedefs

033a887

Merge branch 'main' into adrianl/compile-api-output-stream

fcdb5cf

stub apis

03eb5fa

merge main

3310968

new branch. add 2 streams first

c3693de

Move away from using Graph's graph_proto_ member

a69d5f9

fix deref assignment

5743dcd

Clean up

fd87e0c

Merge branch 'main' into adrianl/compile-api-output-stream

a40f463

Use std::filesystem::path in ModelCompilationOptions; fix memleak in …

0dadf4d

…test_allocators.cc

fix unused variable warning (as error)

d94cf44

adrianlizarraga mentioned this pull request Jul 21, 2025

Compile API: output EPContext binary data to write function #25471

Draft

adrianlizarraga marked this pull request as ready for review July 21, 2025 16:07

jywu-msft added the release:1.23.1 label Jul 22, 2025

jywu-msft added release:1.23.0 and removed release:1.23.1 labels Aug 2, 2025

adrianlizarraga added 13 commits August 28, 2025 10:43

Merge main and fix conflicts

5bfbddb

Update handler function signature to take in the ExternalDataInfo for…

69a4338

… original initializer

Add test that reuses external initializers from original model

90ade82

Define new ExternalDataInfo constructor only for non-minimal builds

c36afe5

Merge branch 'main' into adrianl/compile-api-output-stream

c07dc11

Fix unused variable warning (as error)

4b83a2b

another unused variable

91acc8f

Merge branch 'main' into adrianl/compile-api-output-stream

6e5629a

clean up

9b092bf

Start adding csharp api funcs

049b9ad

Remove qnn_factory memleak fix (address in different PR)

8e00a06

Add ExternalInitializerInfo to C++ api

11a6c74

Add compile_to_stream py api

9ca882f

yuslepukhin requested changes Aug 29, 2025

View reviewed changes

adrianlizarraga added 3 commits September 1, 2025 21:23

Update comment

0b2f0e6

Merge branch 'main' into adrianl/compile-api-output-stream

83758d1

Address review comments

c62ed23

adrianlizarraga commented Sep 2, 2025

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/OrtValue.shared.cs Outdated Show resolved Hide resolved

yuslepukhin requested changes Sep 2, 2025

View reviewed changes

adrianlizarraga mentioned this pull request Sep 2, 2025

[CXX] Introduce C++ API for new C entry points #25897

Merged

adrianlizarraga added 5 commits September 2, 2025 23:43

Address review comments

a35e7b6

Remove unused variable

d906855

Merge branch 'main' into adrianl/compile-api-output-stream

255c2df

Merge main conflicts

3db3117

Merge main again

c7f98de

yuslepukhin requested changes Sep 3, 2025

View reviewed changes

adrianlizarraga added 3 commits September 3, 2025 15:17

Address review comments for C#

9031635

Rename functions in C and python

abd0297

Merge branch 'main' into adrianl/compile-api-output-stream

d5012fb

yuslepukhin reviewed Sep 4, 2025

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/CompileModel.shared.cs Show resolved Hide resolved

yuslepukhin reviewed Sep 4, 2025

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/CompileModel.shared.cs Outdated Show resolved Hide resolved

yuslepukhin reviewed Sep 4, 2025

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/CompileModel.shared.cs Show resolved Hide resolved

Address comments

0e0497a

yuslepukhin approved these changes Sep 4, 2025

View reviewed changes

Merge branch 'main' into adrianl/compile-api-output-stream

0a61f1f

adrianlizarraga merged commit 8705c68 into main Sep 4, 2025
90 of 93 checks passed

adrianlizarraga deleted the adrianl/compile-api-output-stream branch September 4, 2025 20:10

tianleiwu mentioned this pull request Sep 4, 2025

cherry picks for 1.23.0 release #25959

Merged

tianleiwu added cherry-picked Cherry-picked for a cherrypicks branch and removed release:1.23.0 labels Sep 4, 2025

adrianlizarraga mentioned this pull request Oct 29, 2025

Fix early check that prevents a compiled model from being overwritten #26439

Merged

	static OrtStatus* ORT_API_CALL TestWriteToStream(void* stream_state, const void* buffer, size_t buffer_num_bytes) {
	std::ofstream* outfile = reinterpret_cast<std::ofstream*>(stream_state);
	outfile->write(reinterpret_cast<const char*>(buffer), buffer_num_bytes);
	return nullptr; // No error
	}

	// Implementation of OrtOutStreamWriteFunc that directly returns an OrtStatus indicating an error.
	static OrtStatus* ORT_API_CALL ReturnStatusFromStream(void* stream_state, const void* buffer, size_t buffer_num_bytes) {
	ORT_UNUSED_PARAMETER(stream_state);
	ORT_UNUSED_PARAMETER(buffer);
	ORT_UNUSED_PARAMETER(buffer_num_bytes);
	return Ort::GetApi().CreateStatus(ORT_FAIL, "Error from OrtOutStreamWriteFunc callback");
	}

	// Test using the CompileModel() API with settings:
	// - input model comes from a file
	// - write output model to custom write stream
	TEST_F(QnnHTPBackendTests, CompileApi_InputFile_WriteOutputModelBytes) {
	const ORTCHAR_T* input_model_file = ORT_TSTR("./compileapi_inputfile_writeoutputmodelbytes.onnx");
	std::filesystem::remove(input_model_file);

	// Create a test model and save it to a file.
	TestModel test_model;
	CreateTestModel(BuildGraphWithQAndNonQ(false), 21, logging::Severity::kERROR, test_model);
	ASSERT_STATUS_OK(test_model.Save(input_model_file));

	// Initialize session options with QNN EP
	Ort::SessionOptions so;
	ProviderOptions provider_options;
	provider_options["backend_type"] = "htp";
	provider_options["offload_graph_io_quantization"] = "0";
	so.AppendExecutionProvider("QNN", provider_options);

	const ORTCHAR_T* output_model_file = ORT_TSTR("compileapi_inputfile_writeoutputmodelbytes_ctx.onnx");
	std::filesystem::remove(output_model_file);

	// Open an output file. Test will incrementally write the output model to file
	// via calls to our OrtOutStreamWriteFunc callback.
	ASSERT_FALSE(std::filesystem::exists(output_model_file));
	std::ofstream outfile(output_model_file, std::ios::binary);

	// Create model compilation options from the session options.
	Ort::ModelCompilationOptions compile_options(*ort_env, so);
	compile_options.SetInputModelPath(input_model_file);
	compile_options.SetOutputModelWriteFunc(TestWriteToStream, reinterpret_cast<void*>(&outfile));
	compile_options.SetEpContextEmbedMode(true);

	// Compile the model.
	Ort::Status status = Ort::CompileModel(*ort_env, compile_options);
	ASSERT_TRUE(status.IsOK()) << status.GetErrorMessage();
	outfile.flush();
	outfile.close();

	// Check that the compiled model has the expected number of EPContext nodes.
	ASSERT_TRUE(std::filesystem::exists(output_model_file));
	CheckEpContextNodeCounts(output_model_file, 2, 2);
	}

	struct CustomInitializerHandlerState {
	const ORTCHAR_T* external_file_path = nullptr;
	std::ofstream* outfile = nullptr;
	};

	static OrtStatus* ORT_API_CALL TestHandleInitializerDataFunc(void* state,
	const char* initializer_name,
	const OrtValue* initializer_value,
	const OrtExternalInitializerInfo* /external_info/,
	OrtExternalInitializerInfo** new_external_info) {
	const OrtApi& ort_api = Ort::GetApi();
	CustomInitializerHandlerState* custom_state = reinterpret_cast<CustomInitializerHandlerState*>(state);

	if (std::string("constant") == initializer_name) {
	// Keep a specific initializer in the model just to test both scenarios.
	// A real implementation may check the byte size and keep small initializers in the model.
	*new_external_info = nullptr;
	return nullptr;
	}

	//
	// Store other initializers in an external file.
	//

	// Get initializer's byte size
	size_t byte_size = 0;
	if (OrtStatus* status = ort_api.GetTensorSizeInBytes(initializer_value, &byte_size); status != nullptr) {
	return status;
	}

	// Get initializer's data.
	const void* initializer_data = nullptr;
	if (OrtStatus* status = ort_api.GetTensorData(initializer_value, &initializer_data); status != nullptr) {
	return status;
	}

	// Write initializer data to some file.
	int64_t offset = custom_state->outfile->tellp();
	const ORTCHAR_T* location = custom_state->external_file_path;

	custom_state->outfile->write(static_cast<const char*>(initializer_data), byte_size);
	custom_state->outfile->flush();

	// Provide caller (ORT) with the new external info.
	if (OrtStatus* status = ort_api.CreateExternalInitializerInfo(location, offset, byte_size, new_external_info);
	status != nullptr) {
	return status;
	}

	return nullptr;
	}

	// Test using the CompileModel() API with settings:
	// - input model comes from a file
	// - write output model to a file
	// - Use callback to specify where each initializer is stored (i.e., external file or within model).
	TEST_F(QnnHTPBackendTests, CompileApi_InputFile_OutputFile_InitializerHandler) {
	const ORTCHAR_T* input_model_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler.onnx");
	const ORTCHAR_T* output_model_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler_ctx.onnx");
	const ORTCHAR_T* initializer_file = ORT_TSTR("./compileapi_inputfile_outputfile_initializerhandler.bin");
	std::filesystem::remove(input_model_file);
	std::filesystem::remove(output_model_file);
	std::filesystem::remove(initializer_file);

	// Create a test model and save it to a file.
	TestModel test_model;
	CreateTestModel(BuildGraphWithQAndNonQ(false), 21, logging::Severity::kERROR, test_model);
	ASSERT_STATUS_OK(test_model.Save(input_model_file));

	// Initialize session options with QNN EP
	Ort::SessionOptions so;
	ProviderOptions provider_options;
	provider_options["backend_type"] = "htp";
	provider_options["offload_graph_io_quantization"] = "0";
	so.AppendExecutionProvider("QNN", provider_options);

	// Open a file to store external initializers. ORT will call our handler function for every initializer.
	ASSERT_FALSE(std::filesystem::exists(initializer_file));
	std::ofstream outfile(initializer_file, std::ios::binary);
	CustomInitializerHandlerState custom_state = {initializer_file, &outfile};

	// Create model compilation options from the session options.
	Ort::ModelCompilationOptions compile_options(*ort_env, so);
	compile_options.SetInputModelPath(input_model_file);
	compile_options.SetOutputModelPath(output_model_file);
	compile_options.SetOutputModelHandleInitializerFunc(TestHandleInitializerDataFunc,
	reinterpret_cast<void*>(&custom_state));
	compile_options.SetEpContextEmbedMode(true);

	// Compile the model.
	Ort::Status status = Ort::CompileModel(*ort_env, compile_options);
	ASSERT_TRUE(status.IsOK()) << status.GetErrorMessage();
	outfile.flush();
	outfile.close();

	ASSERT_TRUE(std::filesystem::exists(initializer_file));
	ASSERT_TRUE(std::filesystem::exists(output_model_file));
	CheckEpContextNodeCounts(output_model_file, 2, 2);
	}

	static OrtStatus* ORT_API_CALL ReuseExternalInitializers(void* state,
	const char* /initializer_name/,
	const OrtValue* /initializer_value/,
	const OrtExternalInitializerInfo* external_info,
	OrtExternalInitializerInfo** new_external_info) {
	// If the original initializer was stored in an external file, keep it there (just for testing).
	if (external_info != nullptr) {
	Ort::ConstExternalInitializerInfo info(external_info);
	auto location = info.GetFilePath();
	int64_t offset = info.GetFileOffset();
	size_t byte_size = info.GetByteSize();

	Ort::ExternalInitializerInfo new_info(nullptr);
	Ort::Status status = Ort::ExternalInitializerInfo::Create(location.c_str(), offset, byte_size, new_info);
	if (!status.IsOK()) {
	return status.release();
	}

	*new_external_info = new_info.release();

	// Keep track of number of reused external initializers so that we can assert
	// that we reused the expected number of initializers.
	// THIS IS TEST CODE. An application would not do this.
	size_t* num_reused_ext_initializers = reinterpret_cast<size_t*>(state);
	*num_reused_ext_initializers += 1;

	return nullptr;
	}

	// If not originally external, save it within the generated compiled model
	*new_external_info = nullptr;
	return nullptr;
	}

Compile API: output model and initializer stream write functions #25455

Compile API: output model and initializer stream write functions #25455

Uh oh!

Conversation

adrianlizarraga commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example

Motivation and Context

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

adrianlizarraga commented Jul 19, 2025 •

edited

Loading