Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
033a887
Add func typedefs
adrianlizarraga Jul 9, 2025
fcdb5cf
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Jul 9, 2025
03eb5fa
stub apis
adrianlizarraga Jul 15, 2025
3310968
merge main
adrianlizarraga Jul 17, 2025
c3693de
new branch. add 2 streams first
adrianlizarraga Jul 19, 2025
a69d5f9
Move away from using Graph's graph_proto_ member
adrianlizarraga Jul 19, 2025
5743dcd
fix deref assignment
adrianlizarraga Jul 20, 2025
fd87e0c
Clean up
adrianlizarraga Jul 21, 2025
a40f463
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Jul 21, 2025
0dadf4d
Use std::filesystem::path in ModelCompilationOptions; fix memleak in …
adrianlizarraga Jul 21, 2025
d94cf44
fix unused variable warning (as error)
adrianlizarraga Jul 21, 2025
5bfbddb
Merge main and fix conflicts
adrianlizarraga Aug 28, 2025
69a4338
Update handler function signature to take in the ExternalDataInfo for…
adrianlizarraga Aug 28, 2025
90ade82
Add test that reuses external initializers from original model
adrianlizarraga Aug 29, 2025
c36afe5
Define new ExternalDataInfo constructor only for non-minimal builds
adrianlizarraga Aug 29, 2025
c07dc11
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Aug 29, 2025
4b83a2b
Fix unused variable warning (as error)
adrianlizarraga Aug 29, 2025
91acc8f
another unused variable
adrianlizarraga Aug 29, 2025
6e5629a
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Aug 29, 2025
9b092bf
clean up
adrianlizarraga Aug 29, 2025
049b9ad
Start adding csharp api funcs
adrianlizarraga Aug 29, 2025
8e00a06
Remove qnn_factory memleak fix (address in different PR)
adrianlizarraga Aug 29, 2025
11a6c74
Add ExternalInitializerInfo to C++ api
adrianlizarraga Aug 29, 2025
9ca882f
Add compile_to_stream py api
adrianlizarraga Aug 29, 2025
6d522d8
Python bindings and tests
adrianlizarraga Aug 30, 2025
af996bb
C# API for WriteBuffer delegate
adrianlizarraga Aug 31, 2025
9b27b31
c# api handle initializers
adrianlizarraga Aug 31, 2025
9607193
missing documentation in c#
adrianlizarraga Aug 31, 2025
e65710a
Add ExternalInitializerInfo C# class
adrianlizarraga Aug 31, 2025
c16b327
Full C# API for delegate that handles initializers
adrianlizarraga Sep 1, 2025
0b2f0e6
Update comment
adrianlizarraga Sep 2, 2025
83758d1
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Sep 2, 2025
c62ed23
Address review comments
adrianlizarraga Sep 2, 2025
a35e7b6
Address review comments
adrianlizarraga Sep 3, 2025
d906855
Remove unused variable
adrianlizarraga Sep 3, 2025
255c2df
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Sep 3, 2025
3db3117
Merge main conflicts
adrianlizarraga Sep 3, 2025
c7f98de
Merge main again
adrianlizarraga Sep 3, 2025
9031635
Address review comments for C#
adrianlizarraga Sep 3, 2025
abd0297
Rename functions in C and python
adrianlizarraga Sep 3, 2025
d5012fb
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Sep 3, 2025
0e0497a
Address comments
adrianlizarraga Sep 4, 2025
0a61f1f
Merge branch 'main' into adrianl/compile-api-output-stream
adrianlizarraga Sep 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
stub apis
  • Loading branch information
adrianlizarraga committed Jul 15, 2025
commit 03eb5fa8df157afec5928858af1c7e1f56e5eebd
90 changes: 85 additions & 5 deletions include/onnxruntime/core/session/onnxruntime_c_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -507,18 +507,28 @@ typedef OrtStatus*(ORT_API_CALL* OrtWriteBufferFunc)(_In_ void* state,
_In_ const void* buffer,
_In_ size_t buffer_num_bytes);

/** \brief Function called by ORT to allow user to specify how an initializer should be saved, either
* written to an external file or stored within the model.
/** \brief Function called by ORT to allow user to specify how an initializer should be saved, that is, either
* written to an external file or stored within the model. ORT calls this function for every initializer when
* generating a model.
*
* If the function sets the `is_external` output parameter to false, ORT stores initializer data within the model.
*
* Otherwise, if `is_external` is set to false, ORT assumes that this function stores the initializer data to a file.
* In this case, ORT configures the model's initializer to point to the location and offset returned from this function
* via the `location` and `offset` output parameters.
*
* \param state Opaque pointer holding the user's state.
* \param buffer The buffer to write.
* \param buffer_num_bytes The size of the buffer in bytes.
* \param[in] state Opaque pointer holding the user's state.
* \param[in] initializer_name The initializer's name as a null-terminated string.
* \param[in] initializer_data Pointer to the initializer's raw data (contiguous).
* \param[in] initializer_num_bytes The size in bytes of the initializer's data.
* \param[in] initializer_type The type and shape information for the initializer.
* \param[out] is_external Output parameter set to true if the initializer data is to be stored externally.
* The function implemented is responsible for writing the initializer data to file.
* If set to false, ORT stores the initializers within the model.
* \param[out] location Output parameter set to the location (i.e., file path) into which the initializer data is stored
* by the function implementer. Ignored if `is_external` is set to false.
* \param[out] offset Output parameter set to the location offset into which the initializer data is stored
* by the function implementer. Ignored if `is_external` is set to false.
*
* \return OrtStatus* Write status. Return nullptr on success.
* Use CreateStatus to provide error info. Use ORT_FAIL as the error code.
Expand All @@ -532,6 +542,23 @@ typedef OrtStatus*(ORT_API_CALL* OrtHandleInitializerDataFunc)(_In_ void* state,
_Out_ bool* is_external,
_Out_ const ORTCHAR_T** location, _Out_ int64_t* offset);

/** \brief Function called by ORT to write a EPContext binary data to a custom destination (e.g., file, stream, etc.).
*
* \param state Opaque pointer holding the user's state.
* \param buffer The buffer to write.
* \param buffer_num_bytes The size of the buffer in bytes.
* \param[out] location Output parameter set to the location (i.e., file path) into which the data is stored
* by the function implementer.
*
* \return OrtStatus* Write status. Return nullptr on success.
* Use CreateStatus to provide error info. Use ORT_FAIL as the error code.
* ORT will release the OrtStatus* if not null.
*/
typedef OrtStatus*(ORT_API_CALL* OrtWriteEpContextDataFunc)(_In_ void* state,
_In_ const void* buffer,
_In_ size_t buffer_num_bytes,
_Out_ const ORTCHAR_T** location);

/** \brief Algorithm to use for cuDNN Convolution Op
*/
typedef enum OrtCudnnConvAlgoSearch {
Expand Down Expand Up @@ -6895,6 +6922,59 @@ struct OrtCompileApi {
*/
ORT_API2_STATUS(ModelCompilationOptions_SetFlags, _In_ OrtModelCompilationOptions* model_compile_options,
size_t flags);

/** \brief Sets a OrtWriteBufferFunc function that is called by ORT to write out the output model's serialized
* ONNX bytes.
*
* The provided write function may be called repeatedly until then entire output model has been written out. Each call
* to the write function is expected to consume the entire input buffer.
*
* The output model's destination (e.g., file path, memory buffer, or stream) can be set with any of the functions
* that begin with ModelCompilationOptions_SetOutputModel____.
*
* \param[in] model_compile_options The OrtModelCompilationOptions instance.
* \param[in] write_func The OrtWriteBufferFunc function called by ORT when writing out the model.
* \param[in] state Opaque state passed as the first argument to OrtWriteBufferFunc. Can be NULL.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.23.
*/
ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelWriteFunc,
_In_ OrtModelCompilationOptions* model_compile_options,
_In_ OrtWriteBufferFunc write_func, _In_ void* state);

/** \brief Sets a OrtHandleInitializerDataFunc function that is called by ORT for every initializer in the generated
* model. Allows implementer to specify whether initializers should be stored within the model or externally.
*
* \param[in] model_compile_options The OrtModelCompilationOptions instance.
* \param[in] write_func The OrtHandleInitializerDataFunc function called by ORT when writing out an initializer.
* \param[in] state Opaque state passed as the first argument to OrtHandleInitializerDataFunc. Can be NULL.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.23.
*/
ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelHandleInitializerFunc,
_In_ OrtModelCompilationOptions* model_compile_options,
_In_ OrtHandleInitializerDataFunc handle_initializer_func, _In_ void* state);

/** \brief Sets a OrtWriteEpContextDataFunc function that is called by an execution provider to write out an
* an EPContext node's binary data, which is usually stored in the attribute named `ep_cache_context`.
*
* \note Not compatible with embed mode set to true via ModelCompilationOptions_SetEpContextEmbedMode.
*
* \param[in] model_compile_options The OrtModelCompilationOptions instance.
* \param[in] write_func The OrtWriteEpContextDataFunc function called by an EP to write out an EPContext node's data.
* \param[in] state Opaque state passed as the first argument to OrtWriteEpContextDataFunc. Can be NULL.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.23.
*/
ORT_API2_STATUS(ModelCompilationOptions_SetEpContextDataWriteFunc,
_In_ OrtModelCompilationOptions* model_compile_options,
_In_ OrtWriteEpContextDataFunc write_func, _In_ void* state);
};

/*
Expand Down
114 changes: 114 additions & 0 deletions onnxruntime/core/framework/ep_context_options.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#include <cassert>
#include <limits>
#include <string>
#include <utility>
#include "core/common/common.h"
#include "core/framework/ep_context_options.h"
#include "core/framework/error_code_helper.h"
#include "core/session/onnxruntime_session_options_config_keys.h"

namespace onnxruntime {
namespace epctx {
// class ModelGenOptions

ModelGenOptions::ModelGenOptions(const ConfigOptions& config_options) {
enable = config_options.GetConfigOrDefault(kOrtSessionOptionEpContextEnable, "0") == "1";

std::string output_model_path = config_options.GetConfigOrDefault(kOrtSessionOptionEpContextFilePath, "");
if (!output_model_path.empty()) {
output_model_location = config_options.GetConfigOrDefault(kOrtSessionOptionEpContextFilePath, "");
} else {
output_model_location = std::monostate{};
}

output_external_initializers_file_path = config_options.GetConfigOrDefault(
kOrtSessionOptionsEpContextModelExternalInitializersFileName, "");
output_external_initializer_size_threshold = 0;
embed_ep_context_in_model = config_options.GetConfigOrDefault(kOrtSessionOptionEpContextEmbedMode, "0") == "1";
}

bool ModelGenOptions::HasOutputModelLocation() const {
return !std::holds_alternative<std::monostate>(output_model_location);
}

const std::string* ModelGenOptions::TryGetOutputModelPath() const {
return std::get_if<std::string>(&output_model_location);
}

const BufferHolder* ModelGenOptions::TryGetOutputModelBuffer() const {
return std::get_if<BufferHolder>(&output_model_location);
}

const OutStreamHolder* ModelGenOptions::TryGetOutputModelOutStream() const {
return std::get_if<OutStreamHolder>(&output_model_location);
}

// class OutStreamBuf

OutStreamBuf::OutStreamBuf(OutStreamHolder out_stream_holder) : out_stream_holder_(out_stream_holder) {
setp(buffer_.data(), buffer_.data() + buffer_.size());
}

OutStreamBuf::~OutStreamBuf() {
sync();
}

// Called when the buffer_ is full. Flushes the buffer_ (via sync()) and then writes the overflow character to buffer_.
std::streambuf::int_type OutStreamBuf::overflow(std::streambuf::int_type ch) {
if (sync() == -1) {
return traits_type::eof();
}

if (ch != traits_type::eof()) {
*pptr() = static_cast<char>(ch);
pbump(1);
}

return ch;
}

// Flushes the entire buffer_ to the user's write function.
int OutStreamBuf::sync() {
if (!last_status_.IsOK()) {
return -1;
}

std::ptrdiff_t num_bytes = pptr() - pbase();
if (num_bytes == 0) {
return 0;
}

// Can only call pbump() with an int, so can only write at most (2^31 - 1) bytes.
if (num_bytes > std::numeric_limits<int>::max()) {
num_bytes = std::numeric_limits<int>::max();
}

char* ptr = pbase();

Status status = Status::OK();

ORT_TRY {
status = ToStatus(out_stream_holder_.write_func(out_stream_holder_.stream_state,
ptr, num_bytes));
}
ORT_CATCH(const std::exception& e) {
ORT_HANDLE_EXCEPTION([&]() {
status = ORT_MAKE_STATUS(ONNXRUNTIME, FAIL,
"Caught exception while calling user's OrtOutStreamWriteFunc callback: ", e.what());
});
}

if (!status.IsOK()) {
last_status_ = std::move(status);
return -1;
}

pbump(-static_cast<int>(num_bytes)); // Reset internal pointer to point to the beginning of the buffer_
return 0;
}

} // namespace epctx
} // namespace onnxruntime
87 changes: 87 additions & 0 deletions onnxruntime/core/framework/ep_context_options.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

#pragma once

#include <array>
#include <streambuf>
#include <string>
#include <variant>
#include "core/framework/allocator.h"
#include "core/framework/config_options.h"

namespace onnxruntime {
namespace epctx {
/// <summary>
/// Holds the buffer that will store the output model and the allocator used to allocate the memory.
/// </summary>
struct BufferHolder {
void** buffer_ptr = nullptr;
size_t* buffer_size_ptr = nullptr;
AllocatorPtr buffer_allocator = nullptr;
};

/// <summary>
/// Holds the opaque stream state and the write function that ORT calls to write out the output model.
/// </summary>
struct OutStreamHolder {
OrtOutStreamWriteFunc write_func = nullptr;
void* stream_state = nullptr; // Opaque pointer to user's stream state. Passed as first argument to write_func.
};

/// <summary>
/// Stores EPContext model generation options. Used in SessionOptions.
/// </summary>
struct ModelGenOptions {
ModelGenOptions() = default;

// Initializes from string key/value pairs in session config options.
explicit ModelGenOptions(const ConfigOptions& config_options);

bool enable = false;
bool overwrite_existing_output_file = false;
bool error_if_no_compiled_nodes = false;
bool embed_ep_context_in_model = false;

std::variant<std::monostate, // Initial state (no output model location)
std::string, // output model path
BufferHolder, // buffer to save output model
OutStreamHolder> // Function to write the output model to a user's stream.
output_model_location{};

std::string output_external_initializers_file_path;
size_t output_external_initializer_size_threshold = 0;

bool HasOutputModelLocation() const;
const std::string* TryGetOutputModelPath() const;
const BufferHolder* TryGetOutputModelBuffer() const;
const OutStreamHolder* TryGetOutputModelOutStream() const;
};

// Class that wraps the user's OrtOutStreamWriteFunc function to enable use with
// C++'s std::ostream.
// Example:
// OutStreamHolder stream_holder{write_func, stream_state};
// std::unique_ptr<OutStreamBuf> out_stream_buf = std::make_unique<OutStreamBuf>(stream_holder);
// std::ostream out_stream(out_stream_buf.get());
class OutStreamBuf : public std::streambuf {
public:
explicit OutStreamBuf(OutStreamHolder out_stream_holder);
~OutStreamBuf();

const Status& GetStatus() const {
return last_status_;
}

protected:
int_type overflow(int_type ch) override;
int sync() override;

private:
OutStreamHolder out_stream_holder_{};
std::array<char, 4096> buffer_{};
Status last_status_{};
};

} // namespace epctx
} // namespace onnxruntime