Skip to content

Conversation

ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Aug 11, 2025

Closes #789

This PR optimizes serialization of FunctionCalls by:

  1. Making the serialization call return a &[u8] instead of Vec, saving a memory allocation when writing it to memory.
  2. Avoids an unnecessary step where VecBytes(Vec) were converted to vec through an useless Iterator.
  3. Preallocates a FlatBufferBuilder with a specific capacity to avoid reallocations when it grows. This is at a small runtime cost of estimating the capacity that will be needed. In practice this estimation is very fast and almost always worth it

Future Todos:

  • This PR only affects the host side, and there's a lot of improvement that can be made on the guest side: Guest functions should not need to return a Vec. Instead they should be passed some kind of Writer on which they pass return values.
  • FunctionCall should contained a borrowed &str/&[u8] instead of String and Vec

Relevant benchmark results compared to main branch (but with c2e6cdd which adds the 2 first benchmarks):

guest_functions_with_large_parameters/guest_call_with_large_parameters
                        time:   [723.24 ms 756.56 ms 793.92 ms]
                        change: [−15.950% −10.778% −5.4238%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  5 (5.00%) high mild
  10 (10.00%) high severe

function_call_serialization/serialize_function_call
                        time:   [5.5803 ms 5.6351 ms 5.7012 ms]
                        change: [−80.248% −79.529% −78.899%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
function_call_serialization/deserialize_function_call
                        time:   [8.2955 ms 8.3794 ms 8.4724 ms]
                        change: [−58.447% −57.639% −56.842%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

sample_workloads/24K_in_8K_out_c
                        time:   [28.390 µs 28.675 µs 29.009 µs]
                        change: [−45.020% −42.425% −39.887%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
sample_workloads/24K_in_8K_out_rust
                        time:   [27.788 µs 27.997 µs 28.237 µs]
                        change: [−38.325% −36.829% −35.245%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

@ludfjig ludfjig force-pushed the big_param_opt branch 2 times, most recently from bf0445e to 4196a2f Compare August 11, 2025 22:45
@ludfjig ludfjig added kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. area/performance Addresses performance labels Aug 11, 2025
@ludfjig ludfjig force-pushed the big_param_opt branch 7 times, most recently from 1ffa58d to 0c83035 Compare August 14, 2025 19:26
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the extensive tests!

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes FunctionCall serialization to improve performance by reducing memory allocations and improving buffer capacity estimation. The changes focus on making serialization more efficient by returning borrowed slices instead of owned vectors and pre-allocating FlatBufferBuilder capacity.

Key changes:

  • Refactored FunctionCall serialization to use encode() method returning &[u8] instead of TryFrom<FunctionCall> for Vec<u8>
  • Added capacity estimation function to pre-allocate FlatBufferBuilder capacity and avoid reallocations
  • Updated function signatures to accept &[u8] instead of Vec<u8> where possible

Reviewed Changes

Copilot reviewed 11 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/hyperlight_common/src/flatbuffer_wrappers/function_call.rs Refactored serialization from TryFrom trait to encode() method returning borrowed slice
src/hyperlight_common/src/flatbuffer_wrappers/util.rs Added capacity estimation function with comprehensive tests
src/hyperlight_common/src/flatbuffer_wrappers/function_types.rs Fixed VecBytes deserialization to use .bytes().to_vec() instead of iterator
src/hyperlight_host/src/sandbox/initialized_multi_use.rs Updated to use new encode method with capacity estimation
src/hyperlight_guest/src/guest_handle/io.rs Changed parameter from Vec<u8> to &[u8]
src/hyperlight_guest/src/guest_handle/host_comm.rs Updated to use new encode method and pass references
src/hyperlight_guest_bin/src/guest_function/call.rs Updated function call to pass reference instead of owned vector
src/hyperlight_host/benches/benchmarks.rs Added benchmarks for serialization performance and sample workloads
src/tests/rust_guests/simpleguest/src/main.rs Added benchmark test function for 24K input/8K output scenario
src/tests/c_guests/c_simpleguest/main.c Added C version of benchmark test function
src/hyperlight_guest/Cargo.toml Added flatbuffers dependency

danbugs
danbugs previously approved these changes Aug 21, 2025
Copy link
Contributor

@danbugs danbugs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization! Mostly LGTM. Just a couple of questions/nits here and there 👍

jsturtevant
jsturtevant previously approved these changes Aug 22, 2025
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to address the remaining comments but this LGTM

@ludfjig ludfjig dismissed stale reviews from jsturtevant and danbugs via fb898f7 August 25, 2025 18:55
and a c+rust sample workload benchmark

Signed-off-by: Ludvig Liljenberg <[email protected]>
This should save memory allocations, but require that a
FlatBufferBuilder is passed in.

Signed-off-by: Ludvig Liljenberg <[email protected]>
Signed-off-by: Ludvig Liljenberg <[email protected]>
@jsturtevant jsturtevant enabled auto-merge (squash) August 25, 2025 19:00
@jsturtevant jsturtevant merged commit 85b4510 into hyperlight-dev:main Aug 25, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Addresses performance kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FunctionCall should serialize to &[u8] instead of Vec<u8>
3 participants