Skip to content

Conversation

@silvanshade
Copy link
Contributor

@silvanshade silvanshade commented Feb 12, 2025

This PR is a follow up to #445 and implements llfio-based memory-mapped IO.

A new example using this functionality has been added as example-mmap.c.

The example can be compiled, installed, and run with the following commands:

# Also add -DBLAKE3_FETCH_TBB=1 if compiling on Windows
$ cmake -S c -B c/build -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=c/build/install -DBLAKE3_USE_LLFIO=1 -DBLAKE3_USE_TBB=1 -DBLAKE3_EXAMPLES=1
$ cmake --build c/build --target install
$ c/build/install/bin/blake3-example-mmap <path>

@silvanshade
Copy link
Contributor Author

There is no option to request reviewers and nobody is assigned so I'll just ask:

@BurningEnlightenment and @oconnor663 can you review this? Thanks.

Copy link
Collaborator

@BurningEnlightenment BurningEnlightenment left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting the PR!

I also noticed a few missing includes / reliance on transitive includes like std::move => <utility>, OUTCOME_TRY => <outcome-basic.hpp>, etc.

I haven't looked at the CMake files, yet.

I'll allocate some time next weekend to fix some of the issues.

if (result.has_error()) {
// Explicitly set errno on error since this doesn't always happen
// automatically on some platforms such as Windows.
errno = static_cast<int>(result.error().value().sc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llfio leverages NT APIs on Windows, i.e. the status code might be a NTSTATUS and not a Win32 code. This is problematic due to the fact that NTSTATUS and Win32 codes have overlapping value ranges.
One solution would be to convert to HRESULT, but maybe there is a more elegant one. Needs additional thought.

Furthermore llfio might be built by the user with std::error_code, i.e. .sc will produce a compiler error.

And last but not least errno might be dirty, i.e. we should clear it on success or switch the return type from void to bool to indicate success.

Copy link
Contributor Author

@silvanshade silvanshade Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here in the CMake configuration: https://github.com/silvanshade/BLAKE3/blob/fac71c9e6748e61e0c280e4e905f8f808b3c2ae8/c/dependencies/llfio/CMakeLists.txt#L27-L42

I'm specifically configuring the build with LLFIO_EXPERIMENTAL_STATUS_CODE in order to allow compiling llfio as a header-only library. The documentation states the following:

LLFIO has your choice of header-only, static library, and shared library build modes. Note that on Microsoft Windows, the default header only configuration is unsafe to use outside of toy projects. You will get warnings of the form:

[...]

The cause is that <system_error> has a design flaw where custom error code categories are unsafe when shared libraries are in use.

You have one of three choices here: (i) Use experimental SG14 status_code which doesn't have this problem (define LLFIO_EXPERIMENTAL_STATUS_CODE=1) (ii) Use the NT kernel error category as a shared library (see its documentation) (iii) Don't use header only LLFIO on Windows (see below).

Of the three choices available, the header-only with LLFIO_EXPERIMENTAL_STATUS_CODE seems the cleanest and most portable.

I couldn't find any vendor-provided packages available for llfio for any OS, so relying on it being available pre-compiled doesn't seem like a viable option.

It also doesn't appear that the repository-provided binaries are set up to be able to easily detect them via CMake through the usual means.

Finally, I'd also prefer to avoid compiling it in-tree as a shared lib on Windows if we don't need that.

Furthermore llfio might be built by the user with std::error_code, i.e. .sc will produce a compiler error.

Since llfio is configured as a private dependency and header-only library, I don't see how this can be a problem. As far as I understand, our use of llfio should not interfere with other already existing uses of it elsewhere in the user's code base.

And last but not least errno might be dirty, i.e. we should clear it on success or switch the return type from void to bool to indicate success.

Good point. I would prefer to return a bool rather than set-on-success since it is cleaner and more informative. I will change that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since llfio is configured as a private dependency and header-only library, I don't see how this can be a problem. As far as I understand, our use of llfio should not interfere with other already existing uses of it elsewhere in the user's code base.

Sadly, this is not the case. Formally speaking: Using llfio with different configurations in the same project results in One Definition Rule (ODR) violations for basically every entity. ODR violations make the whole program "ill-formed no diagnostic required" (IFNDR), it basically stops being a C++ program from the C++ standard POV.

The practical consequences can be illustrated with the following example: We build blake3 as a static library and llfio in its header variant is used in two configurations like std::error_code and status_code (progam's vs blake3's). All inline functions will be emitted by the compiler into the object file and the linker is allowed to deduplicate them by picking any of them. This means that either our program or the blake3 library will not get ABI incompatible return values.

I couldn't find any vendor-provided packages available for llfio for any OS, so relying on it being available pre-compiled doesn't seem like a viable option.

Indeed, OS packagers are a bad source for C++ libraries (though llfio has been added to the FreeBSD ports tree IIRC). Typically you would use a C++ package manager like vcpkg or conan.

btw. I also maintain llfio and its buildsystem and I'm therefore vaguely familiar with it 😉

Copy link
Contributor Author

@silvanshade silvanshade Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The practical consequences can be illustrated with the following example: We build blake3 as a static library and llfio in its header variant is used in two configurations like std::error_code and status_code (progam's vs blake3's). All inline functions will be emitted by the compiler into the object file and the linker is allowed to deduplicate them by picking any of them. This means that either our program or the blake3 library will not get ABI incompatible return values.

Okay, I see how this can be a problem.

Indeed, OS packagers are a bad source for C++ libraries (though llfio has been added to the FreeBSD ports tree IIRC). Typically you would use a C++ package manager like vcpkg or conan.

btw. I also maintain llfio and its buildsystem and I'm therefore vaguely familiar with it 😉

I have some familiarity with vcpkg (not conan though), but to be more specific, I wasn't sure how we should detect local availability of llfio regardless of the source.

If I remember correctly, llfio didn't seem to be set up with a CMake find module definition so we could just use find_package. This is why I checked for the llfio::hl target specifically (after loading from FetchContent).

How would you propose we attempt to detect and use a locally available llfio?

One strategy I am imagining is that if BLAKE3_USE_LLFIO is set, we check for a (user-defined) value for LLFIO_LIBRARY and if defined we attempt to load the value as subdirectory with CMake (which we expect to define the llfio::* targets), otherwise if BLAKE3_FETCH_LLFIO is set, use FetchContent and configure like we have it now, otherwise print an error message about missing llfio.

Furthermore llfio might be built by the user with std::error_code, i.e. .sc will produce a compiler error.

Given the above, how can we detect which configuration is in use in order to handle it appropriately in the different cases? Can you give an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore llfio might be built by the user with std::error_code, i.e. .sc will produce a compiler error.

Given the above, how can we detect which configuration is in use in order to handle it appropriately in the different cases? Can you give an example?

I tried to generalize the implementation to work with either std::error_code or status_code and also to allow specifying whether to use an existing llfio or to fetch and whether to build as header-only (the default) or static or shared.

Unfortunately, I can't seem to get most of the non-header-only configurations to even compile on Windows. I haven't tested that on other platforms yet.

You can try yourself by specifying, for example, -DBLAKE3_USE_LLFIO=1 -DBLAKE3_USE_LLFIO_TARGET=llfio_dl -DBLAKE3_USE_LLFIO_EXPERIMENTAL_STATUS_CODE=OFF -DBLAKE3_FETCH_LLFIO=1.

For non-header-only builds on Windows, the compile fails with several unresolved symbols unless exceptions are enabled, but as far as I understand this should be supported according to ned14/llfio#127.

Another problem is that if EXCLUDE_FROM_ALL is specified in:

FetchContent_Declare(
  llfio
  GIT_REPOSITORY https://github.com/ned14/llfio
  GIT_TAG 20250206
  GIT_SHALLOW TRUE
  EXCLUDE_FROM_ALL
)

or:

add_subdirectory(${BLAKE3_USE_LLFIO_DIR} EXCLUDE_FROM_ALL)

then the llfio libs won't be compiled so the example-mmap will just fail when run because the libraries are missing. I would have thought that adding the llfio_dl to the target would force these to compile.

@oconnor663
Copy link
Member

Could you give me the high level tradeoffs for LLFIO vs mio vs Boost mapped_file? I guess another option we could consider is providing only the TBB side of the equation (since that necessarily touches BLAKE3 internaly) but leaving mmap'ing to the caller?

I was playing around with adding support for this feature to the build.rs file in the blake3_c_rust_bindings test harness, and looks like pulling in these headers means cloning three different git repos plus their submodules. I'm willing to do that (our Rust code has a gajillion transitive dependencies if you count dev/test deps), but if one of the alternatives is lighter weight that might be nice? I acknowledge that using a Rust test harness to run C/C++ code is a wacky choice :-D

@silvanshade
Copy link
Contributor Author

Could you give me the high level tradeoffs for LLFIO vs mio vs Boost mapped_file? I guess another option we could consider is providing only the TBB side of the equation (since that necessarily touches BLAKE3 internaly) but leaving mmap'ing to the caller?

@oconnor663 Sorry I missed this earlier but I see you merged the TBB PR. I think that's fine; it's still an improvement to have to multi-threading for C even if the caller has to figure out the memory-mapping for now.

I don't have much familiarity with mio but there was a reddit thread about it here.

The llfio author (ned14) commented:

From a quick scan through his sources:

LLFIO is far more future standards focused, which makes sense as the reference implementation for http://wg21.link/P1031 Low level file i/o. So specifically, he's got lots of useful iterator support and so on. LLFIO goes no further than giving you a span, it assumes you'll be using Ranges on those. LLFIO also uses Outcome as a standin for deterministic exceptions with the proposed std::error. He's using C++ 11 error_code, which is reasonable.

LLFIO does a ton load more stuff, and is correspondingly more complex. You get non-mmapped file i/o, async i/o, ability to polymorphically interchange i/o backends, various generic algorithms for excluding other users from parts of files, or many files, and so on.

LLFIO implements a full fat virtual memory control layer, so you can kick in or out individual pages, use large pages, remap pages, and so on.

LLFIO implements a full filesystem layer, so you get path views (proposed for standardisation), ability to enumerate directories correctly rather than the abomination which is directory_iterator, symlinks and so on.

Finally LLFIO handles are stackable, not that that is in the current reference library. I'm currently working on the first stackable handle type, it implements crypto random XOR.

His library looks well implemented, though he could do with more comprehensive unit testing. I wouldn't have any qualms in recommending its use.

The mio library seems to be less actively developed now.

Regarding LLFIO vs Boost (mapped_file and other features), it's my impression that LLFIO was developed partly in response to needing more complete or more advanced or more performant functionality than what was available in Boost.

I believe the intention was also that parts of LLFIO were to be proposed for addition to the C++ standard (more on this later) whereas Boost has often kind of evolved its own way of doing this for various reasons. I would expect LLFIO to be more forward looking in its design.

The main reason I chose LLFIO is because it was actively developed and it looked like parts of it were likely to be added to the standard. But I recently saw this comment and it sounds like the author is stepping away from the C++ WG stuff and deprioritizing C++ projects.

I'm less sure about whether it's worth the effort to get LLFIO-based memory-mapping to work for the C lib now.

Maybe @BurningEnlightenment has some thoughts about all of this.

I was playing around with adding support for this feature to the build.rs file in the blake3_c_rust_bindings test harness, and looks like pulling in these headers means cloning three different git repos plus their submodules. I'm willing to do that (our Rust code has a gajillion transitive dependencies if you count dev/test deps), but if one of the alternatives is lighter weight that might be nice? I acknowledge that using a Rust test harness to run C/C++ code is a wacky choice :-D

Yeah, I don't know really. This sounds like maybe more trouble than its worth.

If you do everything through CMake and use FetchContent and build LLFIO as header-only, it's pretty smooth. But I'm finding that almost any other configuration (especially on Windows) is a lot more complex than I anticipated, even for the C++ side.

I have resolved most of the feedback and plan to update the PR still but maybe there's a better option or maybe it's fine to just not have memory-mapping.

Perhaps a much simpler approach would be to split the memory-mapping code from Rust into a separate crate which could be statically linked to the C library.

@oconnor663
Copy link
Member

I've added example_tbb.c (2374b56) as an example of using mmap directly in C (so not supporting Windows). Maybe that's sufficient? Let me know what you think, if you get a chance to look at it.

@sneves
Copy link
Collaborator

sneves commented Mar 15, 2025

The handles created by open are never closed in the example.

@silvanshade
Copy link
Contributor Author

silvanshade commented Mar 16, 2025

I've added example_tbb.c (2374b56) as an example of using mmap directly in C (so not supporting Windows). Maybe that's sufficient? Let me know what you think, if you get a chance to look at it.

Cool, that seems sufficient I think.

The motivating use-case for this PR was to enable memory-mapped IO for the BLAKE3 hashing in nix. But nix is already depending on Boost locally so it's probably simpler to just use boost::mapped_file there.

I'll go ahead and close this then. Thanks for the feedback.

@oconnor663
Copy link
Member

Cool, thanks for all the work you've done to get this feature in.

The handles created by open are never closed in the example.

Woops, fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants