Fix .args loading order for program mode detection #840
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: This PR targets release v0.9.3 (commit ff0c02e) to make it easy for users relying on the last stable release to apply this fix to their local builds. Upstream main has since restructured to use submodules and patches.
Summary
cosmo_args()beforedetermine_program()so embedded.argsflags like--server --v2are visible when determining program modelf::server::main()tolf::server::run()withargs_already_loadedparameter to avoid double-loading.argsProblem
When a llamafile has an embedded
.argsfile containing--server --v2, thedetermine_program()call happens BEFOREcosmo_args()loads the embedded args. These flags are not seen, so the program falls through to chatbot mode instead of launching llamafiler.Issue with PR #788
PR #788 correctly identifies this bug and moves
cosmo_args()beforedetermine_program(). However, it introduces a double-loading issue: when the program dispatches to llamafiler mode,lf::server::main()inprog.cppcallscosmo_args()again. This causes.argsto be loaded twice, which is problematic for accumulator flags like--headerthat append values rather than overwrite them.Solution
This PR:
cosmo_args()call beforedetermine_program()inllama.cpp/main/main.cpplf::server::main()tolf::server::run()with anargs_already_loadedparametertrueto skip the redundant.argsloadllamafilerbinary passesfalseto load its own argsThis provides the same fix as PR #788 while avoiding the double-loading issue.
Test plan
.argsfile containing--server --v2in a llamafile using zipalignllamafilerbinary still works correctlyEnhances the
.argstiming fix from PR #788.