Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

msbuild performance has been poor recently, particularly since the model implementations were split into separate source files. This change enables MultiToolTask which enables parallelism within a project without the problems of /MP. See https://devblogs.microsoft.com/cppblog/improved-parallelism-in-msbuild/.

I tested each of the vulkan and cuda backends. My CPU is a 32-core threadripper.

vulkan before
========== Rebuild started at 2:50 PM and took 03:38.550 minutes ==========
vulkan after
========== Rebuild started at 3:36 PM and took 01:57.272 minutes ==========

cuda before
========== Rebuild started at 3:27 PM and took 07:59.340 minutes ==========
cuda after
========== Rebuild started at 3:38 PM and took 06:17.516 minutes ==========

This is supported in vs2019+. I think unknown options are ignored, so this shouldn't be harmful on older versions.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an environment to test this. Feel free to merge if it works on your end.

@Acly
Copy link
Collaborator

Acly commented Dec 3, 2025

Maybe guard it with if(LLAMA_STANDALONE) ?

@jeffbolznv
Copy link
Collaborator Author

Maybe guard it with if(LLAMA_STANDALONE) ?

Sure, done.

@jeffbolznv jeffbolznv merged commit d8b5cdc into ggml-org:master Dec 4, 2025
76 of 80 checks passed
khemchand-zetta pushed a commit to khemchand-zetta/llama.cpp that referenced this pull request Dec 4, 2025
* build: enable parallel builds in msbuild using MTT

* check LLAMA_STANDALONE
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Dec 4, 2025
* origin/master:
server: strip content-length header on proxy (ggml-org#17734)
server: move msg diffs tracking to HTTP thread (ggml-org#17740)
examples : add missing code block end marker [no ci] (ggml-org#17756)
common : skip model validation when --help is requested (ggml-org#17755)
ggml-cpu : remove asserts always evaluating to false (ggml-org#17728)
convert: use existing local chat_template if mistral-format model has one. (ggml-org#17749)
cmake : simplify build info detection using standard variables (ggml-org#17423)
ci : disable ggml-ci-x64-amd-* (ggml-org#17753)
common: use native MultiByteToWideChar (ggml-org#17738)
metal : use params per pipeline instance (ggml-org#17739)
llama : fix sanity checks during quantization (ggml-org#17721)
build : move _WIN32_WINNT definition to headers (ggml-org#17736)
build: enable parallel builds in msbuild using MTT (ggml-org#17708)
ggml-cpu: remove duplicate conditional check 'iid' (ggml-org#17650)
Add a couple of file types to the text section (ggml-org#17670)
convert : support latest mistral-common (fix conversion with --mistral-format) (ggml-org#17712)
Use OpenAI-compatible `/v1/models` endpoint by default (ggml-org#17689)
webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden (ggml-org#17445)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Compilation issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants