Skip to content

Conversation

thevar1able
Copy link
Member

@thevar1able thevar1able commented Mar 9, 2025

Resolves: #78765

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Use clang from llvm-20

@thevar1able thevar1able changed the title temp-commit Use clang-20 Mar 9, 2025
Copy link

clickhouse-gh bot commented Mar 9, 2025

Workflow [PR], commit [227c67f]

@clickhouse-gh clickhouse-gh bot added the pr-build Pull request with build/testing/packaging improvement label Mar 9, 2025
@azat azat self-assigned this Mar 11, 2025
@thevar1able thevar1able force-pushed the use-clang-20 branch 4 times, most recently from 8ea88c9 to a32abfa Compare March 31, 2025 13:32
@azat azat mentioned this pull request Apr 7, 2025
1 task
@azat
Copy link
Member

azat commented Apr 15, 2025

Build (amd_tidy) — Failed: Build ClickHouse

$ grep 'error:' build_clickhouse.log | grep -o '\[.*\]' | sort | uniq -cd
     10 [bugprone-nondeterministic-pointer-iteration-order,-warnings-as-errors]
     16 [cert-arr39-c,-warnings-as-errors]
      3 [clang-diagnostic-error]
      3 [clang-diagnostic-uninitialized,-warnings-as-errors]
   2595 [modernize-use-integer-sign-comparison,-warnings-as-errors]
 153682 [portability-template-virtual-member-function,-warnings-as-errors]
    403 [readability-container-contains,-warnings-as-errors]

@azat azat added the pr-performance Pull request with some performance improvements label Apr 15, 2025
@clickhouse-gh clickhouse-gh bot removed the pr-performance Pull request with some performance improvements label Apr 15, 2025
@clickhouse-gh clickhouse-gh bot added pr-performance Pull request with some performance improvements and removed pr-build Pull request with build/testing/packaging improvement labels Apr 15, 2025
CMakeLists.txt Outdated
# set CPU time limit to 1000 seconds
set (RLIMIT_CPU 1000)
# set stack size to 1024M
set (RLIMIT_STACK 1024000000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No wonder it does not work - clang simply assume that the stack is 8MB - https://github.com/llvm/llvm-project/blob/788b50a4384985f2c221cfd8d290cabc6f59e646/clang/include/clang/Basic/Stack.h#L26

As for why it happens only for arm64 it is likely due to it has more registers

And here is an example of a stacktrace - https://pastila.nl/?0000e83f/bcc54268487047ba960a50304e756557#7k0S6lGYXncNor/j2uTBqw==

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced in c978f0f7

$ git tag --contains c978f0f7
llvmorg-21-init
llvmorg-20.1.3
llvmorg-20.1.2
llvmorg-20.1.1
llvmorg-20.1.0-rc3
llvmorg-20.1.0-rc2
llvmorg-20.1.0-rc1
llvmorg-20.1.0

@azat
Copy link
Member

azat commented Apr 21, 2025

This is how our recursion for Settings.cpp looks like :)

Clang AST:

settings

P.S. Yes, it is better to show as screenshot since I have to zoom out as much as possible

UPD: in text - https://pastila.nl/?0010852e/d728efbefd772d270ccc6d00a5c8685a#M7N24U/ZDFhfF0/ZzJko1A==

Copy link
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for comments LGMT, also worth checking perf tests (though they are unstable right now)

@azat
Copy link
Member

azat commented Apr 22, 2025

column_array_filter query 7 looks related

image

@thevar1able thevar1able force-pushed the use-clang-20 branch 2 times, most recently from 0c1160e to d9131f3 Compare September 4, 2025 03:38
# [9] https://developer.arm.com/documentation/dui0801/g/A64-Data-Transfer-Instructions/LDAPR?lang=en
# [10] https://github.com/aws/aws-graviton-getting-started/blob/main/README.md
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=armv8.2-a+simd+crypto+dotprod+ssbs+rcpc+bf16")
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=armv8.2-a+simd+crypto+dotprod+ssbs+rcpc+bf16+lse")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L. 63-64 say that LSE is part of ARM v.1, why do we add it again here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something failed without it, but I don't remember what exactly. Let's try removing them when other stuff is fixed, and see if it is really necessary.

@thevar1able thevar1able removed the ci-performance performance only label Sep 7, 2025
@rschu1ze rschu1ze self-assigned this Sep 8, 2025
```sh
export CC=clang-19
export CXX=clang++-19
export CC=clang-20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

21

@thevar1able thevar1able added the ci-build CI with just build jobs, style check and fast test label Sep 9, 2025
@thevar1able thevar1able added ci-performance performance only and removed ci-build CI with just build jobs, style check and fast test labels Sep 11, 2025
@thevar1able
Copy link
Member Author

Yay, AVX-related regression is gone. But there are new ones, like sum() and if().

I'm going to close this PR and reopen a new one for clang-21.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-performance performance only pr-performance Pull request with some performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build problem: private field 'packet_in_progress' is not used
4 participants