Skip to content

Conversation

@vgvassilev
Copy link
Member

@vgvassilev vgvassilev commented Dec 20, 2023

The llvm9 JIT issued callbacks when a symbol was missing and we reacted on it by loading the relevant library. In 9b2041e3 we have kept the logic but now the JIT started querying more often even for symbols which are okay to be missing. In turn that leads to scanning all libraries causing performance issues.

This patch tries to limit this functionality only in contexts where automatic loading is allowed. In addition when computing the offsets of a constant variable declaration we compute the initializers instead of searching in the shared objects.

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@pcanal pcanal requested review from dpiparo and hahnjo December 20, 2023 17:54
@pcanal
Copy link
Member

pcanal commented Dec 20, 2023

This patch seems to be a good performance enhancement (skipping some work that will not be used) but I am confused on how it helps with the increase in autoloading as I would have expected that in the case where this patch is active (i.e. autoloading is disabled) the net effect would have been the same (i.e. autoloading not activated). What am I missing?

i.e. this patch looks good to me as-is but does not seem to fully address the problem.

@github-actions
Copy link

github-actions bot commented Dec 20, 2023

Test Results

     9 files       9 suites   1d 16h 8m 44s ⏱️
 2 488 tests  2 487 ✅ 0 💤 1 ❌
21 360 runs  21 359 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit 55145f0.

♻️ This comment has been updated with latest results.

@vepadulano
Copy link
Member

vepadulano commented Dec 21, 2023

Reporting some numbers from my machine with this patch, building ROOT with the following cmake invocation:

cmake -DCMAKE_BUILD_TYPE=Release -S rootsrc -B rootbuild/BUILD_NAME -DCMAKE_INSTALL_DIR=rootinstall/INSTALL_NAME

TLDR: The situation w.r.t. #14277 is seemingly unchanged.

EDIT: Probably we also want to compare the changes of this patch against ROOT master before the two hotfixes regarding the missing symbols were introduced. I will do that and provide a followup comment.

Baseline: master@fbd6abb

Import time

Taken after a few runs, hot caches

$: time python -c "import ROOT"

real	0m0.791s
user	0m0.592s
sys	0m0.203s

strace file stats

strace -e file python -c 'import ROOT' 2>&1 | awk -v FS='"' '{ print $1 }' | sort | uniq -c
    206 access(
      2 chdir(
      1 execve(
      1 +++ exited with 0 +++
    181 getcwd(
      2 newfstatat(0, 
      2 newfstatat(1, 
      2 newfstatat(2, 
    532 newfstatat(3, 
     29 newfstatat(4, 
  10123 newfstatat(AT_FDCWD, 
    546 openat(AT_FDCWD, 
    843 readlink(
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130118, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130120, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130124, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

openat stats with the two symbols related to #14277

strace -z -f -o openat.log -e trace=open,openat python3 -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
-1
3000
grep openat openat.log | sed 's|.*, "||;s|".*||' | sort | uniq | wc -l
338

After this patch

Import time

Taken after a few runs, hot caches

time python -c "import ROOT"

real	0m0.792s
user	0m0.600s
sys	0m0.197s

strace file stats

strace -e file python -c 'import ROOT' 2>&1 | awk -v FS='"' '{ print $1 }' | sort | uniq -c
    206 access(
      2 chdir(
      1 execve(
      1 +++ exited with 0 +++
    181 getcwd(
      2 newfstatat(0, 
      2 newfstatat(1, 
      2 newfstatat(2, 
    532 newfstatat(3, 
     29 newfstatat(4, 
  10123 newfstatat(AT_FDCWD, 
    546 openat(AT_FDCWD, 
    843 readlink(
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130864, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130866, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130870, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

openat stats with the two symbols related to #14277

strace -z -f -o openat.log -e trace=open,openat python3 -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
-1
3000
grep openat openat.log | sed 's|.*, "||;s|".*||' | sort | uniq | wc -l
338

@vgvassilev
Copy link
Member Author

Don’t spend time on this patch just yet. I have more local changes that I am going to upload soon-ish.

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@vgvassilev
Copy link
Member Author

@vepadulano, can you run your tests again. With the current changes we support constexpr and it should be fast as we do not even try to look for the symbols but we evaluate the initializer instead.

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-performance-centos8-multicore/soversion.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on mac12arm/cxx20.
Running on 194.12.161.128:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@vepadulano
Copy link
Member

Updates

Baseline: master before #14261 and #14276

time python -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
-1
3000

real	0m1.198s
user	0m0.706s
sys	0m0.475s
nm -C $ROOTSYS/lib/*.so | grep TString::kNPOS | wc -l
0
nm -C $ROOTSYS/lib/*.so | grep kError | wc -l
27
strace -e file python -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)' 2>&1 | awk -v FS='"' '{ print $1 }' | sort | uniq -c
      1 -1
      1 3000
    219 access(
      2 chdir(
      1 execve(
      1 +++ exited with 0 +++
    184 getcwd(
      2 newfstatat(0, 
      2 newfstatat(1, 
      2 newfstatat(2, 
  10455 newfstatat(3, 
   3997 newfstatat(4, 
  31461 newfstatat(AT_FDCWD, 
  19903 openat(AT_FDCWD, 
  28769 readlink(
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=11626, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=11628, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=11632, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
strace -z -f -o openat.log -e trace=open,openat python3 -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
grep openat openat.log | wc -l
19759

Master after the two hotfixes

time python -c "import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)"
-1
3000

real	0m0.807s
user	0m0.620s
sys	0m0.183s
nm -C $ROOTSYS/lib/*.so | grep TString::kNPOS | wc -l
1
nm -C $ROOTSYS/lib/*.so | grep kError | wc -l
34
strace -e file python -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)' 2>&1 | awk -v FS='"' '{ print $1 }' | sort | uniq -c
      1 -1
      1 3000
    219 access(
      2 chdir(
      1 execve(
      1 +++ exited with 0 +++
    184 getcwd(
      2 newfstatat(0, 
      2 newfstatat(1, 
      2 newfstatat(2, 
    532 newfstatat(3, 
     29 newfstatat(4, 
  10141 newfstatat(AT_FDCWD, 
    546 openat(AT_FDCWD, 
    843 readlink(
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12038, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12040, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12044, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
 strace -z -f -o openat.log -e trace=open,openat python3 -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
-1
3000
grep openat openat.log | wc -l
402

This PR

time python -c "import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)"
-1
3000

real	0m0.805s
user	0m0.612s
sys	0m0.191s
nm -C $ROOTSYS/lib/*.so | grep TString::kNPOS | wc -l
1
nm -C $ROOTSYS/lib/*.so | grep kError | wc -l
27
strace -e file python -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)' 2>&1 | awk -v FS='"' '{ print $1 }' | sort | uniq -c
      1 -1
      1 3000
    219 access(
      2 chdir(
      1 execve(
      1 +++ exited with 0 +++
    184 getcwd(
      2 newfstatat(0, 
      2 newfstatat(1, 
      2 newfstatat(2, 
    532 newfstatat(3, 
     29 newfstatat(4, 
  10141 newfstatat(AT_FDCWD, 
    546 openat(AT_FDCWD, 
    843 readlink(
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12243, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12245, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
      1 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12249, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
strace -z -f -o openat.log -e trace=open,openat python3 -c 'import ROOT;print(ROOT.gErrorIgnoreLevel);print(ROOT.kError)'
-1
3000
grep openat openat.log | wc -l
402

@vepadulano
Copy link
Member

@vgvassilev It seems that the changes in this PR provide the same performance as the current master (after the two hotfixes), while supporting leaving the k* constants in TError.h as constexpr so without producing a symbol in libCore.so. For TString::kNPOS I guess we want to leave the commit there since it was anyway a real fix to a bug.

Another problem which is still present in both the current master and even after this PR is that the number of file operations is still quite high, i.e.

    532 newfstatat(3, 
     29 newfstatat(4, 
  10141 newfstatat(AT_FDCWD, 
    546 openat(AT_FDCWD, 
    843 readlink(

which are exactly the same whether current master or this patch. Consider that many of those newfstatat calls are spurious and I'm trying to understand where they come from. Take the following as an example

newfstatat(AT_FDCWD, "/home/vpadulan/programs/rootproject/rootbuild/reduce-symbol-search-release/gui/sessionviewer/input_line_11", 0x7ffde442f050, 0) = -1 ENOENT (No such file or directory)

Note the presence of the rootbuild directory, which is the path where I store the ROOT build artifacts, even though I am using ROOT from the install directory. I thought it was something in the machinery of (T)cling but so far my tests haven't brought me to any conclusion. Another suspicion is that the path to the build directory is hardcoded somehow in the .pcm or in the AST and doesn't get converted to the install directory.

@vgvassilev
Copy link
Member Author

@vgvassilev It seems that the changes in this PR provide the same performance as the current master (after the two hotfixes), while supporting leaving the k* constants in TError.h as constexpr so without producing a symbol in libCore.so.

Thanks for doing this. That sounds good

For TString::kNPOS I guess we want to leave the commit there since it was anyway a real fix to a bug.

I agree, we should also land this PR since it supports now constexpr constructs which can be anywhere in external codes.

Consider that many of those newfstatat calls are spurious and I'm trying to understand where they come from.

These are likely garbage that's put by rootcling into the pcm files. We can verify if that theory is right if we make a build with -Druntime_cxxmodules=Off and see if we still get them at such high numbers...

@smuzaffar
Copy link
Contributor

smuzaffar commented Dec 24, 2023

sure, started PR tests at cms-sw#193

@dpiparo
Copy link
Member

dpiparo commented Dec 27, 2023

Hi @smuzaffar , looking at the CMSSW tests, it seems this patch worked for CMS: could you confirm that?

Copy link
Member

@dpiparo dpiparo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vgvassilev for this fix, delivered so quickly. I propose to merge the code if other reviewers approve the PR and if 2 conditions are met:

  1. We get the confirmation from CMS that this is all what they need.
  2. We understand why roottest-python-basic-basic fails on fedora38 and, if possible, fix the problem.

DeclId_t d;
TClingClassInfo *cl = (TClingClassInfo*)opaque_cl;

// Could trigger deserialization of decls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand exactly why this had to move up in the cxx file: could you perhaps clarify, please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the tests make GetDataMember now deserialize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I can just remove this and tests still pass on my system. Can you assess if this is still needed in the latest version of the PR? Shall we try reverting it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was needed on my system.

@dpiparo
Copy link
Member

dpiparo commented Dec 27, 2023

(btw, since we are looking for ways to speedup ROOT startup, I think this could also be an avenue #7123)

@smuzaffar
Copy link
Contributor

smuzaffar commented Dec 27, 2023

Hi @smuzaffar , looking at the CMSSW tests, it seems this patch worked for CMS: could you confirm that?

Yes @dpiparo , this worked for CMS too. Local tests also show that with this update CMSSW/ROOT symbols mentioned in cms-sw/cmssw#43077 (comment) also do not trigger loading of all shared libs.

@hahnjo
Copy link
Member

hahnjo commented Dec 27, 2023

We understand why roottest-python-basic-basic fails on fedora38 and, if possible, fix the problem.

I believe this is not related to this PR, some of the Python tests fail from time to time on Fedora 38 and Fedora 39. I suspect there are still GC-related issues with Python 3.11 and 3.12 that were never fully understood...

@vgvassilev vgvassilev force-pushed the reduce-symbol-search branch from e2af2f9 to 8f4c26d Compare January 8, 2024 10:26
@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@dpiparo
Copy link
Member

dpiparo commented Jan 9, 2024

Hi @vgvassilev , thanks once again for this great development, which was delivered and tested on such a short timescale, basically during the end of the year break. On January 23rd, the last open pre-release of CMSSW_14_0_0 will close: after that date, no ROOT version updates will be possible (see cms-sw/cmssw#43077 (comment)).
Currently, the CMS issue has been worked around, therefore ROOT 6.30 can be picked up by CMS. However, it would be much better to have your fix in master and the 6.30 branch for CMS to pick up. That leaves us with a short window of opportunity, which is basically this week.

Do you think you can factor and succinctly document the changes in the way @hahnjo proposed in order to be able to merge the PR to make your fixes available to CMS? We see LHCb experiencing similar problems - they would also likely benefit from your code.

@vgvassilev
Copy link
Member Author

Hi @vgvassilev , thanks once again for this great development, which was delivered and tested on such a short timescale, basically during the end of the year break. On January 23rd, the last open pre-release of CMSSW_14_0_0 will close: after that date, no ROOT version updates will be possible (see cms-sw/cmssw#43077 (comment)). Currently, the CMS issue has been worked around, therefore ROOT 6.30 can be picked up by CMS. However, it would be much better to have your fix in master and the 6.30 branch for CMS to pick up. That leaves us with a short window of opportunity, which is basically this week.

Do you think you can factor and succinctly document the changes in the way @hahnjo proposed in order to be able to merge the PR to make your fixes available to CMS? We see LHCb experiencing similar problems - they would also likely benefit from your code.

@dpiparo, I have made the changes that I agree with and responded to the rest of the comments. This PR seems to be ready to go from what concerns me.

vgvassilev and others added 4 commits January 11, 2024 10:27
The llvm9 JIT issued callbacks when a symbol was missing and we reacted on it
by loading the relevant library. In root-project/root@9b2041e3 we have kept the
logic but now the JIT started querying more often even for symbols which are
okay to be missing. In turn that leads to scanning all libraries causing
performance issues.

This patch tries to limit this functionality only in contexts where automatic
loading is allowed.
Symbol lookup is a quite expensive operation and might result in JIT
compilation and library loading.

Co-authored-by: Jonas Hahnfeld <[email protected]>
@dpiparo
Copy link
Member

dpiparo commented Jan 11, 2024

Hi @hahnjo , @vgvassilev : thanks a lot for the exchanges and work on this topic. I was wondering if you can estimate an ETA for the merging of this nice work, key for addressing CMS (and LHCbs) current issues in a sustainable way and for the adoption of ROOT 6.30 by CMS in 2024.

@hahnjo hahnjo force-pushed the reduce-symbol-search branch from 8f4c26d to 55145f0 Compare January 11, 2024 11:43
@phsft-bot
Copy link

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

Copy link
Member

@hahnjo hahnjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, the organization is now good from my point of view.

DeclId_t d;
TClingClassInfo *cl = (TClingClassInfo*)opaque_cl;

// Could trigger deserialization of decls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I can just remove this and tests still pass on my system. Can you assess if this is still needed in the latest version of the PR? Shall we try reverting it?

@dpiparo dpiparo self-requested a review January 15, 2024 15:44
Copy link
Member

@dpiparo dpiparo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @vgvassilev , Thanks for these changes. Can you merge them asap? We can then make sure @smuzaffar uses this commit for the CMSSW builds in order for CMS to fully benefit from the improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants