Skip to content

Conversation

@hahnjo
Copy link
Member

@hahnjo hahnjo commented Oct 7, 2022

Avoid loading ROOT modules while Clang is instantiating a template class from STL, leading to ODR checks with an incomplete class.

@hahnjo hahnjo added the in:Cling label Oct 7, 2022
@hahnjo hahnjo requested a review from vgvassilev October 7, 2022 12:58
@hahnjo hahnjo self-assigned this Oct 7, 2022
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac1015/cxx17, mac11/cxx14, windows10/cxx14
How to customize builds

@vgvassilev
Copy link
Member

@junaire, can you test if this solves also your problems?

auto *D = dyn_cast<Decl>(DC);
if (D) {
SourceLocation Loc = D->getLocation();
if (Loc.isValid() && SemaR.getSourceManager().isInSystemHeader(Loc)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could ask if Decls OwningModule is a system one. This way we could "annotate" modules which we do not need to mess with. We consider modules starting with ROOT_ system modules because a pure [system] tag also changes a little the semantics of the whole thing but I think we are fine to do that check here.

I think we probably can explore a little more the heuristics you came up with as it looks promising. Namely, can a declaration context of a module be 'extended with' declarations from a module that is not listed among the module dependencies? I suspect the only cases where this can happen is friend declarations and ADL lookups. I don't think we need to support that... Eg. what happens if we have a module that adds another std::swap with MyClass parameter - are we expected to autoload the module exporting MyClass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like asking the OwningModule and changed both the existing and the new check 😃

I think we probably can explore a little more the heuristics you came up with as it looks promising.

Yes, we likely need to understand the other failure modes and then abstract that to a heuristic / check that we can add here. For now I just want to make both LookupObject methods bail out on the same conditions, namely a Decl coming from a system module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm well, I guess this doesn't work for the nortcxxmod case? 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we care for the nortcxxmod as the PCH is preloaded and we don't react to entities that come from an ASTFile. I don't think we have "system" headers which are not in the PCH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted back to asking isInSystemHeader which did better in the tests...

@junaire
Copy link
Contributor

junaire commented Oct 8, 2022

@junaire, can you test if this solves also your problems?

I haven't tested this, but @hahnjo said this almost fixes my issue in #10910, only roottest-root-html-runMakeIndex needs some tweaking for the generated header file. But I think this may not be a full "workaround" as it doesn't handle the root issue. In fact, Jonas did mention this doesn't help with the problems about TSeq, which means we still have to preload ROOTDataFrame.

I also made my own fix (4757b60) and I feel like it may be a better direction to work on (even though it also failed to fix TSeq...)

@junaire
Copy link
Contributor

junaire commented Oct 8, 2022

@junaire, can you test if this solves also your problems?

I haven't tested this, but @hahnjo said this almost fixes my issue in #10910, only roottest-root-html-runMakeIndex needs some tweaking for the generated header file. But I think this may not be a full "workaround" as it doesn't handle the root issue. In fact, Jonas did mention this doesn't help with the problems about TSeq, which means we still have to preload ROOTDataFrame.

I also made my own fix (4757b60) and I feel like it may be a better direction to work on (even though it also failed to fix TSeq...)

FWIW I pushed a new commit to #10910 and it looks like surprisingly fixes all issues even without preloading ROOTDataFrame! But now let's see if Jenkins is happy or not. Fingers crossed :)

@junaire
Copy link
Contributor

junaire commented Oct 9, 2022

@junaire, can you test if this solves also your problems?

I haven't tested this, but @hahnjo said this almost fixes my issue in #10910, only roottest-root-html-runMakeIndex needs some tweaking for the generated header file. But I think this may not be a full "workaround" as it doesn't handle the root issue. In fact, Jonas did mention this doesn't help with the problems about TSeq, which means we still have to preload ROOTDataFrame.
I also made my own fix (4757b60) and I feel like it may be a better direction to work on (even though it also failed to fix TSeq...)

FWIW I pushed a new commit to #10910 and it looks like surprisingly fixes all issues even without preloading ROOTDataFrame! But now let's see if Jenkins is happy or not. Fingers crossed :)

OK, so we still got some failures but I can't see it locally. I'll try to reproduce them, but I feel like that's the right direction. CC @hahnjo @vgvassilev

@hahnjo
Copy link
Member Author

hahnjo commented Oct 10, 2022

@vgvassilev as I wrote on Mattermost, this change was particularly written to address Jun's problem. roottest-root-html-runMakeIndex is a separate thing that needs addressing anyhow; it's only now visible because all the other failures are gone.

FWIW I don't agree that the workarounds (hacks) introduced in #10910 are a good way to go, as shown by the many failing tests. The scope of the changes is too big and there are valid reasons to do something with Decls currently being defined, as evidenced by the many failing tests. Adding more conditions will eventually only lead to internally inconsistent states...

@junaire
Copy link
Contributor

junaire commented Oct 10, 2022

@vgvassilev as I wrote on Mattermost, this change was particularly written to address Jun's problem. roottest-root-html-runMakeIndex is a separate thing that needs addressing anyhow; it's only now visible because all the other failures are gone.

FWIW I don't agree that the workarounds (hacks) introduced in #10910 are a good way to go, as shown by the many failing tests. The scope of the changes is too big and there are valid reasons to do something with Decls currently being defined, as evidenced by the many failing tests. Adding more conditions will eventually only lead to internally inconsistent states...

The "workaround" introduced in #10910 is not only intended to address my problem but also serves as a more generic fix. It makes more sense right? disable the callback when we're instantiating templates.

as shown by the many failing tests.

Ahh... yes and no, actually we're almost getting there. Currently, it is just some strange failures in some specific build bots (ROOT-debian10-i386 and mac builders) We also haven't tested that separately so who knows what triggered the failure.

I'm not really sure which way is the best to go, we still need time to discover, can you check if that patch fixes your issue? Also, I would appreciate it if you can take a look at the failures to see if you have a clue or not :)

@hahnjo
Copy link
Member Author

hahnjo commented Oct 10, 2022

@junaire tests should be clean in CI builds, as you can see for this PR. Any failures are introduced by the code changes, possibly due to exposing other issues (as was the case here, leading to this fix). And it's not only some obscure platforms, but both macOS and two of the three Linux platforms. But we should discuss this on the PR itself, not here...

@junaire
Copy link
Contributor

junaire commented Oct 10, 2022

@junaire tests should be clean in CI builds, as you can see for this PR. Any failures are introduced by the code changes, possibly due to exposing other issues (as was the case here, leading to this fix). And it's not only some obscure platforms, but both macOS and two of the three Linux platforms. But we should discuss this on the PR itself, not here...

Sorry about discussing unrelated content here, I sent you a message in mattermost. Anyway, thanks for your help!

@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac1015/cxx17, mac11/cxx14, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2022-10-10T10:06:28.432Z] FAILED: tmva/sofie/test/CMakeFiles/SofieCompileModels_ROOT.util

@phsft-bot
Copy link

Build failed on ROOT-ubuntu18.04/nortcxxmod.
Running on sft-ubuntu-1804-3.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Failing tests:

@vgvassilev
Copy link
Member

This change looks good to me, do we have a plan what to do with the current failures?

@hahnjo
Copy link
Member Author

hahnjo commented Oct 10, 2022

@vgvassilev the failure is in a nortcxxmod, that's why I wondered about this case. The previous implementation looking at the SourceLocations worked, so there must be some differences...

@vgvassilev
Copy link
Member

vgvassilev commented Oct 10, 2022

@vgvassilev the failure is in a nortcxxmod, that's why I wondered about this case. The previous implementation looking at the SourceLocations worked, so there must be some differences...

I suspect that std::pair<edm::Value> returned true which we can classify more accurately now. Template instantiations are a weird beast that can contain both content from a system header and non-system headers. I suspect that hasOwningModule is false. For the PCH usually it is enough to use isFromASTFile.

PS: On a second thought that would probably fail elsewhere..

Avoid loading ROOT modules while Clang is instantiating a template
class from STL, leading to ODR checks with an incomplete class.
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac1015/cxx17, mac11/cxx14, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu18.04/nortcxxmod.
Running on sft-ubuntu-1804-3.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2022-10-11T12:46:21.868Z] fatal error: PCH file '/mnt/build/workspace/root-pullrequests-build/build/etc//allDict.cxx.pch' not found: module file not found
  • [2022-10-11T12:46:24.399Z] FAILED: tmva/sofie/test/CMakeFiles/SofieCompileModels_ROOT.util

@hahnjo
Copy link
Member Author

hahnjo commented Oct 19, 2022

ping...

@pcanal
Copy link
Member

pcanal commented Oct 19, 2022

disable the callback when we're instantiating templates.

As long as we do before hand the side effects. ie seeing templateclass<MyClass> we need to load the module for MyClass first.

Copy link
Member

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me assuming @pcanal does not have objections.

Copy link
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@hahnjo hahnjo merged commit 40aa68d into root-project:master Oct 21, 2022
@hahnjo hahnjo deleted the lookup-stl branch October 21, 2022 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants