Skip to content

Conversation

@zzxuanyuan
Copy link
Contributor

@bbockelm I am submitting a new pull request for signal handling:

  1. I move signal handling from TSystem to a new class called TSigHandling and use gSigHandling as a global signal handling object in ROOT.
  2. To comply with some function calls like "gSystem->ResetSignals()". I move old ResetSignals() to the new class so TUnixSystem::ResetSignals() just call gSigHandling->ResetSignals(). I could also go over all gSystem->(functions) and replace them with gSigHandling->(functions) if necessary.
  3. I replace old unsafe functions in signal handlers with thread-safe ones.
  4. I only implement StackTrace functions for SIGBUS, SIGSEGV, SIGILL. Other signals are still using default StackTrace functions. kSigAlarm and kSigChild are ignored for my current implementation. Do we need to change other signal handlers?
  5. @pcanal I have some problem with running roottest. I asked a question here:
    Fix issue ROOT-7588. #84
    Could you take a look at it and I will write test case this patch also.

@pcanal
Copy link
Member

pcanal commented Feb 1, 2016

@zzxuanyuan 5. Note that for the test you made for #84, we had to apply a few fixes to the CMakeLists.txt to make it more general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to your name and current month

@pcanal
Copy link
Member

pcanal commented Feb 1, 2016

@zzxuanyuan 2. I think the new Signal class should focus in the signal handling itself and delegate the auxiliary functoin (getenv, etc, to gSystem).

@pcanal
Copy link
Member

pcanal commented Feb 1, 2016

@zzxuanyuan Other signals are still using default StackTrace functions.

What made you make this choice?

@zzxuanyuan kSigAlarm and kSigChild are ignored for my current implementation.

What made you make this choice?

Thanks.

@zzxuanyuan
Copy link
Contributor Author

@pcanal I ignore kSigAlarm is because it will trigger the function DispatchTimers(). This function is implemented in TUnixSystem but not defined in TSystem. I can't use gSystem to call this function. Same issue for kSigChild, it tries to call CheckChilds(). I could either duplicate these two functions in TSigHandling or add those functions in TSystem. But I don't think it makes sense if we define them in TSystem.

Update:
I tried to move DispatchTimers() into TSystem as an abstract function. But it will cause a segmentation fault once I open CINT (command: root -l). It seemed CINT set timer when it is initialized and that cause the problem. Should I duplicate all DispatchTimers() and relative functions into TSigHandling.cxx ?

@pcanal
Copy link
Member

pcanal commented Mar 11, 2016

But it will cause a segmentation fault once I open CINT (command: root -l).

What is the stack trace?

@zzxuanyuan
Copy link
Contributor Author

Here is the stack trace of the error. Now I just simply use type casting gSystem to TUnixSystem to get rid of defining DispatchTimers() in TSystem.

[zhan0915@hcc-zhan root]$ root -l

A fatal system signal has occurred: segmentation violation error
Thread 1 (Thread 0x7f8b05aa0920 (LWP 10908)):
#0  0x0000003fa220e82d in read () from /lib64/libpthread.so.0
#1  0x00007f8b05fa3c64 in SignalSafeRead (fd=3, inbuf=inbuf
entry=0x7ffd70966680 "\001", timeout=timeout
entry=-1, len=1) at /home/bockelman/zhan0915/root/core/unix/src/TUnixSigHandling.cxx:195
#2  0x00007f8b05fa4435 in TUnixSigHandling::StackTraceTriggerThread () at /home/bockelman/zhan0915/root/core/unix/src/TUnixSigHandling.cxx:816
#3  0x00007f8b05fa450c in TUnixSigHandling::DispatchSignals (this=0x1534f70, sig=kSigSegmentationViolation) at /home/bockelman/zhan0915/root/core/unix/src/TUnixSigHandling.cxx:552
#4  <signal handler called>
#5  0x0000003fa1e811a1 in __strlen_sse2 () from /lib64/libc.so.6
#6  0x00007f8b0255f728 in length (__s=0x0) at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/char_traits.h:259
#7  assign (__s=0x0, this=0x7ffd70966c40) at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/basic_string.h:1131
#8  operator= (__s=0x0, this=0x7ffd70966c40) at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/basic_string.h:555
#9  TCling::TCling (this=0x1594620, name=<optimized out>, title=<optimized out>) at /home/bockelman/zhan0915/root/core/meta/src/TCling.cxx:1068
#10 0x00007f8b025602ae in CreateInterpreter (interpLibHandle=<optimized out>) at /home/bockelman/zhan0915/root/core/meta/src/TCling.cxx:578
#11 0x00007f8b05e2daa8 in TROOT::InitInterpreter (this=0x7f8b062a5520 <ROOT::Internal::GetROOT1()::alloc>) at /home/bockelman/zhan0915/root/core/base/src/TROOT.cxx:1821
#12 0x00007f8b05e2de26 in ROOT::Internal::GetROOT2 () at /home/bockelman/zhan0915/root/core/base/src/TROOT.cxx:363
#13 0x00007f8b05f03845 in TApplication::TApplication (this=0x1582e40, appClassName=0x401350 "Rint", argc=0x7ffd70968f4c, argv=0x7ffd70969048, numOptions=0) at /home/bockelman/zhan0915/root/core/base/src/TApplication.cxx:144
#14 0x00007f8b05ad1b71 in TRint::TRint (this=0x1582e40, appClassName=<optimized out>, argc=<optimized out>, argv=<optimized out>, options=<optimized out>, numOptions=<optimized out>, noLogo=false) at /home/bockelman/zhan0915/root/core/rint/src/TRint.cxx:144
#15 0x0000000000400fee in main (argc=3, argv=0x7ffd70969048) at /home/bockelman/zhan0915/root/main/src/rmain.cxx:27

@Axel-Naumann
Copy link
Member

No idea how you managed to get there :-) but that crash should now be protected against in the master by f0e5295. You forgot to set ROOTSYS (and somehow ROOT didn't manage to determine it itself...)

@bbockelm
Copy link
Contributor

Hi Zhe,

Just sent a pull request to your branch that merges master into this branch (makes things a bit easier to review) and does a small touchup (add a timeout when waiting for GDB to protect against potential issues).

Everything looks good to me - this is probably ready to go forward (do please continue to work on the unit test in parallel).

Brian

@bbockelm
Copy link
Contributor

@pcanal - I'm starting to poke at this one now. I'd like to get this closed out.

@bbockelm
Copy link
Contributor

@zzxuanyuan - I only found a small issue, from using newer versions of CMake. Can you cherry-pick this patch:

bbockelm@8841ee6

Merges cleanly back into master; everything seems to work well.

@pcanal
Copy link
Member

pcanal commented Sep 12, 2016

@bbockelm I had started working on the merge but got distracted. The branch as it was then did not build on MacOS (i have a fix for that).

I am also still uncomfortable with

  1. I think the new Signal class should focus in the signal handling itself and delegate the auxiliary function (getenv, etc, to gSystem).

which zxuanyuan explain to some extent and found odd failure when attempting to improve the situation, so I wanted to take a closer look (too see what the problem was and/or there was a better way).

@bbockelm
Copy link
Contributor

Can you send the fix to @zzxuanyuan so he can also cherry-pick it along with the comment I left above?

@pcanal
Copy link
Member

pcanal commented Sep 13, 2016

See pcanal@0108ec6

@zzxuanyuan
Copy link
Contributor Author

@pcanal @bbockelm I merged your commits and updated this branch with upstream.

#pragma link C++ class TStringToken;
#pragma link C++ class TSubString;
#pragma link C++ class TSysEvtHandler;
#pragma link C++ class TSigHandling+;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you preferred TSigHandling over TSigHandler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There exists TSignalHandler (https://github.com/root-project/root/blob/8f66431755137822f726c4594b680997324439b5/core/base/inc/TSysEvtHandler.h). I feel TSigHandler is so close to TSignalHandler that it could be difficult to distinguish.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find this very confusing. What is the difference between the two?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason which brought you to create a dictionary for the class(es)?

@bbockelm
Copy link
Contributor

@zzxuanyuan - can you resolve conflicts noted above?

@phsft-bot
Copy link

Can one of the admins verify this patch?

@pcanal
Copy link
Member

pcanal commented Mar 22, 2017

Don't worry about this .. for better or worse this patch request is on my desk in half merge/half more upgrade state and need a bit more help on my side. I am keeping this pull request open until I am done.

@pcanal
Copy link
Member

pcanal commented Mar 28, 2017

closing while being reworked.

@pcanal pcanal closed this Mar 28, 2017
@bellenot
Copy link
Member

Sorry for the naive question, but what would be the difference between TSignalHandler and TSigHandling? Aren't they the same? Why a new class (and a new global)?

@zzxuanyuan
Copy link
Contributor Author

TSignalHandler is the name of original implementation in ROOT. TSigHandling is the new implementation which supports thread-safe functions.

Zhe

@pcanal
Copy link
Member

pcanal commented Sep 25, 2017

See #1053

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants