Skip to content

Conversation

@xvallspl
Copy link
Contributor

@xvallspl xvallspl commented Apr 7, 2017

No description provided.

@pcanal
Copy link
Member

pcanal commented Apr 7, 2017

the commit seems to mix white space changes and functional changes making it hard to figure out the functional changes.

@xvallspl
Copy link
Contributor Author

xvallspl commented Apr 7, 2017

Oh, no.

Sorry, my editor wasn't showing the space changes when in diff mode.

I'll fix it over the weekend.

@phsft-bot
Copy link

Starting build on centos7/gcc49, mac1011/native, slc6/gcc49, slc6/gcc62, ubuntu14/native and CMake flags -Dccache=ON -Dimt=OFF

@phsft-bot
Copy link

Starting build on centos7/gcc49, mac1011/native, slc6/gcc49, slc6/gcc62, ubuntu14/native and CMake flags -Dccache=ON -Dimt=OFF

@phsft-bot
Copy link

Starting build on centos7/gcc49, mac1011/native, slc6/gcc49, slc6/gcc62, ubuntu14/native and CMake flags -Dccache=ON -Dimt=OFF

}
if (request == 1) {
request = strtol(argv[a+1], 0, 10);
if (request < kMaxLong && request >= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that request is a long isn't the first part always true? If I remember correctly, one has to check errno to figure out if something went wrong in strtol.

Copy link
Contributor Author

@xvallspl xvallspl Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily.

From cpp reference:

  • If successful, an integer value corresponding to the contents of str is returned.
  • If the converted value falls out of range of corresponding return type, a range error occurs (setting errno to ERANGE) and LONG_MAX, LONG_MIN, LLONG_MAX or LLONG_MIN is returned.
  • If no conversion can be performed, ​0​ is returned

Which means that this may be incorrect when no conversion can be performed. There are a couple more places were hadd was already doing that. I'll check.

std::cout<<"hadd failed at the parallel stage"<<std::endl;
}
for(auto pf:partialFiles){
gSystem->Unlink(pf.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpfull for debugging to have a way to disable the unlink of the intermediary files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful, yes.
How? Adding another option? This will only be used in the multiprocess case, so maybe a combination of the two? Something like -jdbg?

if(multiproc){
for(auto i = 0; (i*step)<filesToProcess; i++) {
std::stringstream buffer;
buffer <<"partial"<<i<<".root";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read correctly the intermediary files name are "partial1.root", "partial2.root". If this is the case, we may want to enhance the name to allow for the possibility of running two hadd in the same directory at the same time (i.e. at the moment this is not possible).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will look into that.

@xvallspl xvallspl changed the title Allow hadd to run in multiprocess mode [WIP} Allow hadd to run in multiprocess mode Apr 11, 2017
@xvallspl xvallspl changed the title [WIP} Allow hadd to run in multiprocess mode [WIP] Allow hadd to run in multiprocess mode Apr 11, 2017
Appends the same unique identifier to the name of the partial files
within the same hadd execution.
@dpiparo
Copy link
Member

dpiparo commented Apr 11, 2017

Hi @xvallspl, nice. Are now the partial files in the /tmp or equivalent directory? Do we have a switch not to unlink them after merging and to print their names for debugging purposes?

@xvallspl
Copy link
Contributor Author

xvallspl commented Apr 11, 2017 via email

@pcanal
Copy link
Member

pcanal commented Apr 11, 2017

I would not overload -v with a change in behavior. I agree that -jdbg is a good option. -dbg might be even better (i.e. even if at the moment it only change the parallel behavior, it might change behavior in the scalar case also ... in the future).

@pcanal
Copy link
Member

pcanal commented Apr 11, 2017

Are now the partial files in the /tmp or equivalent directory?

Because the intermediary files could be large or numerous, if we rely on a shared directory we ought to create them in a user-specific subdirectory.

However using a shared directory may cause problem in itself. On some system /tmp is small and /var/tmp should be used (or maybe simply use what TMPDIR says).

All in all, it might even be better to use a (subdirectory of the) output directory which is, per se, guaranteed to be writeable by the user (or the output can not be done). However, whether it has enough space for twice the final output size is a (small) concern.

@martinmine
Copy link
Contributor

@phsft-bot build!

@dpiparo
Copy link
Member

dpiparo commented Apr 11, 2017

@pcanal: In a previous incarnation of this pr I suggested to use TSystem to get the right tmp dir. The local dir, in case of eos or afs or other shared file systems can become a huge performance penalty.

@xvallspl
Copy link
Contributor Author

Should I add a -d option for specifying the work directory and make it $TMP by default?

@pcanal
Copy link
Member

pcanal commented Apr 11, 2017

Should I add a -d option for specifying the work directory and make it $TMP by default?

Good idea.

In a previous incarnation of this pr I suggested to use TSystem to get the right tmp dir. The local dir, in case of eos or afs or other shared file systems can become a huge performance penalty.

Good point indeed. I wonder if we have a means of knowing whether the destination directory is local (aka 'fast') or not. We can find out whether the file URL is on the local node or not (via TFile::GetType) but this does not tell us whether it is on afs or not.

This avoids the deletion of the intermediate files.
@xvallspl
Copy link
Contributor Author

@phsft-bot build!

@phsft-bot
Copy link

Starting build on centos7/gcc49, mac1011/native, slc6/gcc49, slc6/gcc62, ubuntu14/native and CMake flags -Dccache=ON -Dimt=OFF

@phsft-bot
Copy link

Starting build on centos7/gcc49, mac1011/native, slc6/gcc49, slc6/gcc62, ubuntu14/native and CMake flags -Dccache=ON -Dimt=OFF

@dpiparo dpiparo merged commit 09ed3f3 into root-project:master Apr 26, 2017
@dpiparo
Copy link
Member

dpiparo commented Apr 28, 2017

@xvallspl : I reverted the PR. It breaks classic builds as TProcessExecutor is not available there. Perhaps we can take this opportunity to have a special case for the single threaded mode also in the cmake builds which is identical to the previous version of hadd?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants