-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Extend templation in GenVector to better support vectorised Vc types. #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend templation in GenVector to better support vectorised Vc types. #394
Conversation
… on their scalar type
…ix support for Vc types
…ests. Start with a limited scope and just update Plane3D, Translation3D and Transformation3D
…ort vector Scalar types
| Scalar Mag2() const { return fX*fX + fY*fY; } | ||
| Scalar R() const { return std::sqrt( Mag2());} | ||
| Scalar Phi() const { return (fX==0 && fY==0) ? 0.0 : atan2(fY,fX);} | ||
| Scalar R() const { using namespace std; return sqrt( Mag2()); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, what is the advantage of the using namespace std; pattern, instead of using return std::sqrt...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required to support Vc types. For Vc types you need to use the versions of sqrt etc. shipped as part of that library. Using std::sqrt does not allow this. The namespace trick means you use the std:: versions when appropriate, but also allows other implementations when not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, thanks for explaining! Neat and subtle at the same time. Btw, there is an inherited extra space after the open paren.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, it's @lmoneta's call but I'd prefer to have a short comment saying what the pattern does, or maybe a #define R__enable_vc_types using namespace std; This would make this easier to read by an unarmed eye.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike preprocessor directives unless absolutely required, and I don't think here adding one really helps. To my eye the pattern as is is clear, but as you say its up to @lmoneta . I would be OK with adding some sort of comment somewhere ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment would be as clear and more concise than a ifdef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I don't understand this. Are you sure it is needed for Vc ? In the past I have seen Vc replacing automatically its vectorised function implementations in the std namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sure I ran into some issue that lead me to doing this... Also note the templation is in principle there for other types than Vc. This change in principle supports types that do not extend the math functions under std::, as well as those that do.
| *begin++ = a; | ||
| *begin++ = b; | ||
| *begin = c; | ||
| Scalar a,b,c = Scalar(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that leave a and b uninitialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite right.... Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix pushed.
|
enable_if implementations updated. |
| */ | ||
| template <typename SCALAR = Scalar> | ||
| typename std::enable_if<std::is_arithmetic<SCALAR>::value, DisplacementVector3D>::type Unit() const | ||
| template <typename SCALAR = Scalar, typename std::enable_if<std::is_arithmetic<SCALAR>::value>::type * = nullptr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you transform this:
template <typename SCALAR = Scalar, typename std::enable_if<std::is_arithmetic<SCALAR>::value>::type * = nullptr>
into
template <typename SCALAR = Scalar, typename = std::enable_if<std::is_arithmetic<SCALAR>::value>::type>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Forget my last comment... having a bad comment day).
You are missing one typename, but yes, I can make that change..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it fails to compile when there are two methods. i.e.
/**
return unit vector parallel to this (scalar)
*/
template <typename SCALAR = Scalar, typename = typename std::enable_if<std::is_arithmetic<SCALAR>::value>::type>
DisplacementVector3D Unit() const
{
const auto tot = R();
return tot == 0 ? *this : DisplacementVector3D(*this) / tot;
}
/**
return unit vector parallel to this (vector)
*/
template <typename SCALAR = Scalar, typename = typename std::enable_if<!std::is_arithmetic<SCALAR>::value>::type>
DisplacementVector3D Unit() const
{
SCALAR tot = R();
tot(tot == SCALAR(0)) = SCALAR(1);
return DisplacementVector3D(*this) / tot;
}
gives
In file included from /Users/chris/Projects/ROOT/source/math/genvector/src/Boost.cxx:16:
In file included from /Users/chris/Projects/ROOT/build/include/Math/GenVector/Boost.h:20:
In file included from /Users/chris/Projects/ROOT/build/include/Math/GenVector/LorentzVector.h:23:
/Users/chris/Projects/ROOT/build/include/Math/GenVector/DisplacementVector3D.h:351:28: error: class member cannot be redeclared
DisplacementVector3D Unit() const
^
/Users/chris/Projects/ROOT/build/include/Math/GenVector/DisplacementVector3D.h:341:28: note: previous declaration is here
DisplacementVector3D Unit() const
^
I'll stick with what is currently there and works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, indeed.
… expect speed up based on the SIMD register size is observed (with 10% safety factor)
|
I have extended the Vc GenVector test as requested to include a speed test. The test asserts that, for an equal number of 'photons' the processing time for the vectorised test should be a factor of So indeed for the default (SSE4) vectorisation level with double a factor of 2 is observed. Chris |
|
@phsft-bot build! Thanks for the updated test Chris! |
|
running the test a few times it seems 10% is too tight a margin. The fluctuations from just rerunning the test a few times can be quite sizeable. To avoid false positives I have extended the margin to 25%. |
|
Hm..., now I am confused, according to the test results, the vectorisation on slc6 and centos7 has a negative effect... |
|
@phsft-bot build! |
|
why do you say that ? |
|
I know for a fact the results are as expected on SLC6. The results in the talk I posted where run on this platform. |
|
I am puzzled why this happens, do you have an idea (the invariant is the gcc49). You say results on SLC6 are expected: you mean the ones we observed in the jenkins build? |
|
which jenkins build ? Note that to get good SIMD results requires the test to run on a machine that properly supports SIMD instructions. As I mention in my talk if the machine you use is a VM, the results could be way off. |
|
The only Vc results I really trust are those run on a 'real' CPU, not a VM. |
|
Ok, then we should not assert failure from in test case. Let's just print out the information for reference. |
|
What sort of a machine do the tests run on ? My bet is this is not a real CPU that properly supports SIMD instructions (SSE4 in this case) or a VM machine where the timings cannot be trusted. Sorry, but if you want a timing test for the Vc types you have to run this on a machine where the SIMD types are properly supported, and looking at the results above I would say this is currently not the case. Chris |
|
I agree then. If you run the jenkins tests on VM then the timing tests cannot be trusted. I will disable the failure mode from them. |
|
Thanks! With that I think we are ready to land this nice piece of work given we get green light from jenkins. |
…on a VM with poort SIMD support
|
I've turned off the failure return value from the timing tests, as if they are run from VMs the results cannot really be trusted. |
|
@phsft-bot build! |
|
Finally merged! Thanks for the contribution. I am looking forward to reviewing more PR going in this direction! |
|
Many thanks for your help as well. I may well return to some of the other genvector classes at some point, the rotations for instance, as I think I will eventually have a use for them. That will be more work though... Hopefully next time things will be smoother as I now know a lot more about what is expected in a ROOT PR... ;) |
vgvassilev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some post-commit review comments. Could you address them please?
|
|
||
| template <typename POINT, typename VECTOR, typename FTYPE> | ||
| inline | ||
| typename std::enable_if<std::is_arithmetic<typename POINT::Scalar>::value && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjones051073 it seems that we omitted transforming this code here as a default template parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I over looked these instances of enable_if. will change them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I guess I should open a new PR for this small fix, as this one is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. see PR #427.
|
|
||
| template <typename POINT, typename VECTOR, typename FTYPE> | ||
| inline | ||
| typename std::enable_if<!std::is_arithmetic<typename POINT::Scalar>::value && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
| } | ||
|
|
||
| template <typename POINT, typename VECTOR, typename PLANE> | ||
| inline typename std::enable_if<std::is_arithmetic<typename POINT::Scalar>::value && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
| } | ||
|
|
||
| template <typename POINT, typename VECTOR, typename PLANE, typename FTYPE = typename POINT::Scalar> | ||
| inline typename std::enable_if<!std::is_arithmetic<typename POINT::Scalar>::value && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
This PR extends the type templation that was already present in some classes in the GenVector library to improve support for using vectorised (Vc) types. Specifically it :-
This PR is not complete, in that there are still some classes in GenVector that still do not support Vc types, as they are still not templated, such as the Rotation like transformations. It would be nice to add this at some point, but the code associated to these is more extensive (3DConversions.cxx for instance) and that will require some work.
My changes pass the built in ROOT tests. In addition I have prepared a simple test case for the Vc types (attached) that I have used to check the scalar and vector types give equivalent results. Tested on OS X with the compilation command
clang++ -O3 -mavx2 -mfma
root-config --cflags-I/Users/chris/Projects/Vc/install/include main.cpproot-config --libs-lGenVector /Users/chris/Projects/Vc/install/lib/libVc.amain-cpp.txt