Mark functions that take the address of a global function as host… #835

jlebar · 2016-09-15T20:36:44Z

…-only.

clang is about to get pickier about disallowing references to things
from host+device code when it won't work on either host or device.

clang doesn't currently support launching kernels from the device side,
thus these host device functions that take a function pointer to
__global functions when CUDA_RDC is not defined are no good.

Originally landed as 98b4e16, reverted in 884d199 because the
condition was wrong. See
#831 (review)

…-only. clang is about to get pickier about disallowing references to things from host+device code when it won't work on either host or device. clang doesn't currently support launching kernels from the device side, thus these __host__ __device functions that take a function pointer to __global__ functions when __CUDA_RDC__ is not defined are no good. Originally landed as 98b4e16, reverted in 884d199 because the condition was wrong. See NVIDIA#831 (review)

3gx · 2016-09-15T20:42:47Z

LGTM, but let me run a unit tester with -rdc=true to verify the correctness. Will report here.

3gx · 2016-09-15T21:21:34Z

All tests pass.

jaredhoberock · 2016-09-15T21:33:18Z

Great, thanks!

jlebar · 2016-09-15T21:39:28Z

Thank you, folks!

3gx · 2016-09-16T14:42:53Z

Bugger. This change must be reverted. I only tested it with -rdc=true -arch=sm_35 or higher. It will fail otherwise because kernel invocation, or address taking, must not be guarded by CUDA_ARCH. I overlooked this aspect.

3gx · 2016-09-16T14:44:19Z

If it is guarded by CUDA_ARCH device compilation will never specialize kernel. kaboom!

jaredhoberock · 2016-09-16T15:27:36Z

Thanks for the analysis. It may be that this guard needs to be made clang-specific.

jlebar · 2016-09-16T17:17:45Z

This may be easier if you guys write the patch? I am happy to test it.

3gx · 2016-09-17T16:13:55Z

Wouldn't it be easier if clang would not disallow taking address of a kernel in a device code?

jlebar · 2016-09-17T16:51:05Z

Wouldn't it be easier if clang would not disallow taking address of a kernel in a device code?

Possibly, but clang emphasizes being a sound compiler. :)

Taking the address of a function you cannot call isn't allowed in C++. For example, you can't take the address of a private function you don't have access to.

Indeed taking addresses of functions from device code should probably be disallowed entirely, because indirect calls are not supported on the GPU. We're not there yet, but this is a step in that direction.

gnzlbg · 2016-10-09T10:37:41Z

Any progress on this?

jlebar · 2016-10-09T17:03:36Z

I am happy to write another patch, but at this point I'm pretty confused about what the guard should be.

We checked in the code that makes this fail in clang a few days ago.

andrewcorrigan · 2016-10-10T00:27:13Z

Can we please get this fixed as soon as possible? Would changing the guard to only disable the code in question for clang, while leaving it alone for nvcc, be acceptable to everyone?

#if !(defined(__clang__) && defined(__CUDA__)) && (!defined(__CUDA_ARCH__) || (defined(__CUDACC_RDC__) && __CUDA_ARCH__ >= 350))

3gx · 2016-10-10T00:47:00Z

Since clang doesn't support Dynamic Parallelism, a simple guard should suffice:

#if !(defined(__clang__) && defined(__CUDA__))

@jlebar Please submit PR and I will test it.

jlebar mentioned this pull request Sep 15, 2016

Mark functions that take the address of a __global__ function as host… #831

Merged

jaredhoberock merged commit 62df72e into NVIDIA:master Sep 15, 2016

schiller-manuel mentioned this pull request Sep 16, 2016

thrust/master crashing with "invalid device function" when compiled with CUDA 7.5 #837

Closed

jaredhoberock mentioned this pull request Sep 16, 2016

Revert "Mark functions that take the address of a __global__ function as host…" #838

Merged

andrewcorrigan mentioned this pull request Oct 10, 2016

attempt to resolve #835 #840

Closed

Mark functions that take the address of a __global__ function as host… #835

Mark functions that take the address of a __global__ function as host… #835

Uh oh!

Conversation

jlebar commented Sep 15, 2016

Uh oh!

3gx commented Sep 15, 2016

Uh oh!

3gx commented Sep 15, 2016

Uh oh!

jaredhoberock commented Sep 15, 2016

Uh oh!

jlebar commented Sep 15, 2016

Uh oh!

3gx commented Sep 16, 2016

Uh oh!

3gx commented Sep 16, 2016

Uh oh!

jaredhoberock commented Sep 16, 2016

Uh oh!

jlebar commented Sep 16, 2016

Uh oh!

3gx commented Sep 17, 2016

Uh oh!

jlebar commented Sep 17, 2016

Uh oh!

gnzlbg commented Oct 9, 2016

Uh oh!

jlebar commented Oct 9, 2016

Uh oh!

andrewcorrigan commented Oct 10, 2016

Uh oh!

3gx commented Oct 10, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Mark functions that take the address of a global function as host… #835

Mark functions that take the address of a global function as host… #835