The test runtime_call with CUDA client (device↔host copies) at exla/test/exla/defn/runtime_call_test.exs:34 fails with:
** (RuntimeError) EXLA was not compiled with CUDA support.
This error means your EXLA compilation is out of sync with your libexla.so NIF.
Even after a clean make clean && mix compile with nvcc available and -DCUDA_ENABLED correctly set in CFLAGS.
Analysis
The .o file is compiled correctly — _impl is present, _stub is absent. But the final libexla.so contains both symbols, and the stub wins the XLA FFI handler registration.
nm cache/libexla.so | grep cuda_stub
00000000000648d0 t ...exla_runtime_callback_cuda_stub # should not exist
Reproduction
cd exla
make clean && mix compile
mix test test/exla/defn/runtime_call_test.exs:34
Environment
- GPU: NVIDIA GeForce RTX 5090 (SM 12.0a)
- CUDA 12.8 / cuDNN 9.13.0
- XLA 0.10.0 / EXLA 0.11.0
Introduced in PR #1677. Fails consistently on main.
The test
runtime_call with CUDA client (device↔host copies)atexla/test/exla/defn/runtime_call_test.exs:34fails with:Even after a clean
make clean && mix compilewithnvccavailable and-DCUDA_ENABLEDcorrectly set in CFLAGS.Analysis
The
.ofile is compiled correctly —_implis present,_stubis absent. But the finallibexla.socontains both symbols, and the stub wins the XLA FFI handler registration.Reproduction
Environment
Introduced in PR #1677. Fails consistently on
main.