CUDA runtime_call test fails: stub handler invoked despite CUDA_ENABLED build

The test `runtime_call with CUDA client (device↔host copies)` at `exla/test/exla/defn/runtime_call_test.exs:34` fails with:

```
** (RuntimeError) EXLA was not compiled with CUDA support.
This error means your EXLA compilation is out of sync with your libexla.so NIF.
```

Even after a clean `make clean && mix compile` with `nvcc` available and `-DCUDA_ENABLED` correctly set in CFLAGS.

## Analysis

The `.o` file is compiled correctly — `_impl` is present, `_stub` is absent. But the final `libexla.so` contains both symbols, and the stub wins the XLA FFI handler registration.

```
nm cache/libexla.so | grep cuda_stub
00000000000648d0 t ...exla_runtime_callback_cuda_stub  # should not exist
```

## Reproduction

```bash
cd exla
make clean && mix compile
mix test test/exla/defn/runtime_call_test.exs:34
```

## Environment

- GPU: NVIDIA GeForce RTX 5090 (SM 12.0a)
- CUDA 12.8 / cuDNN 9.13.0
- XLA 0.10.0 / EXLA 0.11.0

Introduced in PR #1677. Fails consistently on `main`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA runtime_call test fails: stub handler invoked despite CUDA_ENABLED build #1687

Analysis

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA runtime_call test fails: stub handler invoked despite CUDA_ENABLED build #1687

Description

Analysis

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions