Hi,
I’d like clarification on using OpenBLAS on a 2×192-core AmpereOne (384 cores total), AArch64 Linux.
Planned build:
make TARGET=ARMV8 USE_OPENMP=1 NUM_THREADS=384 BIGNUMA=1
Docs mention a default 256-core limit, and BIGNUMA=1 extending support to 1024 cores, but this seems described mainly for x86_64.
Questions:
- On AmpereOne (AArch64), is
BIGNUMA=1 NUM_THREADS=384 a supported/validated configuration for a single process?
- Or should AArch64 builds be treated as limited to 256 threads per process (keep
NUM_THREADS <= 256 and use multiple MPI ranks)?
- Is
BIGNUMA intended to be x86_64-only, or is it expected to work on Arm as well?