Skip to content

Conversation

@arielzn
Copy link
Contributor

@arielzn arielzn commented Sep 23, 2025

(created using eb --new-pr)

With the modification of the bundle.py easyblock to include the tests of components introduced on eb v5.1.1 easyblocks PR#3748 we started hitting errors when testing the install of FlexiBLAS 3.3.1 on zen4/zen5.

It segfaults as:

 Running: /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/easybuild_obj/test/lapack-3.11.0/LIN/xlintstz
ARGS= OUTPUT_FILE;/dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/easybuild_obj/test/lapack-3.11.0/ztest.out;ERROR_FILE;/dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/easybuild_obj/test/lapack-3.11.0/ztest.out.err;INPUT_FILE;/dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/test/lapack-3.11.0/ztest.in
Test OUTPUT:

Test ERROR:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14e1021fe5af in ???
#1  0x14e101e53bcc in zgbtrf_
        at /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/contributed/lapack-3.11.0/SRC/zgbtrf.f:429
#2  0x4225a1 in zchkgb_
        at /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/test/lapack-3.11.0/LIN/zchkgb.f:466
#3  0x41c26b in zchkaa
        at /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/test/lapack-3.11.0/LIN/zchkaa.F:488
#4  0x40844e in main
        at /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/test/lapack-3.11.0/LIN/zchkaa.F:1271

CMake Error at /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/flexiblas-3.3.1/test/lapack-3.11.0/runtest.cmake:51 (message):
  Test
  /dev/shm/alozano/ebbuild/FlexiBLAS/3.3.1/GCC-12.3.0-test/easybuild_obj/test/lapack-3.11.0/LIN/xlintstz
  returned Segmentation fault

        Start  93: LAPACK-xlintstrfz_ztest_rfp_in

Applying the -fno-tree-vectorize option for the netlib LAPACK build as proposed in this PR fixes the issue.

We've hit this kind of issues on the past for lapack code as in #19280

I don't know if we can go for disabling tree-vectorize fully as proposed with the patch, or for a more targeted fix, patching the lapack code as it was done on these #20745 #16406

@arielzn
Copy link
Contributor Author

arielzn commented Sep 23, 2025

I guess @bartoldeman input on this would be nice

For more context, I tested to backport FlexiBLAS 3.4.4 (which builds and tests fine with tree-vectorize on 2024a) into 2023a and it also fails on tests with the same kind of errors.

So the issue does not seem to be the FlexiBLAS/Lapack version, but that the GCC versions on 2023a,b is not producing proper vectorized builds of lapack code for zen4/zen5.

@arielzn arielzn changed the title add patches for FlexiBLAS v3.3.1 to disable -ftree-vectorize for netlib LAPACK add patch for FlexiBLAS v3.3.1 to disable -ftree-vectorize for netlib LAPACK Sep 23, 2025
@boegel boegel changed the title add patch for FlexiBLAS v3.3.1 to disable -ftree-vectorize for netlib LAPACK add patch for FlexiBLAS v3.3.1 to disable -ftree-vectorize for netlib LAPACK Sep 23, 2025
@boegel boegel added this to the next release (5.1.2) milestone Sep 23, 2025
@boegel boegel added bug fix and removed change labels Sep 23, 2025
@boegel boegel requested a review from bartoldeman September 23, 2025 17:49
@bartoldeman
Copy link
Contributor

I'll have to look at this carefully. I'd prefer a patch but using -ffp-contract=off may do the trick more cheaply, as FMA is where the differences often come from, not vectorization, which modulo compiler bugs gives the same result (GCC does not do reductions using vectors by default, which would also affect it; for that you would need -fassociative-math which is off by default).

So please try -ffp-contract=off for now.

@bartoldeman
Copy link
Contributor

Reference-LAPACK/lapack#1033 disabled FMAs in one place using strategic parentheses.
But it's not in any released FlexiBLAS yet.

@boegel
Copy link
Member

boegel commented Sep 23, 2025

@bartoldeman I tried changing the patch to use -ffp-contract=off instead of -fno-tree-vectorize, and that doesn't fix the segfault, LAPACK-xlintstz_ztest_in fails with:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference

@boegel
Copy link
Member

boegel commented Sep 24, 2025

@boegelbot please test @ jsc-zen3
CORE_CNT=16
EB_ARGS="--installpath /tmp/$USER/pr23974"

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23974 EB_ARGS="--installpath /tmp/$USER/pr23974" EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23974 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 8049

Test results coming soon (I hope)...

- notification for comment with ID 3326640773 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.6, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/0e1361d553bbd457232aac2af3686bb0 for a full test report.

@boegel
Copy link
Member

boegel commented Sep 24, 2025

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node4210.shinx.os - Linux RHEL 9.6, x86_64, AMD EPYC 9654 96-Core Processor (zen4), Python 3.9.21
See https://gist.github.com/boegel/167a1893ff1563970ca19ebb5156d924 for a full test report.

@akesandgren
Copy link
Contributor

I re-iterate my standpoint from way back, lapack's test code and matgen MUST be built without optimization of any kind and with -O0 since that code isn't written to handle compiler optimizations.
That's where most of the problems with lapack tests comes from.
If you do that then one can most likely keep the vectorization flag.

@bartoldeman
Copy link
Contributor

A little bit more digging:
the SIGSEGV was caused by an unaligned memory access with an aligned move (avx2 vmovapd used on 16-byte, not 32-byte aligned memory). It looks like an issue with GCC 12.3 and 13.2, it's not present with 13.3.

diff -ur flexiblas-3.3.1.orig/contributed/lapack-3.11.0/SRC/zgbtrf.f flexiblas-3.3.1/contributed/lapack-3.11.0/SRC/zgbtrf.f
--- flexiblas-3.3.1.orig/contributed/lapack-3.11.0/SRC/zgbtrf.f	2023-04-03 14:07:31.000000000 +0000
+++ flexiblas-3.3.1/contributed/lapack-3.11.0/SRC/zgbtrf.f	2025-09-24 15:29:42.138091646 +0000
@@ -169,8 +169,8 @@
       COMPLEX*16         TEMP
 *     ..
 *     .. Local Arrays ..
-      COMPLEX*16         WORK13( LDWORK, NBMAX ),
-     $                   WORK31( LDWORK, NBMAX )
+      COMPLEX*16, ALLOCATABLE :: WORK13( :, : ),
+     $                   WORK31( :, : )
 *     ..
 *     .. External Functions ..
       INTEGER            ILAENV, IZAMAX
@@ -232,6 +232,8 @@
 *
 *        Use blocked code
 *
+         ALLOCATE(WORK13(LDWORK, NBMAX), WORK31(LDWORK, NBMAX))
+*
 *        Zero the superdiagonal elements of the work array WORK13
 *
          DO 20 J = 1, NB

allocates the relevant matrices which bypasses the miscompiled code here, so that's a source code based workaround.

@arielzn
Copy link
Contributor Author

arielzn commented Sep 24, 2025

A little bit more digging: the SIGSEGV was caused by an unaligned memory access with an aligned move (avx2 vmovapd used on 16-byte, not 32-byte aligned memory). It looks like an issue with GCC 12.3 and 13.2, it's not present with 13.3.

...
allocates the relevant matrices which bypasses the miscompiled code here, so that's a source code based workaround.

ok, i was just testing your other proposed patch, so we prefer this source code modif then, right ?

@github-actions
Copy link

Updated software FlexiBLAS-3.3.1-GCC-12.3.0-test.eb

Diff against FlexiBLAS-3.4.5-GCC-14.3.0.eb

easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.3.0.eb

diff --git a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.3.0.eb b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
index 834061dbcb..a7e26dd6c1 100644
--- a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.3.0.eb
+++ b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
@@ -1,28 +1,25 @@
 easyblock = 'Bundle'
 
 name = 'FlexiBLAS'
-version = '3.4.5'
+version = '3.3.1'
+versionsuffix = '-test'
 
 homepage = 'https://gitlab.mpi-magdeburg.mpg.de/software/flexiblas-release'
 description = """FlexiBLAS is a wrapper library that enables the exchange of the BLAS and LAPACK implementation
 used by a program without recompiling or relinking it."""
 
-toolchain = {'name': 'GCC', 'version': '14.3.0'}
+toolchain = {'name': 'GCC', 'version': '12.3.0'}
 local_extra_flags = "-fstack-protector-strong -fstack-clash-protection"
 toolchainopts = {'pic': True, 'extra_cflags': local_extra_flags, 'extra_fflags': local_extra_flags}
 
 builddependencies = [
-    ('CMake', '4.0.3'),
-    ('Python', '3.13.5'),  # required for running the tests
-    ('BLIS', '2.0'),
+    ('CMake', '3.26.3'),
+    ('Python', '3.11.3'),  # required for running the tests
+    ('BLIS', '0.9.0'),
 ]
-if ARCH == 'x86_64':
-    builddependencies.append(('AOCL-BLAS', '5.1'))
-if ARCH == 'aarch64':
-    builddependencies.append(('NVPL', '25.5', '', SYSTEM))
 
 dependencies = [
-    ('OpenBLAS', '0.3.30'),
+    ('OpenBLAS', '0.3.23'),
 ]
 
 # note: first listed backend will be used as default by FlexiBLAS,
@@ -31,9 +28,7 @@ local_backends = ['OpenBLAS', 'BLIS']
 
 # imkl supplies its backend via the imkl module, not as a dependency
 if ARCH == 'x86_64':
-    local_backends.extend(['AOCL-BLAS', 'imkl'])
-if ARCH == 'aarch64':
-    local_backends.extend(['NVPL'])
+    local_backends.append('imkl')
 
 default_component_specs = {'start_dir': '%(namelower)s-%(version)s'}
 sanity_check_all_components = True
@@ -42,16 +37,16 @@ sanity_check_all_components = True
 components = [
     (name, version, {
         'source_urls':
-        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v%(version)s/'],
+        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v3.3.1/'],
         'sources': [SOURCELOWER_TAR_GZ],
-        'checksums': ['e819949c614c4968919b0ea4e873ab916d95cdc6943e9d091a78d209b7d6ed07'],
+        'checksums': ['bbeebf5e5a006924558fec43f49affbe1aaa4cbacfc472a9ff6066ffda142e18'],
         'backends': local_backends,
     }),
-    ('LAPACK', '3.12.1', {
+    ('LAPACK', '3.11.0', {
         'easyblock': 'CMakeMake',
         'source_urls': ['https://github.com/Reference-LAPACK/lapack/archive/'],
         'sources': ['v%(version)s.tar.gz'],
-        'checksums': ['2ca6407a001a474d4d4d35f3a61550156050c48016d949f0da0529c0aa052422'],
+        'checksums': ['4b9ba79bfd4921ca820e83979db76ab3363155709444a787979e81c22285ffa9'],
         'configopts': ('-DBUILD_SHARED_LIBS=ON -DUSE_OPTIMIZED_BLAS=ON -DLAPACKE=ON '
                        '-DUSE_OPTIMIZED_LAPACK=ON -DBUILD_DEPRECATED=ON '
                        '-DCMAKE_INSTALL_INCLUDEDIR=%(installdir)s/include/flexiblas'),
@@ -62,4 +57,9 @@ components = [
     }),
 ]
 
+# add patch to fix issue with tree vectorizer builds
+if ARCH == 'x86_64':
+    components[0][2]['patches'] = ['FlexiBLAS-3.3.1_lapack-prefer-vector-width-256.patch']
+    components[0][2]['checksums'].append('dc60dee5663987f4d23ae3a368217e0dc410993d18c12cc9aaa057567f762e63')
+
 moduleclass = 'lib'
Diff against FlexiBLAS-3.4.5-GCC-14.2.0.eb

easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.2.0.eb

diff --git a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.2.0.eb b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
index 1e8869fdce..a7e26dd6c1 100644
--- a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.5-GCC-14.2.0.eb
+++ b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
@@ -1,26 +1,25 @@
 easyblock = 'Bundle'
 
 name = 'FlexiBLAS'
-version = '3.4.5'
+version = '3.3.1'
+versionsuffix = '-test'
 
 homepage = 'https://gitlab.mpi-magdeburg.mpg.de/software/flexiblas-release'
 description = """FlexiBLAS is a wrapper library that enables the exchange of the BLAS and LAPACK implementation
 used by a program without recompiling or relinking it."""
 
-toolchain = {'name': 'GCC', 'version': '14.2.0'}
+toolchain = {'name': 'GCC', 'version': '12.3.0'}
 local_extra_flags = "-fstack-protector-strong -fstack-clash-protection"
 toolchainopts = {'pic': True, 'extra_cflags': local_extra_flags, 'extra_fflags': local_extra_flags}
 
 builddependencies = [
-    ('CMake', '3.31.3'),
-    ('Python', '3.13.1'),  # required for running the tests
-    ('BLIS', '1.1'),
+    ('CMake', '3.26.3'),
+    ('Python', '3.11.3'),  # required for running the tests
+    ('BLIS', '0.9.0'),
 ]
-if ARCH == 'x86_64':
-    builddependencies.append(('AOCL-BLAS', '5.0'))
 
 dependencies = [
-    ('OpenBLAS', '0.3.29'),
+    ('OpenBLAS', '0.3.23'),
 ]
 
 # note: first listed backend will be used as default by FlexiBLAS,
@@ -29,7 +28,7 @@ local_backends = ['OpenBLAS', 'BLIS']
 
 # imkl supplies its backend via the imkl module, not as a dependency
 if ARCH == 'x86_64':
-    local_backends.extend(['AOCL-BLAS', 'imkl'])
+    local_backends.append('imkl')
 
 default_component_specs = {'start_dir': '%(namelower)s-%(version)s'}
 sanity_check_all_components = True
@@ -38,16 +37,16 @@ sanity_check_all_components = True
 components = [
     (name, version, {
         'source_urls':
-        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v%(version)s/'],
+        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v3.3.1/'],
         'sources': [SOURCELOWER_TAR_GZ],
-        'checksums': ['e819949c614c4968919b0ea4e873ab916d95cdc6943e9d091a78d209b7d6ed07'],
+        'checksums': ['bbeebf5e5a006924558fec43f49affbe1aaa4cbacfc472a9ff6066ffda142e18'],
         'backends': local_backends,
     }),
-    ('LAPACK', '3.12.0', {
+    ('LAPACK', '3.11.0', {
         'easyblock': 'CMakeMake',
         'source_urls': ['https://github.com/Reference-LAPACK/lapack/archive/'],
         'sources': ['v%(version)s.tar.gz'],
-        'checksums': ['eac9570f8e0ad6f30ce4b963f4f033f0f643e7c3912fc9ee6cd99120675ad48b'],
+        'checksums': ['4b9ba79bfd4921ca820e83979db76ab3363155709444a787979e81c22285ffa9'],
         'configopts': ('-DBUILD_SHARED_LIBS=ON -DUSE_OPTIMIZED_BLAS=ON -DLAPACKE=ON '
                        '-DUSE_OPTIMIZED_LAPACK=ON -DBUILD_DEPRECATED=ON '
                        '-DCMAKE_INSTALL_INCLUDEDIR=%(installdir)s/include/flexiblas'),
@@ -58,4 +57,9 @@ components = [
     }),
 ]
 
+# add patch to fix issue with tree vectorizer builds
+if ARCH == 'x86_64':
+    components[0][2]['patches'] = ['FlexiBLAS-3.3.1_lapack-prefer-vector-width-256.patch']
+    components[0][2]['checksums'].append('dc60dee5663987f4d23ae3a368217e0dc410993d18c12cc9aaa057567f762e63')
+
 moduleclass = 'lib'
Diff against FlexiBLAS-3.4.4-GCC-13.3.0.eb

easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.4-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.4-GCC-13.3.0.eb b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
index c8d1df6e85..a7e26dd6c1 100644
--- a/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.4.4-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/f/FlexiBLAS/FlexiBLAS-3.3.1-GCC-12.3.0-test.eb
@@ -1,31 +1,30 @@
 easyblock = 'Bundle'
 
 name = 'FlexiBLAS'
-version = '3.4.4'
+version = '3.3.1'
+versionsuffix = '-test'
 
 homepage = 'https://gitlab.mpi-magdeburg.mpg.de/software/flexiblas-release'
 description = """FlexiBLAS is a wrapper library that enables the exchange of the BLAS and LAPACK implementation
 used by a program without recompiling or relinking it."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'GCC', 'version': '12.3.0'}
 local_extra_flags = "-fstack-protector-strong -fstack-clash-protection"
 toolchainopts = {'pic': True, 'extra_cflags': local_extra_flags, 'extra_fflags': local_extra_flags}
 
 builddependencies = [
-    ('CMake', '3.29.3'),
-    ('Python', '3.12.3'),  # required for running the tests
-    ('BLIS', '1.0'),
-    #   ('AOCL-BLAS', '5.0'),  # Uncomment for support for AOCL-BLAS
+    ('CMake', '3.26.3'),
+    ('Python', '3.11.3'),  # required for running the tests
+    ('BLIS', '0.9.0'),
 ]
 
 dependencies = [
-    ('OpenBLAS', '0.3.27'),
+    ('OpenBLAS', '0.3.23'),
 ]
 
 # note: first listed backend will be used as default by FlexiBLAS,
 # unless otherwise specified via easyconfig parameter flexiblas_default
 local_backends = ['OpenBLAS', 'BLIS']
-# local_backends =+ ['AOCL-BLAS']  # Uncomment for support for AOCL-BLAS
 
 # imkl supplies its backend via the imkl module, not as a dependency
 if ARCH == 'x86_64':
@@ -38,16 +37,16 @@ sanity_check_all_components = True
 components = [
     (name, version, {
         'source_urls':
-        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v3.4.4/'],
+        ['https://gitlab.mpi-magdeburg.mpg.de/api/v4/projects/386/packages/generic/flexiblas-source/v3.3.1/'],
         'sources': [SOURCELOWER_TAR_GZ],
-        'checksums': ['05040ae092142dd0bf38d1bb9ce33f6b475d9f9bb455e33be997932ae855c22b'],
+        'checksums': ['bbeebf5e5a006924558fec43f49affbe1aaa4cbacfc472a9ff6066ffda142e18'],
         'backends': local_backends,
     }),
-    ('LAPACK', '3.12.0', {
+    ('LAPACK', '3.11.0', {
         'easyblock': 'CMakeMake',
         'source_urls': ['https://github.com/Reference-LAPACK/lapack/archive/'],
         'sources': ['v%(version)s.tar.gz'],
-        'checksums': ['eac9570f8e0ad6f30ce4b963f4f033f0f643e7c3912fc9ee6cd99120675ad48b'],
+        'checksums': ['4b9ba79bfd4921ca820e83979db76ab3363155709444a787979e81c22285ffa9'],
         'configopts': ('-DBUILD_SHARED_LIBS=ON -DUSE_OPTIMIZED_BLAS=ON -DLAPACKE=ON '
                        '-DUSE_OPTIMIZED_LAPACK=ON -DBUILD_DEPRECATED=ON '
                        '-DCMAKE_INSTALL_INCLUDEDIR=%(installdir)s/include/flexiblas'),
@@ -58,4 +57,9 @@ components = [
     }),
 ]
 
+# add patch to fix issue with tree vectorizer builds
+if ARCH == 'x86_64':
+    components[0][2]['patches'] = ['FlexiBLAS-3.3.1_lapack-prefer-vector-width-256.patch']
+    components[0][2]['checksums'].append('dc60dee5663987f4d23ae3a368217e0dc410993d18c12cc9aaa057567f762e63')
+
 moduleclass = 'lib'

@github-actions github-actions bot removed the update label Sep 24, 2025
@bartoldeman
Copy link
Contributor

A little bit more digging: the SIGSEGV was caused by an unaligned memory access with an aligned move (avx2 vmovapd used on 16-byte, not 32-byte aligned memory). It looks like an issue with GCC 12.3 and 13.2, it's not present with 13.3.
...
allocates the relevant matrices which bypasses the miscompiled code here, so that's a source code based workaround.

ok, i was just testing your other proposed patch, so we prefer this source code modif then, right ?

yes I'd prefer the source code modification as it is more targeted.

@boegel boegel removed this from the next release (5.1.2) milestone Sep 24, 2025
@boegel boegel added this to the release after 5.1.2 (5.2.0?) milestone Sep 24, 2025
Copy link
Contributor

@bartoldeman bartoldeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@arielzn
Copy link
Contributor Author

arielzn commented Sep 24, 2025

I changed the fix in order to use the patch lapack code option.

@bartoldeman for the FlexiBLAS-3.3.1-GCC-12.3.0.eb case your patch was enough, but when applying the same to FlexiBLAS-3.3.1-GCC-13.2.0.eb i've still got a segfault. In this case the trace pointed to another lapack subroutine ZPBTRF. It seems gcc 12.3 and 13.2 still do some slightly different miscompilations.

I applied the same logic of making the work array dynamic on this other routine and it worked, so we have different patches for each toolchain.

@arielzn arielzn changed the title add patch for FlexiBLAS v3.3.1 to disable -ftree-vectorize for netlib LAPACK add patch for FlexiBLAS v3.3.1 to fix LAPACK errors triggered by -ftree-vectorize on zen4/zen5 microarchs Sep 24, 2025
@arielzn
Copy link
Contributor Author

arielzn commented Oct 3, 2025

@arielzn arielzn closed this Oct 3, 2025
Micket pushed a commit that referenced this pull request Oct 3, 2025
This fixes a segmentation fault in FlexiBLAS LAPACK tests when
compiling for Zen4/Zen5

See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566 and #23974
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants