src/scalar_4x64_impl.h:361:5: error: ‘asm’ operand has impossible constraints #1623

issue tersec openend this issue on October 24, 2024
  1. tersec commented at 3:23 am on October 24, 2024: none
     0~/secp256k1 % gcc -c -march=native -O1 -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c                      
     1In file included from src/scalar_impl.h:20,
     2                 from src/secp256k1.c:28:
     3src/scalar_4x64_impl.h: In function secp256k1_scalar_mul:
     4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
     5  361 |     __asm__ __volatile__(
     6      |     ^~~~~~~
     7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
     8  475 |     __asm__ __volatile__(
     9      |     ^~~~~~~
    10~/secp256k1 % gcc -c -march=native -O1 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    11In file included from src/scalar_impl.h:20,
    12                 from src/secp256k1.c:28:
    13src/scalar_4x64_impl.h: In function secp256k1_scalar_mul:
    14src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    15  361 |     __asm__ __volatile__(
    16      |     ^~~~~~~
    17src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    18  475 |     __asm__ __volatile__(
    19      |     ^~~~~~~
    20~/secp256k1 % gcc -c -march=native -O2 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    21In file included from src/scalar_impl.h:20,
    22                 from src/secp256k1.c:28:
    23src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    24src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    25  361 |     __asm__ __volatile__(
    26      |     ^~~~~~~
    27src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    28  475 |     __asm__ __volatile__(
    29      |     ^~~~~~~
    30~/secp256k1 % gcc -c -march=native -O2 -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c         
    31In file included from src/scalar_impl.h:20,
    32                 from src/secp256k1.c:28:
    33src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    34src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    35  361 |     __asm__ __volatile__(
    36      |     ^~~~~~~
    37src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    38  475 |     __asm__ __volatile__(
    39      |     ^~~~~~~
    40~/secp256k1 % gcc -c -march=native -O3 -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    41In file included from src/scalar_impl.h:20,
    42                 from src/secp256k1.c:28:
    43src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    44src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    45  361 |     __asm__ __volatile__(
    46      |     ^~~~~~~
    47src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    48  475 |     __asm__ __volatile__(
    49      |     ^~~~~~~
    50~/secp256k1 % gcc -c -march=native -O3 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    51In file included from src/scalar_impl.h:20,
    52                 from src/secp256k1.c:28:
    53src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    54src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    55  361 |     __asm__ __volatile__(
    56      |     ^~~~~~~
    57src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    58  475 |     __asm__ __volatile__(
    59      |     ^~~~~~~
    
    0gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
    1Copyright (C) 2021 Free Software Foundation, Inc.
    2This is free software; see the source for copying conditions.  There is NO
    3warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    
     0 % lscpu 
     1Architecture:             x86_64
     2  CPU op-mode(s):         32-bit, 64-bit
     3  Address sizes:          48 bits physical, 48 bits virtual
     4  Byte Order:             Little Endian
     5CPU(s):                   16
     6  On-line CPU(s) list:    0-15
     7Vendor ID:                AuthenticAMD
     8  Model name:             AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
     9    CPU family:           25
    10    Model:                117
    11    Thread(s) per core:   2
    12    Core(s) per socket:   8
    13    Socket(s):            1
    14    Stepping:             2
    15    Frequency boost:      enabled
    
    0Linux version 5.15.0-118-generic (buildd@lcy02-amd64-080) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) [#128](/bitcoin-core-secp256k1/128/)-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024
    

    Commit 68b55209f1ba3e6c0417789598f5f75649e9c14c

  2. bitcoin-core deleted a comment on Oct 24, 2024
  3. real-or-random commented at 8:51 am on October 24, 2024: contributor

    I’m unable to reproduce this on my gcc 14.2.1. Can you provide some more context, please?

    • What does native resolve to? Can you provide a reproduction command with the specific arch?
    • Does this really happen with -fomit-frame-pointer (which is the default on -O1)?
    • Have you tried more recent gcc versions? Is this the original gcc in Ubuntu? If yes, can you give us instructions on how to reproduce this with a docker image?
    • Is this a regression in our code?

    The referenced issues seem to have some partial answers to these questions, but those also appear to contradict your report here because -fomit-frame-pointer seems to have resolved your issue. So I’m really not sure about the details of the report.

    The error message usually means that there are not enough registers, but I don’t see how narrowing to a specific arch can make a (correct) gcc assume that there are fewer registers.

  4. real-or-random added the label build on Oct 24, 2024
  5. tersec commented at 9:17 am on October 24, 2024: none
    • native resolves to znver3. However, specifying -march=znver3, or -march=znver3 -mtune=znver3, does not result in this compiler error. In more detail, gcc -march=native -Q --help=target output. To excerpt from that:
    0gcc -march=native -Q --help=target | grep -E '(march|mcpu|mtune)='
    1  -march=                     		znver3
    2  -mcpu=                      		
    3  -mtune=                     		znver3
    
    • Yes, it really does happen with -fomit-frame-pointer. That’s why I’ve pointedly included that variation, because I know in past build issues that’s been one question/suggestion/recommendation. Yes, yes it does. See the what I already posted.
    • I have not tried with more recent gcc versions. This is, yes, the default version of gcc in Ubuntu 22.04.

    -fomit-frame-pointer did not no, resolve the issue. Again, from what’s already posted above:

    0~/secp256k1 % gcc -c -march=native -O1 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    1In file included from src/scalar_impl.h:20,
    2                 from src/secp256k1.c:28:
    3src/scalar_4x64_impl.h: In function secp256k1_scalar_mul:
    4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    5  361 |     __asm__ __volatile__(
    6      |     ^~~~~~~
    7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    8  475 |     __asm__ __volatile__(
    9      |     ^~~~~~~
    

    -fomit-frame-pointer is specified. Explicitly. Also for -O3:

    0~/secp256k1 % gcc -c -march=native -O3 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    1In file included from src/scalar_impl.h:20,
    2                 from src/secp256k1.c:28:
    3src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    5  361 |     __asm__ __volatile__(
    6      |     ^~~~~~~
    7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    8  475 |     __asm__ __volatile__(
    9      |     ^~~~~~~
    

    -fomit-frame-pointer is specified and very definitely does not resolve this issue. Nor does it for

    0~/secp256k1 % gcc -c -march=native -O2 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    1In file included from src/scalar_impl.h:20,
    2                 from src/secp256k1.c:28:
    3src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
    5  361 |     __asm__ __volatile__(
    6      |     ^~~~~~~
    7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
    8  475 |     __asm__ __volatile__(
    9      |     ^~~~~~~
    

    Which also explicitly and specifically specifies -fomit-frame-pointer, this time for -O2.

    • Regarding whether it’s a regression, we’ve only ever seen it on znver3 targets, and only seen it reproduced so far on Ubuntu 20.04, 22.04, and 24.04 (though the machine I’m testing this one now is 22.04). My suspicion is that it’s not a regression of previous targets, up to and including znver2, but never worked with -march=native on a znver3 target. But this is speculation.

    Regarding reproducing this in a Docker image, the tricky thing is that:

     0% gcc -c -march=native -O1 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c 
     1In file included from src/scalar_impl.h:20,
     2                 from src/secp256k1.c:28:
     3src/scalar_4x64_impl.h: In function secp256k1_scalar_mul:
     4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints
     5  361 |     __asm__ __volatile__(
     6      |     ^~~~~~~
     7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints
     8  475 |     __asm__ __volatile__(
     9      |     ^~~~~~~
    10% gcc -c -march=znver3 -mtune=znver3 -O1 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c
    11%
    

    That is, gcc on this machine claims that native resolves to -march=znver3 -mtune=znver3, but those options, which would allow machine-independent, and Docker-based, reproduction, don’t even do so in otherwise the exact same conditions. -march=native is triggering something else salient too, but I have not yet identified what. It’s 100% deterministic and consistent.

    But the main point is, no, -fomit-frame-pointer does not resolve this. I was already aware of that point.

    Edit to add another output of what -march=native does:

    0echo | gcc -### -E - -march=native 
    1 /usr/lib/gcc/x86_64-linux-gnu/11/cc1 -E -quiet -imultiarch x86_64-linux-gnu - "-march=znver3" -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -msse4a -mno-fma4 -mno-xop -mfma -mavx512f -mbmi -mbmi2 -maes -mpclmul -mavx512vl -mavx512bw -mavx512dq -mavx512cd -mno-avx512er -mno-avx512pf -mavx512vbmi -mavx512ifma -mno-avx5124vnniw -mno-avx5124fmaps -mavx512vpopcntdq -mavx512vbmi2 -mgfni -mvpclmulqdq -mavx512vnni -mavx512bitalg -mavx512bf16 -mno-avx512vp2intersect -mno-3dnow -madx -mabm -mno-cldemote -mclflushopt -mclwb -mclzero -mcx16 -mno-enqcmd -mf16c -mfsgsbase -mfxsr -mno-hle -msahf -mno-lwp -mlzcnt -mmovbe -mno-movdir64b -mno-movdiri -mmwaitx -mno-pconfig -mpku -mno-prefetchwt1 -mprfchw -mno-ptwrite -mrdpid -mrdrnd -mrdseed -mno-rtm -mno-serialize -mno-sgx -msha -mshstk -mno-tbm -mno-tsxldtrk -mvaes -mno-waitpkg -mwbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -mno-amx-tile -mno-amx-int8 -mno-amx-bf16 -mno-uintr -mno-hreset -mno-kl -mno-widekl -mno-avxvnni --param "l1-cache-size=32" --param "l1-cache-line-size=64" --param "l2-cache-size=1024" "-mtune=znver3" -fasynchronous-unwind-tables -fstack-protector-strong -Wformat -Wformat-security -fstack-clash-protection -fcf-protection -dumpbase -
    
  6. tersec commented at 9:40 am on October 24, 2024: none

    No -march=native. Reproduces on another machine, with

    0gcc (Debian 14.2.0-7) 14.2.0
    1Copyright (C) 2024 Free Software Foundation, Inc.
    2This is free software; see the source for copying conditions.  There is NO
    3warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    

    And, yes, it uses -fomit-frame-pointer:

     0$ gcc -c -march=znver3 -mavx512f -O1 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c 
     1In file included from src/scalar_impl.h:20,
     2                 from src/secp256k1.c:28:
     3In function secp256k1_scalar_reduce_512,
     4    inlined from secp256k1_scalar_mul at src/scalar_4x64_impl.h:868:5:
     5src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints or there are not enough registers
     6  361 |     __asm__ __volatile__(
     7      |     ^~~~~~~
     8src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints or there are not enough registers
     9  475 |     __asm__ __volatile__(
    10      |     ^~~~~~~
    11$ gcc -c -march=znver3 -mavx512f -O2 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c 
    12In file included from src/scalar_impl.h:20,
    13                 from src/secp256k1.c:28:
    14src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    15src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints or there are not enough registers
    16  361 |     __asm__ __volatile__(
    17      |     ^~~~~~~
    18src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints or there are not enough registers
    19  475 |     __asm__ __volatile__(
    20      |     ^~~~~~~
    21$ gcc -march=znver3 -mavx512f -O3 -fomit-frame-pointer -DENABLE_MODULE_EXTRAKEYS=1 -DUSE_ASM_X86_64 src/secp256k1.c 
    22In file included from src/scalar_impl.h:20,
    23                 from src/secp256k1.c:28:
    24src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    25src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints or there are not enough registers
    26  361 |     __asm__ __volatile__(
    27      |     ^~~~~~~
    28src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints or there are not enough registers
    29  475 |     __asm__ __volatile__(
    30      |     ^~~~~~~
    

    The relevant flag is -mavx512f.

  7. real-or-random commented at 10:02 pm on October 24, 2024: contributor

    Argh, all of this CPU stuff is so confusing.

    Model name: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics

    That’s a Zen4. But znver4 is only supported on GCC 13 and newer. I share the expectation that -march=znver3 should be equivalent to -march=native on GCC 11… As you point out, the affected GCC 11 appears, for whatever reason, to be (overly) clever and adds the -mavx512f. Perhaps this was some initial/half-ready support for znver4, or simply a bug?

    In any case, I see this on my machine with gcc version 14.2.1 20240910:

    It works with -march=znver4 -mavx512f (and also without explicit -mavx512f, which should be implied):

    0$ gcc -c -march=znver4 -mavx512f -O2 -fomit-frame-pointer -DUSE_ASM_X86_64 src/secp256k1.c
    

    It errors with -march=znver3 -mavx512f, but that’s a strange set of flags because no such CPU exists:

    0$ gcc -c -march=znver3 -mavx512f -O2 -fomit-frame-pointer -DUSE_ASM_X86_64 src/secp256k1.c
    1In file included from src/scalar_impl.h:20,
    2                 from src/secp256k1.c:28:
    3src/scalar_4x64_impl.h: In function secp256k1_scalar_reduce_512:
    4src/scalar_4x64_impl.h:361:5: error: asm operand has impossible constraints or there are not enough registers
    5  361 |     __asm__ __volatile__(
    6      |     ^~~~~~~
    7src/scalar_4x64_impl.h:475:5: error: asm operand has impossible constraints or there are not enough registers
    8  475 |     __asm__ __volatile__(
    9      |     ^~~~~~~
    

    I still don’t know what the cause of this is, but the problem disappears with the correct flags on a recent GCC. So my conclusion so far is that this is not our bug.

  8. real-or-random commented at 10:28 pm on October 24, 2024: contributor

    I share the expectation that -march=znver3 should be equivalent to -march=native on GCC 11… As you point out, the affected GCC 11 appears, for whatever reason, to be (overly) clever and adds the -mavx512f. Perhaps this was some initial/half-ready support for znver4, or simply a bug?

    Okay, -march=native can detect individual CPU features. I believe this is precisely what you observe with GCC 11. And it turns out that this auto-detection produces a broken set of flags, presumably because noone had tested it on a Zen with AVX512 (because no such CPU existed when GCC 11 was released).

  9. real-or-random commented at 3:47 pm on November 1, 2024: contributor

    So my conclusion so far is that this is not our bug.

    Closing for now, but please don’t hesitate to reply if you think my analysis is wrong, or if you believe we should do something about this.

  10. real-or-random closed this on Nov 1, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 17:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me