guix: re-enable exported symbol checking for RISC-V #28095

issue fanquake openend this issue on July 18, 2023
  1. fanquake commented at 10:12 am on July 18, 2023: member

    This is currently disabled, when it can likely be fixed and re-enabled: https://github.com/bitcoin/bitcoin/blob/673acab223c0f896767b1ae784659df9f95452ae/contrib/devtools/symbol-check.py#L210

    A Guix build with the exception dropped shows there are symbols being exported in the RISC-V bins:

    0/opt/homebrew/opt/binutils/bin/objdump -T bitcoind
    1
    20000000000080500  w   DF .text	00000000000000a6  Base        _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1IS3_EEPKcRKS3_
    3000000000007f6a0  w   DF .text	0000000000000064  Base        _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED1Ev
    4000000000007e9da  w   DF .text	0000000000000024  Base        _ZNKSt5ctypeIcE8do_widenEc
    5
    6std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&)
    7std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf()
    8std::ctype<char>::do_widen(char) const
    
  2. fanquake added the label Build system on Jul 18, 2023
  3. hebasto commented at 10:40 am on July 19, 2023: member

    Historically, symbol checking for RISC-V was skipped from the beginning for the following reason:

    Need to skip RISC-V for now, the linker would export so many symbols.

  4. fanquake commented at 10:41 am on July 19, 2023: member
    I don’t think that’s relevant anymore? I’ve listed the exported symbols above. It’s 3.
  5. laanwj assigned laanwj on Apr 11, 2024
  6. laanwj commented at 12:27 pm on April 11, 2024: member

    i’ll try to figure out why it exports those three symbols, and if it is indication of a problem, and if not, add them to the exceptions

    demangled, these are:

    0std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&)
    1std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf()
    2std::ctype<char>::do_widen(char) const
    
  7. laanwj commented at 9:42 am on April 29, 2024: member
    Holding this off until #29987 is.merged.
  8. fanquake commented at 4:00 pm on March 5, 2025: member

    Holding this off until #29987 is.merged.

    This has gone in, and I’ve confirmed that this is still an issue with current master (0391d7e4c24e49ed186215e9fa375903c19af86e). @laanwj any chance you still wanted to investigate?

  9. laanwj commented at 3:01 am on March 6, 2025: member
    Yes, thanks for the reminder.
  10. laanwj commented at 2:53 pm on March 6, 2025: member

    i tried to figure this out today but haven’t got very far. Curiously, there are even more (weak) symbols exported by the current RISC-V build than mentioned in the OP-

     00000000000091fd8  w   DF .text  0000000000000998  Base        _ZZSt8to_arrayINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEELm35EESt5arrayINSt9remove_cvIT_E4typeEXT0_EEOAT0__S8_ENKUlTpTnmSt16integer_sequenceImJXspT_EEEE_clIJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8ELm9ELm10ELm11ELm12ELm13ELm14ELm15ELm16ELm17ELm18ELm19ELm20ELm21ELm22ELm23ELm24ELm25ELm26ELm27ELm28ELm29ELm30ELm31ELm32ELm33ELm34EEEEDaSF_
     10000000000093ade  w   DF .text  0000000000000148  Base        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::replace(unsigned long, unsigned long, char const*, unsigned long)
     20000000000091efa  w   DF .text  000000000000005e  Base        std::array<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, 35ul>::~array()
     300000000000929de  w   DF .text  0000000000000042  Base        std::any::reset()
     4000000000009389e  w   DF .text  000000000000008a  Base        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long)
     50000000000093928  w   DF .text  00000000000000be  Base        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*)
     60000000000091d88  w   DF .text  000000000000006e  Base        std::any::_Manager_internal<node::NodeContext*>::_S_manage(std::any::_Op, std::any const*, std::any::_Arg*)
     7000000000009372c  w   DF .text  0000000000000172  Base        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long)
     80000000000092a20  w   DF .text  0000000000000068  Base        std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf()
     90000000000091fa4  w   DF .text  0000000000000034  Base        std::_Function_base::~_Function_base()
    100000000000092afe  w   DF .text  000000000000004c  Base        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose()
    110000000000091cb0  w   DF .text  0000000000000028  Base        std::ctype<char>::do_widen(char) const
    

    One thing we can say for sure: the large number of symbols here makes it impractical to add specific exceptions.

    Haven’t been able to explain why there is a difference between RISC-V and other platforms here. i couldn’t find any ABI documentation that mentions a difference in handling weak symbols.

    Next step i’ll see if i can isolate a small test case that can be compiled on different cross-compilers to reproduce the issue.

  11. laanwj commented at 6:47 pm on March 6, 2025: member

    Reproduction (with gcc 13.3.0 , binutils 2.38, everything from guix build context):

    0#include <iostream>
    1
    2int main()
    3{
    4    std::string test{"12345"};
    5    std::cout << test << std::endl;
    6
    7    return 0;
    8}
    
    0$ x86_64-linux-gnu-g++ test.cpp -o test.x64 -Wl,--exclude-libs,ALL -fvisibility=hidden  -static-libstdc++ 
    1$ x86_64-linux-gnu-objdump -T test.x64|grep "\.text"
    2(no output)
    3
    4$ riscv64-linux-gnu-g++ test.cpp -o test.rv -Wl,--exclude-libs,ALL -fvisibility=hidden  -static-libstdc++ 
    5$ riscv64-linux-gnu-objdump -T test.rv|grep "\.text"
    600000000000176b0 l    d  .text  0000000000000000              .text
    70000000000018810  w   DF .text  00000000000000fe  Base        _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPKcEEvT_S8_St20forward_iterator_tag
    
    • On x86_64, even providing just -static-libstdc++ gives no exported weak symbols from the binary.
    • On riscv64, leaving out -fvisibility=hidden gives one exported symbol main. Leaving out -fvisibility=hidden makes the build export a ton of symbols.

    As the versions of all build tools and libraries are the same for both platforms, i suspect this is a linker bug specific to RISC-V. (i’ve checked the intermediate .o files and the forward_iterator_tag symbols in those have the same flags, so expect it is not the compiler)

    All in all, this appears to be an upstream issue, not something we can fix in our build. When i get around to it i’ll check with the latest binutils to see if it has been fixed upstream in the meantime.

  12. laanwj added the label Upstream on Mar 6, 2025
  13. laanwj commented at 4:00 am on March 8, 2025: member

    The problem still exists in the latest binutils built from git (d07a59a5ca830bf74705471f6bea6db1a47da2b5 as of 2025-03-07).

    i’ve been digging a bit as to what causes some weak symbols (but not all) to end up in the binary: overall, if a weak function symbol is in both libstdc++.a and in the .o file, it will end up exported in the binary. This behavior exists on riscv64 but not x86_64.

    The only exception is the monstriosity _ZZSt8to_arrayINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEELm35EESt5arrayINSt9remove_cvIT_E4typeEXT0_EEOAT0__S8_ENKUlTpTnmSt16integer_sequenceImJXspT_EEEE_clIJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8ELm9ELm10ELm11ELm12ELm13ELm14ELm15ELm16ELm17ELm18ELm19ELm20ELm21ELm22ELm23ELm24ELm25ELm26ELm27ELm28ELm29ELm30ELm31ELm32ELm33ELm34EEEEDaSF_, which does not appear in libstdc++, and which objdump cannot demangle. It is likely coming from inside bitcoin’s code. But will focus on the libstdc++ ones first.

    If you want to know what the actual ld invocation looks like for linking a simple binary (normally this is hidden two levels, inside a collect2 call inside g++):

     0$LD --sysroot=/ --eh-frame-hdr -melf64lriscv \
     1    -dynamic-linker /gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib/ld-linux-riscv64-lp64d.so.1 \
     2    -pie -o ${BINARY} \
     3    /gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib/Scrt1.o \
     4    /gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0/crti.o \
     5    /gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0/crtbeginS.o \
     6    -L/gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib  \
     7    -L/gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0 \
     8    -L/gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib \
     9    -L/gnu/store/25fk4fj4n7r3djxq9hqn54mf4pcs0irj-glibc-cross-riscv64-linux-gnu-2.31-static/lib \
    10    -L/gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0 \
    11    -L/gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0/../../../../riscv64-linux-gnu/lib \
    12    -L/gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib \
    13    $OBJECT \
    14    --exclude-libs ALL -Bstatic -lstdc++ -Bdynamic -lm -lgcc_s -lgcc \
    15    -L/gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib \
    16    -rpath-link=/gnu/store/whzxf4x97y7j2gcrg8p90n3iqiy9ssal-glibc-cross-riscv64-linux-gnu-2.31/lib \
    17    -rpath-link=/gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/riscv64-linux-gnu/lib \
    18    -lgcc_s -lc -lgcc_s -lgcc \
    19    /gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0/crtendS.o \
    20    /gnu/store/f1blyzzq50h5gx0aimp7d0m72xvmdk3a-gcc-cross-riscv64-linux-gnu-13.3.0-lib/lib/gcc/riscv64-linux-gnu/13.3.0/crtn.o
    

    i’ve compared the command line between riscv64 and x86_64, but did not notice any differences that could explain the behavior. So i’m still fairly convinced it’s internal to binutils.

  14. laanwj commented at 1:54 pm on March 11, 2025: member

    It almost looks like ld flag --no-export-dynamic, which is the default, doesn’t work on RISC-V. Not even if added explicitly. Without --exclude-libs ALL, all weak symbols imported from libraries end up in the exported binary. With --exclude-libs ALL only the ones that are exported by the .o file (the behavior we’re seeing).

    Passing --export-dynamic to the x86_64 build approximates what we’re seeing on RISC-V. But not entirely, we get more symbols than that. So some filtering is happening there, just less.

    Edit: narrowed it down to the following code in allocate_dynrelocs in bfd/elf64-riscv.c:

    0      /* Make sure this symbol is output as a dynamic symbol.
    1         Undefined weak syms won't yet be marked as dynamic.  */
    2      if (h->dynindx == -1
    3          && !h->forced_local)
    4        {
    5          if (! bfd_elf_link_record_dynamic_symbol (info, h))
    6            return false;
    7        }
    

    Commenting out the bfd_elf_link_record_dynamic_symbol call makes the extraneous symbols disappear. So it looks like this has something to do with dynamic relocations in the binary. This explains why it is RISCV specific, but doesn’t rule out that the symbols are spurious.

  15. laanwj commented at 9:17 am on March 12, 2025: member
    Filed an upstream issue for binutils: https://sourceware.org/bugzilla/show_bug.cgi?id=32783

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-31 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me