Unnecessary symbols being exported in libbitcoinconsensus.so #26698

issue hebasto openend this issue on December 14, 2022
  1. hebasto commented at 2:00 pm on December 14, 2022: member

    Initially it was reported in #25020 (comment).

    0$ objdump -T ./bitcoin-24.0.1/lib/libbitcoinconsensus.so | grep bitcoinconsensus_
    1000000000002b1f0 g    DF .text	0000000000000037  Base        bitcoinconsensus_version
    2000000000002cc30 g    DF .text	000000000000006d  Base        bitcoinconsensus_verify_script
    3000000000002cbe0 g    DF .text	0000000000000049  Base        bitcoinconsensus_verify_script_with_amount
    
     0$ objdump -T ./bitcoin-24.0.1/lib/libbitcoinconsensus.so | grep -v UND
     1
     2./bitcoin-24.0.1/lib/libbitcoinconsensus.so:     file format elf64-x86-64
     3
     4DYNAMIC SYMBOL TABLE:
     5000000000002cd20  w   DF .text	00000000000000ce  Base        _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EEOS8_PKS5_
     6000000000003a000  w   DF .text	00000000000001e5  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
     70000000000038590  w   DF .text	0000000000000054  Base        _ZNSt19bad_optional_accessD0Ev
     80000000000022050  w   DF .text	0000000000000034  Base        _ZNKSt5ctypeIcE8do_widenEc
     900000000001739a8  w   DO .data.rel.ro	0000000000000018  Base        _ZTISt19bad_optional_access
    100000000000038540  w   DF .text	0000000000000042  Base        _ZNSt19bad_optional_accessD2Ev
    1100000000000285e0  w   DF .text	0000000000000373  Base        _ZNSt20__uninitialized_copyILb0EE13__uninit_copyIN9__gnu_cxx17__normal_iteratorIPK5CTxInSt6vectorIS4_SaIS4_EEEEPS4_EET0_T_SD_SC_
    12000000000002b1f0 g    DF .text	0000000000000037  Base        bitcoinconsensus_version
    13000000000000596f  w   DF .text	0000000000000048  Base        _ZSt27__throw_bad_optional_accessv
    14000000000002cc30 g    DF .text	000000000000006d  Base        bitcoinconsensus_verify_script
    1500000000000393c0  w   DF .text	000000000000009e  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EED1Ev
    160000000000044890  w   DF .text	00000000000000d4  Base        _ZNSt8__detail18__from_chars_digitIjEEbRPKcS2_RT_i
    1700000000000397f0  w   DF .text	0000000000000120  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE9push_backERKS1_
    180000000000039910  w   DF .text	000000000000012d  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE8_M_eraseEN9__gnu_cxx17__normal_iteratorIPS1_S3_EES7_
    1900000000001739f0  w   DO .data.rel.ro	0000000000000028  Base        _ZTVSt19bad_optional_access
    20000000000002cbe0 g    DF .text	0000000000000049  Base        bitcoinconsensus_verify_script_with_amount
    210000000000038050  w   DF .text	0000000000000039  Base        _ZNKSt19bad_optional_access4whatEv
    220000000000039460  w   DF .text	00000000000000b6  Base        _ZNSt6vectorIhSaIhEEC2ERKS1_
    230000000000027ef0  w   DF .text	0000000000000062  Base        _ZNSt12_Vector_baseIhSaIhEED2Ev
    240000000000023020  w   DF .text	0000000000000073  Base        _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED1Ev
    250000000000039460  w   DF .text	00000000000000b6  Base        _ZNSt6vectorIhSaIhEEC1ERKS1_
    260000000000039520  w   DF .text	00000000000002cd  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJRKS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
    270000000000038690  w   DF .text	0000000000000062  Base        _ZNSt6vectorIhSaIhEED1Ev
    280000000000056f10  w   DO .rodata	0000000000000018  Base        _ZTSSt19bad_optional_access
    2900000000000221e0  w   DF .text	00000000000000d3  Base        _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1IS3_EEPKcRKS3_
    300000000000044360  w   DF .text	00000000000000dc  Base        _ZNSt8__detail18__from_chars_digitImEEbRPKcS2_RT_i
    3100000000000444e0  w   DF .text	0000000000000178  Base        _ZNSt6vectorISt4byteSaIS0_EE17_M_realloc_insertIJS0_EEEvN9__gnu_cxx17__normal_iteratorIPS0_S2_EEDpOT_
    3200000000000440f0  w   DF .text	00000000000000e2  Base        _ZNSt6vectorIhSaIhEE7reserveEm
    330000000000038540  w   DF .text	0000000000000042  Base        _ZNSt19bad_optional_accessD1Ev
    340000000000039e80  w   DF .text	0000000000000178  Base        _ZNSt6vectorIhSaIhEE17_M_realloc_insertIJhEEEvN9__gnu_cxx17__normal_iteratorIPhS1_EEDpOT_
    3500000000000221e0  w   DF .text	00000000000000d3  Base        _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2IS3_EEPKcRKS3_
    36000000000002d670  w   DF .text	0000000000000373  Base        _ZNSt20__uninitialized_copyILb0EE13__uninit_copyIPK5CTxInPS2_EET0_T_S7_S6_
    37000000000002d4a0  w   DF .text	00000000000001cd  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJEEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
    380000000000053964 g    DF .fini	0000000000000000  Base        _fini
    390000000000004000 g    DF .init	0000000000000000  Base        _init
    4000000000000393c0  w   DF .text	000000000000009e  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EED2Ev
    410000000000039b30  w   DF .text	0000000000000160  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE13_M_insert_auxIS1_EEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEOT_
    420000000000039a40  w   DF .text	00000000000000e3  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE8_M_eraseEN9__gnu_cxx17__normal_iteratorIPS1_S3_EE
    430000000000039c90  w   DF .text	00000000000001e5  Base        _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_default_appendEm
    4400000000000230a0  w   DF .text	0000000000000081  Base        _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED0Ev
    450000000000027ef0  w   DF .text	0000000000000062  Base        _ZNSt12_Vector_baseIhSaIhEED1Ev
    460000000000023020  w   DF .text	0000000000000073  Base        _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev
    470000000000038690  w   DF .text	0000000000000062  Base        _ZNSt6vectorIhSaIhEED2Ev
    4800000000000441e0  w   DF .text	0000000000000178  Base        _ZNSt6vectorIhSaIhEE17_M_realloc_insertIJRKhEEEvN9__gnu_cxx17__normal_iteratorIPhS1_EEDpOT_
    490000000000043fc0  w   DF .text	0000000000000129  Base        _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EERKS8_PKS5_
    
  2. hebasto commented at 2:37 pm on December 19, 2023: member

    The underlying problem is described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36022, and it was not considered as a bug.

    I think that moving code that uses C++ template functions/classes out from the bitcoinconsensus.{h,cpp} compilation unit, which were suggested in #24994, is a step in the right direction.

  3. ryanofsky commented at 5:22 pm on December 19, 2023: contributor

    From the linked issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36022#c4:

    For C++, types that are to be used across shared libraries have to be visible

    Seems to suggest that it is a good thing for these symbols to be exported so the library and its callers use the same symbols, particularly type_info symbols, and I guess generally so symbols in shared libraries have the same visibility as symbols in static libraries. So linking dynamically doesn’t turn shared symbols into hidden symbols.

    I don’t know what the best approach is, but I guess my question is what is the harm of exporting these symbols?

  4. theuni commented at 6:05 pm on December 21, 2023: member

    Seems to suggest that it is a good thing for these symbols to be exported so the library and its callers use the same symbols, particularly type_info symbols

    We purposefully export only a C api to avoid exactly these type issues :) The user’s/lib’s c++ impls should be completely firewalled from each-other.

    So I agree that this is annoying and maybe worth a hack to fix. But I’d like to exhaust all other options first.

  5. hebasto commented at 12:44 pm on December 22, 2023: member

    @ryanofsky

    my question is what is the harm of exporting these symbols?

    I can add nothing to the best practises:

    symbol visibility should be viewed as part of the API interface contract

  6. theuni commented at 3:08 pm on December 22, 2023: member

    Adding one more bit to complicate and maybe explain why I described the boundary as a firewall… We build libbitcoinconsensus.so with guix using -static-libstdc++ (side-note, we should perhaps move this into the build-system rather than guix).

    So the idea here is that the entire c++ implementation should be compiled into the shared lib with no unresolved c++ stdlib symbols, and exporting only C symbols. That way the end-user can use any c++ impl (if any) they want without fear of collisions. And because there’s no cross-boundary c++ calls, there’s no benefit to having the shared typeinfo/exception symbols.

    Arguably this is different than the libstdc++ bug that was closed as WONTFIX because (at least as far as I can tell) we’re trying to do something completely safe.

    If it turns out that tight firewall idea doesn’t work because libstdc++ is stubborn about exporting symbols, we may need to either:

    • revisit the idea of -static-libstdc++ and
    • Use the --exclude-libs ALL @hebasto has suggested
  7. fanquake commented at 4:13 pm on December 22, 2023: member

    Adding one more bit to complicate

    Thanks, I’ll pile on some more. I’ve alluded to this in other threads (#24994), but still haven’t seen a summary of:

    • What currently happens with lld + libstdc++?
    • What currently happens with lld + libc++?
    • Maybe even, what happens with mold and either standard library?
    • What happens under LTO, in any combination of the above?

    There is much more to consider here than “What does ld happen to do”, and that means any “fix” should also work for/not break other toolchain combinations as well.

    The discussion seemed to land on, nothing can be wrong with ld/libstdc++, because a 15 year old thread says “RESOLVED INVALID”. However, it’s not clear to me if, in the present day, that thread is still relevant/the expected behaviour, and, it’s also completely possible that other bugs/changes in (expected) behaviour have emerged in ld, or libstdc++, over the past 15 years, leaving the discussion there outdated.

    I will also mention, that we currently skip exports checks for RISC-V binaries, because of std lib related symbol exports (#28095), so whatever the issue is here, it’s also not specific to, or limited to libraries.

  8. hebasto commented at 4:26 pm on December 22, 2023: member
    Our current release process uses a very particular toolchain. While LTO might be considered as its update in the near future, it is not clear how other combinations of tools are related to this problem? Unless we would prefer to alter the release toolchain instead of moving some code into a separated translation unit to fix the shared library symbol visibility.
  9. fanquake commented at 4:36 pm on December 22, 2023: member

    Our current release process

    This isn’t specific to the release process. Anyone can, and should expect to be able to build “working” libs/bins outside of Guix.

    it is not clear how other combinations of tools are related to this problem?

    If you “fix” anything by doing something that isn’t supported by those tools, i.e use a special link/compile flag, or rely on ld or libstdc++ specific behaviors, you’re going to break things/be incompatible with anyone using any other tools. Also, any change in our source code/build system may have side-effects that could break things for users of those tools that otherwise already worked perfectly fine.

  10. hebasto commented at 4:46 pm on December 22, 2023: member

    If you “fix’ anything by doing something that isn’t supported by those tools, i.e use a special link/compile flag, or rely on ld or libc++ specific behaviors, you’re going to break things/be incompatible with anyone using any other tools.

    Right. Requiring that a fix does not introduce any regressions is obvious. By the way, there were no comments regarding any introduced break in #24994.

    I mean, what is the point to analyze the current behavior with all possible tool combinations:

    • What currently happens with lld + libstdc++?

    • What currently happens with lld + libc++?

    • Maybe even, what happens with mold and either standard library?

    ?

    If any of those combinations works, I doubt we will modify the release build system to use this working toolchain.

  11. hebasto commented at 6:24 pm on December 22, 2023: member

    Thanks, I’ll pile on some more. I’ve alluded to this in other threads (#24994), but still haven’t seen a summary of:

    * What currently happens with `lld` + `libstdc++`?
    
    * What currently happens with `lld` + `libc++`?
    
    * Maybe even, what happens with [`mold`](https://github.com/rui314/mold) and either standard library?
    
    * What happens under LTO, in any combination of the above?
    

    I’ve done some tests on Ubuntu 22.04 using a simple model code, compiling and testing as follows:

    0g++ -fPIC --shared -fvisibility=hidden -o test_lib.so -I . test_lib.cpp
    1nm -C -D test_lib.so | grep " W "
    

    g++ with -fuse-ld={bfd,gold,lld,mold} does export unwanted symbols.

    clang++ with libc++ or libstdc++ does not export unwanted symbols.

  12. willcl-ark commented at 8:37 pm on February 4, 2024: contributor
    Can we close this following #29189 ?
  13. hebasto closed this on Feb 4, 2024

  14. fanquake commented at 8:37 pm on February 4, 2024: member
    Yep.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-19 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me