Reproducibility issue with 0.20.1 for x86_64-linux-gnu #20389

issue monperrus opened this issue on November 14, 2020
  1. monperrus commented at 4:41 PM on November 14, 2020: none

    When running the Gitian pipeline for 0.20.1, there is a reproducibility issue for one linux binary.

    For bitcoin-0.20.1-aarch64-linux-gnu.tar.gz, bitcoin-0.20.1-arm-linux-gnueabihf.tar.gz, bitcoin-0.20.1-riscv64-linux-gnu.tar.gz, bitcoin-0.20.1.tar.gz, the sha256 checksums correspond.

    However, for target x86_64-linux-gnu, the sha256 mismatches:

    • expected sha256, per SHA256SUMS.asc: 376194f06596ecfa40331167c39bc70c355f960280bd2a645fdbf18f66527397
    • actual sha256: 277599356bd2df760832c6636797fe5ea5a5c28d929d53635b685f5ac1e4689b

    (found by reproduction attempt, and also reported at https://bitcoin.stackexchange.com/questions/99967/reproducible-gitian-builds-but-not-the-same-hash-as-bitcoincore-org and https://www.onooks.com/trouble-reproducing-the-same-binary-as-bitcoin-org-or-bitcoincore-org/)

    Is that a build reproducibility bug or a supply chain attack?

  2. monperrus added the label Bug on Nov 14, 2020
  3. hebasto commented at 10:01 PM on November 14, 2020: member

    @monperrus Mind providing your bitcoin-core-linux-0.20-build.assert file for comparison?

  4. achow101 commented at 10:13 PM on November 14, 2020: member

    As stated in my answer on the referenced stackexchange question, this result is now expected due to changes to gcc and other build dependencies whose versions are not pinned. This is an unfortunate side effect of the gitian build system which should be solved by the move to guix.

    I examined the differences in the compiled binary using diffoscope and it's basically just a few bytes that changed resulting in different hashes. The specific changes were in the extra debug data that is generated, which resulted in a different debug data checksum. As the debug data checksum is embedded in the final binary (which has debugging data stripped), the build id is also different, and the final result is that the final binary hash is also different.

  5. monperrus commented at 5:56 AM on November 15, 2020: none

    Thanks a lot for the explanation @achow101, that's very valuable.

    What's the reason for embedding debug data in the final released binary?

  6. achow101 commented at 5:59 AM on November 15, 2020: member

    Debug data isn't embedded in the binary itself, it is removed and put into separate files. In order to ensure that the debug data can be applied to the binary afterwards in a debugger, gcc will embed a bit of metadata into the final binary. It is this metadata which is causing the reproducibility failure in this particular instance.

  7. monperrus commented at 6:15 AM on November 15, 2020: none

    I'm not sure to understand what you mean by "separate file", because the binaries are different in bitcoin-0.20.1-x86_64-linux-gnu.tar.gz.

    Let us consider bin/bitcoind.

    • on bitcoin.org: 4ec74161b2a90293926ae8e20a2efbe952bd23b53aeebf051e6a6285ace18271
    • on a Gitian build from yesterday: 4ec74161b2a90293926ae8e20a2efbe952bd23b53aeebf051e6a6285ace18271
  8. achow101 commented at 7:23 AM on November 15, 2020: member

    Let me explain in far more detail.

    If you look at the gitian build results, you will see a file named bitcoin-0.20.1-x86_64-linux-gnu-debug.tar.gz. If you untar this file, there will be *.dbg files, e.g. bitcoind.dbg and bitcoin-qt.dbg. These *.dbg files contain the debug data for their respective binaries, i.e. bitcoind.dbg contains debug data for bitcoind.

    To ensure that you use the correct dbg file with the correct binary, gcc embeds a checksum of the dbg file within the binary itself. This means that bitcoind contains a checksum of bitcoind.dbg. This is to prevent attempting to debug bitcoind with another dbg file. For example, if you attempted to tell gdb that the bitcoin-qt.dbg file contained the debugging symbols for bitcoind, it would detect it does not and not attempt to load debugging symbols from bitcoin-qt.dbg.

    The debugging symbols are initially compiled into the bitcoind binary, but later during the gitian build these are removed and placed into the bitcoind.dbg file. However because they are compiled into bitcoind initially, the debugging symbols have an effect on the build id that gcc embeds into the binary. The build ID is a hash of the compiled binary, including the debugging data.

    So the end result is that the published binary contains two commitments to the debug symbols, but does not actually contain them itself.

    You can read more about these separate debug files here: https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

    The crux of this issue is that the debug symbols from recent gitian builds differ from the debug symbols that were generated for the original release. This means that both the build ID and the debug symbol checksum that we find inside of bitcoind are different. This then results in the bitcoind hashes being different (as well as all of the other binaries). And of course that causes the tarfile hashes to be different.


    Here is the diff of the binaries that diffoscope generates:

    --- bitcoind
    +++ /mnt/archive/bitcoin/bitcoin-binaries/0.20.1/bitcoin-0.20.1/bin/bitcoind
    ├── readelf --wide --notes {}
    │ @@ -1,15 +1,15 @@
    │  
    │  Displaying notes found in: .note.ABI-tag
    │    Owner                Data size 	Description
    │    GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)	    OS: Linux, ABI: 3.2.0
    │  
    │  Displaying notes found in: .note.gnu.build-id
    │    Owner                Data size 	Description
    │ -  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)	    Build ID: 6b464617f7f91fd270ac86f43ef4a58eeeedff19
    │ +  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)	    Build ID: 3a439a31a5157ff7052ed310050df5643a02ea3f
    │  
    │  Displaying notes found in: .note.stapsdt
    │    Owner                Data size 	Description
    │    stapsdt              0x00000036	NT_STAPSDT (SystemTap probe descriptors)	    Provider: libstdcxx
    │      Name: throw
    │      Location: 0x00000000006e550d, Base: 0x000000000086d140, Semaphore: 0x0000000000000000
    │      Arguments: 8@%rdi 8@%rsi
    ├── readelf --wide --decompress --hex-dump=.gnu_debuglink {}
    │ @@ -1,5 +1,5 @@
    │  
    │  Hex dump of section '.gnu_debuglink':
    │    0x00000000 62697463 6f696e64 2e646267 00000000 bitcoind.dbg....
    │ -  0x00000010 b25ceebb                            .\..
    │ +  0x00000010 114d519d                            .MQ.
    

    As you can see, there are only two differences here, one in the build ID, and one in the .gnu_debuglink section. From the documentation I linked earlier, we can see that this .gnu_debuglink section has the first line is the debug filename followed by enough 0 bytes to pad to a 4 byte boundary. The second line is the 4 byte CRC checksum. And it is this CRC checksum that differs.

    So why do the debug symbols differ here? Again, diffoscope can help us a bit.

    --- bitcoind.dbg
    +++ /mnt/archive/bitcoin/bitcoin-binaries/0.20.1/bitcoin-0.20.1/bin/bitcoind.dbg
    ├── readelf --wide --notes {}
    │ @@ -1,15 +1,15 @@
    │  
    │  Displaying notes found in: .note.ABI-tag
    │    Owner                Data size 	Description
    │    GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)	    OS: Linux, ABI: 3.2.0
    │  
    │  Displaying notes found in: .note.gnu.build-id
    │    Owner                Data size 	Description
    │ -  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)	    Build ID: 6b464617f7f91fd270ac86f43ef4a58eeeedff19
    │ +  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)	    Build ID: 3a439a31a5157ff7052ed310050df5643a02ea3f
    │  
    │  Displaying notes found in: .note.stapsdt
    │    Owner                Data size 	Description
    │    stapsdt              0x00000036	NT_STAPSDT (SystemTap probe descriptors)	    Provider: libstdcxx
    │      Name: throw
    │      Location: 0x00000000006e550d, Base: 0x000000000086d140, Semaphore: 0x0000000000000000
    │      Arguments: 8@%rdi 8@%rsi
    ├── readelf --wide --debug-dump=info {}
    │┄ error from `readelf --wide --debug-dump=info {}`:
    │┄ readelf: Error: /build/binutils/src/binutils-gdb/binutils/dwarf.c:1989: read LEB value is too large to store in destination variable
    │┄ readelf: Error: /build/binutils/src/binutils-gdb/binutils/dwarf.c:1989: read LEB value is too large to store in destination variable
    │┄ readelf: Error: /build/binutils/src/binutils-gdb/binutils/dwarf.c:1989: read LEB value is too large to store in destination variable
    │┄ readelf: Error: /build/binutils/src/binutils-gdb/binutils/dwarf.c:1989: read LEB value is too large to store in destination variable
    │┄ readelf: Error: /build/binutils/src/binutils-gdb/binutils/dwarf.c:1989: read LEB value is too large to store in destination variable
    │ @@ -85052,36 +85052,36 @@
    │      <29607>   DW_AT_decl_line   : 124
    │      <29608>   DW_AT_decl_column : 16
    │      <29609>   DW_AT_type        : <0x28761>
    │      <2960d>   DW_AT_data_member_location: 12
    │   <2><2960e>: Abbrev Number: 30 (DW_TAG_member)
    │      <2960f>   DW_AT_name        : (indirect string, offset: 0x13aaf): __kind
    │      <29613>   DW_AT_decl_file   : 108
    │ -    <29614>   DW_AT_decl_line   : 148
    │ +    <29614>   DW_AT_decl_line   : 128
    │      <29615>   DW_AT_decl_column : 7
    │      <29616>   DW_AT_type        : <0x287a6>
    │      <2961a>   DW_AT_data_member_location: 16
    │   <2><2961b>: Abbrev Number: 30 (DW_TAG_member)
    │      <2961c>   DW_AT_name        : (indirect string, offset: 0x7a535): __spins
    │      <29620>   DW_AT_decl_file   : 108
    │ -    <29621>   DW_AT_decl_line   : 154
    │ +    <29621>   DW_AT_decl_line   : 134
    │      <29622>   DW_AT_decl_column : 3
    │      <29623>   DW_AT_type        : <0x2879a>
    │      <29627>   DW_AT_data_member_location: 20
    ...
    

    There's a lot more output that I haven't included because it's pretty much all of the same.

    Now this isn't terribly helpful, but we can see that for a bunch of functions, the line number for that function differs by 20 lines.

    To get some more information, I used dwarfdump. This is what it says for the new build for the two functions I show in diffoscope (the functions are __kind and __spins).

    0x0002960e:     DW_TAG_member
                      DW_AT_name    ("__kind")
                      DW_AT_decl_file       ("/usr/include/x86_64-linux-gnu/bits/thread-shared-types.h")
                      DW_AT_decl_line       (148)
                      DW_AT_decl_column     (0x07)
                      DW_AT_type    (0x000287a6 "int")
                      DW_AT_data_member_location    (0x10)
    
    0x0002961b:     DW_TAG_member
                      DW_AT_name    ("__spins")
                      DW_AT_decl_file       ("/usr/include/x86_64-linux-gnu/bits/thread-shared-types.h")
                      DW_AT_decl_line       (154)
                      DW_AT_decl_column     (0x03)
                      DW_AT_type    (0x0002879a "short int")
                      DW_AT_data_member_location    (0x14)
    

    As you can see by the given file name, these functions come from libraries installed to the system. These appear to be headers for gcc's implementation of the c++ stdlib.


    So what's happened is that libstdc++ has updated in Ubuntu. Whatever updates hapened have moved some code in some header files that Bitcoin Core includes in its use of the c++ stdlib. In turn, compiling with those updated headers results in different debug symbols because function declarations have moved in those header files. This then results in gcc computing a different build ID and a different CRC checksum for the debug symbols. This lastly results in the final binaries being slightly different, which causes the hashes to mismatch.

  9. monperrus commented at 8:35 AM on November 15, 2020: none

    Waoow, many thanks for the super detailed and super clear explanation.

    Now, if I understand correctly, to fix this problem, we would need to prevent gcc to embed the checksum of the dbg file within the binary itself.

    We would loose foolproof debugging, but we would increase reproducibility.

    Is that something we'd like to have?

  10. fanquake removed the label Bug on Nov 16, 2020
  11. sipa commented at 8:07 PM on November 16, 2020: member

    @monperrus My guess is that that would be an improvement in some cases (including this one), but it's not a fundamental solution. As long as gitian builds aren't done with a deterministic toolchain, updates will cause changes to the build output. Here we were just "lucky" that it was only in the debug information. The checksum presumably serves a function, so removing it presumably also comes with a cost.

  12. MarcoFalke commented at 7:40 AM on November 17, 2020: member

    Indeed. Guix is our best bet right now to properly pin the toolchain to a version. Relying on Ubuntu to not upgrade theirs is inherently brittle.

  13. MarcoFalke closed this on Nov 17, 2020

  14. monperrus commented at 5:19 PM on November 17, 2020: none

    OK, thanks, I understand the trade-off.

    Any ETA for releasing the Linux binaries with the Guix pipeline?

  15. MarcoFalke commented at 5:47 PM on November 17, 2020: member

    We were hoping to ship 0.21 with guix, but it looks like it will be 22.0

  16. monperrus commented at 8:26 PM on November 17, 2020: none

    Great, looking forward to it!

  17. monperrus commented at 9:37 AM on December 5, 2020: none

    For the record about Guix

  18. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-17 09:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me