build: optimize .h generation in GenerateHeaderFrom{Raw,Json}.cmake #30888

pull l0rinc wants to merge 2 commits into bitcoin:master from l0rinc:lorinc/cmake-header-optimization changing 2 files +29 −37
  1. l0rinc commented at 6:02 pm on September 12, 2024: contributor

    Follow up of the #30883 revert.

    Replaced multiple file writes with a single string template write. The raw content is first grouped into 8 byte chunks, followed by another regex replace which wraps them in std::byte or just the raw bytes, prefixed with 0x.

    Tested the output with diff -w and they’re the same - only whitespace differences because slightly different source formatting.


    Tested the Raw performance with:

    0time cmake -DRAW_SOURCE_PATH=src/bench/data/block413567.raw -DHEADER_PATH=build/after/block413567.raw.h -DRAW_NAMESPACE=benchmark::data -P cmake/script/GenerateHeaderFromRaw.cmake
    

    Before:

    15.41s user 23.06s system 97% cpu 39.593 total

    After:

    0.77s user 0.06s system 97% cpu 0.849 total


    Tested the Json performance with:

    0time cmake -DJSON_SOURCE_PATH=src/secp256k1/src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json -DHEADER_PATH=build/after/ecdsa_secp256k1_sha256_bitcoin_test.json -P cmake/script/GenerateHeaderFromJson.cmake
    

    Before:

    3.57s user 6.01s system 94% cpu 10.136 total

    After:

    0.17s user 0.01s system 98% cpu 0.187 total

  2. DrahtBot commented at 6:02 pm on September 12, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, hebasto, willcl-ark

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Build system on Sep 12, 2024
  4. l0rinc force-pushed on Sep 12, 2024
  5. in cmake/script/GenerateHeaderFromRaw.cmake:5 in 51a8ac54dc outdated
    1@@ -2,26 +2,26 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+# Extract the base name of the source file
    


    maflcko commented at 6:28 pm on September 12, 2024:
    I think it is fine to drop comment around self-explanatory and self-documenting code

    l0rinc commented at 6:53 pm on September 12, 2024:
    Done
  6. in cmake/script/GenerateHeaderFromRaw.cmake:8 in 51a8ac54dc outdated
    1@@ -2,26 +2,26 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+# Extract the base name of the source file
    6+get_filename_component(raw_source_basename ${RAW_SOURCE_PATH} NAME_WE)
    7+
    8+# Read the raw file content in hexadecimal form
    


    maflcko commented at 6:28 pm on September 12, 2024:
    Same, etc.

    l0rinc commented at 6:53 pm on September 12, 2024:
    Done
  7. maflcko approved
  8. maflcko commented at 6:29 pm on September 12, 2024: member

    Concept ACK, but it would be more consistent and easier to maintain to also transform the json one.

    Seems a bit absurd that the regex replacment is faster than appending to a string, so it would be good to test this on Windows and Linux with the minimum and maximum supported cmake version to make sure this is really the last time this needs to be touched.

  9. l0rinc commented at 6:36 pm on September 12, 2024: contributor

    Seems a bit absurd that the regex replacment is faster than appending to a string

    I did some simple profiling and this regex version likely works with some mutable stringbuilder in the background, while the previous string solution was likely quadratic since it’s immutable, so it was recreated after each byte. And the file write is likely faster than that since it uses some buffering.

    also transform the json one

    I’m still working on that one, but given the concept ACK, I’ll do that here as well.

  10. hebasto commented at 6:52 pm on September 12, 2024: member
    Concept ACK on increasing performance.
  11. l0rinc force-pushed on Sep 12, 2024
  12. l0rinc renamed this:
    build: Minimize I/O operations in GenerateHeaderFromRaw.cmake
    build: Minimize I/O operations in GenerateHeaderFrom{Raw,Json}.cmake
    on Sep 12, 2024
  13. l0rinc renamed this:
    build: Minimize I/O operations in GenerateHeaderFrom{Raw,Json}.cmake
    build: optimize .h generation in GenerateHeaderFrom{Raw,Json}.cmake
    on Sep 12, 2024
  14. l0rinc commented at 6:59 pm on September 12, 2024: contributor

    it would be good to test this on Windows and Linux with the minimum and maximum supported cmake version to make sure this is really the last time this needs to be touched.

    While the speed difference is algorithmic, shouldn’t be OS dependent, I would appreciate if someone would do that.

    would be more consistent and easier to maintain to also transform the json one

    pushed

  15. in cmake/script/GenerateHeaderFromJson.cmake:12 in 3af3619495 outdated
    12 
    13-file(WRITE ${HEADER_PATH} "#include <string_view>\n")
    14-file(APPEND ${HEADER_PATH} "namespace json_tests{\n")
    15-get_filename_component(json_source_basename ${JSON_SOURCE_PATH} NAME_WE)
    16-file(APPEND ${HEADER_PATH} "inline constexpr char detail_${json_source_basename}_bytes[]{\n")
    17+# Write the content of the .h file
    


    maflcko commented at 6:59 pm on September 12, 2024:
    Same. file(WRITE ${HEADER_PATH}... should be self-documenting already.

    l0rinc commented at 7:35 pm on September 12, 2024:
    removed
  16. in cmake/script/GenerateHeaderFromRaw.cmake:8 in 3af3619495 outdated
    1@@ -2,26 +2,24 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+get_filename_component(raw_source_basename ${RAW_SOURCE_PATH} NAME_WE)
    6+
    7 file(READ ${RAW_SOURCE_PATH} hex_content HEX)
    8-string(REGEX MATCHALL "([A-Za-z0-9][A-Za-z0-9])" bytes "${hex_content}")
    9+# Split lines every 16 characters and format the bytes into std::byte format
    


    maflcko commented at 7:02 pm on September 12, 2024:
    nit: Instead of describing what the code does, it would be better to explain why it does something, where relevant. Here, it could say that string-REGEX-REPLACE` was found to be faster than file-APPEND or string-APPEND. But just a style nit, up to you.

    l0rinc commented at 7:35 pm on September 12, 2024:
    I’m not surprised that regex replaces are faster (hence I didn’t comment on that), but cmake syntax isn’t my main language, that’s why I explained that instead. I could explain the history, but the end result is so simple that it may not be needed. If other reviewers require it, I’ll add some extra comments, of course.
  17. in cmake/script/GenerateHeaderFromJson.cmake:8 in 3af3619495 outdated
    1@@ -2,25 +2,23 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+get_filename_component(json_source_basename ${JSON_SOURCE_PATH} NAME_WE)
    6+
    7 file(READ ${JSON_SOURCE_PATH} hex_content HEX)
    8-string(REGEX MATCHALL "([A-Za-z0-9][A-Za-z0-9])" bytes "${hex_content}")
    9+# Split lines every 16 characters and format the bytes into std::byte format
    


    maflcko commented at 7:02 pm on September 12, 2024:

    The comment is wrong. There is no std::byte in a string’s char type.

    (Maybe just remove the comments?)


    l0rinc commented at 7:34 pm on September 12, 2024:
    removed them
  18. maflcko approved
  19. maflcko commented at 7:03 pm on September 12, 2024: member
    lgtm
  20. maflcko commented at 7:06 pm on September 12, 2024: member

    it would be good to test this on Windows and Linux with the minimum and maximum supported cmake version to make sure this is really the last time this needs to be touched.

    While the speed difference is algorithmic, shouldn’t be OS dependent, I would appreciate if someone would do that.

    Some people claimed that string-APPEND was faster on Windows than file-APPEND, but it was the inverse on Linux. So I think it would be good to check on both platforms. (I can do Linux, but not Windows).

  21. l0rinc force-pushed on Sep 12, 2024
  22. in cmake/script/GenerateHeaderFromJson.cmake:5 in a57e7bf79b outdated
    1@@ -2,25 +2,21 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+get_filename_component(json_source_basename ${JSON_SOURCE_PATH} NAME_WE)
    


    hebasto commented at 7:51 pm on September 12, 2024:

    nit: Can be replaced with the cmake_path() command.

    From CMake docs:

    Changed in version 3.20: This command has been superseded by the cmake_path() command…


    l0rinc commented at 8:09 pm on September 12, 2024:
    Done, thanks
  23. in cmake/script/GenerateHeaderFromJson.cmake:8 in a57e7bf79b outdated
    1@@ -2,25 +2,21 @@
    2 # Distributed under the MIT software license, see the accompanying
    3 # file COPYING or https://opensource.org/license/mit/.
    4 
    5+get_filename_component(json_source_basename ${JSON_SOURCE_PATH} NAME_WE)
    6+
    7 file(READ ${JSON_SOURCE_PATH} hex_content HEX)
    8-string(REGEX MATCHALL "([A-Za-z0-9][A-Za-z0-9])" bytes "${hex_content}")
    9+string(REGEX REPLACE "................" "\\0\n" formatted_bytes "${hex_content}")
    


    hebasto commented at 7:55 pm on September 12, 2024:
    nit: Using the string(REPEAT ...) command to create a string of 16 dots can improve the code’s clarity, I think.

    l0rinc commented at 8:02 pm on September 12, 2024:

    I thought of that but couldn’t find a way to do it inline - and defing a new variable in a separate line just for this makes it less readable

    0string(REPEAT "." 16 line_width)
    1string(REGEX REPLACE "${line_width}" "\\0\n" formatted_bytes "${hex_content}")
    
  24. hebasto commented at 7:56 pm on September 12, 2024: member
    Changes look correct. Going to benchmark them on Windows tomorrow.
  25. build: Minimize I/O operations in GenerateHeaderFromRaw.cmake
    Replaced multiple file writes with a single string template write.
    The raw content is first grouped into 8 byte chunks, followed by another regex replace which wraps them in `std::byte`.
    
    Tested the output with `diff -w` and they're the same - only whitespace differences because slightly different source formatting.
    
    Tested the performance with:
    > time cmake -DRAW_SOURCE_PATH=src/bench/data/block413567.raw -DHEADER_PATH=build/after/block413567.raw.h -DRAW_NAMESPACE=benchmark::data -P cmake/script/GenerateHeaderFromRaw.cmake
    
    Before:
    > 15.41s user 23.06s system 97% cpu 39.593 total
    After:
    > 0.77s user 0.06s system 97% cpu 0.849 total
    aa003d1568
  26. build: Minimize I/O operations in GenerateHeaderFromJson.cmake
    Tested the performance with:
    >  time cmake -DJSON_SOURCE_PATH=src/secp256k1/src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json -DHEADER_PATH=build/after/ecdsa_secp256k1_sha256_bitcoin_test.json -P cmake/script/GenerateHeaderFromJson.cmake
    
    Before:
    > 3.57s user 6.01s system 94% cpu 10.136 total
    
    After:
    > 0.17s user 0.01s system 98% cpu 0.187 total
    2a581144f2
  27. l0rinc force-pushed on Sep 12, 2024
  28. hebasto commented at 11:51 am on September 13, 2024: member

    Here are benchmarks on Windows using the following command in PowerShell:

    0> (Measure-Command { cmake -DRAW_SOURCE_PATH="src/bench/data/block413567.raw" -DHEADER_PATH="build/bench/data/block413567.raw.h" -DRAW_NAMESPACE="benchmark::data" -P cmake/script/GenerateHeaderFromRaw.cmake }).TotalSeconds
    
    • the master branch @ 07c7c96022dd325be1cd3b353d575eb6a5593f55 using file(APPEND ...) calls:
    0296.326647
    
    • this PR (three consecutive runs)
    01.6235875
    11.5456055
    21.5283277
    

    Tested on Windows 11 Pro + Visual Studio 2022 17.11.2 with builtin CMake 3.29.5-msvc4.

  29. maflcko commented at 11:55 am on September 13, 2024: member

    Nice! So this looks like a massive speedup on Windows.

    I think the two commits can be squashed, because they do the same thing (just to different files), like the previous modifications to the scripts, which were also a single commit.

  30. fanquake commented at 11:56 am on September 13, 2024: member
    This needs benchmarking on the systems used by devs (i.e Linux, and ideally those that previously reported issues), to ensure that we don’t just have the same problem of Linux performance regressions again.
  31. l0rinc commented at 11:58 am on September 13, 2024: contributor

    think the two commits can be squashed

    I’ve put the before/after measurements in the commit messages, I’d prefer them separate

  32. fanquake commented at 11:58 am on September 13, 2024: member

    I’ve put the before/after measurements in the commit messages, I’d prefer them separate

    You can put that in the PR description, that will be included in the merge.

  33. hebasto commented at 11:59 am on September 13, 2024: member
    The minimum supported CMake version, 3.22, and the most recent version, 3.30, show similar performance on Linux as in the PR description.
  34. in cmake/script/GenerateHeaderFromRaw.cmake:20 in 2a581144f2
    34+};
    35 
    36-file(APPEND ${HEADER_PATH} "\n};\n")
    37-file(APPEND ${HEADER_PATH} "inline constexpr std::span ${raw_source_basename}{detail_${raw_source_basename}_raw};\n")
    38-file(APPEND ${HEADER_PATH} "}")
    39+inline constexpr std::span ${raw_source_basename}{detail_${raw_source_basename}_raw};
    


    l0rinc commented at 12:00 pm on September 13, 2024:
    would it make sense to change the return type to std::span<const std::byte> instead?
  35. l0rinc commented at 12:01 pm on September 13, 2024: contributor
    Thanks a lot for checking @hebasto!
  36. maflcko commented at 12:12 pm on September 13, 2024: member

    review ACK 2a581144f28bad44de40122864f2f7b9fc5000de 👒

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK 2a581144f28bad44de40122864f2f7b9fc5000de 👒
    3uN2m9f++xQ2eaxEpbPURfgVo/1fNzq6POUDlY+cVP7hiSpJ14LUhbjwlC2ruVH9aZ+KTK0KBu0oyjisI8BUUCA==
    
  37. DrahtBot requested review from hebasto on Sep 13, 2024
  38. maflcko commented at 12:24 pm on September 13, 2024: member
    (Also tested on a fresh Ubuntu 24.04 VM and saw a 30x speedup, which is useful, because this part isn’t cached by ccache between builds)
  39. hebasto approved
  40. hebasto commented at 1:12 pm on September 13, 2024: member

    ACK 2a581144f28bad44de40122864f2f7b9fc5000de.

    The new scripts introduce trailing spaces in the generated header, but these are not essential.

  41. maflcko commented at 1:36 pm on September 13, 2024: member

    The new scripts introduce trailing spaces in the generated header, but these are not essential.

    In theory the spaces in the data itself could be dropped completely, because for the compiler (and a human reader) a single , without space is sufficient. But I agree it doesn’t matter, so probably best to leave as-is for now.

  42. maflcko added the label DrahtBot Guix build requested on Sep 16, 2024
  43. DrahtBot commented at 5:44 pm on September 16, 2024: contributor

    Guix builds (on x86_64) [untrusted test-only build, possibly unsafe, not for production use]

    File commit 0c4ff18ee9ec91b424ad26d2643e42566aa45e40(master) commit a20ad9faa701164b639d8f12deccb583e8367649(master and this pull)
    SHA256SUMS.part 79b358d3fd281b65... 5bbbf668eaff2bfc...
    *-aarch64-linux-gnu-debug.tar.gz 4e496d0eb67133bf... 4809bcef048b5c04...
    *-aarch64-linux-gnu.tar.gz 43490ff6ab9a74e2... 9afe25eb1448a0b8...
    *-arm-linux-gnueabihf-debug.tar.gz 38c68631f041e4d8... addd0a7434e9f3c8...
    *-arm-linux-gnueabihf.tar.gz a80c39716b285df7... afaa3a0e77b74900...
    *-arm64-apple-darwin-unsigned.tar.gz 59dce49b541dff6d... ff63b25e8b9ca0d3...
    *-arm64-apple-darwin-unsigned.zip acb355306dcbd43c... 52defa959c2c2f40...
    *-arm64-apple-darwin.tar.gz b6fa18bedb7fe497... de59410584c47ffc...
    *-powerpc64-linux-gnu-debug.tar.gz ab07199c08dcbf3d... 94cec6e4990bebcb...
    *-powerpc64-linux-gnu.tar.gz c93841f69a8a21f2... 710cd733918174ad...
    *-riscv64-linux-gnu-debug.tar.gz fa11806d89e9abd7... 17b984656ab88e1f...
    *-riscv64-linux-gnu.tar.gz 379cad9361252115... a9a49e7a8f0f18aa...
    *-x86_64-apple-darwin-unsigned.tar.gz 03fd8b75278b79c7... 7c7c150fe98eb388...
    *-x86_64-apple-darwin-unsigned.zip e7f5613fc474ffdb... 4cb1fc5bfc33cae2...
    *-x86_64-apple-darwin.tar.gz 9c1621a91eba759d... d0920b6b70db193f...
    *-x86_64-linux-gnu-debug.tar.gz 0aaee8826bab2f2b... 8bb54c026791163a...
    *-x86_64-linux-gnu.tar.gz 163cb7ffd9eec265... 11396265f90ba4fa...
    *.tar.gz 73ef39f7a7c8ccd9... 747ad66d4fca95a8...
    guix_build.log 2c42f4ee42e6a171... 510e236d2a2c212c...
    guix_build.log.diff 922169528ef9f503...
  44. DrahtBot removed the label DrahtBot Guix build requested on Sep 16, 2024
  45. willcl-ark approved
  46. willcl-ark commented at 1:56 pm on September 17, 2024: member

    tACK 2a581144f28bad44de40122864f2f7b9fc5000de

    I didn’t review the code in too much detail, but I like hebasto also saw a good speedup on Ubuntu 23.10 x86_64 from 14 seconds down to half a second for block413567.raw with cmake 3.30:

     0will@ubuntu in ~/src/bitcoin on  master [$?⇕] via △ v3.30.3 : 🐍 (bitcoin) took 12s
     1$ rm -i -Rf build
     2
     3will@ubuntu in ~/src/bitcoin on  master [$?⇕] via △ v3.30.3 : 🐍 (bitcoin)
     4$ time cmake -DRAW_SOURCE_PATH=src/bench/data/block413567.raw -DHEADER_PATH=build/before/block413567.raw.h -DRAW_NAMESPACE=benchmark::data -P cmake/script/GenerateHeaderFromRaw.cmake
     5
     6________________________________________________________
     7Executed in   14.37 secs    fish           external
     8   usr time   10.11 secs  936.00 micros   10.11 secs
     9   sys time    4.13 secs  176.00 micros    4.13 secs
    10
    11
    12will@ubuntu in ~/src/bitcoin on  master [$?⇕] via △ v3.30.3 : 🐍 (bitcoin) took 14s
    13$ rm -i -Rf build
    14
    15will@ubuntu in ~/src/bitcoin on  master [$?⇕] via △ v3.30.3 : 🐍 (bitcoin)
    16✗ gh pr checkout 30888
    17From https://github.com/bitcoin/bitcoin
    18 * [new ref]                 refs/pull/30888/head -> lorinc/cmake-header-optimization
    19Switched to branch 'lorinc/cmake-header-optimization'
    20
    21will@ubuntu in ~/src/bitcoin on  lorinc/cmake-header-optimization [$?] via △ v3.30.3 : 🐍 (bitcoin)
    22$ time cmake -DRAW_SOURCE_PATH=src/bench/data/block413567.raw -DHEADER_PATH=build/after/block413567.raw.h -DRAW_NAMESPACE=benchmark::data -P cmake/script/GenerateHeaderFromRaw.cmake
    23
    24________________________________________________________
    25Executed in  673.32 millis    fish           external
    26   usr time  564.70 millis  851.00 micros  563.85 millis
    27   sys time  108.13 millis  163.00 micros  107.97 millis
    

    The generated headers looked ok, minus whitespace and no newline at end of file.

    Overall a nice speedup!

  47. fanquake merged this on Sep 17, 2024
  48. fanquake closed this on Sep 17, 2024

  49. l0rinc deleted the branch on Sep 17, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me