build: Embedded ASMap [3/3]: Build binary dump header file #28792

pull fjahr wants to merge 11 commits into bitcoin:master from fjahr:2023-10-asmap-in-source changing 30 files +504 −244
  1. fjahr commented at 10:13 pm on November 4, 2023: contributor

    Depends on #33878. This is the third in a tripled of PRs that implement the necessary changes for embedding of asmap data into the binary. This last part add the initial asmap data, implements the build changes and adds further documentation.

    Currently an asmap file needs to be acquired by there user from some location or the user needs to generate one themselves. Then they need to move the file to the right place in datadir or pass the path to the file as -asmap=PATH in order to use the asmap feature. The change here allows for builds to embed asmap data into the bitcoind binary which makes it possible to use the feature without handling of the asmap file by the user. If the user starts bitcoind with -asmap the embedded data will be used for bucketing of nodes.

    The data lives in the repository at src/node/data/ip_asn.dat and can be replaced with a new version at any time. The idea is that the data should be updated with every release. By default the data at that location is embedded into the binary but there is also a build option to prevent this (-DWITH_EMBEDDED_ASMAP=OFF). In this case the original behavior of the -asmap option is maintained.

  2. DrahtBot commented at 10:13 pm on November 4, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/28792.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK Sjors, jonatack, laanwj, hodlinator
    Stale ACK sipa

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33878 (refactor, docs: Embedded ASMap [2/3]: Refactor asmap internals and add documentation by fjahr)
    • #33770 (init: Require explicit -asmap filename by ryanofsky)
    • #33631 (init: Split file path handling out of -asmap option by fjahr)
    • #31260 (scripted-diff: Type-safe settings retrieval by ryanofsky)
    • #17783 (common: Disallow calling IsArgSet() on ALLOW_LIST options by ryanofsky)
    • #17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
    • #17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
    • #17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • contrib/asmapand -> contrib/asmap and [missing space between backtick and “and”, which makes the sentence run together and slightly harder to read]
    • creator of used ASMap file -> creator of the used ASMap file OR creator of the ASMap file used [missing article / awkward phrasing that makes the sentence grammatically off]

    2025-11-25

  3. DrahtBot added the label CI failed on Nov 4, 2023
  4. fjahr force-pushed on Nov 4, 2023
  5. fjahr force-pushed on Nov 4, 2023
  6. fjahr force-pushed on Nov 4, 2023
  7. sipa commented at 0:24 am on November 5, 2023: member
    Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).
  8. DrahtBot removed the label CI failed on Nov 5, 2023
  9. maflcko commented at 5:32 pm on November 5, 2023: member
    How much data would this add per year to the repo, assuming updates happen?
  10. sipa commented at 5:59 pm on November 5, 2023: member
    @maflcko The asmap file added here is around 1.6 MiB, so assuming we update for every 6 months, probably a bit over 3 MiB per year (gzip compresses asmap files only ~5%).
  11. luke-jr commented at 4:17 pm on November 8, 2023: member
    Why is the option of loading it from a file even in dev builds, not considered?
  12. fjahr commented at 7:08 pm on November 9, 2023: contributor

    Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).

    Yeah, that would work as well. I have drafted that approach here: https://github.com/fjahr/bitcoin/commit/0f9109fc49b8add9c057dcacd537250d4539ae57 Details can change of course (like keeping the raw file in init) but if people prefer that approach, I will pull this into here.

  13. fjahr commented at 7:11 pm on November 9, 2023: contributor

    Why is the option of loading it from a file even in dev builds, not considered?

    It is considered. Loading from a file is the only option for dev builds if we go the other route mentioned in the description: “The alternative approach is not having the data in the source code but only adding it during the build phase of a release.” But this also comes with the aforementioned downsides.

  14. willcl-ark commented at 9:56 am on December 6, 2023: member

    A general question about this approach: do most contemporary OSes handle loading large amounts of (potentially) unused data into physical memory well-enough, that they would (likely) not load this extra 1.5-3MB of data in the case that the -asmap is not being used? Perhaps this will be handled by the demand paging system? If not, it seems to be a bit of a shame to increase binary size by 20% in all cases.

    It seems to make most sense to me to take the path suggested by @sipa, including the .dat file in the repo, and having it converted to a .h at compile time. It might also be nice to have a configuration flag to disable the embedded header creation for faster builds.

  15. DrahtBot added the label Needs rebase on Dec 14, 2023
  16. fjahr force-pushed on Dec 14, 2023
  17. DrahtBot removed the label Needs rebase on Dec 15, 2023
  18. DrahtBot added the label CI failed on Jan 16, 2024
  19. fjahr renamed this:
    Embedding ASMap files as binary dump header file
    Embed default ASMap as binary dump header file
    on Aug 27, 2024
  20. fjahr force-pushed on Aug 28, 2024
  21. fjahr marked this as ready for review on Aug 28, 2024
  22. fjahr force-pushed on Aug 28, 2024
  23. fjahr commented at 11:48 pm on August 28, 2024: contributor

    I am picking this back up since we have come a long way in terms of tooling outside of Bitcoin Core and lately I was just waiting for cmake and v28 feature freeze. I will suggest an early merge in the cycle for v29 so more people are encouraged to use one steady file over a longer period of time.

    I am now using the suggested approach of building the header file at compile time and I added the functionality for both autotools and cmake (the autotools one has existed for a while, I can also remove it if people think it’s unnecessary). Of course this is rebased to make cmake available. First time I am editing the cmake build system so there are probably things to improve… And there is also a command for asmap-tool.py to generate the header file.

    I am adding the latest (1724248800) asmap file from asmap-data, the corresponding creation process can be reviewed in the issue and PR.

    I have not added a config option to skip the asmap header file but I will look into that next.

  24. DrahtBot removed the label CI failed on Aug 29, 2024
  25. Sjors commented at 9:13 am on August 29, 2024: member

    I am now using the suggested approach of building the header file at compile time

    I think the suggestion was to not include the header itself in that case.

    the autotools one has existed for a while, I can also remove it if people think it’s unnecessary

    Maybe keep it in until #30664 is merged.

  26. fjahr force-pushed on Aug 29, 2024
  27. fjahr commented at 6:35 pm on August 29, 2024: contributor
    Dropped the autotools build support after the discussion on IRC today (plan is to remove autotools in 24h) and rebased for good measure to have latest CMake fixups included.
  28. in src/init.cpp:1264 in e5a25fd708 outdated
    1265-                InitError(strprintf(_("Could not parse asmap file %s"), fs::quoted(fs::PathToString(asmap_path))));
    1266-                return false;
    1267+
    1268+            if (fs::exists(asmap_path)) {
    1269+                asmap = DecodeAsmap(asmap_path);
    1270+                if (asmap.size() == 0) {
    


    jonatack commented at 6:46 pm on August 29, 2024:

    nit, here and line 1271

    0                if (asmap.empty()) {
    

    fjahr commented at 8:48 pm on August 30, 2024:
    done
  29. fjahr commented at 6:51 pm on August 29, 2024: contributor

    I am now using the suggested approach of building the header file at compile time

    I think the suggestion was to not include the header itself in that case.

    I see. I am indifferent between having them in the repo or not but I am not sure what the arguments are for doing one or the other.

    We currently have two cases that seem comparable where we have the raw files as part of the repo and then make header files out of it in one way or another:

    • The test/data/* files that are built at compile time and the headers are not included in the repo
    • The chainparamsseeds that are built on demand for a release using the python script in contrib. This header file is included in the repo.

    We’ll have the raw file in the repo for sure but I am indifferent between using the build system or the python tool to make the header and I have implemented both which is good for testing anyway, I guess. I am also indifferent between having the header file in the repo or not. Conceptually the chainparamseeds seems closer to what we would do with an embedded asmap file: we basically also refresh it once per (point) release. So I guess that’s why I did it this way, because I was looking at chainparamsees as the reference case.

    I am curious if there are arguments for or against having the header file in the repo that I am currently not seeing. But even if not I am happy to go with what reviewers prefer because I just don’t have a preference right now. Maybe we’ll want to update the chainparamseeds process too if there is an argument for it.

  30. in contrib/asmap/asmap-tool.py:132 in e5a25fd708 outdated
    127@@ -127,6 +128,20 @@ def main():
    128                                    help="address file containing getnodeaddresses output to use in the comparison "
    129                                    "(make sure to set the count parameter to zero to get all node addresses, "
    130                                    "e.g. 'bitcoin-cli getnodeaddresses 0 > addrs.json')")
    131+
    132+    parser_gen_header = subparsers.add_parser("gen_header",
    


    jonatack commented at 7:06 pm on August 29, 2024:

    Might be good to add an example of these new commands and arguments to /contrib/asmap/README.md (and also mention the asmap tooling in /contrib/README.md).

    A higher level doc of overall steps to follow may be helpful, too, unless there is one already that I have forgotten about and overlooked.


    fjahr commented at 8:52 pm on August 30, 2024:

    I have expanded /contrib/asmap/README.md and /contrib/README.md and also added more details to the -asmap help.

    It sounds good to me to add a doc/asmap.md file with a higher level view but I would like to leave it as a follow-up that @jurraca might want to take on soon. He is currently working on getting https://asmap.org in shape and that doc would probably be a condensed version of that.

  31. jonatack commented at 7:08 pm on August 29, 2024: member
    First look at e5a25fd708858204cb6060393dc59d1dfed68acb
  32. Sjors commented at 10:11 am on August 30, 2024: member

    Concept ACK @fjahr if cmake can generate src/init/ip_asn.h from ip_asn.dat and it’s relative quick, then it should be sufficient to only include the .dat file. It saves space in the repo.

    If on the other hand you have to manually call a Python tool to produce the header file, then it would be too much inconvenience for everyone who wants to build Bitcoin Core from source.

    I just tried deleting src/init/ip_asn.h and running cmake again. It seems to take just a fraction of a second, but not sure how to measure that.

    ip_asn.h is also not human readable. It’s not useful to compare the diff between changes. If it were, that could be argument in favor of including it.

    Can you expand the documentation for -asmap to explain how it behaves? I haven’t studied the code in detail yet, but do we now always turn the -asmap feature on, using the default map? Or is it off by default and can be used with -asmap=1 for the default map and -asmap=somefile.dat for a custom / up to date map?

  33. fjahr force-pushed on Aug 30, 2024
  34. fjahr force-pushed on Aug 30, 2024
  35. DrahtBot added the label CI failed on Aug 30, 2024
  36. DrahtBot commented at 8:54 pm on August 30, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/29496076146

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  37. fjahr commented at 9:09 pm on August 30, 2024: contributor

    I have addressed the latest feedback from @Sjors and @jonatack and rebased again because of the CMake situation and because I overhauled a lot of the changes anyway.

    In these latest changes I have

    • Re-organized the commits, I think review should be a bit easier like this
    • Expanded documentation and tests
    • Made it possible to build with the embedded ASMap data or without
    • Removed the header file from the repo
  38. fjahr commented at 9:24 pm on August 30, 2024: contributor

    Can you expand the documentation for -asmap to explain how it behaves? I haven’t studied the code in detail yet, but do we now always turn the -asmap feature on, using the default map? Or is it off by default and can be used with -asmap=1 for the default map and -asmap=somefile.dat for a custom / up to date map?

    I have expanded the help text for -asmap but I want to expand on it a bit here to explain my reasoning.

    Usage of the embedded asmap data is off by default right now. I am open to throw an on-by-default PR out there as a follow-up but I don’t want that conceptual discussion to hold up review of the implementation PR here. My feeling is that most people will be more comfortable with the v2-transport model: Implement as an off-by-default option for one release and then enable by default in the next version if there are no issues discovered that block it.

    On the -asmap functionality: If a file path is given (-asmap=foo.map), the file is either found and opened, otherwise we error. If no path is given (-asmap or -asmap=1) we first look for a file in the default file location. If there is no file in the default location and the build includes the embedded asmap data, we use the embedded data. I was going back and forth on that last part, i.e. using the embedded data instead of the file in the default location when both are present. But I opted for this way because 1. we are not breaking any workflows that were working before. If someone puts their file in the default location and expects it to be used, it’s still going to be used. And 2. dropping an asmap file in the default location can still be the most comfortable way of updating your asmap data, for example if you want to use the latest file from asmap-data instead of the embedded one.

  39. fjahr force-pushed on Aug 30, 2024
  40. fjahr force-pushed on Sep 1, 2024
  41. fjahr force-pushed on Sep 1, 2024
  42. fjahr force-pushed on Sep 1, 2024
  43. fjahr force-pushed on Sep 1, 2024
  44. Sjors commented at 6:42 am on September 2, 2024: member

    Usage of the embedded asmap data is off by default right now.

    That makes sense.

    If a file path is given (-asmap=foo.map), the file is either found and opened, otherwise we error. If no path is given (-asmap or -asmap=1) we first look for a file in the default file location. If there is no file in the default location and the build includes the embedded asmap data, we use the embedded data.

    I think that’s a good approach.

    CI is unhappy :-(

  45. fjahr commented at 9:31 am on September 2, 2024: contributor

    CI is unhappy :-(

    Yeah, I’ve been trying to fix that but it’s tricky because I can’t reproduce it locally. I guess I need to ask someone with more CMake experience.

  46. in cmake/script/WriteIPASN.cmake:14 in b20f0ef642 outdated
     8+execute_process(
     9+    COMMAND xxd -i "${RAW_FILE}"
    10+    COMMAND sed "s/unsigned char .*/static const unsigned char ip_asn[] = {/"
    11+    COMMAND sed "s/^unsigned int .*_len/static const unsigned int ip_asn_len/"
    12+    OUTPUT_VARIABLE HEX_DUMP
    13+)
    


    maflcko commented at 9:34 am on September 2, 2024:
    missing COMMAND_ERROR_IS_FATAL? Otherwise, a failure is silently ignored, no?

    fjahr commented at 9:53 am on September 2, 2024:
    Added, let’s see how it goes, I didn’t know the function could fail silently.
  47. in cmake/script/WriteIPASN.cmake:5 in b20f0ef642 outdated
    0@@ -0,0 +1,16 @@
    1+file(WRITE "${GENERATED_FILE}" "#ifndef BITCOIN_INIT_IP_ASN_H\n")
    2+file(APPEND "${GENERATED_FILE}" "#define BITCOIN_INIT_IP_ASN_H\n")
    3+file(APPEND "${GENERATED_FILE}" "/**\n")
    4+file(APPEND "${GENERATED_FILE}" " * ASMap data, mapping IP prefixes to ASNs,\n")
    5+file(APPEND "${GENERATED_FILE}" " * AUTOGENERATED by contrib/asmap/asmap-tool.py\n")
    


    maflcko commented at 9:38 am on September 2, 2024:
    Seems a bit odd to implement this both in Python and cmake (by calling xxd). Wouldn’t it be easier to just implement it once and require that to be called before use in the other place?

    fjahr commented at 12:04 pm on September 2, 2024:
    Now using the python script from cmake, so marking as resolved.
  48. maflcko commented at 9:38 am on September 2, 2024: member
    I presume the CI fails because xxd is missing?
  49. fjahr force-pushed on Sep 2, 2024
  50. fjahr force-pushed on Sep 2, 2024
  51. fjahr commented at 9:59 am on September 2, 2024: contributor

    I presume the CI fails because xxd is missing?

    Maybe but it would really surprise me because the failures are in Ubuntu 22.04 environments and I tried reproduce the error in a fresh Ubuntu 22.04 VM and it seemed xxd was included from the start.

  52. fjahr commented at 10:01 am on September 2, 2024: contributor

    Sounds like it can’t find the raw file…

    0CMake Error at /ci_container_base/cmake/script/WriteIPASN.cmake:8 (execute_process):
    1  execute_process error getting child return code: No such file or directory
    
  53. maflcko commented at 10:02 am on September 2, 2024: member

    Maybe but it would really surprise me because the failures are in Ubuntu 22.04 environments and I tried reproduce the error in a fresh Ubuntu 22.04 VM and it seemed xxd was included from the start.

    Are you sure? Note that the CI doesn’t use a desktop VM image, but a minimal docker image. I get:

    0# xxd
    1bash: xxd: command not found
    
  54. fjahr commented at 10:03 am on September 2, 2024: contributor

    Note that the CI doesn’t use a desktop VM image, but a minimal docker image

    Ah, then I guess that was my mistake

  55. fjahr force-pushed on Sep 2, 2024
  56. fjahr force-pushed on Sep 2, 2024
  57. fjahr commented at 12:04 pm on September 2, 2024: contributor

    Looks not using xxd fixed it, thanks @maflcko !

    Seems a bit odd to implement this both in Python and cmake (by calling xxd). Wouldn’t it be easier to just implement it once and require that to be called before use in the other place?

    Yeah, I first implemented it in python, then autotools and finally cmake. I kept the python version around because I wasn’t sure which approach we might actually want to go (see #28792 (comment)) and because I thought there might be some possible testing scenarios using it. But it doesn’t matter now since we needed an xxd-free approach to work on all environments and implementing the functionality in cmake directly seemed cumbersome, so I am now just using the python script from cmake as well, leaving us with just the python implementation.

  58. fjahr force-pushed on Sep 2, 2024
  59. in contrib/asmap/asmap-tool.py:225 in 39729a910a outdated
    220+ */
    221+unsigned char ip_asn[] = {{
    222+    {hex_data}
    223+}};
    224+
    225+unsigned int ip_asn_len = {len(data)};
    


    maflcko commented at 12:23 pm on September 2, 2024:
    Using a mutable C-style array seems a bit fragile. IIUC, starting with C++17, using inline constexpr std::array ip_asn{...}; should allow the compiler to optimize the layout, as well as enforce immutability. Moreover, std::array drops the need for ip_asn_len, because the size is embedded in the type of the array. Unrelated, you can change the 0x... above to std::byte{0x...} and then use MakeUCharSpan instead of legacy C-Style reinterpret cast and static cast in the code that uses the array.

    fjahr commented at 3:56 pm on September 2, 2024:
    done
  60. Sjors commented at 1:08 pm on September 2, 2024: member

    so I am now just using the python script from cmake as well, leaving us with just the python implementation

    Mmm, is Python already a build requirement though? Afaik it’s only a test requirement. So if the header can only be generated using a Python script, then that’s an argument for including it in the source code.

  61. maflcko commented at 1:16 pm on September 2, 2024: member

    if the header can only be generated using a Python script, then that’s an argument for including it in the source code.

    It is possible to generate it in native cmake as well. See for example cmake/script/GenerateHeaderFromJson.cmake. Note that previously the “cmake” generation used xxd, which isn’t a build-requirement either.

    Edit: To clarify: I am against reverting to including the data twice. 1.6 MB on every bump is bad enough. Twice that for no strong reason seems annoying to anyone cloning the repo in the future.

  62. Sjors commented at 1:35 pm on September 2, 2024: member

    The current version of cmake/script/GenerateHeaderFromJson.cmake in this PR calls asmap-tool.py: https://github.com/bitcoin/bitcoin/pull/28792/commits/07f05d5d3c773106fb75a9a25ae970ba0c6cf3a0#diff-cdce11cce5bf019b8a9c82d69718aefd2cca265cbaa45f7e0cebc497f60ddf25R25

    Is there a way to do this without relying either on Python or xxd?

    Twice that for no strong reason seems annoying to anyone cloning the repo in the future.

    Agreed.

  63. fjahr commented at 2:22 pm on September 2, 2024: contributor

    To clarify: I am against including the data twice. 1.6 MB on every bump is bad enough. Twice that for no strong reason seems annoying to anyone cloning the repo in the future.

    Now I’m confused, the header file isn’t included anymore since a while ago so the data is included only once already, or am I missing something? Are you just saying that shouldn’t be changed back?

  64. fjahr force-pushed on Sep 2, 2024
  65. fjahr commented at 3:57 pm on September 2, 2024: contributor
    Switched back to a native approach in cmake without python or xxd.
  66. fjahr force-pushed on Sep 2, 2024
  67. DrahtBot removed the label CI failed on Sep 2, 2024
  68. in cmake/script/GenerateHeaderAsmap.cmake:6 in b557dc5976 outdated
    0@@ -0,0 +1,33 @@
    1+# Copyright (c) 2023-present The Bitcoin Core developers
    2+# Distributed under the MIT software license, see the accompanying
    3+# file COPYING or https://opensource.org/license/mit/.
    4+
    5+file(READ ${RAW_FILE} hex_content HEX)
    6+string(REGEX MATCHALL "([A-Za-z0-9][A-Za-z0-9])" bytes "${hex_content}")
    


    maflcko commented at 12:57 pm on September 5, 2024:
    Could rebase and use the existing generate_header_from_raw, which should do the same thing, basically?

    fjahr commented at 10:00 am on September 13, 2024:
    Seems like there is still some tinkering going on on that one, I will give it a try when that has settled down.

    fjahr commented at 9:34 am on September 20, 2024:
    I am now using that function. I thought there might be issue with the missing header guards but the generated files are not listed. There is now also no more comment in the generated file.

    maflcko commented at 9:56 am on September 20, 2024:

    missing header guard

    That was somewhat intentional, I guess. A header guard is really only useful for module headers to avoid compile errors if the module is included several times transitively/recursively.

    However, for data-holding headers, the include should only ever happen in a single place in the whole program. Otherwise it seems too easy to bloat the binary. So a compile error in that case is actually intentional and useful.

  69. DrahtBot added the label CI failed on Sep 12, 2024
  70. DrahtBot removed the label CI failed on Sep 15, 2024
  71. DrahtBot added the label Needs rebase on Sep 17, 2024
  72. fjahr force-pushed on Sep 20, 2024
  73. fjahr commented at 9:32 am on September 20, 2024: contributor
    Rebased to incorporate latest relevant changes to the build system and AutoFile. This now also builds on #30901 so please ignore the first commit or leave your review of it in that PR.
  74. DrahtBot removed the label Needs rebase on Sep 20, 2024
  75. fjahr commented at 10:02 pm on September 20, 2024: contributor
    CI failures (timeouts) seem unrelated, should be ready for review as I am hoping #30901 will get merged quickly.
  76. DrahtBot added the label Needs rebase on Nov 6, 2024
  77. fjahr force-pushed on Nov 6, 2024
  78. DrahtBot removed the label Needs rebase on Nov 7, 2024
  79. sipa commented at 2:39 pm on December 2, 2024: member

    Concept ACK.

    It would be nice if the asmap data blob didn’t need to be converted to std::vector<bool> at runtime anymore, because that implies using memory twice (the blob itself, plus the decoded vector, despite containing effectively the same data). To improve upon that, I don’t think it’s hard to make the asmap interpreter code operate on a std::span<const std::byte> instead of a std::vector<bool>. Do you feel like looking into that, or do you want me to do so as a follow-up?

  80. fjahr commented at 2:07 pm on December 13, 2024: contributor

    It would be nice if the asmap data blob didn’t need to be converted to std::vector<bool> at runtime anymore, because that implies using memory twice (the blob itself, plus the decoded vector, despite containing effectively the same data). To improve upon that, I don’t think it’s hard to make the asmap interpreter code operate on a std::span<const std::byte> instead of a std::vector<bool>. Do you feel like looking into that, or do you want me to do so as a follow-up?

    That makes sense and I am giving it a try right now. Will follow-up here when it’s done!

  81. DrahtBot added the label Needs rebase on Dec 17, 2024
  82. fjahr force-pushed on Jan 15, 2025
  83. DrahtBot commented at 8:44 pm on January 15, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/35676029008

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  84. DrahtBot added the label CI failed on Jan 15, 2025
  85. fjahr force-pushed on Jan 15, 2025
  86. fjahr force-pushed on Jan 15, 2025
  87. DrahtBot removed the label CI failed on Jan 16, 2025
  88. fjahr commented at 4:13 pm on January 16, 2025: contributor

    Rebased with an updated #30901 and implemented @sipa ’s suggestion here: #28792 (comment) This took me a bit more time than I anticipated and a couple of times I wasn’t sure which approach to go with. This is why I have kept this part as a separate commit for now and I am looking for approach ACKs on that last part in particular first before I squash it into the appropriate previous commits.

    The most interesting pieces IMO:

    • Dual member approach in NetGroupManager with m_asmap and m_loaded_asmap (approach suggested by @sipa )
    • Usage of named constructors
    • Checksum calculation: I wrote a new serializer but didn’t use it because it didn’t feel we should add it just for this. Converting also feels ugly but seems to be the easiest approach. Or maybe I am overlooking something.
  89. in src/util/asmap.cpp:234 in 6cf4930d15 outdated
    233 {
    234-    LogInfo("Opened asmap data (%zu bytes) from embedded byte array\n", data.size());
    235-    return Decode(data);
    236+    HashWriter asmap_hasher;
    237+
    238+    // Asmap used to be a bool vector, so in order to keep the checksum
    


    sipa commented at 9:25 pm on January 16, 2025:
    Do we care about keeping the checksum calculation compatible? For almost all users, this PR being merged will mean a new asmap database anyway, so they’ll incur the price of addrman invalidation that brings anyway.

    fjahr commented at 6:36 pm on January 20, 2025:
    I guess you are right, I probably got a bit too caught up in this thinking that there should be an easier way to keep it consistent. I have removed that backwards compatibility code and updated the test.
  90. DrahtBot removed the label Needs rebase on Jan 17, 2025
  91. fjahr force-pushed on Jan 20, 2025
  92. in src/test/fuzz/asmap.cpp:17 in f2e673f944 outdated
    26-    true, true, false, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, // Match 0x00
    27-    true, true, false, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, // Match 0x00
    28-    true, true, false, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, // Match 0x00
    29-    true, true, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true,         // Match 0xFF
    30-    true, true, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true          // Match 0xFF
    31+static const std::vector<std::byte> IPV4_PREFIX_ASMAP = {
    


    sipa commented at 3:28 pm on January 29, 2025:

    This seems to be the prefix it’s trying to match itself, not code for matching it.

    I suggest: static const auto = "fb03ec0fb03fc0fe00fb03ec0fb03fc0fe00fb03ec0fb0fffffeff"_hex_v;


    fjahr commented at 8:29 pm on February 13, 2025:
    done
  93. in contrib/asmap/asmap-tool.py:229 in 97bee93a9c outdated
    224+
    225+        file_content = f"""#include <cstddef>
    226+#include <span>
    227+
    228+namespace node::data {{
    229+inline constexpr std::byte detail_ip_asn_raw[] {{
    


    sipa commented at 3:36 pm on January 29, 2025:
    This could be a lot smaller by using the ""_hex user-defined literal (if you include util/strencodings.h); it’d create an std::array<std::byte, N> instead of a C-array, but the std::span constructor below just work just as well on it.

    fjahr commented at 8:36 pm on February 13, 2025:
    This is neat but I found it to be not as straight forward as it seemed initially. Aside from the additional include I found that I needed to bump template-depth and constexpr-steps in order to make this work at compile time with clang. I am still a bit unclear on what the trade-offs are for this and I will open a separate PR with this so that this discussion doesn’t extend the scope here further.

    sipa commented at 8:51 pm on February 13, 2025:
    Yeah, given that the file isn’t committed its size doesn’t matter that much. And it may be that constexpr evaluating a huge amount of data is non-trivial for the compiler, unsure. This isn’t that important.

    fjahr commented at 8:59 pm on February 13, 2025:
    Opened #31861 so it can be discussed there :)
  94. in contrib/asmap/asmap-tool.py:12 in 97bee93a9c outdated
     8@@ -9,6 +9,7 @@
     9 import json
    10 import math
    11 from collections import defaultdict
    12+from pathlib import Path
    


    sipa commented at 3:39 pm on January 29, 2025:
    This new gen_header command doesn’t seem to be used by the build system; what is its point?

    fjahr commented at 8:29 pm on February 13, 2025:
    It was a left-over of a previous approach and I left it in initially in case it might be interesting for testing, experiments etc. just because it was already there. But the PR has grown in scope a lot and it’s becoming annoying to deal with for me and reviewers, so I dropped it now.
  95. sipa commented at 3:55 pm on January 29, 2025: member

    Concept ACK.

    I’d recommend integrating the last commit into the rest of the PR, making the std::vector<bool> -> std::span<std::byte> for the interpreter change first, and then introducing the embedded asmap data afterwards.

  96. DrahtBot added the label Needs rebase on Feb 12, 2025
  97. fjahr force-pushed on Feb 13, 2025
  98. fjahr commented at 8:37 pm on February 13, 2025: contributor

    Addressed @sipa ’s feedback and rebased on the latest version of #30901.

    The python tool command has been dropped and the commit have been reorganized as well.

  99. DrahtBot removed the label Needs rebase on Feb 13, 2025
  100. sipa commented at 3:24 pm on February 14, 2025: member
    Will review again soon, but a quick question: would it make sense to provide the ability for a user (who doesn’t necessarily have easy access to the source code) to extract the asmap data from a running node? For example, to be able to run asmap.py diff on it. It could be an RPC command that just spits out the data in textual format, or a REST endpoint that directly provides it in binary.
  101. DrahtBot added the label Needs rebase on Feb 14, 2025
  102. jonatack commented at 3:58 pm on February 28, 2025: member
    Concept ACK, will review after rebase.
  103. fjahr force-pushed on Mar 3, 2025
  104. DrahtBot removed the label Needs rebase on Mar 3, 2025
  105. fjahr commented at 6:44 pm on March 3, 2025: contributor

    Rebased the previous conflict and #30901 has been merged now as well 🥳

    I also added the -DWITH_EMBEDDED_ASMAP=NO to one of the CI jobs.

    would it make sense to provide the ability for a user (who doesn’t necessarily have easy access to the source code) to extract the asmap data from a running node?

    Let me address that in a follow-up. I think it’s a nice-to-have since the necessary data for diffing should always be available in https://github.com/asmap/asmap-data when we follow the workflow we propose but yeah, one less step of indirection and having the ability to do it all with the data that’s in the repo is definitely better.

  106. fjahr force-pushed on Mar 29, 2025
  107. fjahr commented at 10:46 pm on March 29, 2025: contributor
    Rebased to get all the latest cmake changes and swapped the included mapping data to the latest asmap run. This used the --fill feature in encoding which leads to additional space savings. This is why the embedded data has gone down significantly from ~1.7MB to ~1.4MB.
  108. in src/test/fuzz/asmap.cpp:34 in d07856e1b1 outdated
    32+    std::vector<std::byte> asmap_vec = ipv6 ? IPV6_PREFIX_ASMAP : IPV4_PREFIX_ASMAP;
    33     for (int i = 0; i < asmap_size; ++i) {
    34+        uint8_t byte = buffer[1 + i];
    35         for (int j = 0; j < 8; ++j) {
    36-            asmap.push_back((buffer[1 + i] >> j) & 1);
    37+            asmap_vec.push_back(static_cast<std::byte>((byte >> j) & 1));
    


    sipa commented at 6:31 pm on April 10, 2025:

    In commit “refactor: Use span of bytes for asmap”

    I don’t think this is what we want to happen: it’s converting every byte of input data into 8 bytes that are each 0x00 or 0x01. This commit is also changing the meaning of interpretation, which makes this a bit confusing.

    This should just be:

    0for (int i = 0; i < asmap_size; ++i) {
    1    asmap_vec.push_back(buffer[1 + i]);
    2}
    

    I think.


    fjahr commented at 10:52 pm on April 22, 2025:
    Yeah, that’s right, fixed it.
  109. in src/netgroup.h:66 in d07856e1b1 outdated
    60@@ -52,7 +61,10 @@ class NetGroupManager {
    61     bool UsingASMap() const;
    62 
    63 private:
    64-    /** Compressed IP->ASN mapping, loaded from a file when a node starts.
    65+    /** Compressed IP->ASN mapping.
    66+     *
    67+     * Data may beloaded from a file when a node starts or embedded in the
    


    sipa commented at 6:32 pm on April 10, 2025:

    In commit “refactor: Use span of bytes for asmap”

    Nit: typo be loaded


    fjahr commented at 10:52 pm on April 22, 2025:
    Fixed
  110. in src/netgroup.h:24 in d07856e1b1 outdated
    14@@ -15,9 +15,18 @@
    15  */
    16 class NetGroupManager {
    17 public:
    18-    explicit NetGroupManager(std::vector<bool> asmap)
    19-        : m_asmap{std::move(asmap)}
    20-    {}
    21+    static NetGroupManager WithEmbeddedAsmap(std::span<const std::byte> asmap) {
    


    sipa commented at 6:34 pm on April 10, 2025:

    In commit “refactor: Use span of bytes for asmap”

    I think the fact that this commit both changes the interpretation from individual-bools to bits-of-byte while also adding support for span-based embedded asmap file makes it a bit harder to review.

    Would it be possible to separate these into two commits (probably first switching the interpretation to be bit-of-std::byte based but leaving everything in std::vector<std::byte>), and then in a second commit add support for embedded span-based ones?

  111. DrahtBot added the label Needs rebase on Apr 11, 2025
  112. fjahr force-pushed on Apr 22, 2025
  113. fjahr commented at 11:03 pm on April 22, 2025: contributor

    Rebased and addressed @sipa ’s feedback. Also made a few further improvements: Added some missing includes, dropped some docs that have been added in #32110 already (silent merge conflict), unified asmap version/checksum naming (I went with version because that’s what we have been communicating to users already).

    Would it be possible to separate these into two commits (probably first switching the interpretation to be bit-of-std::byte based but leaving everything in std::vectorstd::byte), and then in a second commit add support for embedded span-based ones?

    Done, I split the commit. I hope it makes review easier to have these changes separate. The downside is that a lot of lines are now touched in two commits but I guess it will be ok since many of them are easy to grasp.

  114. DrahtBot removed the label Needs rebase on Apr 22, 2025
  115. achow101 added this to the milestone 30.0 on Apr 25, 2025
  116. in src/netgroup.cpp:91 in 828cf86a03 outdated
    90-    std::vector<bool> ip_bits(128);
    91+    std::vector<std::byte> ip_bits(16);
    92     if (address.HasLinkedIPv4()) {
    93         // For lookup, treat as if it was just an IPv4 address (IPV4_IN_IPV6_PREFIX + IPv4 bits)
    94         for (int8_t byte_i = 0; byte_i < 12; ++byte_i) {
    95-            for (uint8_t bit_i = 0; bit_i < 8; ++bit_i) {
    


    sipa commented at 1:21 am on April 26, 2025:

    In commit “refactor: Operate on bytes instead of bits in Asmap code”

    Simpler:

     0--- a/src/netgroup.cpp
     1+++ b/src/netgroup.cpp
     2@@ -90,18 +90,18 @@ uint32_t NetGroupManager::GetMappedAS(const CNetAddr& address) const
     3     if (address.HasLinkedIPv4()) {
     4         // For lookup, treat as if it was just an IPv4 address (IPV4_IN_IPV6_PREFIX + IPv4 bits)
     5         for (int8_t byte_i = 0; byte_i < 12; ++byte_i) {
     6-            ip_bits[byte_i] = static_cast<std::byte>(IPV4_IN_IPV6_PREFIX[byte_i]);
     7+            ip_bits[byte_i] = std::byte(IPV4_IN_IPV6_PREFIX[byte_i]);
     8         }
     9         uint32_t ipv4 = address.GetLinkedIPv4();
    10         for (int i = 0; i < 4; ++i) {
    11-            ip_bits[12 + i] = static_cast<std::byte>((ipv4 >> (24 - i * 8)) & 0xFF);
    12+            ip_bits[12 + i] = std::byte(ipv4 >> (24 - i * 8));
    13         }
    14     } else {
    15         // Use all 128 bits of the IPv6 address otherwise
    16         assert(address.IsIPv6());
    17         auto addr_bytes = address.GetAddrBytes();
    18         for (int8_t byte_i = 0; byte_i < 16; ++byte_i) {
    19-            ip_bits[byte_i] = static_cast<std::byte>(addr_bytes[byte_i]);
    20+            ip_bits[byte_i] = std::byte(addr_bytes[byte_i]);
    21         }
    22     }
    23     uint32_t mapped_as = Interpret(m_asmap, ip_bits);
    

    fjahr commented at 8:31 pm on May 8, 2025:
    Done
  117. in src/test/fuzz/asmap.cpp:32 in 828cf86a03 outdated
    32+    std::vector<std::byte> asmap_vec = ipv6 ? IPV6_PREFIX_ASMAP : IPV4_PREFIX_ASMAP;
    33     for (int i = 0; i < asmap_size; ++i) {
    34-        for (int j = 0; j < 8; ++j) {
    35-            asmap.push_back((buffer[1 + i] >> j) & 1);
    36-        }
    37+        asmap_vec.push_back(static_cast<std::byte>(buffer[1 + i]));
    


    sipa commented at 1:25 am on April 26, 2025:

    In commit “refactor: Operate on bytes instead of bits in Asmap code”

    Simpler:

    0asmap_vec.push_back(std::byte(buffer[1 + i]));
    

    fjahr commented at 8:30 pm on May 8, 2025:
    Taken
  118. in src/test/fuzz/asmap.cpp:34 in 828cf86a03 outdated
    34-        for (int j = 0; j < 8; ++j) {
    35-            asmap.push_back((buffer[1 + i] >> j) & 1);
    36-        }
    37+        asmap_vec.push_back(static_cast<std::byte>(buffer[1 + i]));
    38     }
    39+    std::vector<std::byte> asmap(asmap_vec);
    


    sipa commented at 1:26 am on April 26, 2025:

    In commit “refactor: Operate on bytes instead of bits in Asmap code”

    I don’t think this is necessary? The code above could operate on asmap instead of asmap_vec directly?


    fjahr commented at 8:30 pm on May 8, 2025:
    Hm, right, that’s a left-over from previous iterations. Done.
  119. in src/test/fuzz/asmap_direct.cpp:34 in 828cf86a03 outdated
    30@@ -31,19 +31,21 @@ FUZZ_TARGET(asmap_direct)
    31     if (buffer.size() - sep_pos - 1 > 128) return; // At most 128 bits in IP address
    32 
    33     // Checks on asmap
    34-    std::vector<bool> asmap(buffer.begin(), buffer.begin() + sep_pos);
    35+    std::vector<std::byte> asmap(reinterpret_cast<const std::byte*>(buffer.data()),
    


    sipa commented at 1:37 am on April 26, 2025:

    In commit “refactor: Operate on bytes instead of bits in Asmap code”

    This is not correct. The code (and comment) above indicate that buffer is being interpreted using 1 bit of asmap code per byte. The line here is interpreting it as 1 byte per byte.

    You need some conversion function, I think:

     0namespace {
     1
     2std::vector<std::byte> BitsToBytes(std::span<const uint8_t> input) noexcept
     3{
     4    std::vector<std::byte> ret;
     5    uint8_t next_byte{0};
     6    int next_byte_bits{0};
     7    for (uint8_t val : input) {
     8        next_byte |= (val & 1) << (next_byte_bits++);
     9        if (next_byte_bits == 8) {
    10            ret.push_back(std::byte(next_byte));
    11            next_byte = 0;
    12            next_byte_bits = 0;
    13        }
    14    }
    15    return ret;
    16}
    17
    18} // namespace
    19
    20FUZZ_TARGET(asmap_direct)
    21{
    22    ...
    23    auto asmap = BitsToBytes(std::span{buffer}.first(sep_pos));
    24    ...
    25    auto addr = BitsToBytes(std::span{buffer}.subspan(sep_pos + 1));
    26    ...
    27}
    

    fjahr commented at 8:31 pm on May 8, 2025:
    Taken but this code skipped the last byte if the input wasn’t a multiple of 8, so I added that.
  120. in src/test/fuzz/asmap_direct.cpp:38 in 828cf86a03 outdated
    36+                                 reinterpret_cast<const std::byte*>(buffer.data() + sep_pos));
    37     if (SanityCheckASMap(asmap, buffer.size() - 1 - sep_pos)) {
    38         // Verify that for valid asmaps, no prefix (except up to 7 zero padding bits) is valid.
    39-        std::vector<bool> asmap_prefix = asmap;
    40-        while (!asmap_prefix.empty() && asmap_prefix.size() + 7 > asmap.size() && asmap_prefix.back() == false) {
    41+        std::vector<std::byte> asmap_prefix = asmap;
    


    sipa commented at 1:40 am on April 26, 2025:

    In commit “refactor: Operate on bytes instead of bits in Asmap code”

    This doesn’t make sense anymore. You can’t slice off individual bits from an std::byte.


    fjahr commented at 8:33 pm on May 8, 2025:
    Right, I adapted this part to skip the last byte since the BitsToBytes function would pad again, but otherwise I think this now functions as the original fuzz test again.
  121. in src/util/asmap.cpp:25 in 828cf86a03 outdated
    21@@ -21,15 +22,14 @@ namespace {
    22 
    23 constexpr uint32_t INVALID = 0xFFFFFFFF;
    24 
    25-uint32_t DecodeBits(std::vector<bool>::const_iterator& bitpos, const std::vector<bool>::const_iterator& endpos, uint8_t minval, const std::vector<uint8_t> &bit_sizes)
    26+uint32_t DecodeBits(size_t& bitpos, const std::vector<std::byte>& data, uint8_t minval, const std::vector<uint8_t>& bit_sizes)
    


    sipa commented at 1:49 am on April 26, 2025:

    “In commit “refactor: Operate on bytes instead of bits in Asmap code”:

    I think it would be helpful to have a bit getter helper function:

    0inline bool GetBit(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
    1{
    2    return (std::to_integer<uint8_t>(bytes[bitpos / 8]) >> (bitpos % 8)) & 1;
    3}
    

    And then for example:

    0    ...
    1    bit = GetBit(data, bitpos++);
    2    ...
    

    and

    0    ...
    1    if (GetBit(ip, ip.size() * 8 - bits)) {
    2        --bits;
    3    }
    4    ...
    

    fjahr commented at 8:30 pm on May 8, 2025:
    Good idea. Done, also made a BE version of it.
  122. in src/test/fuzz/asmap_direct.cpp:36 in c1c6943ea9 outdated
    32@@ -33,19 +33,19 @@ FUZZ_TARGET(asmap_direct)
    33     // Checks on asmap
    34     std::vector<std::byte> asmap(reinterpret_cast<const std::byte*>(buffer.data()),
    35                                  reinterpret_cast<const std::byte*>(buffer.data() + sep_pos));
    36-    if (SanityCheckASMap(asmap, buffer.size() - 1 - sep_pos)) {
    37+    if (SanityCheckASMap(std::span<const std::byte>(asmap), buffer.size() - 1 - sep_pos)) {
    


    sipa commented at 1:58 am on April 26, 2025:

    In commit “refactor: Use span instead of vector for data in util/asmap”

    No need for the explicit conversion:

    0-if (SanityCheckASMap(std::span<const std::byte>(asmap), buffer.size() - 1 - sep_pos)) {
    1+if (SanityCheckASMap(asmap, buffer.size() - 1 - sep_pos)) {
    

    Also twice the same further in this function.


    fjahr commented at 8:30 pm on May 8, 2025:
    Done
  123. fjahr force-pushed on May 8, 2025
  124. sipa commented at 3:47 pm on June 9, 2025: member
    To convince myself that with all the bit/byte order changes, no semantics of the actual interpretation are affected, I wrote some known-correct unit tests for the asmap interpreter, added before the changes in this PR, and maintained throughout: https://github.com/sipa/bitcoin/commits/pr28792. Feel free to grab any part of it.
  125. in src/init.cpp:1494 in 84a8ad67a8 outdated
    1494+                    } else {
    1495+                        InitError(strprintf(_("Could not parse asmap file in default location %s"), fs::quoted(fs::PathToString(asmap_path))));
    1496+                    }
    1497+                    return false;
    1498+                }
    1499+                node.netgroupman = std::make_unique<NetGroupManager>(NetGroupManager::WithLoadedAsmap(asmap));
    


    sipa commented at 7:41 pm on June 9, 2025:

    In commit “init, net: Implement usage of binary-embedded asmap data”

    Nit: use std::move(asmap) here to avoid a copy (of the entire loaded asmap blob).


    fjahr commented at 3:44 pm on June 11, 2025:
    Done
  126. sipa commented at 7:45 pm on June 9, 2025: member
    ACK 61714f390e74f49d0f37d7f96cc0645c8215728a
  127. DrahtBot requested review from Sjors on Jun 9, 2025
  128. DrahtBot requested review from jonatack on Jun 9, 2025
  129. fanquake commented at 9:53 am on June 10, 2025: member

    Can you update the PR description to reflect the current state of the PR, it seems like it describes this as one of multiple approaches, and mostly talks about a different approach?

    Looking through the changes here, I can’t see any documentation about where the .dat comes from, who’s responsible for generating it (what’s required determinism wise), how often it needs updating (who’s resposible for that, as well as maintaining any relevant tooling, and where that lives), how any of this effects the release process etc.

    If we are going to bundle the .dat, and use it by default in releases, can you elaborate on why we need a build option to disable the embedding, but not the asmap feature entirely; this seems to just complicate the settings/usage handling, i.e we need to handle the case where someone has disabled the embedding (built their own binary), but then passed the -asamp option, but not provided an asmap?

  130. fanquake commented at 3:25 pm on June 10, 2025: member

    Some other questions:

    • How quickly does the data degrade? Im guessing not so fast that there could be any adverse effects within the supported lifetime of a binary?
    • Is the expectation that we’d backport updates to any embedded data for every new (point) release?
  131. jurraca commented at 10:43 am on June 11, 2025: contributor

    How quickly does the data degrade?

    ran some numbers, counting by the number of addresses within networks, and diff’ing month to month, the ASmap from Dec 1 to Apr 1 drifted about 5%:

     0IPv4 drift from Dec 1
     1Jan 1: 1.40%
     2Feb 1: 3.72%
     3Mar 1: 4.90%
     4Apr 1: 5.23%
     5
     6IPv6 drift from Dec 1
     7Jan 1: 2.04%
     8Feb 1: 4.29%
     9Mar 1: 5.23%
    10Apr 1: 5.74%
    

    Method: running python3 contrib/asmap/asmap-tool.py diff map1.txt map2.txt which outputs the total count of addresses changed between the two files. I was running ASmap continually during those months but unfortunately don’t have files past Apr 1. Would be worth considering drift for a given node’s peerset against the ASmap. @0xB10C had previously shared some data which measured approximately 0.5% drift per month. (edit: this was the % of unmapped addresses in the peerset)

    EDIT: using diff_addrs docs to compare two ASmaps from March 23rd to June 11 with my node’s peers, the AS assignment drift is about 10%:

    0ASmap 1: 2025-03-23
    1ASmap 2: 2025-06-11
    2getnodeaddresses output from 2025-03-25
    3Summary: 5,092 (10.13%) of 50,260 addresses were reassigned (migrations=450, assignments=0, unassignments=4642)
    4getnodeaddresses output from 2025-06-10
    5Summary: 5,221 (10.37%) of 50,347 addresses were reassigned (migrations=420, assignments=0, unassignments=4801)
    

    This seems strangely high, and doesn’t match up with our previous estimates, so something may be off with my ASmaps…

  132. fjahr commented at 12:56 pm on June 11, 2025: contributor

    Can you update the PR description to reflect the current state of the PR, it seems like it describes this as one of multiple approaches, and mostly talks about a different approach?

    Right, fixed.

    Looking through the changes here, I can’t see any documentation about where the .dat comes from, who’s responsible for generating it (what’s required determinism wise), how often it needs updating (who’s resposible for that, as well as maintaining any relevant tooling, and where that lives), how any of this effects the release process etc.

    At a high level the description of the creation process as in practice at https://github.com/asmap/asmap-data here is still pretty much up-to-date: https://delvingbitcoin.org/t/asmap-creation-process/548

    I will add documentation on this here but I need to think about wording and placement a bit more. Similar to how we don’t share hard rules for when a PR is ready for merge I think it might be harmful to be overly specific with the description of the process here. Generally, the process is designed to share a lot of the characteristics of Core PRs/workflows to reduce cognitive overhead.

    What’s not spelled out on delving and won’t be in any documentation but should be said: Currently I am the sole maintainer of the data repo as in anyone can run the initiate the process and open a PR with the results but I am the only one who can merge stuff into asmap-data and kartograf currently. I am happy to share the access or move the repos if that is requested so there is some redundancy. The main question for me is if this should rather be shared with Bitcoin Core maintainers or rather regular contributors that are knowledgeable in the ASMap topic. I think I brought this up on IRC in the past but maybe now is a good time to revisit that question.

    If we are going to bundle the .dat, and use it by default in releases,

    The usage of asmap with the embedded asmap data is off by default and would be until we decide to change that (at least one release cycle), following the model of v2-transport. However, the .dat in the repository is embedded in builds by default, just to make sure we are on the same page.

    can you elaborate on why we need a build option to disable the embedding, but not the asmap feature entirely; this seems to just complicate the settings/usage handling, i.e we need to handle the case where someone has disabled the embedding (built their own binary), but then passed the -asamp option, but not provided an asmap?

    There was some interest expressed in this build option in the past though I am not sure anymore where and when that was. Might have just been the argument that this makes it easier for forks like Knots to build without the data. That and probably that embedding ~1.5MB of “obscure” data might scare some people in principle. I found the code overhead to be manageable but I am always happy to remove code and reduce the scope!

    EDIT: This was probably also before the data was checked into the repository. When discussing other solutions this build option would have been more important so that users could build when they had no file at hand.

    In the case where someone uses -asamp without a path to a file and the build does not have asmap data embedded the behavior falls back to the current one: The node looks for the file in the default location (in the datadir) and if it doesn’t find it, it returns an init error. There is a functional test that covers this behavior and one of the CI tasks builds without the embedded asmap. I wrote a bit more nuance on the parameter behavior here as well: #28792 (comment)

    How quickly does the data degrade? Im guessing not so fast that there could be any adverse effects within the supported lifetime of a binary?

    This is a difficult question to answer because part of the data is outdated by the time the map is created since part of the internet routing table is extremely volatile. In 2024 there were ~200k IPV4 BGP announcements daily for example. @jurraca has been building additional tooling and creating statistics to show how much volatility there really is and how much of it is relevant for the bitcoin network (deferring to his posts). Last but not least there is the dimension of legal entities vs. ASNs. Part of the volatility will harmless since the the same legal entity has just changed their ASN or shifted prefixes to a subsidiary etc. Unfortunately, this part is hard to quantify without considerable manual effort.

    But the guiding belief for this effort has been that even an badly outdated ASMap file should still be far better that the current default since prefixes don’t discriminate at all and it is not unusual for a node to have 2-3 their peers hosted at AWS or Hetzner. This was demonstrated in some of @virtu’s research though I can’t find a link to the presentation at the moment. Part of the relevant findings are teased here: #16599 (comment)

    Is the expectation that we’d backport updates to any embedded data for every new (point) release?

    I would recommend this, yes. I can’t think of a reason not to do it and it should be an easy commit to review because all it does is change the embedded .dat file to the same data that is in master so this should add considerable overhead afaict.

  133. fjahr force-pushed on Jun 11, 2025
  134. fjahr commented at 3:45 pm on June 11, 2025: contributor

    To convince myself that with all the bit/byte order changes, no semantics of the actual interpretation are affected, I wrote some known-correct unit tests for the asmap interpreter, added before the changes in this PR, and maintained throughout: https://github.com/sipa/bitcoin/commits/pr28792. Feel free to grab any part of it.

    Nice, I have added this as the first commit to show that it should pass on each of the following commits (with minor edits).

  135. fjahr force-pushed on Jun 13, 2025
  136. fjahr commented at 8:57 pm on June 13, 2025: contributor
    I have pushed an initial text for the asmap data documentation that should answer most of the questions @fanquake asked. I avoided answering the “who is responsible” part directly since the process is open for anyone to contribute just like contributing to bitcoin core is. IMO this is similar to how release-process.md describes the steps that need to be completed for the release but there is no prescription for who should perform these steps and it is also not spelled out explicitly that mainters need to merge the PRs that complete these steps.
  137. in doc/asmap-data.md:5 in 2495279f83 outdated
    0@@ -0,0 +1,58 @@
    1+# Embedded ASMap data
    2+
    3+## Background
    4+
    5+The ASMap feature (available via `-asmap`) makes it possible to use a peers AS Number (an ISP/hoster identifier)
    


    sipa commented at 11:01 pm on June 14, 2025:

    “peers AS Number” -> “peer’s AS number”

    Maybe add “(ASN)” as that term is used further.

  138. in doc/asmap-data.md:8 in 2495279f83 outdated
    0@@ -0,0 +1,58 @@
    1+# Embedded ASMap data
    2+
    3+## Background
    4+
    5+The ASMap feature (available via `-asmap`) makes it possible to use a peers AS Number (an ISP/hoster identifier)
    6+in netgroup bucketing in order to ensure a higher diversity in the peer
    7+set. By default the buckets are formed based on IP prefixes but this does not
    8+prevent having multiple connections to peers at the same large-scale hoster,
    


    sipa commented at 11:02 pm on June 14, 2025:
    “having multiple connections to peers” -> “having connections dominated by peers” (multiple isn’t so much a problem on itself)
  139. in doc/asmap-data.md:26 in 2495279f83 outdated
    21+Sourcing the data from a single trusted source is problematic as well.
    22+
    23+The [Kartograf](https://github.com/asmap/kartograf) tool was created to
    24+deal with these uncertainties as good as possible. The mapping data is sourced from RPKI, IRR and
    25+Routeviews. The former two are themselves used as security mechanisms to
    26+protect against BGP security issues, which why they are considered more secure and
    


    sipa commented at 11:09 pm on June 14, 2025:
    “which why” -> “which is why”
  140. in doc/asmap-data.md:27 in 2495279f83 outdated
    22+
    23+The [Kartograf](https://github.com/asmap/kartograf) tool was created to
    24+deal with these uncertainties as good as possible. The mapping data is sourced from RPKI, IRR and
    25+Routeviews. The former two are themselves used as security mechanisms to
    26+protect against BGP security issues, which why they are considered more secure and
    27+their data takes precedent. The latter is a trusted collector of BGP traffic
    


    sipa commented at 11:10 pm on June 14, 2025:
    “precedent” -> “precedence”
  141. in doc/asmap-data.md:31 in 2495279f83 outdated
    26+protect against BGP security issues, which why they are considered more secure and
    27+their data takes precedent. The latter is a trusted collector of BGP traffic
    28+and only used for IP space that is not covered by RPKI and IRR.
    29+
    30+The process in which the Kartograf project parses, processes and merges these
    31+data sources are deterministic. Given the raw download files from these
    


    sipa commented at 11:10 pm on June 14, 2025:
    “are” -> “is”
  142. in doc/asmap-data.md:33 in 2495279f83 outdated
    28+and only used for IP space that is not covered by RPKI and IRR.
    29+
    30+The process in which the Kartograf project parses, processes and merges these
    31+data sources are deterministic. Given the raw download files from these
    32+different sources, anyone can build their own map file and verify the content
    33+matches with other users results. Before the map is usable by Bitcoin Core
    


    sipa commented at 11:10 pm on June 14, 2025:
    “users” -> “users'”
  143. in doc/asmap-data.md:37 in 2495279f83 outdated
    32+different sources, anyone can build their own map file and verify the content
    33+matches with other users results. Before the map is usable by Bitcoin Core
    34+it needs to be encoded as well. This is done using `asmap-tool.py` in `contrib/asmap`
    35+and this step is deterministic as well.
    36+
    37+When it comes to obtaining the initial input data the high volatility remains
    


    sipa commented at 11:11 pm on June 14, 2025:
    comma after “data”
  144. in doc/asmap-data.md:54 in 2495279f83 outdated
    49+or an explicit schedule.
    50+
    51+## Release process
    52+
    53+As an upcoming release approaches the embedded ASMap data should be updated
    54+by replacing the `ip_asn.dat` in a newer ASMap file from the asmap-data
    


    sipa commented at 11:12 pm on June 14, 2025:
    “in” -> “to”
  145. fjahr force-pushed on Jun 15, 2025
  146. fjahr commented at 12:01 pm on June 15, 2025: contributor
    Addressed @sipa ’s comments (all taken) on the asmap data documentation.
  147. in doc/asmap-data.md:18 in ccbefd52a1 outdated
    13+available to users when they are unable to provide a file.
    14+
    15+## Data sourcing and tools
    16+
    17+ASMap is a mapping of IP prefix to ASN, essentially a snapshot of the
    18+internet routing table at a some point in time. Due to the high volatility
    


    maflcko commented at 7:51 am on June 16, 2025:
    In doc/asmap-data.md: “snapshot of the internet routing table at a some point in time.”
    a some → some [extra “a” makes the phrase ungrammatical]
    
    In doc/asmap-data.md under “Release process”: “replacing the ip_asn.dat to a newer ASMap file”
    to → with [wrong preposition]
    
    In doc/asmap-data.md under “Release process”: “obtaining a fresh for use in the upcoming release.”
    missing noun → “obtaining a fresh map for use in the upcoming release.” [omitted object makes it incomprehensible]
    

    fjahr commented at 11:21 am on June 16, 2025:
    All fixed
  148. fjahr force-pushed on Jun 16, 2025
  149. in doc/asmap-data.md:43 in 071483f649 outdated
    37+When it comes to obtaining the initial input data, the high volatility remains
    38+a challenge if users don't want to trust a single creator of used ASMap file.
    39+To overcome this, multiple users can start the download process at the exact
    40+same time which leads to a high likelihood that their downloaded data will be
    41+similar enough that they receive the same output at the end of the process.
    42+This process is regularly coordinated at the [asmap-data](https://github.com/asmap/asmap-data)
    


    sipa commented at 3:29 pm on June 24, 2025:

    See https://github.com/asmap/asmap-data/issues/25

    The repository description there says:

    This repository is strictly used for demonstration purposes to help with conceptual discussions about ASMap in the Bitcoin Core release process. Any data uploaded here should be treated with extreme caution and is not inteded for production use.

    which doesn’t seem to match the expectations around the repository as used here.


    fjahr commented at 10:50 pm on June 24, 2025:
    Thanks, I am addressing this in https://github.com/asmap/asmap-data/pull/26
  150. sipa commented at 5:24 pm on June 24, 2025: member
    Code review ACK 071483f64932e806ac8280d8c16e9bdf6be23eb3
  151. in doc/asmap-data.md:44 in 071483f649 outdated
    38+a challenge if users don't want to trust a single creator of used ASMap file.
    39+To overcome this, multiple users can start the download process at the exact
    40+same time which leads to a high likelihood that their downloaded data will be
    41+similar enough that they receive the same output at the end of the process.
    42+This process is regularly coordinated at the [asmap-data](https://github.com/asmap/asmap-data)
    43+project. If enough participants have joined the effort (5 or more is recommended) and a majority of the
    


    fanquake commented at 1:13 pm on June 25, 2025:

    If enough participants have joined the effort (5 or more is recommended)

    Just noting that of the last three additions to the repo (https://github.com/asmap/asmap-data/pull/23, https://github.com/asmap/asmap-data/pull/21, https://github.com/asmap/asmap-data/pull/19), it seems like none of them reached 5: I can see 3, 3, and 3 participants.


    fjahr commented at 4:37 pm on June 25, 2025:

    The reviewers of the PR are only ACKing that the actual data from the collaborative run was used and that the data was encoded correctly by reproducing the file hash. This is the part I am referring to a little further below:

    0Files will not be merged to the repository
    1without at least two additional reviewers confirming that the process described
    2above was followed as expected and that the encoding step yielded the same
    3file hash.
    

    The 5 participants is the number I am looking for in the collaborative run which can be seen in the issues that preceed the PRs, they should always be linked in the description of the PRs. For example this one https://github.com/asmap/asmap-data/issues/20 had 5 (preceeded the PR 21 you linked), and this one had 9 https://github.com/asmap/asmap-data/issues/22 (preceeded PR 23).

    I think can make the distinction more clear by spelling out which part is in the issue and which part is in the PR. Do you think that would make it clearer? I just try to strike a balance so that we don’t have to edit this document too frequently.

    The 5 participant minimum is kind of arbitrarily picked, I admit, but I try to get to what I would say is comparable to 3 ACKs. More would be nice but I wasn’t confident we would get more people involved on a regular basis and majority of 5 is 3 so basically this is a minimum of 3 ACKs on the input data. The mismatches are not comparable to NACKs because there is no right or wrong there (assuming no malice), they are just different because of timing or connection problems. On the mechanical encoding review part in the PR similarly I am looking for 2-3 ACKs.

  152. DrahtBot added the label Needs rebase on Jul 3, 2025
  153. fjahr force-pushed on Jul 5, 2025
  154. fjahr commented at 10:12 pm on July 5, 2025: contributor

    Rebased

    Also moving the conversation here that I started on IRC last week: I suggested some sharing of access to the tools and data repositories in the asmap github org in case of emergency and asked for feedback. @sipa suggested that instead the https://github.com/asmap/asmap-data could be moved to the bitcoin-core org instead. I now think this is actually even better because that repository holds data that is encoded particularly for use in Bitcoin Core while the rest of the tooling in the asmap org are more independent of this use case and could be used in other contexts.

    I would suggest we move the repository over once the PR is merged or very close to it, since it still needs some additional review. I guess I would still be able to help with administering the asmap-data repository after the move, similar to how @dergoegge administers the qa-assets repo for example, so there should be no additional workload for maintainers caused by this afaict.

    For completeness, here is a link to the (very brief) original conversation: https://bitcoin-irc.chaincode.com/bitcoin-core-dev/2025-06-26#1131356;

  155. DrahtBot removed the label Needs rebase on Jul 5, 2025
  156. DrahtBot added the label Needs rebase on Jul 19, 2025
  157. fjahr force-pushed on Jul 20, 2025
  158. fjahr commented at 9:56 pm on July 20, 2025: contributor
    Rebased and I shaved off 4 easy to review commits from here and put them into #33026. Maybe splitting this off helps to move things along a bit faster.
  159. DrahtBot removed the label Needs rebase on Jul 20, 2025
  160. DrahtBot added the label Needs rebase on Jul 25, 2025
  161. fjahr force-pushed on Jul 25, 2025
  162. DrahtBot removed the label Needs rebase on Jul 25, 2025
  163. DrahtBot added the label CI failed on Jul 25, 2025
  164. DrahtBot commented at 11:39 pm on July 25, 2025: contributor

    🚧 At least one of the CI tasks failed. Task no wallet, libbitcoinkernel: https://github.com/bitcoin/bitcoin/runs/46745894874 LLM reason (✨ experimental): Build failure caused by incorrect constructor invocation of ‘NetGroupManager’ in test code.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  165. fjahr force-pushed on Jul 31, 2025
  166. fjahr force-pushed on Jul 31, 2025
  167. DrahtBot removed the label CI failed on Aug 1, 2025
  168. fjahr force-pushed on Aug 18, 2025
  169. fjahr commented at 4:34 pm on August 18, 2025: contributor
    Updated the map to that of the latest run: https://github.com/asmap/asmap-data/pull/30
  170. achow101 removed this from the milestone 30.0 on Aug 18, 2025
  171. achow101 added this to the milestone 31.0 on Aug 18, 2025
  172. fjahr commented at 3:41 pm on September 15, 2025: contributor

    Updated the map to that of the latest run: asmap/asmap-data#30

    FYI for reviewers, new maps are available in asmap-data and will continue to be updated over the next few months but I don’t plan to update the map data that is included here for at least a while. This is by choice: If this had been merged for v30 then this map data might have been included and it would have been the default until the release of v31. So keeping the data as-is kind of simulates that and allows for an easier analysis if there are noticeable signs of “decaying” data. But of course newer (and older) maps are always available in asmap-data for further testing. If you are looking for very old maps to test with you may have to check out the history of that repository since we remove maps that are outdated after a while.

  173. laanwj commented at 11:32 am on October 22, 2025: member
    Concept ACK
  174. TheBlueMatt commented at 3:02 pm on November 2, 2025: contributor

    I couldn’t really find a better place to leave this comment, even though its clearly not really quite targeted at this PR itself (which is awesome!), so I’ll leave it here and y’all can yell at me for not finding the right place later.

    I saw in the core-dev writeups that there appears to be some intent to build asmap files based on only RPKI data (if that’s wrong, please skip the rest of this comment). I fear that that is a particularly bad idea.

    While RPKI deployment has been quite successful, it is used primarily by larger ISPs and cloud service providers, with the long-tail of providers largely not using it. In seeking to use asmap data in Bitcoin Core, part of the goal is to improve diversity of connections across different ISPs to avoid a single large ISP or cloud service provider from being able to dominate connections and address tables. If we use an asmap built only from RPKI data, that (I assume? at least in part?) implies that we treat these large cloud providers each as separate entities while the long-tail of ISPs that we’d really like to bias towards will be treated as only one or a handful of ISPs [1].

    Historically I’ve heard many complaints that ASMap data isn’t deterministic when generated from non-RPKI data and that this prevents inclusion, but I find this argument incredibly lazy. Yes, the data isn’t deterministic when built from public looking glasses or native BGP feeds, and comparing two such resulting files to determine if the differences are “valid” isn’t really all that practical, but I don’t see why this is such a blocker. As long as any differences are small (under 0.1% of addresses, I’d bet), the ability to maliciously craft an ASMap such that the network biases towards your own addresses shouldn’t be all that practical. Further, if someone is trying to maliciously bias ASMap results, its rater trivial even with RPKI - generally RIRs don’t place very strict limits on ASN availability (there are still tons available!) so its rather easy to just request a hundred or two ASNs and split your address space across those. Worse, with an RPKI-only ASMap the impact of such activities is made greater as you can ensure that your prefixes are included as separate ISPs, giving you an edge over the long-tail ISPs that may not be.

    [1] this, of course, isn’t quite true, if we’re worried about eclipse attacks the long-tail of ISPs is likely connected to through T1s or other large carriers, and biasing towards the long-tail may end up giving large carriers an edge in their ability to be MiTM. This is, however, the price we pay for ISP diversity, which has much better non-MiTM sybil resistance which IMO is a much more likely threat model.

  175. sipa commented at 3:37 pm on November 2, 2025: member
    @TheBlueMatt FYI, the repository with asmap construction scripts is https://github.com/asmap/kartograf. My understanding (which may be incomplete) is that RPKI is used where available, but supplemented with other sources.
  176. fjahr commented at 3:51 pm on November 2, 2025: contributor

    I saw in the core-dev writeups that there appears to be some intent to build asmap files based on only RPKI data (if that’s wrong, please skip the rest of this comment). I fear that that is a particularly bad idea.

    You mean the asmap session probably and I think that’s a misunderstand of the notes. I don’t think they are released yet but maybe you can point me to the line you are referring to and we can amend that to be better understandable. I thought there might a btc transcripts PR already but I couldn’t find it so far. The notes are best effort and I reviewed them but only briefly so it wouldn’t surprise me if something slipped through that can be interpretted differently than it is intended.

    There is not intent to change what we have been doing for the coordinated runs in asmap-data, which is RPKI + RIR IRRs + BGP announcements from Routeviews (in order of priority). Someone could use Kartograf to build a RPKI-only map but we are working pretty hard to make sure we can use all the sources that make sense so get the best possible view of the network.

  177. fjahr commented at 3:55 pm on November 2, 2025: contributor

    I was misled by https://github.com/bitcointranscripts/bitcointranscripts/pull/593/files#diff-e25cced7d45bde5a991f8ac638162335638ea1ba39b21a5cb76d5e5e31b31345R9-R10

    Ok, I see, that could have been more clear as either “depends on rpki-client” (the only dependency outside of python) or “depends on RPKI, IRR and Routeviews downloads” or something. I will clarify in the PR.

  178. fjahr commented at 4:03 pm on November 2, 2025: contributor
    This should clarify it: https://github.com/bitcointranscripts/bitcointranscripts/pull/594 Sorry for triggering a false alarm with the notes :D
  179. DrahtBot added the label Needs rebase on Nov 4, 2025
  180. fjahr force-pushed on Nov 5, 2025
  181. DrahtBot removed the label Needs rebase on Nov 6, 2025
  182. in cmake/script/GenerateHeaderFromRaw.cmake:21 in e00deb2cfa outdated
    17@@ -18,6 +18,5 @@ ${formatted_bytes}
    18 };
    19 
    20 inline constexpr std::span ${raw_source_basename}{detail_${raw_source_basename}_raw};
    21-}
    22-")
    23+}")
    


    hodlinator commented at 3:59 pm on November 11, 2025:
    Why change this? Various tools expect an empty line before EOF (although I admit it’s probably rarely relevant for these generated files).

    fjahr commented at 9:27 pm on November 15, 2025:
    I can’t remember encountering such a tool but primarily I was going by our style guide, which I thought had mentioned that we don’t do new lines before EOF but I guess it doesn’t (anymore?) since I couldn’t find it right now. At least we have InsertNewlineAtEOF: false in our .clang-format.

    hodlinator commented at 7:48 pm on November 17, 2025:

    Good point regarding .clang-format, though that value from 13f36c020f0329b5e975282b45292fdf2a495e31 comes from clang-format defaults rather than a conscious decision on our part. (It will not remove newlines at EOF if they exist though, https://releases.llvm.org/16.0.0/tools/clang/docs/ClangFormatStyleOptions.html#insertnewlineateof).

    GitHub displays a red warning symbol in diffs when one removes the empty newline at ends of files.

    git apply diff.patch fails for…

     0diff --git a/src/util/asmap.cpp b/src/util/asmap.cpp
     1index 267bc652d5..3ebce045d6 100644
     2--- a/src/util/asmap.cpp
     3+++ b/src/util/asmap.cpp
     4@@ -335,6 +335,7 @@ std::vector<std::byte> DecodeAsmap(fs::path path)
     5 
     6     // Read entire file into memory
     7     std::vector<std::byte> buffer(length);
     8+    foo
     9     file.read(buffer);
    10 
    11     if (!CheckAsmap(buffer)) {
    

    ….if there is no newline after the ending open brace.

    0₿ git apply diff.patch
    1error: corrupt patch at line 12
    

    Tried with git apply --inaccurate-eof diff.patch and git apply --whitespace=fix diff.patch but they produce the same error.

    Please consider it a nit though.


    hodlinator commented at 9:41 am on November 18, 2025:
    Related: #33896
  183. in src/test/fuzz/asmap_direct.cpp:15 in e00deb2cfa
    11@@ -12,6 +12,24 @@
    12 
    13 #include <cassert>
    14 
    15+std::vector<std::byte> BitsToBytes(std::span<const uint8_t> input) noexcept
    


    hodlinator commented at 4:01 pm on November 11, 2025:

    A variant of BitsToBytes from src/merkleblock.cpp already exists, but it seems okay to make more modern divisionless version (albeit dividing by 8 probably gets optimized as 3 right shifts). I understand not wanting to touch & reuse consensus code for now.

    nit: Clearer parameter name?

    0std::vector<std::byte> BitsToBytes(std::span<const uint8_t> bits) noexcept
    

    fjahr commented at 2:48 pm on November 16, 2025:
  184. in src/util/asmap.h:16 in e00deb2cfa
    12 #include <cstdint>
    13+#include <span>
    14 #include <vector>
    15 
    16-uint32_t Interpret(const std::vector<bool> &asmap, const std::vector<bool> &ip);
    17+uint32_t Interpret(const std::span<const std::byte>& asmap, const std::span<const std::byte>& ip);
    


    hodlinator commented at 4:05 pm on November 11, 2025:

    nit: We tend to drop const...& for spans, similar to how we don’t use const int&.

    0uint32_t Interpret(std::span<const std::byte> asmap, std::span<const std::byte> ip);
    

    As the style in:

    https://github.com/bitcoin/bitcoin/blob/e00deb2cfa60e846ea3de48bed35c7d3992db4bd/src/test/fuzz/asmap_direct.cpp#L15

    (One can always add const for the function definition in the .cpp file, as it is the concern of the implementation).


    fjahr commented at 8:51 pm on November 16, 2025:
  185. in ci/test/00_setup_env_native_nowallet_libbitcoinkernel.sh:16 in e00deb2cfa
    12@@ -13,4 +13,4 @@ export PACKAGES="python3-zmq python3-pip clang-17 llvm-17 libc++abi-17-dev libc+
    13 export PIP_PACKAGES="--break-system-packages pycapnp"
    14 export DEP_OPTS="NO_WALLET=1 CC=clang-17 CXX='clang++-17 -stdlib=libc++'"
    15 export GOAL="install"
    16-export BITCOIN_CONFIG="-DREDUCE_EXPORTS=ON -DBUILD_UTIL_CHAINSTATE=ON -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_SHARED_LIBS=ON"
    17+export BITCOIN_CONFIG="-DREDUCE_EXPORTS=ON -DBUILD_UTIL_CHAINSTATE=ON -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_SHARED_LIBS=ON -DWITH_EMBEDDED_ASMAP=NO"
    


    hodlinator commented at 4:10 pm on November 11, 2025:

    nit: NO might work, but isn’t OFF a clearer inversion of ON?

    0export BITCOIN_CONFIG="-DREDUCE_EXPORTS=ON -DBUILD_UTIL_CHAINSTATE=ON -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_SHARED_LIBS=ON -DWITH_EMBEDDED_ASMAP=OFF"
    

    fjahr commented at 10:50 pm on November 16, 2025:

    One might say that “NO” is literally the inversion of “ON” 😜

    Changed in the latest push 🙂


    hodlinator commented at 12:37 pm on November 21, 2025:

    PR description nit:

    0- -DWITH_EMBEDDED_ASMAP=NO
    1+ -DWITH_EMBEDDED_ASMAP=OFF
    

    hodlinator commented at 8:14 am on November 25, 2025:

    I was pointing to this sentence in the PR description at the top, where it’s still using “NO” rather than “OFF”, as was changed in the code:

    By default the data at that location is embedded into the binary but there is also a build option to prevent this (-DWITH_EMBEDDED_ASMAP=NO).

    (You can respond by jumping here: #28792 (review)).


    fjahr commented at 1:17 pm on November 25, 2025:
    Ah, I get it now, I guess it was late yesterday. Seems like @fanquake already fixed it.
  186. in src/netgroup.cpp:95 in e00deb2cfa outdated
    94         for (int8_t byte_i = 0; byte_i < 12; ++byte_i) {
    95-            for (uint8_t bit_i = 0; bit_i < 8; ++bit_i) {
    96-                ip_bits[byte_i * 8 + bit_i] = (IPV4_IN_IPV6_PREFIX[byte_i] >> (7 - bit_i)) & 1;
    97-            }
    98+            ip_bits[byte_i] = std::byte(IPV4_IN_IPV6_PREFIX[byte_i]);
    99         }
    


    hodlinator commented at 7:00 pm on November 11, 2025:

    nit: Could replace fors with copy_n():

     0--- a/src/netgroup.cpp
     1+++ b/src/netgroup.cpp
     2@@ -89,9 +89,8 @@ uint32_t NetGroupManager::GetMappedAS(const CNetAddr& address) const
     3     std::vector<std::byte> ip_bits(16);
     4     if (address.HasLinkedIPv4()) {
     5         // For lookup, treat as if it was just an IPv4 address (IPV4_IN_IPV6_PREFIX + IPv4 bits)
     6-        for (int8_t byte_i = 0; byte_i < 12; ++byte_i) {
     7-            ip_bits[byte_i] = std::byte(IPV4_IN_IPV6_PREFIX[byte_i]);
     8-        }
     9+        std::copy_n(std::as_bytes(std::span{IPV4_IN_IPV6_PREFIX}).begin(),
    10+                    IPV4_IN_IPV6_PREFIX.size(), ip_bits.begin());
    11         uint32_t ipv4 = address.GetLinkedIPv4();
    12         for (int i = 0; i < 4; ++i) {
    13             ip_bits[12 + i] = std::byte((ipv4 >> (24 - i * 8)) & 0xFF);
    14@@ -100,9 +99,9 @@ uint32_t NetGroupManager::GetMappedAS(const CNetAddr& address) const
    15         // Use all 128 bits of the IPv6 address otherwise
    16         assert(address.IsIPv6());
    17         auto addr_bytes = address.GetAddrBytes();
    18-        for (int8_t byte_i = 0; byte_i < 16; ++byte_i) {
    19-            ip_bits[byte_i] = std::byte(addr_bytes[byte_i]);
    20-        }
    21+        assert(addr_bytes.size() == ip_bits.size());
    22+        std::copy_n(std::as_bytes(std::span{addr_bytes}).begin(),
    23+                    addr_bytes.size(), ip_bits.begin());
    24     }
    25     uint32_t mapped_as = Interpret(m_asmap, ip_bits);
    26     return mapped_as;
    

    fjahr commented at 2:39 pm on November 16, 2025:
    Addressed in #33878
  187. in src/test/fuzz/asmap.cpp:33 in e00deb2cfa
    33     for (int i = 0; i < asmap_size; ++i) {
    34-        for (int j = 0; j < 8; ++j) {
    35-            asmap.push_back((buffer[1 + i] >> j) & 1);
    36-        }
    37+        asmap.push_back(std::byte(buffer[1 + i]));
    38     }
    


    hodlinator commented at 7:22 pm on November 11, 2025:

    nit: Could replace for with:

    0    std::ranges::copy(std::as_bytes(buffer.subspan(1, asmap_size)), std::back_inserter(asmap));
    

    fjahr commented at 2:45 pm on November 16, 2025:
  188. in src/test/fuzz/asmap_direct.cpp:49 in e00deb2cfa
    45@@ -28,22 +46,24 @@ FUZZ_TARGET(asmap_direct)
    46     }
    47     if (!sep_pos_opt) return; // Needs exactly 1 separator
    48     const size_t sep_pos{sep_pos_opt.value()};
    49-    if (buffer.size() - sep_pos - 1 > 128) return; // At most 128 bits in IP address
    50+    const size_t ip_len = buffer.size() - sep_pos - 1;
    


    hodlinator commented at 8:00 pm on November 11, 2025:

    nit: Align with style of prior line (sep_pos)?

    0    const size_t ip_len{buffer.size() - sep_pos - 1};
    

    fjahr commented at 2:59 pm on November 16, 2025:
  189. in src/test/fuzz/asmap_direct.cpp:53 in e00deb2cfa
    51+    if (ip_len > 128) return; // At most 128 bits in IP address
    52 
    53     // Checks on asmap
    54-    std::vector<bool> asmap(buffer.begin(), buffer.begin() + sep_pos);
    55-    if (SanityCheckASMap(asmap, buffer.size() - 1 - sep_pos)) {
    56+    auto asmap = BitsToBytes(std::span{buffer}.first(sep_pos));
    


    hodlinator commented at 8:58 pm on November 11, 2025:

    nit: buffer is already a span:

     0--- a/src/test/fuzz/asmap_direct.cpp
     1+++ b/src/test/fuzz/asmap_direct.cpp
     2@@ -50,11 +50,11 @@ FUZZ_TARGET(asmap_direct)
     3     if (ip_len > 128) return; // At most 128 bits in IP address
     4 
     5     // Checks on asmap
     6-    auto asmap = BitsToBytes(std::span{buffer}.first(sep_pos));
     7+    auto asmap = BitsToBytes(buffer.first(sep_pos));
     8     if (SanityCheckASMap(asmap, ip_len)) {
     9         // Verify that for valid asmaps, no prefix (except up to 7 zero padding bits) is valid.
    10         for (size_t prefix_len = sep_pos - 1; prefix_len > 0; --prefix_len) {
    11-            auto prefix = BitsToBytes(std::span{buffer}.first(prefix_len));
    12+            auto prefix = BitsToBytes(buffer.first(prefix_len));
    13             // We have to skip the prefixes of the same length as the original
    14             // asmap, since they will contain some zero padding bits in the last
    15             // byte.
    16@@ -63,7 +63,7 @@ FUZZ_TARGET(asmap_direct)
    17         }
    18 
    19         // No address input should trigger assertions in interpreter
    20-        auto addr = BitsToBytes(std::span{buffer}.subspan(sep_pos + 1));
    21+        auto addr = BitsToBytes(buffer.subspan(sep_pos + 1));
    22         (void)Interpret(asmap, addr);
    23     }
    24 }
    

    fjahr commented at 3:01 pm on November 16, 2025:
  190. in src/util/asmap.cpp:148 in e00deb2cfa
    146+bool SanityCheckASMap(const std::span<const std::byte>& asmap, int bits)
    147 {
    148-    const std::vector<bool>::const_iterator begin = asmap.begin(), endpos = asmap.end();
    149-    std::vector<bool>::const_iterator pos = begin;
    150+    size_t pos{0};
    151+    size_t endpos{asmap.size() * 8};
    


    hodlinator commented at 10:07 pm on November 11, 2025:

    nit: So much happening in this function, please restore const:

    0    const size_t endpos{asmap.size() * 8};
    

    fjahr commented at 3:07 pm on November 16, 2025:
  191. in src/util/asmap.cpp:38 in e00deb2cfa
    34+inline bool GetBitBE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
    35+{
    36+    return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - ((bytes.size() * 8 - bitpos) % 8))) & 1;
    37+}
    38+
    39+uint32_t DecodeBits(size_t& bitpos, const std::span<const std::byte>& data, uint8_t minval, const std::vector<uint8_t>& bit_sizes)
    


    hodlinator commented at 10:15 am on November 12, 2025:

    nit: Could avoid some static initialization+deinitialization CPU cycles:

    0uint32_t DecodeBits(size_t& bitpos, const std::span<const std::byte>& data, uint8_t minval, const std::span<const uint8_t> bit_sizes)
    
     0--- a/src/util/asmap.cpp
     1+++ b/src/util/asmap.cpp
     2@@ -35,7 +35,7 @@ inline bool GetBitBE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
     3     return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - ((bytes.size() * 8 - bitpos) % 8))) & 1;
     4 }
     5 
     6-uint32_t DecodeBits(size_t& bitpos, const std::span<const std::byte>& data, uint8_t minval, const std::vector<uint8_t>& bit_sizes)
     7+uint32_t DecodeBits(size_t& bitpos, const std::span<const std::byte> data, uint8_t minval, const std::span<const uint8_t> bit_sizes)
     8 {
     9     uint32_t val = minval;
    10     bool bit;
    11@@ -70,29 +70,29 @@ enum class Instruction : uint32_t
    12     DEFAULT = 3,
    13 };
    14 
    15-const std::vector<uint8_t> TYPE_BIT_SIZES{0, 0, 1};
    16 Instruction DecodeType(size_t& bitpos, const std::span<const std::byte>& data)
    17 {
    18+    constexpr uint8_t TYPE_BIT_SIZES[]{0, 0, 1};
    19     return Instruction(DecodeBits(bitpos, data, 0, TYPE_BIT_SIZES));
    20 }
    21 
    22-const std::vector<uint8_t> ASN_BIT_SIZES{15, 16, 17, 18, 19, 20, 21, 22, 23, 24};
    23 uint32_t DecodeASN(size_t& bitpos, const std::span<const std::byte>& data)
    24 {
    25+    constexpr uint8_t ASN_BIT_SIZES[]{15, 16, 17, 18, 19, 20, 21, 22, 23, 24};
    26     return DecodeBits(bitpos, data, 1, ASN_BIT_SIZES);
    27 }
    28 
    29 
    30-const std::vector<uint8_t> MATCH_BIT_SIZES{1, 2, 3, 4, 5, 6, 7, 8};
    31 uint32_t DecodeMatch(size_t& bitpos, const std::span<const std::byte>& data)
    32 {
    33+    constexpr uint8_t MATCH_BIT_SIZES[]{1, 2, 3, 4, 5, 6, 7, 8};
    34     return DecodeBits(bitpos, data, 2, MATCH_BIT_SIZES);
    35 }
    36 
    37 
    38-const std::vector<uint8_t> JUMP_BIT_SIZES{5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30};
    39 uint32_t DecodeJump(size_t& bitpos, const std::span<const std::byte>& data)
    40 {
    41+    constexpr uint8_t JUMP_BIT_SIZES[]{5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30};
    42     return DecodeBits(bitpos, data, 17, JUMP_BIT_SIZES);
    43 }
    44 
    

    (Also removes data from file-global scope).


    fjahr commented at 4:11 pm on November 16, 2025:

    Addressed in #33878

    (Also removes data from file-global scope).

    Not a problem because of the anonymous namespace I think.

  192. in src/util/asmap.cpp:130 in e00deb2cfa
    145             matchlen = std::bit_width(match) - 1;
    146             if (bits < matchlen) break; // Not enough input bits
    147             for (uint32_t bit = 0; bit < matchlen; bit++) {
    148-                if ((ip[ip.size() - bits]) != ((match >> (matchlen - 1 - bit)) & 1)) {
    149+                if (GetBitBE(ip, bits) != ((match >> (matchlen - 1 - bit)) & 1)) {
    150                     return default_asn;
    


    hodlinator commented at 8:51 am on November 13, 2025:

    Side note: the documentation below indicates that a failure should be returned if no default ASN has been set, but the current code (also on master) returns zero: https://github.com/bitcoin/bitcoin/blob/e00deb2cfa60e846ea3de48bed35c7d3992db4bd/contrib/asmap/asmap.py#L158-L165

    I think it would be good to update the documentation or the code to match, but understand if others don’t want to do it in the context of this PR.


    fjahr commented at 3:06 pm on November 16, 2025:
    I have been working on improved documentation. Please check the new commit in #33878 to see if that is fixing this completely.
  193. in src/test/fuzz/asmap.cpp:20 in e00deb2cfa outdated
    30-    true, true, false, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, // Match 0x00
    31-    true, true, false, true, true, true, true, true, true, true, false, false, false, false, false, false, false, false, // Match 0x00
    32-    true, true, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true,         // Match 0xFF
    33-    true, true, false, true, true, true, true, true, true, true, true, true, true, true, true, true, true, true          // Match 0xFF
    34-};
    35+static const auto IPV4_PREFIX_ASMAP = "fb03ec0fb03fc0fe00fb03ec0fb03fc0fe00fb03ec0fb0fffffeff"_hex_v;
    


    hodlinator commented at 9:28 am on November 13, 2025:

    remark in 35e6c5bdd467e8342ed606f338c3b4a363c2ef35 - “refactor: Operate on bytes instead of bits in Asmap code”:

    Endian-inversion on a bit-level

    The old vector<bool> data on master converts to "dfc0...."_hex (df being the bit-endian-inversion of fb, and c0 being the inversion of 03). But the old code would endian-invert each byte while loading from disk or from a byte-buffer:

    https://github.com/bitcoin/bitcoin/blob/8f7e02a08626a3141b64df09dfeb38d7d9e2974c/src/util/asmap.cpp#L212-L215

    https://github.com/bitcoin/bitcoin/blob/8f7e02a08626a3141b64df09dfeb38d7d9e2974c/src/test/netbase_tests.cpp#L633-L635

    https://github.com/bitcoin/bitcoin/blob/8f7e02a08626a3141b64df09dfeb38d7d9e2974c/src/test/addrman_tests.cpp#L56-L58

    And since we want to remove those loops in the new version and just consume the bytes, the “asmap-runtime” now needs to endian-invert each byte internally on the fly when interpreting the individual bits (GetBitLE).

    So it makes sense to invert the data in test/fuzz/asmap.cpp.

    Notable: fb03ec0fb03fc0fe00fb03ec0fb03fc0fe00fb03ec0fb0fffffeff exactly matches the first 27 bytes of src/test/data/asmap.raw.

  194. in src/test/fuzz/asmap_direct.cpp:33 in e00deb2cfa outdated
    28+    if (next_byte_bits) ret.push_back(std::byte(next_byte));
    29+
    30+    return ret;
    31+}
    32+
    33 FUZZ_TARGET(asmap_direct)
    


    hodlinator commented at 1:50 pm on November 13, 2025:

    How about switching to using raw bytes instead and dropping BitsToBytes?

     0FUZZ_TARGET(asmap_direct)
     1{
     2    FuzzedDataProvider provider{buffer.data(), buffer.size()};
     3    const int ip_len{provider.ConsumeIntegralInRange<int>(0, 128)}; // At most 128 bits in IP address
     4    const auto asmap_size{static_cast<int64_t>(provider.remaining_bytes()) - ((ip_len + 7) / 8)};
     5    if (asmap_size < 0) return;
     6
     7    // Checks on asmap
     8    const auto asmap{provider.ConsumeBytes<std::byte>(asmap_size)};
     9    if (SanityCheckASMap(asmap, ip_len)) {
    10        // Verify that for valid asmaps, no prefix (except up to 7 zero padding bits) is valid.
    11        auto asmap_prefix{asmap};
    12        // We have to skip the prefixes of the same length as the original
    13        // asmap, since they will contain some zero padding bits in the last
    14        // byte.
    15        asmap_prefix.pop_back();
    16        while (!asmap_prefix.empty()) {
    17            assert(!SanityCheckASMap(asmap_prefix, ip_len));
    18            for (int bit = 0; bit < 8; ++bit) {
    19                asmap_prefix.back() &= ~std::byte{uint8_t(1 << bit)};
    20                assert(!SanityCheckASMap(asmap_prefix, ip_len));
    21            }
    22            asmap_prefix.pop_back();
    23        }
    24        // No address input should trigger assertions in interpreter
    25        const auto addr{provider.ConsumeRemainingBytes<std::byte>()};
    26        (void)Interpret(asmap, addr);
    27    }
    28}
    

    fjahr commented at 2:58 pm on November 16, 2025:
    Probably not a bad idea in principle but the nature of the test changes with your suggested code, so I would like to manage the PR scope a bit and keep this for a follow-up.
  195. in src/netgroup.cpp:17 in e00deb2cfa
    15+uint256 NetGroupManager::GetAsmapVersion() const
    16 {
    17     if (!m_asmap.size()) return {};
    18-
    19-    return (HashWriter{} << m_asmap).GetHash();
    20+    return AsmapVersion(m_asmap);
    


    hodlinator commented at 8:02 pm on November 13, 2025:

    In 3da238bf0585fb2ed93e7976b8b2e27fa84536e3 “refactor: Unify asmap version calculation and naming”: If we keep naming the called function AsmapVersion() rather than AsmapChecksum(), it should probably be the one checking for zero size and returning {}/zeroes, rather than the if-statement above.

     0uint256 NetGroupManager::GetAsmapVersion() const
     1{
     2    return AsmapVersion(m_asmap);
     3}
     4
     5uint256 AsmapVersion(const std::span<const std::byte>& data)
     6{
     7    if (data.empty()) return {};
     8
     9    HashWriter asmap_hasher;
    10    asmap_hasher << data;
    11    return asmap_hasher.GetHash();
    12}
    

    fjahr commented at 3:17 pm on November 16, 2025:
    Done in #33878
  196. in src/util/asmap.cpp:217 in e00deb2cfa
    213+    if (!SanityCheckASMap(data, 128)) {
    214+        LogInfo("Sanity check of asmap data failed\n");
    215+        return {};
    216+    }
    217+    return data;
    218+}
    


    hodlinator commented at 8:17 pm on November 13, 2025:

    Seems more useful to return a bool instead of the input / empty. Could make it more like SanityCheckASMap, just with default bit-length.

    0bool CheckAsmap(const std::span<const std::byte>& data)
    1{
    2    if (!SanityCheckASMap(data, 128)) {
    3        LogWarning("Sanity check of asmap data failed\n");
    4        return false;
    5    }
    6    return true;
    7}
    

    Also, DecodeAsmap() could be changed to call CheckAsmap() since it’s also using 128.


    fjahr commented at 4:38 pm on November 16, 2025:
  197. in src/netgroup.h:92 in e00deb2cfa
    89+     * asmap data.
    90+     */
    91+    const std::span<const std::byte>  m_asmap;
    92+    std::vector<std::byte> m_loaded_asmap;
    93+
    94+    explicit NetGroupManager(std::span<const std::byte> asmap, std::vector<std::byte> loaded_asmap)
    


    hodlinator commented at 8:34 pm on November 13, 2025:

    The vector would be better as an r-value here and in WithLoadedAsmap() to make clear that we expect to take ownership (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f18-for-will-move-from-parameters-pass-by-x-and-stdmove-the-parameter).

    0    explicit NetGroupManager(std::span<const std::byte> asmap, std::vector<std::byte>&& loaded_asmap)
    

    Also, m_asmap(asmap.empty() ? std::span<const std::byte>() : asmap) seems overly verbose.

     0diff --git a/src/init.cpp b/src/init.cpp
     1index afc317ea74..21c0e140a5 100644
     2--- a/src/init.cpp
     3+++ b/src/init.cpp
     4@@ -1598,8 +1598,8 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
     5                         InitError(strprintf(_("Could not read embedded asmap data")));
     6                         return false;
     7                     }
     8-                    node.netgroupman = std::make_unique<NetGroupManager>(NetGroupManager::WithEmbeddedAsmap(asmap));
     9                     asmap_version = AsmapVersion(asmap);
    10+                    node.netgroupman = std::make_unique<NetGroupManager>(NetGroupManager::WithEmbeddedAsmap(asmap));
    11                     LogInfo("Opened asmap data (%zu bytes) from embedded byte array\n", asmap.size());
    12                 #else
    13                     // If there is no embedded data, fail and report the default
    14diff --git a/src/netgroup.h b/src/netgroup.h
    15index 8040656bb9..588fb4ca75 100644
    16--- a/src/netgroup.h
    17+++ b/src/netgroup.h
    18@@ -20,9 +20,8 @@ public:
    19         return NetGroupManager(asmap, {});
    20     }
    21 
    22-    static NetGroupManager WithLoadedAsmap(std::vector<std::byte> loaded_asmap) {
    23-        std::span<const std::byte> asmap_span(loaded_asmap);
    24-        return NetGroupManager(asmap_span, std::move(loaded_asmap));
    25+    static NetGroupManager WithLoadedAsmap(std::vector<std::byte>&& asmap) {
    26+        return NetGroupManager(std::span{asmap}, std::move(asmap));
    27     }
    28 
    29     static NetGroupManager NoAsmap() {
    30@@ -89,9 +88,9 @@ private:
    31     const std::span<const std::byte>  m_asmap;
    32     std::vector<std::byte> m_loaded_asmap;
    33 
    34-    explicit NetGroupManager(std::span<const std::byte> asmap, std::vector<std::byte> loaded_asmap)
    35-        : m_asmap(asmap.empty() ? std::span<const std::byte>() : asmap),
    36-          m_loaded_asmap(std::move(loaded_asmap))
    37+    explicit NetGroupManager(std::span<const std::byte> embedded_asmap, std::vector<std::byte>&& loaded_asmap)
    38+        : m_asmap{embedded_asmap},
    39+          m_loaded_asmap{std::move(loaded_asmap)}
    40     {}
    41 };
    42 
    43diff --git a/src/test/fuzz/util/net.h b/src/test/fuzz/util/net.h
    44index 537df1d133..acb9170913 100644
    45--- a/src/test/fuzz/util/net.h
    46+++ b/src/test/fuzz/util/net.h
    47@@ -239,7 +239,7 @@ public:
    48     if (!SanityCheckASMap(std::span<std::byte>(asmap), 128)) {
    49         return NetGroupManager::NoAsmap();
    50     }
    51-    return NetGroupManager::WithLoadedAsmap(asmap);
    52+    return NetGroupManager::WithLoadedAsmap(std::move(asmap));
    53 }
    54 
    55 inline CSubNet ConsumeSubNet(FuzzedDataProvider& fuzzed_data_provider) noexcept
    

    fjahr commented at 4:00 pm on November 16, 2025:
  198. in src/util/asmap.cpp:108 in e00deb2cfa
    117     uint32_t default_asn = 0;
    118     uint32_t jump, match, matchlen;
    119     Instruction opcode;
    120-    while (pos != endpos) {
    121-        opcode = DecodeType(pos, endpos);
    122+    while (pos < asmap.size() * 8) {
    


    hodlinator commented at 8:02 pm on November 14, 2025:

    nit: Would be nice to retain endpos here in Interpret() as done in SanityCheckASMap() to keep them more similar.

     0--- a/src/util/asmap.cpp
     1+++ b/src/util/asmap.cpp
     2@@ -101,11 +101,12 @@ uint32_t DecodeJump(size_t& bitpos, const std::span<const std::byte>& data)
     3 uint32_t Interpret(const std::span<const std::byte>& asmap, const std::span<const std::byte>& ip)
     4 {
     5     size_t pos{0};
     6+    const size_t endpos{asmap.size() * 8};
     7     uint8_t bits = ip.size() * 8;
     8     uint32_t default_asn = 0;
     9     uint32_t jump, match, matchlen;
    10     Instruction opcode;
    11-    while (pos < asmap.size() * 8) {
    12+    while (pos != endpos) {
    13         opcode = DecodeType(pos, asmap);
    14         if (opcode == Instruction::RETURN) {
    15             default_asn = DecodeASN(pos, asmap);
    16@@ -115,7 +116,7 @@ uint32_t Interpret(const std::span<const std::byte>& asmap, const std::span<cons
    17             jump = DecodeJump(pos, asmap);
    18             if (jump == INVALID) break; // Jump offset straddles EOF
    19             if (bits == 0) break; // No input bits left
    20-            if (int64_t{jump} >= static_cast<int64_t>(asmap.size() * 8 - pos)) break; // Jumping past EOF
    21+            if (int64_t{jump} >= static_cast<int64_t>(endpos - pos)) break; // Jumping past EOF
    22             if (GetBitBE(ip, bits)) {
    23                 pos += jump;
    24             }
    

    fjahr commented at 3:04 pm on November 16, 2025:
  199. in test/functional/feature_asmap.py:21 in e00deb2cfa outdated
    30@@ -31,7 +31,7 @@
    31 
    32 DEFAULT_ASMAP_FILENAME = 'ip_asn.map' # defined in src/init.cpp
    33 ASMAP = 'src/test/data/asmap.raw' # path to unit test skeleton asmap
    34-VERSION = 'fec61fa21a9f46f3b17bdcd660d7f4cd90b966aad3aec593c99b35f0aca15853'
    35+VERSION = 'bafc9da308f45179443bd1d22325400ac9104f741522d003e3fac86700f68895'
    


    hodlinator commented at 9:44 am on November 15, 2025:

    remark in 0b13a009d259adff3468ad5111295a80584ee51b “refactor: Use span instead of vector for data in util/asmap”: When will I get used to vectors being serialized with size and spans being serialized without size, leading to different hashes? :)

    Would be nice to point out in the commit message: “The version hash changes due to span’s being serialized without their size-prefix (unlike vectors).”


    fjahr commented at 10:54 pm on November 16, 2025:
    Missed the bit about the commit message initially, now also addressed in #33878
  200. in src/init.cpp:1560 in e00deb2cfa
    1555@@ -1544,33 +1556,65 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1556     ApplyArgsManOptions(args, peerman_opts);
    1557 
    1558     {
    1559-
    1560-        // Read asmap file if configured
    1561-        std::vector<bool> asmap;
    1562+        // Read asmap file or embedded data if configured and initialize
    1563+        // Netgroupman with or without it
    


    hodlinator commented at 9:56 am on November 15, 2025:
    meganit in 0b13a009d259adff3468ad5111295a80584ee51b “refactor: Use span instead of vector for data in util/asmap”: This comment should probably change in one of the later commits, like f5c8e41733fe91b6f7c66056a0a40c9b8ab86257 “init, net: Implement usage of binary-embedded asmap data”.

    fjahr commented at 3:20 pm on November 16, 2025:
    Removed the embedding mention in #33878
  201. in src/init.cpp:1581 in e00deb2cfa outdated
    1591+                    // on if it was passed or the default location
    1592+                    if (asmap_file_set) {
    1593+                        InitError(strprintf(_("Could not parse asmap file %s"), fs::quoted(fs::PathToString(asmap_path))));
    1594+                    } else {
    1595+                        InitError(strprintf(_("Could not parse asmap file in default location %s"), fs::quoted(fs::PathToString(asmap_path))));
    1596+                    }
    


    hodlinator commented at 12:29 pm on November 15, 2025:
    Why does it matter whether it was from the default location or not?

    fjahr commented at 9:28 pm on November 16, 2025:
    It’s just clearer where the error is come from. A user may have forgotten that they had a file in the default location and this could save them a bit of error chasing. But this is pretty much irrelevant now since it looks like the default location will be removed shortly, see #33770

    hodlinator commented at 8:05 pm on November 17, 2025:
    Ah, wasn’t on my radar, cheers! Does simplify things a bit.
  202. in src/init.cpp:1562 in e00deb2cfa
    1560-        // Read asmap file if configured
    1561-        std::vector<bool> asmap;
    1562+        // Read asmap file or embedded data if configured and initialize
    1563+        // Netgroupman with or without it
    1564+        assert(!node.netgroupman);
    1565+        uint256 asmap_version;
    


    hodlinator commented at 12:31 pm on November 15, 2025:
    nit: Could move asmap_version into the if-block since it’s not used outside it.

    fjahr commented at 3:21 pm on November 16, 2025:
    Done in #33878

    hodlinator commented at 8:07 pm on November 17, 2025:
    Thanks! (I believe that’s fixed in this PR - 093980af8e4e594716b9219931fd441266c1fcd4 “init, net: Implement usage of binary-embedded asmap data”)
  203. in doc/asmap-data.md:7 in e00deb2cfa
    0@@ -0,0 +1,58 @@
    1+# Embedded ASMap data
    2+
    3+## Background
    4+
    5+The ASMap feature (available via `-asmap`) makes it possible to use a peer's AS Number (ASN), an ISP/hoster identifier,
    6+in netgroup bucketing in order to ensure a higher diversity in the peer
    7+set. By default the buckets are formed based on IP prefixes but this does not
    


    hodlinator commented at 12:36 pm on November 15, 2025:

    nit: Make it clearer that we are not talking about ASMap here:

    0set. When not using this, the default behavior is to have the buckets formed based on IP prefixes but this does not
    

    fjahr commented at 10:50 pm on November 16, 2025:
    Done
  204. in src/util/asmap.h:18 in e00deb2cfa
    15 
    16-uint32_t Interpret(const std::vector<bool> &asmap, const std::vector<bool> &ip);
    17+uint32_t Interpret(const std::span<const std::byte>& asmap, const std::span<const std::byte>& ip);
    18 
    19-bool SanityCheckASMap(const std::vector<bool>& asmap, int bits);
    20+bool SanityCheckASMap(const std::span<const std::byte>& asmap, int bits);
    


    hodlinator commented at 12:40 pm on November 15, 2025:
    meganit: We use Asmap casing in the other functions, would be nice to make this SanityCheckAsmap. Maybe as an added commit to #33026.

    fjahr commented at 9:40 pm on November 16, 2025:
  205. in src/netgroup.h:95 in e00deb2cfa
    92+    std::vector<std::byte> m_loaded_asmap;
    93+
    94+    explicit NetGroupManager(std::span<const std::byte> asmap, std::vector<std::byte> loaded_asmap)
    95+        : m_asmap(asmap.empty() ? std::span<const std::byte>() : asmap),
    96+          m_loaded_asmap(std::move(loaded_asmap))
    97+    {}
    


    hodlinator commented at 1:21 pm on November 15, 2025:

    nit: Could also remove stray space and add sanity check in ctor:

    0    const std::span<const std::byte> m_asmap;
    1    std::vector<std::byte> m_loaded_asmap;
    2
    3    explicit NetGroupManager(std::span<const std::byte> asmap, std::vector<std::byte> loaded_asmap)
    4        : m_asmap(asmap.empty() ? std::span<const std::byte>() : asmap),
    5          m_loaded_asmap(std::move(loaded_asmap))
    6    {
    7        assert(m_loaded_asmap.empty() || m_asmap.data() == m_loaded_asmap.data());
    8    }
    

    fjahr commented at 4:04 pm on November 16, 2025:
  206. in src/netgroup.h:90 in e00deb2cfa
    87+     * thread-safe. m_asmap can either point to m_loaded_asmap which holds
    88+     * data loaded from an external file at runtime or it can point to embedded
    89+     * asmap data.
    90+     */
    91+    const std::span<const std::byte>  m_asmap;
    92+    std::vector<std::byte> m_loaded_asmap;
    


    hodlinator commented at 1:31 pm on November 15, 2025:

    Would be good to declare these public:

    0    NetGroupManager(const NetGroupManager&) = delete;
    1    NetGroupManager(NetGroupManager&&) = default;
    2    NetGroupManager& operator=(const NetGroupManager&) = delete;
    3    NetGroupManager& operator=(NetGroupManager&&) = delete;
    

    Otherwise with the auto-generated l-value versions there naively copy the span, making it point to the other NetGroupManager instance’s vector data which might go out of scope.


    hodlinator commented at 1:41 pm on November 15, 2025:

    Noticed in 0b13a009d259adff3468ad5111295a80584ee51b “refactor: Use span instead of vector for data in util/asmap” that you remove the const from the vector.

    0    const std::vector<std::byte> m_loaded_asmap;
    

    For some reason adding it back leads to AddrManImpl::Check() failures when running feature_asmap.py - any idea why?


    fjahr commented at 3:35 pm on November 16, 2025:
    Addressed in #33878

    fjahr commented at 3:54 pm on November 16, 2025:
    I didn’t fully validate this yet but it looks to me like with the const the vector can not be moved anymore, it is copied instead. This means that when the ctor WithLoadedAsmap tries to move it, it is copied instead and then the span pointer is dangling.

    hodlinator commented at 3:58 pm on November 21, 2025:

    Close, fixed the feature_asmap.py failure by implementing special logic in the move-ctor. But I recommend keeping it as a non-const member so that it can be moved from instead of copied using the default move-ctor.

     0--- a/src/netgroup.h
     1+++ b/src/netgroup.h
     2@@ -17,7 +17,10 @@
     3 class NetGroupManager {
     4 public:
     5     NetGroupManager(const NetGroupManager&) = delete;
     6-    NetGroupManager(NetGroupManager&&) = default;
     7+    NetGroupManager(NetGroupManager&& other)
     8+        : m_loaded_asmap{other.m_loaded_asmap},
     9+          m_asmap{m_loaded_asmap.empty() ? other.m_asmap : m_loaded_asmap}
    10+    {}
    11     NetGroupManager& operator=(const NetGroupManager&) = delete;
    12     NetGroupManager& operator=(NetGroupManager&&) = delete;
    13 
    14@@ -90,12 +93,12 @@ private:
    15      * data loaded from an external file at runtime or it can point to embedded
    16      * asmap data.
    17      */
    18+    const std::vector<std::byte> m_loaded_asmap;
    19     const std::span<const std::byte> m_asmap;
    20-    std::vector<std::byte> m_loaded_asmap;
    21 
    22     explicit NetGroupManager(std::span<const std::byte> embedded_asmap, std::vector<std::byte>&& loaded_asmap)
    23-        : m_asmap{embedded_asmap},
    24-          m_loaded_asmap{std::move(loaded_asmap)}
    25+        : m_loaded_asmap{std::move(loaded_asmap)},
    26+          m_asmap{embedded_asmap}
    27     {
    28         assert(m_loaded_asmap.empty() || m_asmap.data() == m_loaded_asmap.data());
    29     }
    
  207. hodlinator commented at 2:13 pm on November 15, 2025: contributor

    Reviewed e00deb2cfa60e846ea3de48bed35c7d3992db4bd

    Seems like this PR has evolved a bit since before CMake. Pretty good shape now (using CMake byte header generation instead of Python script etc).

    Left a bunch of nits but also some more in-depth suggestions & questions.

    Skipped aspects addressed in other PR with similar commits (https://github.com/bitcoin/bitcoin/pull/33026#pullrequestreview-3441618331).

  208. fjahr renamed this:
    Embed default ASMap as binary dump header file
    Embedded ASMap [3/3]: Build binary dump header file
    on Nov 15, 2025
  209. fjahr commented at 9:03 pm on November 15, 2025: contributor

    I am taking this PR back into draft status temporarily because I have been slicing it into 3 parts thanks to the suggestions of @hodlinator . The first part is #33026 which features some minor prep work and is ready for review right now. The second part is #33878 which can be reviewed now but has not addressed the latest feedback from today. There is one new commit in there which I hope is interesting: it adds further documentation to asmap.cpp to make it more accessible. This PR here will become the third part but I will need some time to address the latest comments and only then push an updated version to not rebase on the predecessor PRs too often.

    I have also resurrected a tracking issue for asmap at #33879 since there have also been some further spin-off PRs that should hopefully make it into the same with the embedding.

  210. fjahr marked this as a draft on Nov 15, 2025
  211. fjahr commented at 9:43 pm on November 16, 2025: contributor
    All the feedback that applied to #33878 has now been addressed there. I am marking these comments as resolved and if there are further notes on these please move the conversation over there. Thanks!
  212. fjahr renamed this:
    Embedded ASMap [3/3]: Build binary dump header file
    build: Embedded ASMap [3/3]: Build binary dump header file
    on Nov 16, 2025
  213. DrahtBot added the label Build system on Nov 16, 2025
  214. fjahr force-pushed on Nov 16, 2025
  215. fjahr force-pushed on Nov 16, 2025
  216. DrahtBot added the label CI failed on Nov 16, 2025
  217. DrahtBot commented at 10:58 pm on November 16, 2025: contributor

    🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/actions/runs/19413285446/job/55537763402 LLM reason (✨ experimental): Lint failure: ruff flagged Python f-strings without placeholders (F541) causing the CI to fail.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  218. fjahr commented at 10:59 pm on November 16, 2025: contributor
    This is now in sync with #33026 and #33878 again so technically ready for review but of course the other PRs should be checked out first if you haven’t yet.
  219. fjahr marked this as ready for review on Nov 16, 2025
  220. DrahtBot removed the label CI failed on Nov 17, 2025
  221. in doc/asmap-data.md:52 in 0ae539dbff outdated
    47+without at least two additional reviewers confirming that the process described
    48+above was followed as expected and that the encoding step yielded the same
    49+file hash. New files are created on an ongoing basis but without any central planning
    50+or an explicit schedule.
    51+
    52+## Release process
    


    fanquake commented at 10:10 am on November 18, 2025:

    a new creation process

    What is the “new creation process”:

    • Who is involved (and has the permissions needed to merge into the required repos).
    • How many total people are required?
    • Can it happen at anytime, or does it require coordination days ahead of time?
    • What should happen if the process creates data that isn’t usable?

    fjahr commented at 11:32 pm on November 19, 2025:

    I think this is just poor wording on my part, I don’t mean a completely new process there, I mean the process that is described in this document can be run again at any time so that there is a new map created that can be used for a release. If that answers your question then I will amend that part of the document to be more clear.

    I will still give short answer to the questions in case above doesn’t resolve them. As mentioned previously here: https://github.com/bitcoin/bitcoin/pull/28792/files/0ae539dbff83229b1512f92067522424125229ed#r2167155199 I am hesitant to write the specifics of the process in very specific details for similar reasons as we don’t write anywhere explicitly that “a PR will get merged when it has 3 ACKs” or so.

    Who is involved (and has the permissions needed to merge into the required repos).

    Anyone can be involved similar to how PRs here can be opened and reviewed by anyone, see for example the latest run here: https://github.com/asmap/asmap-data/issues/34. There is nothing special about the people participating. We just tag those that asked for it for their convenience so they don’t miss a new run. I am the only one who has permission to merge new maps into that repo. I have signaled several times that (for example here and here) I am happy to move the repo into the bitcoin-core org or share maintainership of the repo with core maintainers, whatever you think works best. I didn’t push for this since review here was slow for a long time. Happy to tackle this soon if you want to get it done before the embedding merge.

    How many total people are required?

    I answered this here nothing really has changed. I would add that the tooling also let’s a single person build a map but the numbers I came up with are just designed to roughly match core’s processes of distributing trust among multiple participants.

    The 5 participant minimum is kind of arbitrarily picked, I admit, but I try to get to what I would say is comparable to 3 ACKs. More would be nice but I wasn’t confident we would get more people involved on a regular basis and majority of 5 is 3 so basically this is a minimum of 3 ACKs on the input data. The mismatches are not comparable to NACKs because there is no right or wrong there (assuming no malice), they are just different because of timing or connection problems. On the mechanical encoding review part in the PR similarly I am looking for 2-3 ACKs.

    Can it happen at anytime, or does it require coordination days ahead of time?

    For convenience we currently schedule the runs several days in advance so people can set things up ahead of time and don’t miss a spontaneous event. But if you can have 5 people online with the tooling setup you can launch it at any point basically.

    I want to highlight also here that there has been great progress in the performance of kartograf because of the work of @jurraca . In my home which has good but not crazy connection speed and on my modern laptop the run now takes about 12 minutes while previously it took much much longer. This means a spontaneous run is much more feasible than it used to be.

    There are also efforts to make it more convenient to do the runs as a cron job or similar so regularly scheduled runs become even less of a hassle.

    What should happen if the process creates data that isn’t usable?

    This should be an exception, while we had some misses in the past most runs where successful and we are still researching better ways of optimizing the tools to increase the chances. But they will never go away completely. That being said, the options are either to run again or not and use an older version. The former would probably be used if there is a motivated group that wants to create a map that is as fresh as possible while the latter is what we do with our monthly runs. If it fails we just wait for the next one in a month because there is no pressure.

  222. in CMakeLists.txt:130 in 0ae539dbff outdated
    126@@ -127,6 +127,8 @@ if(WITH_ZMQ)
    127   find_package(ZeroMQ 4.0.0 MODULE REQUIRED)
    128 endif()
    129 
    130+option(WITH_EMBEDDED_ASMAP "Embed default ASMap data." ON)
    


    fanquake commented at 10:26 am on November 18, 2025:
    I can’t remember where we landed on making this is a build option. Has anyone asked for the ability to turn this off at build time, as opposed to just at runtime (if it becomes the default). I can imagine that could be asked for, for someone wanting to do a from-source build without any arbitrary blobs. In that case, what is the process for that person to recreate, (after the fact), the ip_asn.dat file that will exist in this repo?

    fjahr commented at 10:57 pm on November 19, 2025:

    I can’t remember where we landed on making this is a build option. Has anyone asked for the ability to turn this off at build time, as opposed to just at runtime (if it becomes the default)

    We didn’t really land anywhere. My last response on it was here and then the discussion didn’t continue, I think. Similar to the default location discussion I am not going to die on this hill, I just want to give users the options they might want to get the most out of asmap or for asmap to not get in their way if they don’t want it (yet). Something I could do is pull this part out of here into a separate, optional follow-up PR on top of this one. Then the embedding can be reviewed and potentially merged without it while the other PR gives users that want this option a venue to speak up and if they do we can still add this before a release. Wdyt?

    I can imagine that could be asked for, for someone wanting to do a from-source build without any arbitrary blobs. In that case, what is the process for that person to recreate, (after the fact), the ip_asn.dat file that will exist in this repo?

    Based on my suggested process for creating the maps and selecting one for the release the asmap-data repo should hold the map that was selected. So the easiest way to get the file is to get the map directly from there. It’s been discussed already that this repo will move into the bitcoin-core org when embedding is merged. @sipa has also suggested to have an export functionality in core nodes that have been built with the map, which I have planned to do as a quick follow-up to this PR.

  223. fanquake referenced this in commit 509dc91db1 on Nov 19, 2025
  224. DrahtBot added the label Needs rebase on Nov 19, 2025
  225. fjahr force-pushed on Nov 19, 2025
  226. fjahr commented at 11:47 pm on November 19, 2025: contributor
    Rebased since #33026 was merged, of course still based on #33878.
  227. DrahtBot removed the label Needs rebase on Nov 20, 2025
  228. DrahtBot added the label Needs rebase on Nov 20, 2025
  229. fjahr force-pushed on Nov 20, 2025
  230. fjahr commented at 2:55 pm on November 20, 2025: contributor
    Only rebased again
  231. DrahtBot removed the label Needs rebase on Nov 20, 2025
  232. in src/CMakeLists.txt:289 in 1dcd2e8a98
    285@@ -286,6 +286,13 @@ target_link_libraries(bitcoin_node
    286     $<TARGET_NAME_IF_EXISTS:libevent::pthreads>
    287     $<TARGET_NAME_IF_EXISTS:USDT::headers>
    288 )
    289+if (WITH_EMBEDDED_ASMAP)
    


    hodlinator commented at 1:00 pm on November 21, 2025:

    nit: Convention seems to be

    0if(WITH_EMBEDDED_ASMAP)
    

    fjahr commented at 1:00 am on November 25, 2025:
    Done
  233. hodlinator commented at 2:24 pm on November 21, 2025: contributor

    Concept ACK 1dcd2e8a983406d12009e41bac5dbaeb12f330be

    I do understand the distaste at adding 1MB+ binary blobs into a reproducible build. From that perspective, it would be cleaner to include the ASMap blob as a separate file included in our Bitcoin Core releases.

    But having a separate file opens up the can of worms of where to install it on various operating systems and distributions, how to encourage package maintainers to include and ensure it ends up in the right locations, and making Bitcoin Core search an ever changing list of directories to check for ASMap file existence (or requiring always specifying -noasmap or -asmap=/path/on/distro/x/asmap.dat). That would hamper broad ASMap adoption, especially if we want to make it default-on in the future.

    Might be good to add a blurb about the consequences of not embedding to the PR description.

    Generation times unaffected

    OFF

    0$ ccache -C && rm -rf build && time cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=OFF
    1
    2real	0m18.115s
    3real	0m18.091s
    4real	0m18.027s
    

    ON

    0$ ccache -C && rm -rf build && time cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=ON
    1
    2real	0m18.050s
    3real	0m17.949s
    4real	0m17.995s
    

    Compile times fairly unaffected

    OFF

    0$ ccache -C && rm -rf build && cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=OFF && \
    1time cmake --build build -t bitcoind
    2
    3real	2m28.611s
    4real	2m27.686s
    5real	2m26.425s
    

    ON

    0$ ccache -C && rm -rf build && cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=ON && \
    1time cmake --build build -t bitcoind
    2
    3real	2m29.730s
    4real	2m29.345s
    5real	2m30.751s
    6real	2m28.495s
    7real	2m26.334s
    
  234. DrahtBot added the label Needs rebase on Nov 21, 2025
  235. refactor: Operate on bytes instead of bits in Asmap code
    Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    87f72d9ed0
  236. refactor: Unify asmap version calculation and naming
    Calculate the asmap version only in one place: A dedicated function in util/asmap.
    
    The version was also referred to as asmap checksum in several places. To avoid confusion call it asmap version everywhere.
    b89ffbc56a
  237. refactor: Use span instead of vector for data in util/asmap
    This prevents holding the asmap data in memory twice.
    
    The version hash changes due to spans being serialized without their size-prefix (unlike vectors).
    dc0c4310ba
  238. doc: Add more extensive docs to asmap implementation
    Also makes minor improvement on the python implementation documentation.
    d150b24c88
  239. refactor: Simplify Interpret asmap function
    This aligns it more with SanityCheckAsmap and reduces variable scope.
    
    Also unify asmap casing in SanityCheckAsmap function name.
    
    Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    825d677a86
  240. build: Add embedded asmap data
    The data embedded is from the latest ASMap file from the asmap-data
    repository: https://github.com/asmap/asmap-data/blob/main/1755187200_asmap.dat
    dff775f412
  241. build: Generate ip_asn.dat.h during build process
    This can be disabled with -DWITH_EMBEDDED_ASMAP=OFF.
    c35bf3d5b0
  242. fjahr force-pushed on Nov 25, 2025
  243. fjahr commented at 1:00 am on November 25, 2025: contributor

    Addressed comments and rebased after #33770 was merged including the latest changes in #33878. This PR now necessarily re-adds the possibility to set a bool option for -asmap to make it possible to use asmap without an external file. I have done this a bit differently now and I think it’s better.

    Re @hodlinator #28792 (review): I don’t think I understand your comment. I am pretty sure I have addressed your previous comment to use OFF but the file was renamed in the meantime which I pulled in here with my previous rebase. This is probably why I can’t answer to your comment inline. Can you clarify if this isn’t addressed yet somehow or if I am misunderstanding something? Thanks!

  244. init, net: Implement usage of binary-embedded asmap data a99f1e5036
  245. doc: Expand documentation on asmap feature and tooling af1632ca06
  246. ci: Use without embedded asmap build option in one ci job d9692f2da9
  247. doc: Add initial asmap data documentation 89b87c5592
  248. fjahr force-pushed on Nov 25, 2025
  249. DrahtBot added the label CI failed on Nov 25, 2025
  250. DrahtBot commented at 1:04 am on November 25, 2025: contributor

    🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/actions/runs/19654544240/job/56288400840 LLM reason (✨ experimental): CI failed due to a lint error from ruff: an f-string without placeholders in the Python tests (feature_asmap).

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  251. DrahtBot removed the label Needs rebase on Nov 25, 2025
  252. DrahtBot removed the label CI failed on Nov 25, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-28 03:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me