Avoid rebucketing on restart when it's not necessary #20539

pull naumenkogs wants to merge 2 commits into bitcoin:master from naumenkogs:2020-12-01-rebucket-asmap-fix changing 1 files +15 −2
  1. naumenkogs commented at 8:21 AM on December 1, 2020: member

    Previously, we would re-bucket AddrMan entries if, at a restart, a node reads a peers.dat file of pre-asmap format.

    This should not happen if an actual asmap file was not provided, because bucket assignment won't change.

    Re-bucketing should be avoided because it leads to a data loss: since we don't remember where those addrs came from, they will be placed only into 1 bucket (as opposed to placing in up to 8, if we receive them from 8 different peers during node operation).

    For peers.dat files with pre-asmap format, re-bucket only if an asmap file was supported.

  2. naumenkogs commented at 8:23 AM on December 1, 2020: member

    The bug was found by @jnewbery here.

    This probably should be backported to 0.20 and 0.21.

    cc @vasild

  3. fanquake added the label P2P on Dec 1, 2020
  4. naumenkogs commented at 8:26 AM on December 1, 2020: member

    To be clear, the issue we're solving here happens only once when a node updates from <0.20 to 0.20+. After that, even if asmap was not supplied, peers.dat will be stored on disk in Format::V2_ASMAP, and next time a node restarted, the check won't be triggered. (even before this PR)

  5. DrahtBot commented at 9:36 AM on December 1, 2020: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    No conflicts as of last run.

  6. in src/addrman.h:548 in 58dfe2b090 outdated
     547 | +                (asmap_was_supported && serialized_asmap_version != supplied_asmap_version);
     548 | +
     549 | +            const bool constants_changed_incompatibly = nUBuckets != ADDRMAN_NEW_BUCKET_COUNT || info.nRefCount >= ADDRMAN_NEW_BUCKETS_PER_ADDRESS;
     550 | +            const bool assignment_changed = vvNew[bucket][nUBucketPos] != -1;
     551 | +
     552 | +            if (!asmap_changed && !constants_changed_incompatibly && !assignment_changed) {
    


    vasild commented at 2:04 PM on December 1, 2020:

    I think this would be more readable as:

    if (asmap_is_same && constants_are_same && assignment_is_same) {
    

    (and of course variables renamed and re-assigned)


    naumenkogs commented at 4:31 PM on December 1, 2020:

    "Same" is not quite true... For example, if "ADDRMAN_NEW_BUCKETS_PER_ADDRESS" was increased, we should ignore it here, although technically it changed :)

  7. in src/addrman.h:543 in 58dfe2b090 outdated
     542 | +            // asmap has changed in two cases:
     543 | +            // - either it wasn't supported by the addr file, but now asmap is supplied;
     544 | +            // - or it was supported, but we have a different asmap now (the lack of asmap counts as well)
     545 | +            const bool asmap_was_supported = format >= Format::V2_ASMAP;
     546 | +            const bool asmap_changed = (!asmap_was_supported && m_asmap.size() != 0) ||
     547 | +                (asmap_was_supported && serialized_asmap_version != supplied_asmap_version);
    


    vasild commented at 2:24 PM on December 1, 2020:
    I think the second commit is not an equivalent change (it is behavioral change):
    
    asmap_was_supported == format >= Format::V2_ASMAP;
    asmap_changed ==
        (!asmap_was_supported && m_asmap.size() != 0) ||
        (asmap_was_supported && serialized_asmap_version != supplied_asmap_version);
    
    is equivalent to
    
    asmap_changed ==
        (!(format >= Format::V2_ASMAP) && m_asmap.size() != 0) ||
        (format >= Format::V2_ASMAP && serialized_asmap_version != supplied_asmap_version);
    
    is equivalent to
    
    asmap_changed ==
        (format < Format::V2_ASMAP && m_asmap.size() != 0) ||
        (format >= Format::V2_ASMAP && serialized_asmap_version != supplied_asmap_version);
    
    is equivalent to
    
    !asmap_changed ==
        !(format < Format::V2_ASMAP && m_asmap.size() != 0) &&
        !(format >= Format::V2_ASMAP && serialized_asmap_version != supplied_asmap_version);
    
    is equivalent to
    
    !asmap_changed ==
        (format >= Format::V2_ASMAP || m_asmap.size() == 0) &&
        (format < Format::V2_ASMAP || serialized_asmap_version == supplied_asmap_version);
    
    but in the "if", !asmap_changed replaces not the above but something else:
    
    (format >= Format::V2_ASMAP && serialized_asmap_version == supplied_asmap_version) ||
    (format < Format::V2_ASMAP && m_asmap.size() == 0)
    

    naumenkogs commented at 8:22 AM on December 2, 2020:

    Well, I think they are actually equal, assuming that both can be simplified to serialized_asmap_version == supplied_asmap_version in our case :)


    naumenkogs commented at 8:22 AM on December 2, 2020:

    Anyway, this part will be no relevant when I simplify the first commit.

  8. in src/addrman.h:531 in 58dfe2b090 outdated
     526 | @@ -527,10 +527,25 @@ friend class CAddrManTest;
     527 |              CAddrInfo &info = mapInfo[n];
     528 |              int bucket = entryToBucket[n];
     529 |              int nUBucketPos = info.GetBucketPosition(nKey, true, bucket);
     530 | -            if (((format >= Format::V2_ASMAP && serialized_asmap_version == supplied_asmap_version) ||
     531 | -                (format < Format::V2_ASMAP && m_asmap.size() == 0)) &&
    


    vasild commented at 2:29 PM on December 1, 2020:

    This is introduced in the first commit:

    (format >= Format::V2_ASMAP && serialized_asmap_version == supplied_asmap_version) ||
    (format < Format::V2_ASMAP && m_asmap.size() == 0)
    

    Is it not equivalent to just serialized_asmap_version == supplied_asmap_version? After all, if the hash of the supplied and the serialized version is the same, then asmap did not change.

  9. Previously, we would re-bucket AddrMan entries if, at a restart,
    a node reads a peers.dat file of pre-asmap format.
    
    This should not happen if an actual asmap file was not provided,
    because bucket assignment won't change.
    
    Re-bucketing should be avoided because it leads to a data loss:
    since we don't remember where those addrs came from, they
    will be placed only into 1 bucket (as opposed to placing
    in up to 8, if we receive them from 8 different peers during node operation).
    
    For peers.dat files with pre-asmap format, re-bucket only if an asmap file
    was supported, which can be achieved by an already existing check:
    serialized_asmap_version == supplied_asmap_version
    e03a716fad
  10. naumenkogs force-pushed on Dec 2, 2020
  11. naumenkogs force-pushed on Dec 2, 2020
  12. naumenkogs force-pushed on Dec 2, 2020
  13. Refactor deciding to re-bucket on restart
    No behavior change.
    c2b3e4b97e
  14. in src/addrman.h:544 in 0288ed8dbf outdated
     541 | +            // - either it wasn't supported by the addr file, but now asmap is supplied;
     542 | +            // - or it was supported, but we have a different asmap now (the lack of asmap counts as well)
     543 | +            const bool asmap_is_same = serialized_asmap_version == supplied_asmap_version;
     544 | +
     545 | +            const bool constants_are_compatible = nUBuckets == ADDRMAN_NEW_BUCKET_COUNT && info.nRefCount < ADDRMAN_NEW_BUCKETS_PER_ADDRESS;
     546 | +            const bool assignment_is_compatible = vvNew[bucket][nUBucketPos] == -1;
    


    jnewbery commented at 9:23 AM on December 2, 2020:

    I still think these assignments should happen outside the loop, since they're constant and don't depend on the loop variable. See #16702 (review).


    naumenkogs commented at 9:25 AM on December 2, 2020:

    Yeah, some of them. I'll move them out.

  15. naumenkogs force-pushed on Dec 2, 2020
  16. vasild approved
  17. vasild commented at 6:23 AM on December 3, 2020: member

    ACK c2b3e4b97

  18. jnewbery commented at 11:40 AM on December 3, 2020: member

    #20557 fixes this issue and a couple of other bugs in addrman unserialization. I suggest we close this PR in favour of that one.

  19. naumenkogs closed this on Dec 3, 2020

  20. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-27 06:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me