[p2p] Pimpl AddrMan to abstract implementation details #22950

pull amitiuttarwar wants to merge 13 commits into bitcoin:master from amitiuttarwar:2021-09-impl-addrman-pimpl changing 20 files +857 −704
  1. amitiuttarwar commented at 3:02 am on September 11, 2021: contributor

    Introduce the pimpl pattern for AddrMan to separate the implementation details from the externally used object representation. This reduces compile-time dependencies and conceptually clarifies AddrMan’s interface from the implementation specifics.

    Since the unit & fuzz tests currently rely on accessing AddrMan internals, this PR introduces addrman_impl.h, which is exclusively imported by addrman.cpp and test files.

  2. amitiuttarwar added the label P2P on Sep 11, 2021
  3. fanquake requested review from jnewbery on Sep 11, 2021
  4. fanquake requested review from ajtowns on Sep 11, 2021
  5. DrahtBot commented at 2:56 pm on September 11, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23140 (Make CAddrman::Select_ select buckets, not positions, first by sipa)
    • #23035 (p2p, rpc, test: expose tried and refcount in getnodeaddresses, update/improve tests by jonatack)
    • #22937 (refactor: Forbid calling unsafe fs::path(std::string) constructor and fs::path::string() method by ryanofsky)
    • #22872 (log: improve checkaddrman logging with duration in milliseconds by jonatack)
    • #22839 (log: improve addrman logging by mzumsande)
    • #22508 (fuzz: replace every fuzzer-controlled while loop with a macro by apoelstra)
    • #20295 (rpc: getblockfrompeer by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  6. amitiuttarwar force-pushed on Sep 13, 2021
  7. amitiuttarwar force-pushed on Sep 13, 2021
  8. amitiuttarwar commented at 10:57 pm on September 13, 2021: contributor

    Fixed the lint failure that was caused by the circular dependencies check with 95a066efff8c63b7d81d93977c1980e019708fa9. The check treated file_name.h and file_name.cpp as the same module, so the import pattern here of addrman.h <- addrman_impl.h <- addrman.cpp was perceived as a circular dependency. I fixed this by telling it to also treat file_name_impl.h as part of the same module.

    More info about updated fix here: #22950 (comment)

  9. jnewbery commented at 12:14 pm on September 14, 2021: member

    Concept ACK. I’m a big fan of compilation firewall patterns. From Herb Sutter:

    One big advantage of this idiom is that it breaks compile-time dependencies. First, system builds run faster because using a Pimpl can eliminate extra #includes. I have worked on projects where converting just a few widely-visible classes to use Pimpls has halved the system’s build time. Second, it localizes the build impact of code changes because the parts of a class that reside in the Pimpl can be freely changed – that is, members can be freely added or removed – without recompiling client code. Because it’s so good at eliminating compilation cascades due to changes in only the now-hidden members, it’s often dubbed a “compilation firewall.”

    (https://herbsutter.com/gotw/_100/)

    Both are relevant here. addrman.h is eventually included by just about everything, so minimizing its compilation time is potentially a big win. There’s also a plan to rework its implementation. If we first make addrman into a pimpl, that could theoretically be done without any impact on the rest of the codebase.

  10. jonatack commented at 5:59 pm on September 15, 2021: member

    Concept ACK on improved separation and shorter build times. Debug build clean. Just read through pages 147-156 of Scott Meyers’ “Effective Modern C++” about the (unique) pointer to implementation idiom and started looking at the commit organisation, particularly the choices in the second (“Introduce CAddrMan::Impl to encapsulate addrman implementation”) and third (“Remove external dependencies on CAddrInfo objects”) commits where the action appears to be.

    Edit, noting for myself, also:

    • Scott Meyers, “Effective C++”, item 31
    • Herb Sutter, “Exceptional C++”, items 26-30
  11. laanwj commented at 4:24 pm on September 16, 2021: member

    Concept ACK

    Both are relevant here. addrman.h is eventually included by just about everything, so minimizing its compilation time is potentially a big win.

    Yes, at some point there was discussion about doing this for txmempool.h too, as it’s included a fair bit and includes all these boost multi_index structures, which i’m sure slow down compilation quite a bit.

  12. amitiuttarwar force-pushed on Sep 16, 2021
  13. amitiuttarwar force-pushed on Sep 16, 2021
  14. amitiuttarwar commented at 4:34 pm on September 16, 2021: contributor
    adopted a different fix to the detected circular dependency- I made the impl a stand-alone class, so it’s now AddrManImpl instead of AddrMan::Impl. this means addrman_impl.h does not need to import addrman.h, and the linter script does not need to be updated :)
  15. laanwj added this to the "Blockers" column in a project

  16. DrahtBot added the label Needs rebase on Sep 20, 2021
  17. JeremyRubin commented at 3:29 am on September 21, 2021: contributor
    concept ack!
  18. amitiuttarwar force-pushed on Sep 21, 2021
  19. amitiuttarwar commented at 9:55 pm on September 21, 2021: contributor
    thanks for the concept ACKs! rebased to incorporate the recent addrman changes.
  20. amitiuttarwar force-pushed on Sep 21, 2021
  21. DrahtBot removed the label Needs rebase on Sep 21, 2021
  22. amitiuttarwar commented at 0:35 am on September 22, 2021: contributor
    I had missed a silent merge conflict, fixed now and ready for review 🎈
  23. in src/addrman.h:51 in 11926b0993 outdated
    63-
    64-/**
    65- * Stochastical (IP) address manager
    66- */
    67-class CAddrMan
    68+class AddrManImpl;
    


    jnewbery commented at 11:33 am on September 22, 2021:
    This is a forward declaration, so the usual place for it would be at the top of the file. I think placing it here would also means that the doxygen comment would be associated with AddrManImpl instead of AddrMan.

    amitiuttarwar commented at 10:20 pm on September 22, 2021:
    fixed
  24. in src/addrman.cpp:1010 in 11926b0993 outdated
    1008+template void AddrMan::Unserialize(CHashVerifier<CAutoFile>& s);
    1009+template void AddrMan::Unserialize(CDataStream& s);
    1010+template void AddrMan::Unserialize(CHashVerifier<CDataStream>& s);
    1011+
    1012+template <typename Stream>
    1013+void AddrMan::Serialize(Stream& s_) const
    


    jnewbery commented at 12:26 pm on September 22, 2021:
    This may just be personal preference, but I think this file would be tidier/easier to read if all of these AddrMan boilerplate forwarding functions were collected at the bottom (as is done in txrequest.cpp), rather than interleaved with AddrManImpl functions.

    amitiuttarwar commented at 10:21 pm on September 22, 2021:
    I agree, updated
  25. in src/addrman_impl.h:279 in 11926b0993 outdated
    274+
    275+    /**
    276+     * This internal function adds one parameter to the GetAddr wrapper.
    277+     *
    278+     * @param[out] vAddr         Vector of randomly selected addresses from vRandom.
    279+     */
    


    jnewbery commented at 12:31 pm on September 22, 2021:
    Maybe we should just update the internal GetAddr_() function to return a std::vector<CAddress> instead of introducing this comment?

    amitiuttarwar commented at 10:22 pm on September 22, 2021:
    sounds good, added commit 5a8415750cdc93b719a8e4c18ee245e236b93006 to handle that
  26. in src/addrman_impl.h:12 in 11926b0993 outdated
     5+#ifndef BITCOIN_ADDRMAN_IMPL_H
     6+#define BITCOIN_ADDRMAN_IMPL_H
     7+
     8+#include <logging.h>
     9+
    10+#include <sync.h>
    


    jnewbery commented at 12:32 pm on September 22, 2021:
    Should be grouped above with logging.h, since sync.h is a project header and not a stl header.

    amitiuttarwar commented at 10:22 pm on September 22, 2021:
    done
  27. in src/addrman_impl.h:19 in 11926b0993 outdated
     7+
     8+#include <logging.h>
     9+
    10+#include <sync.h>
    11+#include <unordered_map>
    12+#include <unordered_set>
    


    jnewbery commented at 12:38 pm on September 22, 2021:

    Also include:

    • cstdint (for fixed width integer types)
    • netaddress.h (for CNetAddr)
    • serialize.h (for serialization functions)
    • protocol.h (for CAddress)
    • uint256.h (for uint256)
    • vector (for std::vector)
    • utility (for std::pair)
    • set (for std::set)
    • optional (for std::optional)

    amitiuttarwar commented at 10:23 pm on September 22, 2021:
    done
  28. in src/addrman.h:12 in 11926b0993 outdated
     5@@ -6,94 +6,19 @@
     6 #ifndef BITCOIN_ADDRMAN_H
     7 #define BITCOIN_ADDRMAN_H
     8 
     9-#include <fs.h>
    10-#include <logging.h>
    11 #include <netaddress.h>
    12 #include <protocol.h>
    13-#include <sync.h>
    14 #include <timedata.h>
    


    jnewbery commented at 12:41 pm on September 22, 2021:
    Also include streams.h

    amitiuttarwar commented at 10:23 pm on September 22, 2021:
    done
  29. jnewbery commented at 12:49 pm on September 22, 2021: member

    Code review ACK 11926b09935f491c33b611f1a112a800377a481a. Just a few very minor comments inline, and a few commit order comments:

    1.

    In the first commit ([addrman] Introduce CAddrMan::Impl to encapsulate addrman implementation.), you slightly change the function bodies of SelectTriedCollision() and Select():

    0         Check();
    1-        const CAddrInfo ret = SelectTriedCollision_();
    2+        const auto ret = SelectTriedCollision_();
    3         Check();
    

    and

    0         Check();
    1-        const CAddrInfo addrRet = Select_(newOnly);
    2+        const auto addrRet = Select_(newOnly);
    3         Check();
    

    at the same time as moving that code from addrman.h to addrman.cpp. That means that it doesn’t show up as moved code when using git diff --color-moved=dimmed-zebra. I think those changes to auto can happen in the second commit ([net, addrman] Remove external dependencies on CAddrInfo objects)

    2.

    You add #include <addrman_impl.h> to fuzz/deserialize.cpp in the first commit ([addrman] Introduce CAddrMan::Impl to encapsulate addrman implementation.). I think it’s only needed there from the 3rd commit ([move-only] Move CAddrInfo to test-only header file), since the include is only needed for the definition of CAddrInfo.

    3.

    The updates to the Select() and SelectTriedCollision() doxygen comments should be included in the changes to the interface in commit 2 ([net, addrman] Remove external dependencies on CAddrInfo objects)

    4.

    In commit scripted-diff: Rename CAddrMan to AddrMan, is there a reason not to use git grep -l CAddrMan?

  30. amitiuttarwar force-pushed on Sep 22, 2021
  31. amitiuttarwar commented at 10:24 pm on September 22, 2021: contributor

    thanks for the review @jnewbery. I took all your suggestions except for this one that I have a question about-

    In commit scripted-diff: Rename CAddrMan to AddrMan, is there a reason not to use git grep -l CAddrMan?

    I was trying this out, but am unsure how to exclude release notes. This is what I have so far, any suggestions? git grep -l CAddrMan | xargs sed -i 's/CAddrMan/AddrMan/g'

    also, thank you for the feedback about commit breakdown. I hope its all straightened out now :)

  32. jamesob commented at 11:39 pm on September 22, 2021: member
    If this is being done in the name of compilation performance, do you have any measurements justifying the change?
  33. MarcoFalke commented at 6:18 am on September 23, 2021: member

    I was trying this out, but am unsure how to exclude release notes.

    git grep -l "foo_bar" ./src

  34. in src/addrman_impl.h:102 in bb22a365c8 outdated
     97+};
     98+
     99+class AddrManImpl
    100+{
    101+public:
    102+    AddrManImpl(std::vector<bool>&& asmap, bool deterministic, int32_t consistency_check_ratio)
    


    jnewbery commented at 10:14 am on September 23, 2021:
    What do you think about having the ctor and dtor function bodies in addrman.cpp, so that addrman_impl.h only contains the function declarations?

    amitiuttarwar commented at 9:10 pm on September 24, 2021:
    done
  35. in src/addrman.cpp:1118 in bb22a365c8 outdated
    1093+const std::vector<bool>& AddrManImpl::GetAsmap() const
    1094+{
    1095+    return m_asmap;
    1096+}
    1097+
    1098+// explicit instantiation
    


    jnewbery commented at 10:15 am on September 23, 2021:
    It seems more natural to have these instantiations below the function definitions for Serialize() and Unserialize().

    amitiuttarwar commented at 8:40 pm on September 24, 2021:

    to make sure I understand correctly, is this the order you’re suggesting?

    • AddrManImpl::Serialize
    • AddrManImpl::Unserialize
    • explicit instantiations
    • AddrManImpl [ other functions ]
    • AddrMan::Serialize
    • AddrMan::Unserialize

    The reason I think the current order makes more sense is because the explicit instantiations are for the AddrMan template, not the impl template.


    jnewbery commented at 9:47 am on September 27, 2021:
    I mean that these instantiations of AddrMan::Serialize() and AddrMan::Unserialize() should be below the template function definitions for AddrMan::Serialize() and AddrMan::Unserialize() (i.e. just move them down 10 lines).

    amitiuttarwar commented at 2:40 am on September 29, 2021:
    ahhhh, that makes more sense. done
  36. in src/addrman.cpp:106 in bb22a365c8 outdated
    118-        for (auto& entry : bucket) {
    119-            entry = -1;
    120-        }
    121-    }
    122-}
    123+AddrMan::AddrMan(std::vector<bool> asmap, bool deterministic, int32_t consistency_check_ratio)
    


    jnewbery commented at 10:25 am on September 23, 2021:
    Could these AddrMan ctor and dtor methods also be moved to the bottom, below the AddrManImpl methods?

    amitiuttarwar commented at 9:10 pm on September 24, 2021:
    done
  37. in src/addrman_impl.h:268 in bb22a365c8 outdated
    277+    //! Consistency check, taking into account m_consistency_check_ratio. Will std::abort if an inconsistency is detected.
    278+    void Check() const EXCLUSIVE_LOCKS_REQUIRED(cs);
    279+
    280+    //! Perform consistency check, regardless of m_consistency_check_ratio.
    281+    //! @returns an error code or zero.
    282+    int ForceCheckAddrman() const EXCLUSIVE_LOCKS_REQUIRED(cs);
    


    jnewbery commented at 10:30 am on September 23, 2021:
    Perhaps these two methods should be moved above Good_, so that all the underscore internal functions are grouped together.

    amitiuttarwar commented at 8:51 pm on September 24, 2021:
    hm, but they are still internal functions?

    jnewbery commented at 9:50 am on September 27, 2021:
    By “internal functions”, I mean private functions that are called by the public “outer functions”. There’s a pattern that the public Good() function calls private Good_(), public Add() calls private Add_(), etc. It makes sense to group all of the FunctionName_() functions together.

    amitiuttarwar commented at 2:42 am on September 29, 2021:
    okay, I introduced this new commit: #22950. I moved Check and ForceCheck to be after all the Function_. And I moved some additional stuff around to be consistently ordered in declarations & definitions.
  38. jnewbery commented at 10:41 am on September 23, 2021: member

    A couple more suggestions for the git ordering/commit logs:

    1.

    I think the first commit [addrman] Introduce CAddrMan::Impl to encapsulate addrman implementation. could be split into two commits, to make it even easier for reviewers to verify that there have been no changes. The first commit would move all of the function bodies (eg the outer size() ,Add(), Good(), etc) out of addrman.h into addrman.cpp and could be very easily reviewed with git diff --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space. The second commit would add the pimpl. That means the commit isn’t a mixture of new code, renames and moves.

    2.

    The commit log for [doc] Update comments includes “Remove them from the internal ones unless there is a divergence in function behaviors.”, which can now be removed, since there are no longer any instances where the behaviour diverges.

  39. jnewbery commented at 11:11 am on September 23, 2021: member

    If this is being done in the name of compilation performance, do you have any measurements justifying the change? @jamesob - the most significant impact would be on recompilation times. By totally separating the implementation from the interface, we can make changes to the implementation without any impact on client code that calls into this component. For example, on this branch, making the following change:

     0diff --git a/src/addrman.cpp b/src/addrman.cpp
     1index ac2635249c..1257a7503f 100644
     2--- a/src/addrman.cpp
     3+++ b/src/addrman.cpp
     4@@ -1015,7 +1015,7 @@ bool AddrManImpl::Add(const std::vector<CAddress>& vAddr, const CNetAddr& source
     5         nAdd += Add_(*it, source, nTimePenalty) ? 1 : 0;
     6     Check();
     7     if (nAdd) {
     8-        LogPrint(BCLog::ADDRMAN, "Added %i addresses from %s: %i tried, %i new\n", nAdd, source.ToString(), nTried, nNew);
     9+        LogPrint(BCLog::ADDRMAN, "Added %i addresses from %s: %i tried, %i new, %i collisions\n", nAdd, source.ToString(), nTried, nNew, m_tried_collisions.size());
    10     }
    11     return nAdd > 0;
    12 }
    

    Recompiles for me in about 35s:

     0→ time make
     1Making all in src
     2make[1]: Entering directory '/home/vagrant/bitcoin/src'
     3make[2]: Entering directory '/home/vagrant/bitcoin/src'
     4make[3]: Entering directory '/home/vagrant/bitcoin'
     5make[3]: Leaving directory '/home/vagrant/bitcoin'
     6  CXX      libbitcoin_server_a-addrman.o
     7  AR       libbitcoin_server.a
     8  CXXLD    bitcoind
     9  CXXLD    test/test_bitcoin
    10  CXXLD    bench/bench_bitcoin
    11  CXXLD    test/fuzz/fuzz
    12make[2]: Leaving directory '/home/vagrant/bitcoin/src'
    13make[1]: Leaving directory '/home/vagrant/bitcoin/src'
    14Making all in doc/man
    15make[1]: Entering directory '/home/vagrant/bitcoin/doc/man'
    16make[1]: Nothing to be done for 'all'.
    17make[1]: Leaving directory '/home/vagrant/bitcoin/doc/man'
    18make[1]: Entering directory '/home/vagrant/bitcoin'
    19make[1]: Nothing to be done for 'all-am'.
    20make[1]: Leaving directory '/home/vagrant/bitcoin'
    21
    22real    0m35.758s
    23user    0m26.836s
    24sys     0m7.951s
    

    On master, making the equivalent change:

     0diff --git a/src/addrman.h b/src/addrman.h
     1index 7dd8528bef..08f158f6c6 100644
     2--- a/src/addrman.h
     3+++ b/src/addrman.h
     4@@ -174,7 +174,7 @@ public:
     5             nAdd += Add_(*it, source, nTimePenalty) ? 1 : 0;
     6         Check();
     7         if (nAdd) {
     8-            LogPrint(BCLog::ADDRMAN, "Added %i addresses from %s: %i tried, %i new\n", nAdd, source.ToString(), nTried, nNew);
     9+            LogPrint(BCLog::ADDRMAN, "Added %i addresses from %s: %i tried, %i new, %i collisions\n", nAdd, source.ToString(), nTried, nNew, m_tried_collisions.size());
    10         }
    11         return nAdd > 0;
    12     }
    

    results in a very long build of 9m29s, since those changes get included in many other translation units:

      0 time make
      1Making all in src
      2make[1]: Entering directory '/home/vagrant/bitcoin/src'
      3make[2]: Entering directory '/home/vagrant/bitcoin/src'
      4make[3]: Entering directory '/home/vagrant/bitcoin'
      5make[3]: Leaving directory '/home/vagrant/bitcoin'
      6  CXX      init/bitcoind-bitcoind.o
      7  CXX      libbitcoin_server_a-addrdb.o
      8  CXX      libbitcoin_server_a-addrman.o
      9  CXX      libbitcoin_server_a-init.o
     10  CXX      libbitcoin_server_a-mapport.o
     11  CXX      libbitcoin_server_a-net.o
     12  CXX      libbitcoin_server_a-net_processing.o
     13  CXX      node/libbitcoin_server_a-context.o
     14  CXX      node/libbitcoin_server_a-interfaces.o
     15  CXX      node/libbitcoin_server_a-transaction.o
     16  CXX      rpc/libbitcoin_server_a-mining.o
     17  CXX      rpc/libbitcoin_server_a-net.o
     18  CXX      libbitcoin_server_a-torcontrol.o
     19  CXX      libbitcoin_server_a-txorphanage.o
     20  CXX      libbitcoin_server_a-txrequest.o
     21  CXX      wallet/libbitcoin_server_a-init.o
     22  AR       libbitcoin_server.a
     23  CXX      interfaces/libbitcoin_util_a-init.o
     24  AR       libbitcoin_util.a
     25  CXXLD    bitcoind
     26  CXXLD    bitcoin-cli
     27  CXXLD    bitcoin-tx
     28  CXXLD    bitcoin-wallet
     29  CXXLD    bitcoin-util
     30  CXX      test/test_bitcoin-addrman_tests.o
     31  CXX      test/test_bitcoin-denialofservice_tests.o
     32  CXX      test/test_bitcoin-i2p_tests.o
     33  CXX      test/test_bitcoin-validation_tests.o
     34  CXX      test/test_bitcoin-net_peer_eviction_tests.o
     35  CXX      test/test_bitcoin-net_tests.o
     36  CXX      test/test_bitcoin-txrequest_tests.o
     37  CXX      test/util/libtest_util_a-net.o
     38  CXX      test/util/libtest_util_a-setup_common.o
     39  AR       libtest_util.a
     40  CXXLD    test/test_bitcoin
     41  CXX      test/fuzz/fuzz-addition_overflow.o
     42  CXX      test/fuzz/fuzz-addrman.o
     43  CXX      test/fuzz/fuzz-autofile.o
     44  CXX      test/fuzz/fuzz-banman.o
     45  CXX      test/fuzz/fuzz-block_header.o
     46  CXX      test/fuzz/fuzz-blockfilter.o
     47  CXX      test/fuzz/fuzz-bloom_filter.o
     48  CXX      test/fuzz/fuzz-buffered_file.o
     49  CXX      test/fuzz/fuzz-chain.o
     50  CXX      test/fuzz/fuzz-checkqueue.o
     51  CXX      test/fuzz/fuzz-coins_view.o
     52  CXX      test/fuzz/fuzz-connman.o
     53  CXX      test/fuzz/fuzz-crypto.o
     54  CXX      test/fuzz/fuzz-crypto_aes256.o
     55  CXX      test/fuzz/fuzz-crypto_aes256cbc.o
     56  CXX      test/fuzz/fuzz-crypto_chacha20.o
     57  CXX      test/fuzz/fuzz-crypto_chacha20_poly1305_aead.o
     58  CXX      test/fuzz/fuzz-crypto_common.o
     59  CXX      test/fuzz/fuzz-crypto_hkdf_hmac_sha256_l32.o
     60  CXX      test/fuzz/fuzz-crypto_poly1305.o
     61  CXX      test/fuzz/fuzz-cuckoocache.o
     62  CXX      test/fuzz/fuzz-deserialize.o
     63  CXX      test/fuzz/fuzz-fee_rate.o
     64  CXX      test/fuzz/fuzz-fees.o
     65  CXX      test/fuzz/fuzz-flatfile.o
     66  CXX      test/fuzz/fuzz-float.o
     67test/fuzz/float.cpp: In function void float_fuzz_target(FuzzBufferType):
     68test/fuzz/float.cpp:60:30: warning: tmp may be used uninitialized in this function [-Wmaybe-uninitialized]
     69   60 |         assert(std::isnan(d) || d == d_deserialized);
     70      |                              ^~
     71  CXX      test/fuzz/fuzz-golomb_rice.o
     72  CXX      test/fuzz/fuzz-http_request.o
     73  CXX      test/fuzz/fuzz-i2p.o
     74  CXX      test/fuzz/fuzz-integer.o
     75  CXX      test/fuzz/fuzz-kitchen_sink.o
     76  CXX      test/fuzz/fuzz-load_external_block_file.o
     77  CXX      test/fuzz/fuzz-merkleblock.o
     78  CXX      test/fuzz/fuzz-message.o
     79  CXX      test/fuzz/fuzz-muhash.o
     80  CXX      test/fuzz/fuzz-multiplication_overflow.o
     81  CXX      test/fuzz/fuzz-net.o
     82  CXX      test/fuzz/fuzz-net_permissions.o
     83  CXX      test/fuzz/fuzz-netaddress.o
     84  CXX      test/fuzz/fuzz-netbase_dns_lookup.o
     85  CXX      test/fuzz/fuzz-node_eviction.o
     86  CXX      test/fuzz/fuzz-p2p_transport_serialization.o
     87  CXX      test/fuzz/fuzz-parse_hd_keypath.o
     88  CXX      test/fuzz/fuzz-policy_estimator.o
     89  CXX      test/fuzz/fuzz-policy_estimator_io.o
     90  CXX      test/fuzz/fuzz-pow.o
     91  CXX      test/fuzz/fuzz-primitives_transaction.o
     92  CXX      test/fuzz/fuzz-process_message.o
     93  CXX      test/fuzz/fuzz-process_messages.o
     94  CXX      test/fuzz/fuzz-protocol.o
     95  CXX      test/fuzz/fuzz-random.o
     96  CXX      test/fuzz/fuzz-rbf.o
     97  CXX      test/fuzz/fuzz-rolling_bloom_filter.o
     98  CXX      test/fuzz/fuzz-rpc.o
     99  CXX      test/fuzz/fuzz-script.o
    100  CXX      test/fuzz/fuzz-script_bitcoin_consensus.o
    101  CXX      test/fuzz/fuzz-script_descriptor_cache.o
    102  CXX      test/fuzz/fuzz-script_interpreter.o
    103  CXX      test/fuzz/fuzz-script_ops.o
    104  CXX      test/fuzz/fuzz-script_sigcache.o
    105  CXX      test/fuzz/fuzz-script_sign.o
    106  CXX      test/fuzz/fuzz-scriptnum_ops.o
    107  CXX      test/fuzz/fuzz-secp256k1_ec_seckey_import_export_der.o
    108  CXX      test/fuzz/fuzz-secp256k1_ecdsa_signature_parse_der_lax.o
    109  CXX      test/fuzz/fuzz-signature_checker.o
    110  CXX      test/fuzz/fuzz-signet.o
    111  CXX      test/fuzz/fuzz-socks5.o
    112  CXX      test/fuzz/fuzz-span.o
    113  CXX      test/fuzz/fuzz-string.o
    114  CXX      test/fuzz/fuzz-strprintf.o
    115  CXX      test/fuzz/fuzz-system.o
    116  CXX      test/fuzz/fuzz-timedata.o
    117  CXX      test/fuzz/fuzz-torcontrol.o
    118  CXX      test/fuzz/fuzz-tx_pool.o
    119  CXX      test/fuzz/fuzz-txrequest.o
    120  CXX      test/fuzz/fuzz-utxo_snapshot.o
    121  CXX      test/fuzz/fuzz-validation_load_mempool.o
    122  CXX      test/fuzz/fuzz-versionbits.o
    123  CXX      test/fuzz/libtest_fuzz_a-util.o
    124  AR       libtest_fuzz.a
    125  CXXLD    test/fuzz/fuzz
    126make[2]: Leaving directory '/home/vagrant/bitcoin/src'
    127make[1]: Leaving directory '/home/vagrant/bitcoin/src'
    128Making all in doc/man
    129make[1]: Entering directory '/home/vagrant/bitcoin/doc/man'
    130make[1]: Nothing to be done for 'all'.
    131make[1]: Leaving directory '/home/vagrant/bitcoin/doc/man'
    132make[1]: Entering directory '/home/vagrant/bitcoin'
    133make[1]: Nothing to be done for 'all-am'.
    134make[1]: Leaving directory '/home/vagrant/bitcoin'
    135
    136real	9m29.276s
    137user	8m34.052s
    138sys	0m51.490s
    

    (gui and bench were disabled for all builds - I expect the difference would be even more pronounced with those components also enabled).

    There may be a marginal speed up in compiling from scratch since the header is slimmed down, but I expect it’d be a small improvement.

    However, for me the improved recompilation time is not the main motivation for this change. There are two motivations which I think are more important:

    1. Slimming down the header file to just well-documented interface methods makes it much easier for people writing client code that relies on addrman to very quickly understand the class’s responsibilities and contracts without having to concern themselves with the internal implementation details.
    2. The upcoming rework of addrman (https://github.com/sipa/bitcoin/tree/202106_multiindex_addrman) can be achieved with just changes to addrman.cpp, leaving the interface and header file the same. That makes it much easier to review, verify that there are no functional changes, and test conformance between the old and new implementations.
  40. DrahtBot added the label Needs rebase on Sep 24, 2021
  41. amitiuttarwar force-pushed on Sep 24, 2021
  42. amitiuttarwar commented at 9:09 pm on September 24, 2021: contributor

    latest push:

    • rebased to address a small conflict with #23025
    • updated the scripted diffs to use grep to generate the file names
    • split up the first commit into two parts as suggested in #22950#pullrequestreview-761829604
    • moved around some code to address review comments
  43. DrahtBot removed the label Needs rebase on Sep 24, 2021
  44. in src/addrman.cpp:1013 in 21481fe2fe outdated
    1009@@ -1010,9 +1010,8 @@ CAddrInfo AddrManImpl::SelectTriedCollision_()
    1010     int tried_bucket = newInfo.GetTriedBucket(nKey, m_asmap);
    1011     int tried_bucket_pos = newInfo.GetBucketPosition(nKey, false, tried_bucket);
    1012 
    1013-    int id_old = vvTried[tried_bucket][tried_bucket_pos];
    1014-
    1015-    return mapInfo[id_old];
    1016+    auto info_old = mapInfo[vvTried[tried_bucket][tried_bucket_pos]];
    


    mzumsande commented at 9:05 pm on September 25, 2021:
    nit: I think using AddrInfo& instead of auto would make it clearer that it gets sliced to CAddress in the next line.

    amitiuttarwar commented at 2:44 am on September 29, 2021:
    makes sense, done
  45. in src/addrman.cpp:702 in 21481fe2fe outdated
    699 {
    700     AssertLockHeld(cs);
    701 
    702-    if (vRandom.empty())
    703-        return CAddrInfo();
    704+    if (vRandom.empty()) return {};
    


    mzumsande commented at 9:20 pm on September 25, 2021:
    Since this wasn’t obvious to me immediately, just noting that this works because the default constructor of std::pair value-initializes both elements (i.e. this returns a pair with a default-initialized CAddress and 0 for int64_t).

    amitiuttarwar commented at 2:44 am on September 29, 2021:
    great point, thanks for verifying & sharing!
  46. in src/addrman_impl.h:40 in f113b09d8e outdated
    10+ */
    11+class CAddrInfo : public CAddress
    12+{
    13+private:
    14+    //! last try whatsoever by us (memory only)
    15+    int64_t nLastTry{0};
    


    mzumsande commented at 9:38 pm on September 25, 2021:
    The move-only commit f113b09d8ea31bc2e1cf6a5178b3d1d93a1c9bb2 makes nLastTry and nLastCountAttempt private for one commit.

    amitiuttarwar commented at 2:45 am on September 29, 2021:
    fixed it
  47. mzumsande commented at 11:50 am on September 26, 2021: member

    Concept ACK

    I did a first round of review of the changes and it looks really good to me, just some minor comments below.

    One general point: I wonder whether the pattern AddrManImpl::Function() and AddrManImpl::Function_() should still exist in the context of the PImpl, because now there are already two sets of functions that mostly just pass things along. With the recent changes to Check() allowing much more flexibility when and how often to call the checks, maybe it is not necessary anymore and the Check() calls could be merged into the Function_() ? (Changing this shouldn’t be part of this PR, just wanted to bring up the idea).

    a typo: “objecjt” in 527cbd168acb776eb03e4506f186afb5e0a85d90 commit message

  48. in src/addrman_impl.h:23 in 527cbd168a outdated
    18+    template <typename Stream>
    19+    void Unserialize(Stream& s_) EXCLUSIVE_LOCKS_REQUIRED(!cs);
    20+
    21+    size_t size() const EXCLUSIVE_LOCKS_REQUIRED(!cs);
    22+
    23+    bool Add(const std::vector<CAddress> &vAddr, const CNetAddr& source, int64_t nTimePenalty = 0)
    


    martinus commented at 9:06 am on September 27, 2021:
    nit: I’d remove all the default arguments in AddrManImpl, at least in the public interface. Having defaults in both AddrMan and AddrManImpl can be a source of error when not both are updated

    amitiuttarwar commented at 2:46 am on September 29, 2021:

    definitely agree with this feedback! disagree with classifying it as a nit :)

    I believe I removed all the defaults from the public interface, but I left them on the private functions because they aren’t redundant in that situation.

    thank you!

  49. jnewbery commented at 9:37 am on September 27, 2021: member

    One general point: I wonder whether the pattern AddrManImpl::Function() and AddrManImpl::Function_() should still exist in the context of the PImpl, because now there are already two sets of functions that mostly just pass things along. With the recent changes to Check() allowing much more flexibility when and how often to call the checks, maybe it is not necessary anymore and the Check() calls could be merged into the Function_() ? (Changing this shouldn’t be part of this PR, just wanted to bring up the idea). @mzumsande - I agree that this would be nice. Once this PR has been merged, that kind of change can be made without touching the header file and recompiling anything.

    Do you think it would be overkill to have a variadic template CheckExecuteCheck() function that implements the pattern of locking cs, running Check(), calling the internal function, then running Check() again?

  50. jnewbery commented at 10:31 am on September 27, 2021: member

    utACK 0cc7031002079b35953bac1e9d4268b6321893c7

    If you retouch again, you could move the function bodies of CAddrMan::~CAddrMan() and CAddrMan::GetAsmap() into addrman.cpp in the first commit ([move-only] Move CAddrMan function definitions to cpp) to make the second commit cleaner.

  51. DrahtBot added the label Needs rebase on Sep 27, 2021
  52. in src/addrman.cpp:1087 in 29252f3f60 outdated
    1082@@ -1080,10 +1083,9 @@ std::vector<CAddress> AddrManImpl::GetAddr(size_t max_addresses, size_t max_pct,
    1083 {
    1084     LOCK(cs);
    1085     Check();
    1086-    std::vector<CAddress> vAddr;
    1087-    GetAddr_(vAddr, max_addresses, max_pct, network);
    1088+    const auto addresses = GetAddr_(max_addresses, max_pct, network);
    


    theuni commented at 10:26 pm on September 27, 2021:

    Nit: If the compiler doesn’t elide this, the const will force an extra copy: https://stackoverflow.com/questions/25784544/will-returning-a-const-object-from-a-function-prevent-move-construction-from-out/25786015

    Edit: TIL. Looks like this is “permitted” to be elided according to the c++17 final working draft: (emphasis mine)

    This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies):

    • (1.1) in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function parameter or a variable introduced by the exception-declaration of a handler (18.3)) with the same type (ignoring cv-qualification) as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function call’s return object

    But “required” according to cppreference:

    Under the following circumstances, the compilers are required to omit the copy and move construction of class objects, even if the copy/move constructor and the destructor have observable side-effects. The objects are constructed directly into the storage where they would otherwise be copied/moved to. The copy/move constructors need not be present or accessible:

    • In a return statement, when the operand is a prvalue of the same class type (ignoring cv-qualification) as the function return type

    So.. 🤷

    Maybe that draft wasn’t final wrt elision or maybe cppreference is wrong. It really doesn’t matter either way :)


    amitiuttarwar commented at 2:50 am on September 29, 2021:

    that’s so interesting, I hadn’t thought about the correlation between copy elision and temporary const variables. thanks for digging in!

    next question is how I get my hands on the spec..


    sipa commented at 3:15 am on September 29, 2021:
    It certainly looks like in practice the const doesn’t prevent elision: https://godbolt.org/z/Go8xhGP1b
  53. theuni commented at 10:59 pm on September 27, 2021: member

    utACK. Needs rebase.

    Only skimmed the move-only and scripted-diff commits. Reviewed in-order and each commit is well-explained. +1 to the contained interface.

    Agree with @mzumsande about cleaning up the internal functions as a next step, but not as a part of this PR.

    It would be nice if we had a time type so it’d be obvious what the second member of the pair was in the return values from Select and friends, but that’s not worth addressing here.

    My only other nit has already been pointed out here: #22950 (review) .

  54. [move-only] Move CAddrMan function definitions to cpp
    In preparation for introducing the pimpl pattern to addrman, move all function
    bodies out of the header file.
    
    Review hint: use git diff --color-moved=dimmed-zebra
    --color-moved-ws=ignore-all-space
    5faa7dd6d8
  55. [move-only] Match ordering of CAddrMan declarations and definitions
    Also move `Check` and `ForceCheckAddrman` to be after the `FunctionName_` functions.
    
    Review hint: use git diff --color-moved=dimmed-zebra
    --color-moved-ws=ignore-all-space
    f2e5f38f09
  56. [addrman] Introduce CAddrMan::Impl to encapsulate addrman implementation.
    Introduce the pimpl pattern for CAddrMan to separate the implementation details
    from the externally used object representation. This reduces compile-time
    dependencies and conceptually clarifies AddrMan's interface from the
    implementation specifics.
    
    Since the unit & fuzz tests currently rely on accessing CAddrMan internals, this
    commit introduces addrman_impl.h, which is exclusively imported by addrman.cpp
    and test files.
    
    Review hint: git diff --color-moved=dimmed-zebra
    --color-moved-ws=ignore-all-space
    8af5b54f97
  57. [net, addrman] Remove external dependencies on CAddrInfo objects
    CAddrInfo objects are an implementation detail of how AddrMan manages and adds
    metadata to different records. Encapsulate this logic by updating Select &
    SelectTriedCollision to return the additional info that the callers need.
    7cba9d5618
  58. [move-only] Move CAddrInfo to test-only header file
    Now that no bitcoind callers require knowledge of the CAddrInfo object, it can
    be moved into the test-only header file.
    
    Review hint: use git diff --color-moved=dimmed-zebra
    --color-moved-ws=ignore-all-space
    e3f1ea659c
  59. [addrman] Change CAddrInfo access
    Since knowledge of CAddrInfo is limited to callsites that import
    addrman_impl.h, only objects in addrman.cpp or the tests have access. Thus we
    can remove calling them friends and make the members public.
    7cf41bbb38
  60. [move-only] Move constants to test-only header
    Review hint: git diff --color-moved=dimmed-zebra
    --color-moved-ws=ignore-all-space
    40acd6fc9a
  61. [refactor] Update GetAddr_() function signature
    Update so the internal function signature matches the external one, as is the
    case for the other addrman functions.
    14f9e000d0
  62. [doc] Update comments
    Maintain comments on the external interfaces rather than on the internal
    functions that implement them.
    29727c2aa1
  63. [includes] Fix up included files 3c263d3f63
  64. scripted-diff: Rename CAddrMan to AddrMan
    -BEGIN VERIFY SCRIPT-
    git grep -l CAddrMan src/ test/ | xargs sed -i 's/CAddrMan/AddrMan/g'
    -END VERIFY SCRIPT-
    dd8f7f2500
  65. scripted-diff: Rename CAddrInfo to AddrInfo
    -BEGIN VERIFY SCRIPT-
    git grep -l CAddrInfo src/ | xargs sed -i 's/CAddrInfo/AddrInfo/g'
    -END VERIFY SCRIPT-
    375750387e
  66. [style] Run changed files through clang formatter. 021f86953e
  67. amitiuttarwar force-pushed on Sep 29, 2021
  68. amitiuttarwar commented at 3:28 am on September 29, 2021: contributor

    Thank you for the thoughtful reviews @mzumsande, @martinus, @jnewbery & @theuni !! I believe I’ve incorporated all the review comments.

    A note on the latest push:

    Mostly I rebased & addressed comments, but in doing so I added another commit- f2e5f38f09ee40933f752680fe7d75ee8e529fae. I generally attempted to get the grouping / ordering of functions to be consistent, between the declarations & definitions in AddrMan & AddrManImpl. Since (for now) the functions are repeated a few times, having predictable ordering seems valuable for making the file easier for human parsing.

    Thoughts on future work:

    1. Reduce the redundancy of having all interface functions defined 3 times. @mzumsande, @jnewbery, @theuni

    With the recent changes to Check() allowing much more flexibility when and how often to call the checks, maybe it is not necessary anymore and the Check() calls could be merged into the Function_() ?

    Do you think it would be overkill to have a variadic template CheckExecuteCheck() function that implements the pattern of locking cs, running Check(), calling the internal function, then running Check() again?

    Agree with @mzumsande about cleaning up the internal functions as a next step,

    Hehe, I’ve been meaning to play out the variadic template and take a look, it seems very appealing to reduce the boilerplate code. That said, I think incorporating Check() into Function_() would also be an improvement. One of these improvements should come next.

    2. Improving the interface of Select & SelectTriedCollision @theuni

    It would be nice if we had a time type so it’d be obvious what the second member of the pair was in the return values from Select and friends

    I agree, it’s unfortunate that this PR made things slightly worse because addrman’s interface now exposes an int64_t to represent time. I opted for this solution over making it a chrono in an attempt to keep the changes of this PR more mechanical.

    Both of these functions are exclusively used by CConnman::ThreadOpenConnections, and I think the best solution would be to simplify the interfaces. When connman identifies that it is time to open a feeler, I think it should simply ask addrman for the top priority address. The concept of prioritizing tried collisions over a random connection should be internal to addrman.

    Making the code change is not difficult, but we’d have to make the change with care because it would logically change the handling of repeated failures here. So, this is a long winded way of saying, I hope returning a pair is a temporary thing.

    3. Improving doxygen comments of AddrMan functions - there are a few AddrMan interface functions that could benefit from improved comments.

  69. DrahtBot removed the label Needs rebase on Sep 29, 2021
  70. jnewbery commented at 8:36 am on September 29, 2021: member

    ACK 021f86953e8a1dff8ecc768186368d345c865cc2

    The incremental commits make this really easy to review. Thank you!

  71. in src/net.h:800 in 021f86953e
    796@@ -797,7 +797,7 @@ class CConnman
    797         m_onion_binds = connOptions.onion_binds;
    798     }
    799 
    800-    CConnman(uint64_t seed0, uint64_t seed1, CAddrMan& addrman, bool network_active = true);
    801+    CConnman(uint64_t seed0, uint64_t seed1, AddrMan& addrman, bool network_active = true);
    


    rajarshimaitra commented at 1:28 pm on September 29, 2021:

    As we are dropping the C from AddrMan and AddrInfo, is that a convention that will be applied elsewhere in the codebase from now on too?

    My understanding was the C before these structures indicated they are a class. Are we deciding to drop that indication from now on?

    Just trying to understand the rationale of name changes in https://github.com/bitcoin/bitcoin/pull/22950/commits/dd8f7f250095e58bbf4cd4effb481b52143bd1ed and https://github.com/bitcoin/bitcoin/pull/22950/commits/375750387e35ed751d1f5ab48860bdec93977f64


    martinus commented at 1:37 pm on September 29, 2021:
    Since #10461 the developer-notes.md states “Do not prefix class names with C

    michaelfolkson commented at 2:36 pm on September 29, 2021:
    @rajarshimaitra: See https://bitcoin.stackexchange.com/questions/106558/what-do-the-classes-chainstatemanager-cchainstate-cchain-and-blockmanager-do-i or the Core PR review club that links to. New classes shouldn’t have C in front of them but old ones aren’t being renamed.

    rajarshimaitra commented at 3:15 pm on September 29, 2021:

    Oh ok.. Thanks for the clarification. @michaelfolkson Just to understand further, its ok for old ones to drop the C in later PRs if they are being touched right? Just like the case here with AddrMan and AddrInfo?

    So eventually the goal is to remove “C for classes” everywhere?


    sipa commented at 3:17 pm on September 29, 2021:
    New code should follow the style described in the developer notes. If you’re substantially changing a piece of code, it makes sense to also update it to follow that convention. This isn’t just the case for the “C” prefix but also variable naming, bracing, …
  72. rajarshimaitra commented at 1:29 pm on September 29, 2021: contributor

    Concept + Code Review ACK https://github.com/bitcoin/bitcoin/pull/22950/commits/021f86953e8a1dff8ecc768186368d345c865cc2

    To check the effect of compilation firewall, I added a private integer member in CAddrMan of master and AddrManImpl of #22950.

    Verified that the addition causes recompilation of many tests and and other object files in master, where as in case of #22950, only the compilation of relevant test files (that includes addrman_impl header) are triggered..

    Overall the commit changes are very clear and easy to review. Compilation time reduction is significant and very much observable.

    Also thanks for such an elaborate review club description. Helped me to learn a ton about compiler firewall.

    Below is one naming question for my own understanding.

  73. GeneFerneau commented at 7:37 pm on September 29, 2021: none
    utACK 021f869
  74. mzumsande commented at 8:48 pm on October 2, 2021: member

    ACK 021f86953e8a1dff8ecc768186368d345c865cc2

    Reviewed the code and did some sanity checks running this for a while on mainnet with -checkaddrman=1, but no testing beyond that.

  75. in src/addrman.cpp:762 in 14f9e000d0 outdated
    758@@ -759,8 +759,9 @@ void AddrManImpl::GetAddr_(std::vector<CAddress>& vAddr, size_t max_addresses, s
    759 
    760     // gather a list of random nodes, skipping those of low quality
    761     const int64_t now{GetAdjustedTime()};
    762+    std::vector<CAddress> addresses;
    


    theuni commented at 2:11 pm on October 5, 2021:
    Tiny nit for one of the refactor PRs: addresses.reserve(std::min(vRandom.size(), nNodes));
  76. theuni approved
  77. theuni commented at 2:37 pm on October 5, 2021: member

    ACK 021f86953e8a1dff8ecc768186368d345c865cc2

    I had a last-minute panic that some of the unsuspicious moves from CAddrInfo to CAddress could quietly change the behavior of calls like addr.IsValid() if isValid had previously been overridden by CAddrInfo, but fortunately no such overrides existed.

  78. MarcoFalke merged this on Oct 5, 2021
  79. MarcoFalke closed this on Oct 5, 2021

  80. theStack commented at 4:08 pm on October 5, 2021: member
    Post-merge concept and code-review ACK 021f86953e8a1dff8ecc768186368d345c865cc2
  81. sidhujag referenced this in commit 4438e7dc09 on Oct 5, 2021
  82. laanwj removed this from the "Blockers" column in a project

  83. Empact referenced this in commit f396b3d3fd on Apr 22, 2022
  84. Fabcien referenced this in commit 9e405b4d2a on Oct 21, 2022
  85. Fabcien referenced this in commit aa2641ec9a on Oct 21, 2022
  86. Fabcien referenced this in commit 447abdcd54 on Oct 21, 2022
  87. Fabcien referenced this in commit 4413bf85b4 on Oct 21, 2022
  88. Fabcien referenced this in commit 9e47c5912d on Oct 21, 2022
  89. Fabcien referenced this in commit 88cbd67a76 on Oct 21, 2022
  90. Fabcien referenced this in commit a4315672f0 on Oct 21, 2022
  91. Fabcien referenced this in commit 4956ab6420 on Oct 21, 2022
  92. Fabcien referenced this in commit 4e34300aa0 on Oct 21, 2022
  93. Fabcien referenced this in commit e19231be72 on Oct 21, 2022
  94. Fabcien referenced this in commit 10ab03ff92 on Oct 21, 2022
  95. Fabcien referenced this in commit 04cc356dea on Oct 21, 2022
  96. DrahtBot locked this on Oct 30, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me