test: add end-to-end tests for CConnman and PeerManager #26812

pull vasild wants to merge 6 commits into bitcoin:master from vasild:e2e_tests changing 11 files +821 −93
  1. vasild commented at 2:39 pm on January 4, 2023: contributor

    Add fuzz and unit tests that write (fuzzed) data to a mocked socket and inspect what CConnman/PeerManager have written back to the socket, or check the internal state to verify that the behavior is as expected.

    This is now possible, after most of #21878 has been merged - we don’t do any syscalls (e.g. connect(), recv()) from the high level code and using a mocked socket allows testing the entire networking stack without opening actual network connections.

  2. DrahtBot commented at 2:39 pm on January 4, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK Sjors
    Stale ACK jonatack

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label Tests on Jan 4, 2023
  4. in src/test/netmsg_tests.cpp:74 in 9b7e9dad27 outdated
    69+    DebugLogHelper debug_log{"getblocks -1 to end"};
    70+    pipes->recv.PushNetMsg(NetMsgType::GETBLOCKS, block_locator, hash_stop);
    71+    debug_log.EndAndThrowIfNotFound(30s);
    72+}
    73+
    74+BOOST_AUTO_TEST_CASE(ping)
    


    vasild commented at 2:41 pm on January 4, 2023:

    dergoegge commented at 9:08 pm on January 4, 2023:
    Key difference in #25515 is that it is testing net processing in isolation (at least isolated from CConnman), which imo is something we definitely should be doing. We already have e2e tests i.e. the functional tests, so I don’t see the benefit to your approach here (besides the c++ tests being faster than the python ones).

    vasild commented at 2:16 pm on January 12, 2023:

    Yes, both testing in isolation and e2e is good to have. The difference between the tests included in this PR and the functional tests is that when the test code runs inside the same process:

    • it has greater control and observability - the test can read any global or member variable and can call arbitrary functions
    • it can do fuzzing

    I think all those tests are complementary, not mutually exclusive.


    jonatack commented at 3:03 pm on April 19, 2023:
    The unit tests are what I run locally over and over, as they run so much more quickly. IIRC there was a project idea a few years ago to convert the functional tests in Python to C++. Having better unit test coverage, or shifting coverage from the Python tests to C++ ones, seems beneficial.
  5. in src/test/fuzz/netmsg.cpp:44 in 9b7e9dad27 outdated
    38+        }                                                  \
    39+    } const static g_##msg_type##count_before_main;        \
    40+    FUZZ_TARGET_INIT(netmsg_##msg_type, initialize_netmsg) \
    41+    {                                                      \
    42+        fuzz_target(buffer, #msg_type);                    \
    43+    }
    


    vasild commented at 2:46 pm on January 4, 2023:
    This is duplicated from src/test/fuzz/process_message.cpp. It should be deduplicated by putting it in a common header file, but I wonder if the new tests in fuzz/netmsg.cpp make the tests from fuzz/process_message.cpp and fuzz/process_messages.cpp obsolete and whether the latter should be removed. @MarcoFalke, as the author of process_message[s], what do you think?

    maflcko commented at 3:12 pm on January 12, 2023:
    Why not de-duplicate by putting it in the existing ./src/test/fuzz/process_message.cpp file?

    vasild commented at 4:03 pm on January 12, 2023:
    You mean to put the entire contents of the newly introduced fuzz/netmsg.cpp inside fuzz/process_message.cpp?

    maflcko commented at 4:07 pm on January 12, 2023:
    yes

    vasild commented at 12:58 pm on January 14, 2023:
    Looks plausible. I will try to do that and see how it will look.

    vasild commented at 1:05 pm on February 20, 2023:
    Done. This also reduced the size of the diff (the last commit). Thanks!

    vasild commented at 10:26 am on June 6, 2023:
    I put the new e2e tests in a separate new file because this code (which I was trying to avoid duplicating) is gone after https://github.com/bitcoin/bitcoin/pull/27766
  6. in src/test/fuzz/netmsg.cpp:58 in 9b7e9dad27 outdated
    52+        MineBlock(g_setup->m_node, CScript() << OP_TRUE);
    53+    }
    54+    SyncWithValidationInterfaceQueue();
    55+}
    56+
    57+static void fuzz_target(FuzzBufferType buffer, const std::string& limit_to_message_type)
    


    dergoegge commented at 9:24 pm on January 4, 2023:

    This target is averaging 2 execs/s on my machine. A good target should average a 1000 execs/s (https://github.com/google/fuzzing/blob/master/docs/good-fuzz-target.md#speed).

    I don’t quite see the upside of this target over the process_message(s) targets. It might be able to find additional bugs if run with ThreadSanitizer but the current performance is gonna stop it from doing so effectively.

    Fuzzing is particularly effective when the targets are kept narrow. For example: fuzzing net, net processing, validation or any of their sub components (e.g. TxRequestTracker or AddrMan) in isolation makes sense but all of it together will be inefficient.


    vasild commented at 2:21 pm on January 12, 2023:

    Pushed an improvement that changes it to about 17 exec/s, it tends to increase if the test is left to run for some time.

    IMO e2e fuzzing has some value.


    dergoegge commented at 5:31 pm on January 12, 2023:

    IMO e2e fuzzing has some value.

    Yes, but I think it should focus on what you are trying to test. If you keep the target to broad the fuzzer is gonna have a hard time finding interesting test cases (especially if the target is slow). One obvious way of focusing is to only fuzz smaller units of code and in the e2e case you probably want to narrow the way you are interpreting the fuzz input so that you are actually testing what you want to test.

    The approach here currently seems to be very similar to the process_message(s) targets (which is why you suggested to replace them iiuc), in that you are interpreting most of the fuzz input as a sequence of arbitrary messages, so the questions that I have are:

    • What do you want to test here? Message processing? connman internals? the interactions between connman/peerman?
    • What bugs is your target capable of finding that process_message(s) (or other targets) are not? In other words which areas/interactions is this fuzzing that are previously not covered?
    • Are the message contents (and their mutations) of all message types useful for what you are trying to test here? e.g. does it matter if a tx message contains a transaction with 1 input or 5? because the fuzzer will likely find that difference interesting but it doesn’t seem like this would add value to what you are trying to test here, since the goal (correct me if I am wrong) isn’t fuzzing validation.
    • Could the input interpretation be narrowed to only include the messages that lead to interactions you are trying to test?

    brunoerg commented at 8:12 pm on July 20, 2023:
    After running it for some time, the maximum exec/s I got was 21 exec/s which is extremely low. It’s not all the targets that reach more than 1000 exec/s here but 21 is too slow.

    vasild commented at 11:58 am on December 2, 2023:
    “resolving” this thread because I removed the fuzz tests from this PR
  7. in src/test/fuzz/netmsg.cpp:101 in 9b7e9dad27 outdated
     96+            BlockTransactionsRequest{.blockhash = chainstate.m_chain.Tip()->GetBlockHash(),
     97+                                     .indexes = {1, UINT16_MAX}});
     98+        if (pipes->send.GetBytes(buf, sizeof(buf)) == 0) {
     99+            break;
    100+        }
    101+        std::this_thread::sleep_for(50ms);
    


    dergoegge commented at 9:27 pm on January 4, 2023:
    I suspect this is the main reason for the poor performance, but the additional threading also doesn’t help.

    vasild commented at 2:22 pm on January 12, 2023:
    It wasn’t but I ditched it anyway in 9b7e9dad27...daee83c1c5.
  8. in src/test/fuzz/netmsg.cpp:96 in 9b7e9dad27 outdated
     97+                                     .indexes = {1, UINT16_MAX}});
     98+        if (pipes->send.GetBytes(buf, sizeof(buf)) == 0) {
     99+            break;
    100+        }
    101+        std::this_thread::sleep_for(50ms);
    102+    }
    


    dergoegge commented at 9:28 pm on January 4, 2023:
    I think this is missing a call to SyncWithValidationInterfaceQueue here.

    vasild commented at 2:24 pm on January 12, 2023:
    Maybe. I am probably missing something. Can you elaborate why it is missing a call to SyncWithValidationInterfaceQueue()?

    maflcko commented at 2:29 pm on January 12, 2023:
    If (in theory) it is possible to push an event on the queue, it should be handled before starting the next input. (Not sure if applicable here)

    vasild commented at 2:47 pm on January 12, 2023:
    Added.
  9. vasild force-pushed on Jan 12, 2023
  10. vasild commented at 2:07 pm on January 12, 2023: contributor
    9b7e9dad27...daee83c1c5: optimize the fuzz test a bit
  11. vasild force-pushed on Jan 12, 2023
  12. DrahtBot added the label Needs rebase on Feb 1, 2023
  13. vasild force-pushed on Feb 9, 2023
  14. vasild commented at 2:00 pm on February 9, 2023: contributor
    32ab679f54...7c591c868d: rebase due to conflicts
  15. DrahtBot removed the label Needs rebase on Feb 9, 2023
  16. vasild force-pushed on Feb 20, 2023
  17. vasild commented at 1:07 pm on February 20, 2023: contributor
    7c591c868d...a81fe4ff9b: rebase and put the fuzz tests in fuzz/process_message.cpp instead of in a new file, as suggested.
  18. jonatack commented at 3:06 pm on April 19, 2023: contributor

    There may be a silent merge conflict. After rebase to current master at fde224a6610699a6a28cc27e299ac14cbf7e16ca, building with Clang 16 fails at commit test: add a mocked Sock that allows inspecting what has been Send() to it.

     0test/util/net.cpp:280:12: error: no viable conversion from returned value of type 'const CNetMessage' to function return type 'std::optional<CNetMessage>'
     1    return msg;
     2           ^~~
     3/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:736:41: note: candidate constructor not viable: no known conversion from 'const CNetMessage' to 'const optional<CNetMessage> &' for 1st argument
     4    _LIBCPP_INLINE_VISIBILITY constexpr optional(const optional&) = default;
     5                                        ^
     6/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:737:41: note: candidate constructor not viable: no known conversion from 'const CNetMessage' to 'optional<CNetMessage> &&' for 1st argument
     7    _LIBCPP_INLINE_VISIBILITY constexpr optional(optional&&) = default;
     8                                        ^
     9/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:738:41: note: candidate constructor not viable: no known conversion from 'const CNetMessage' to 'nullopt_t' for 1st argument
    10    _LIBCPP_INLINE_VISIBILITY constexpr optional(nullopt_t) noexcept {}
    11                                        ^
    12/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:762:15: note: candidate template ignored: requirement '_CheckOptionalArgsCtor<const CNetMessage &>::__enable_implicit()' was not satisfied [with _Up = const CNetMessage &]
    13    constexpr optional(_Up&& __v)
    14              ^
    15/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:777:35: note: candidate template ignored: could not match 'const optional<_Up>' against 'const CNetMessage'
    16    _LIBCPP_CONSTEXPR_SINCE_CXX20 optional(const optional<_Up>& __v)
    17                                  ^
    18/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:795:35: note: candidate template ignored: could not match 'optional<_Up>' against 'const CNetMessage'
    19    _LIBCPP_CONSTEXPR_SINCE_CXX20 optional(optional<_Up>&& __v)
    20                                  ^
    21/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:748:24: note: explicit constructor is not a candidate
    22    constexpr explicit optional(_InPlaceT, _Args&&... __args)
    23                       ^
    24/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:769:24: note: explicit constructor is not a candidate
    25    constexpr explicit optional(_Up&& __v)
    26                       ^
    27/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:785:44: note: explicit constructor is not a candidate
    28    _LIBCPP_CONSTEXPR_SINCE_CXX20 explicit optional(const optional<_Up>& __v)
    29                                           ^
    30/opt/homebrew/opt/llvm/bin/../include/c++/v1/optional:803:44: note: explicit constructor is not a candidate
    31    _LIBCPP_CONSTEXPR_SINCE_CXX20 explicit optional(optional<_Up>&& __v)
    32                                           ^
    331 error generated.
    34make[2]: *** [test/util/libtest_util_a-net.o] Error 1
    35make[2]: *** Waiting for unfinished jobs....
    36make[1]: *** [all-recursive] Error 1
    37make: *** [all-recursive] Error 1
    
    0$ clang --version
    1Homebrew clang version 16.0.1
    2Target: arm64-apple-darwin22.4.0
    3Thread model: posix
    
  19. dergoegge commented at 3:12 pm on April 19, 2023: member

    There may be a silent merge conflict.

    Probably because of #27324, copying CNetMessages is no longer possible.

  20. DrahtBot added the label Needs rebase on Apr 26, 2023
  21. vasild force-pushed on May 12, 2023
  22. vasild commented at 11:34 am on May 12, 2023: contributor
    a81fe4ff9b...f559067e27: rebase due to conflicts, thanks for looking into this!
  23. DrahtBot removed the label Needs rebase on May 12, 2023
  24. DrahtBot added the label Needs rebase on May 24, 2023
  25. vasild force-pushed on Jun 6, 2023
  26. vasild commented at 10:23 am on June 6, 2023: contributor
    f559067e27...4c867de996: rebase due to conflicts
  27. in src/test/fuzz/process_message_e2e.cpp:84 in 4c867de996 outdated
    79+    LIMITED_WHILE(fuzzed_data_provider.remaining_bytes() > 0, 100) {
    80+
    81+        const std::string message_type{
    82+            LIMIT_TO_MESSAGE_TYPE.empty() ?
    83+                fuzzed_data_provider.ConsumeBytesAsString(CMessageHeader::COMMAND_SIZE) :
    84+                LIMIT_TO_MESSAGE_TYPE};
    


    maflcko commented at 10:53 am on June 6, 2023:
    Changing the fuzz input format with a run-time setting seems fragile and not intuitive when one wants to re-use fuzz inputs or reproduce runs.

    vasild commented at 12:31 pm on June 6, 2023:

    Are you suggesting to replace it with this:

    0        const std::string random_message_type{fuzzed_data_provider.ConsumeBytesAsString(CMessageHeader::COMMAND_SIZE)};
    1        if (!LIMIT_TO_MESSAGE_TYPE.empty() && random_message_type != LIMIT_TO_MESSAGE_TYPE) { 
    2            continue;
    3        } 
    

    vasild commented at 7:50 am on June 8, 2023:
    I would address your suggestion, but I am not sure I understand it. Is the above what you meant or did I get it wrong?

    maflcko commented at 8:24 am on June 8, 2023:
    Yeah, I think you can just copy paste it from the other test, no?

    vasild commented at 9:22 am on June 8, 2023:

    Done. Now it is the same as in process_message.cpp.

    It seems inefficient to me - if LIMIT_TO_MESSAGE_TYPE is set to e.g. filterclear the fuzzer has to brute force all possible strings with length 11 to find it? That is 288 possibilities.


    maflcko commented at 9:29 am on June 8, 2023:
    A modern fuzz engine will read the LIMIT_TO_MESSAGE_TYPE and inject it into the fuzz input, so it shouldn’t take more than a few seconds to guess. In any case it shouldn’t take 2^88 tries

    vasild commented at 3:43 pm on June 26, 2023:
    Marking this as resolved, let me know if you think it is not.
  28. in src/test/fuzz/process_message_e2e.cpp:86 in 4c867de996 outdated
    81+        const std::string message_type{
    82+            LIMIT_TO_MESSAGE_TYPE.empty() ?
    83+                fuzzed_data_provider.ConsumeBytesAsString(CMessageHeader::COMMAND_SIZE) :
    84+                LIMIT_TO_MESSAGE_TYPE};
    85+
    86+        CDataStream payload{
    


    maflcko commented at 10:54 am on June 6, 2023:
    Could this use DataStream, or drop it completely?

    vasild commented at 12:36 pm on June 6, 2023:
    Good point, dropped it!
  29. DrahtBot removed the label Needs rebase on Jun 6, 2023
  30. vasild force-pushed on Jun 8, 2023
  31. vasild commented at 7:48 am on June 8, 2023: contributor
    4c867de996...35fc849412: address a suggestion (remove usage of CDataStream)
  32. DrahtBot added the label CI failed on Jun 8, 2023
  33. vasild force-pushed on Jun 8, 2023
  34. vasild commented at 9:12 am on June 8, 2023: contributor

    35fc849412...bedbdf4a15: address #26812 (review)

    bedbdf4a15...b96bd52f3a: ditto, but move the check to the start of the test

  35. vasild force-pushed on Jun 8, 2023
  36. vasild force-pushed on Jun 26, 2023
  37. vasild commented at 3:48 pm on June 26, 2023: contributor
    b96bd52f3a...4557cc336f: rebase
  38. DrahtBot removed the label CI failed on Jun 26, 2023
  39. in src/test/util/setup_common.h:16 in 4557cc336f outdated
    12@@ -13,6 +13,7 @@
    13 #include <pubkey.h>
    14 #include <random.h>
    15 #include <stdexcept>
    16+#include <test/util/net.h>
    


    jonatack commented at 10:38 pm on June 27, 2023:
    Rather than including everything, it might be good to have a separate compilation unit for code that needs to be in util/setup_common, versus the current util/net code that is only needed for a handful of the tests.

    vasild commented at 7:21 am on June 28, 2023:

    I tried extracting the mocked sockets out from test/util/net.h into a separate test/util/mocked_sockets.h and then compared the re-compilation time if that is modified (e.g. something in DynSock declaration is changed). In both cases recompilation takes ~28 seconds, so it makes no difference (in practice).

    Here is the change (on top of this PR):

      0commit 398a2d5717b6ef6760e48eac92aa92127157a67a (HEAD -> e2e_tests)
      1Parent: 4557cc336fd8eb321b0db024b70213f46017071c
      2Author:     Vasil Dimov <vd@FreeBSD.org>
      3AuthorDate: Wed Jun 28 09:10:41 2023 +0200
      4Commit:     Vasil Dimov <vd@FreeBSD.org>
      5CommitDate: Wed Jun 28 09:10:41 2023 +0200
      6gpg: Signature made Wed Jun 28 09:11:03 2023 CEST
      7gpg:                using RSA key E64D8D45614DB07545D9CCC154DF06F64B55CBBF
      8gpg: Good signature from "Vasil Dimov <vd@myforest.net>" [ultimate]
      9gpg:                 aka "Vasil Dimov <vd@FreeBSD.org>" [ultimate]
     10gpg:                 aka "Vasil Dimov <vasild@gmail.com>" [ultimate]
     11
     12
     13    wip: mocked_sockets.h
     14
     15diff --git a/src/test/util/net.h b/src/test/util/mocked_sockets.h
     16similarity index 67%
     17copy from src/test/util/net.h
     18copy to src/test/util/mocked_sockets.h
     19index b5ab7699f0..cdf6cb5d96 100644
     20--- a/src/test/util/net.h
     21+++ b/src/test/util/mocked_sockets.h
     22@@ -1,111 +1,27 @@
     23-// Copyright (c) 2020-2022 The Bitcoin Core developers
     24+// Copyright (c) 2023-2023 The Bitcoin Core developers
     25 // Distributed under the MIT software license, see the accompanying
     26 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
     27 
     28-#ifndef BITCOIN_TEST_UTIL_NET_H
     29-#define BITCOIN_TEST_UTIL_NET_H
     30+#ifndef BITCOIN_TEST_UTIL_MOCKED_SOCKETS_H
     31+#define BITCOIN_TEST_UTIL_MOCKED_SOCKETS_H
     32 
     33 #include <compat/compat.h>
     34 #include <net.h>
     35-#include <netaddress.h>
     36 #include <netmessagemaker.h>
     37-#include <node/eviction.h>
     38-#include <util/sock.h>
     39 #include <sync.h>
     40+#include <util/sock.h>
     41+#include <version.h>
     42 
     43-#include <array>
     44-#include <cassert>
     45 #include <chrono>
     46-#include <condition_variable>
     47-#include <cstring>
     48+#include <cstdint>
     49 #include <memory>
     50 #include <optional>
     51 #include <string>
     52 #include <vector>
     53 
     54-struct ConnmanTestMsg : public CConnman {
     55-    using CConnman::CConnman;
     56-
     57-    void SetPeerConnectTimeout(std::chrono::seconds timeout)
     58-    {
     59-        m_peer_connect_timeout = timeout;
     60-    }
     61-
     62-    void AddTestNode(CNode& node)
     63-    {
     64-        LOCK(m_nodes_mutex);
     65-        m_nodes.push_back(&node);
     66-    }
     67-    void ClearTestNodes()
     68-    {
     69-        LOCK(m_nodes_mutex);
     70-        for (CNode* node : m_nodes) {
     71-            delete node;
     72-        }
     73-        m_nodes.clear();
     74-    }
     75-
     76-    void Handshake(CNode& node,
     77-                   bool successfully_connected,
     78-                   ServiceFlags remote_services,
     79-                   ServiceFlags local_services,
     80-                   int32_t version,
     81-                   bool relay_txs)
     82-        EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
     83-
     84-    void ProcessMessagesOnce(CNode& node) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex) { m_msgproc->ProcessMessages(&node, flagInterruptMsgProc); }
     85-
     86-    void NodeReceiveMsgBytes(CNode& node, Span<const uint8_t> msg_bytes, bool& complete) const;
     87-
     88-    bool ReceiveMsgFrom(CNode& node, CSerializedNetMsg& ser_msg) const;
     89-};
     90-
     91-constexpr ServiceFlags ALL_SERVICE_FLAGS[]{
     92-    NODE_NONE,
     93-    NODE_NETWORK,
     94-    NODE_BLOOM,
     95-    NODE_WITNESS,
     96-    NODE_COMPACT_FILTERS,
     97-    NODE_NETWORK_LIMITED,
     98-};
     99-
    100-constexpr NetPermissionFlags ALL_NET_PERMISSION_FLAGS[]{
    101-    NetPermissionFlags::None,
    102-    NetPermissionFlags::BloomFilter,
    103-    NetPermissionFlags::Relay,
    104-    NetPermissionFlags::ForceRelay,
    105-    NetPermissionFlags::NoBan,
    106-    NetPermissionFlags::Mempool,
    107-    NetPermissionFlags::Addr,
    108-    NetPermissionFlags::Download,
    109-    NetPermissionFlags::Implicit,
    110-    NetPermissionFlags::All,
    111-};
    112-
    113-constexpr ConnectionType ALL_CONNECTION_TYPES[]{
    114-    ConnectionType::INBOUND,
    115-    ConnectionType::OUTBOUND_FULL_RELAY,
    116-    ConnectionType::MANUAL,
    117-    ConnectionType::FEELER,
    118-    ConnectionType::BLOCK_RELAY,
    119-    ConnectionType::ADDR_FETCH,
    120-};
    121-
    122-constexpr auto ALL_NETWORKS = std::array{
    123-    Network::NET_UNROUTABLE,
    124-    Network::NET_IPV4,
    125-    Network::NET_IPV6,
    126-    Network::NET_ONION,
    127-    Network::NET_I2P,
    128-    Network::NET_CJDNS,
    129-    Network::NET_INTERNAL,
    130-};
    131-
    132-std::vector<NodeEvictionCandidate> GetRandomNodeEvictionCandidates(int n_candidates, FastRandomContext& random_context);
    133-
    134 /**
    135  * A mocked Sock alternative that succeeds on all operations.
    136  * Returns infinite amount of 0x0 bytes on reads.
    137  */
    138 class ZeroSock : public Sock
    139 {
    140@@ -257,7 +173,8 @@ void DynSock::Pipe::PushNetMsg(const std::string& type, Args&&... payload)
    141     LOCK(m_mutex);
    142     m_data.insert(m_data.end(), header.begin(), header.end());
    143     m_data.insert(m_data.end(), msg.data.begin(), msg.data.end());
    144     m_cond.notify_all();
    145 }
    146 
    147-#endif // BITCOIN_TEST_UTIL_NET_H
    148+#endif // BITCOIN_TEST_UTIL_MOCKED_SOCKETS_H
    149+
    150diff --git a/src/test/util/net.cpp b/src/test/util/net.cpp
    151index ac866d7c89..1acdbf7061 100644
    152--- a/src/test/util/net.cpp
    153+++ b/src/test/util/net.cpp
    154@@ -8,12 +8,13 @@
    155 #include <node/eviction.h>
    156 #include <net.h>
    157 #include <net_processing.h>
    158 #include <netmessagemaker.h>
    159 #include <span.h>
    160 #include <sync.h>
    161+#include <test/util/mocked_sockets.h>
    162 
    163 #include <chrono>
    164 #include <optional>
    165 #include <vector>
    166 
    167 void ConnmanTestMsg::Handshake(CNode& node,
    168diff --git a/src/test/util/net.h b/src/test/util/net.h
    169index b5ab7699f0..7bad1ee4af 100644
    170--- a/src/test/util/net.h
    171+++ b/src/test/util/net.h
    172@@ -100,164 +100,7 @@ constexpr auto ALL_NETWORKS = std::array{
    173     Network::NET_CJDNS,
    174     Network::NET_INTERNAL,
    175 };
    176 
    177 std::vector<NodeEvictionCandidate> GetRandomNodeEvictionCandidates(int n_candidates, FastRandomContext& random_context);
    178 
    179-/**
    180- * A mocked Sock alternative that succeeds on all operations.
    181- * Returns infinite amount of 0x0 bytes on reads.
    182- */
    183-class ZeroSock : public Sock
    184-{
    185-public:
    186-    ZeroSock();
    187-
    188-    ~ZeroSock() override;
    189-
    190-    ssize_t Send(const void*, size_t len, int) const override;
    191-
    192-    ssize_t Recv(void* buf, size_t len, int flags) const override;
    193-
    194-    int Connect(const sockaddr*, socklen_t) const override;
    195-
    196-    int Bind(const sockaddr*, socklen_t) const override;
    197-
    198-    int Listen(int) const override;
    199-
    200-    std::unique_ptr<Sock> Accept(sockaddr* addr, socklen_t* addr_len) const override;
    201-
    202-    int GetSockOpt(int level, int opt_name, void* opt_val, socklen_t* opt_len) const override;
    203-
    204-    int SetSockOpt(int, int, const void*, socklen_t) const override;
    205-
    206-    int GetSockName(sockaddr* name, socklen_t* name_len) const override;
    207-
    208-    bool SetNonBlocking() const override;
    209-
    210-    bool IsSelectable() const override;
    211-
    212-    bool Wait(std::chrono::milliseconds timeout,
    213-              Event requested,
    214-              Event* occurred = nullptr) const override;
    215-
    216-    bool WaitMany(std::chrono::milliseconds timeout, EventsPerSock& events_per_sock) const override;
    217-
    218-private:
    219-    ZeroSock& operator=(Sock&& other) override;
    220-};
    221-
    222-/**
    223- * A mocked Sock alternative that returns a statically contained data upon read and succeeds
    224- * and ignores all writes. The data to be returned is given to the constructor and when it is
    225- * exhausted an EOF is returned by further reads.
    226- */
    227-class StaticContentsSock : public ZeroSock
    228-{
    229-public:
    230-    explicit StaticContentsSock(const std::string& contents);
    231-
    232-    /**
    233-     * Return parts of the contents that was provided at construction until it is exhausted
    234-     * and then return 0 (EOF).
    235-     */
    236-    ssize_t Recv(void* buf, size_t len, int flags) const override;
    237-
    238-private:
    239-    StaticContentsSock& operator=(Sock&& other) override;
    240-
    241-    const std::string m_contents;
    242-    mutable size_t m_consumed{0};
    243-};
    244-
    245-/**
    246- * A mocked Sock alternative that allows providing the data to be returned by Recv()
    247- * and inspecting the data that has been supplied to Send().
    248- */
    249-class DynSock : public ZeroSock
    250-{
    251-public:
    252-    /**
    253-     * Unidirectional bytes or CNetMessage queue (FIFO).
    254-     */
    255-    class Pipe
    256-    {
    257-    public:
    258-        /**
    259-         * Get bytes and remove them from the pipe.
    260-         * [@param](/bitcoin-bitcoin/contributor/param/)[in] buf Destination to write bytes to.
    261-         * [@param](/bitcoin-bitcoin/contributor/param/)[in] len Write up to this number of bytes.
    262-         * [@param](/bitcoin-bitcoin/contributor/param/)[in] flags Same as the flags of `recv(2)`. Just `MSG_PEEK` is honored.
    263-         * [@return](/bitcoin-bitcoin/contributor/return/) The number of bytes written to `buf`. `0` if `Eof()` has been called.
    264-         * If no bytes are available then `-1` is returned and `errno` is set to `EAGAIN`.
    265-         */
    266-        ssize_t GetBytes(void* buf, size_t len, int flags = 0) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    267-
    268-        /**
    269-         * Deserialize a `CNetMessage` and remove it from the pipe.
    270-         * If not enough bytes are available then the function will wait. If parsing fails
    271-         * or EOF is signaled to the pipe, then `std::nullopt` is returned.
    272-         */
    273-        std::optional<CNetMessage> GetNetMsg() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    274-
    275-        /**
    276-         * Push bytes to the pipe.
    277-         */
    278-        void PushBytes(const void* buf, size_t len) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    279-
    280-        /**
    281-         * Construct and push CNetMessage to the pipe.
    282-         */
    283-        template <typename... Args>
    284-        void PushNetMsg(const std::string& type, Args&&... payload) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    285-
    286-        /**
    287-         * Signal end-of-file on the receiving end (`GetBytes()` or `GetNetMsg()`).
    288-         */
    289-        void Eof() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    290-
    291-    private:
    292-        Mutex m_mutex;
    293-        std::condition_variable m_cond;
    294-        std::vector<uint8_t> m_data GUARDED_BY(m_mutex);
    295-        bool m_eof GUARDED_BY(m_mutex){false};
    296-    };
    297-
    298-    struct Pipes {
    299-        Pipe recv;
    300-        Pipe send;
    301-    };
    302-
    303-    explicit DynSock(std::shared_ptr<Pipes> pipes);
    304-
    305-    ~DynSock();
    306-
    307-    ssize_t Recv(void* buf, size_t len, int flags) const override;
    308-
    309-    ssize_t Send(const void* buf, size_t len, int) const override;
    310-
    311-    bool Wait(std::chrono::milliseconds timeout,
    312-              Event requested,
    313-              Event* occurred = nullptr) const override;
    314-
    315-    bool WaitMany(std::chrono::milliseconds timeout, EventsPerSock& events_per_sock) const override;
    316-
    317-private:
    318-    DynSock& operator=(Sock&&) override;
    319-
    320-    std::shared_ptr<Pipes> m_pipes;
    321-};
    322-
    323-template <typename... Args>
    324-void DynSock::Pipe::PushNetMsg(const std::string& type, Args&&... payload)
    325-{
    326-    auto msg = CNetMsgMaker(INIT_PROTO_VERSION).Make(type, std::forward<Args>(payload)...);
    327-    std::vector<unsigned char> header;
    328-    V1TransportSerializer().prepareForTransport(msg, header);
    329-
    330-    LOCK(m_mutex);
    331-    m_data.insert(m_data.end(), header.begin(), header.end());
    332-    m_data.insert(m_data.end(), msg.data.begin(), msg.data.end());
    333-    m_cond.notify_all();
    334-}
    335-
    336 #endif // BITCOIN_TEST_UTIL_NET_H
    337diff --git a/src/test/util/setup_common.h b/src/test/util/setup_common.h
    338index d3137fa69e..1daac4a57c 100644
    339--- a/src/test/util/setup_common.h
    340+++ b/src/test/util/setup_common.h
    341@@ -10,13 +10,13 @@
    342 #include <node/caches.h>
    343 #include <node/context.h> // IWYU pragma: export
    344 #include <primitives/transaction.h>
    345 #include <pubkey.h>
    346 #include <random.h>
    347 #include <stdexcept>
    348-#include <test/util/net.h>
    349+#include <test/util/mocked_sockets.h>
    350 #include <util/chaintype.h>
    351 #include <util/check.h>
    352 #include <util/fs.h>
    353 #include <util/string.h>
    354 #include <util/vector.h>
    355 
    

    Or did you mean something else?


    maflcko commented at 7:47 am on June 28, 2023:

    (No opinion on the change), just

    In both cases recompilation takes ~28 seconds

    Seems odd that re-compilation of all tests takes only 28 seconds when mocked_sockets.h is modified, which is included in setup_common.h, which is included in all test files, no?


    vasild commented at 9:07 am on June 28, 2023:

    Compiling everything from scratch (after git clean -fdx) takes about 1m 30sec (including autogen and ./configure which are single-threaded). The CPU on the machine is:

    AMD Ryzen 9 7950X 16-Core Processor (4491.70-MHz K8-class CPU)


    maflcko commented at 9:20 am on June 28, 2023:
    For a useful benchmark, you’ll have to measure the total CPU time, not the CPU time of just the translation unit that takes the longest.

    vasild commented at 9:46 am on June 28, 2023:
     0PR:
     1real    0m29.623s
     2user    10m58.337s
     3sys     0m51.187s
     4
     5real    0m29.829s
     6user    10m58.359s
     7sys     0m50.912s
     8
     9mocked_sockets.h:
    10real    0m29.528s
    11user    11m12.208s
    12sys     1m0.406s
    13
    14real    0m28.831s
    15user    10m45.615s
    16sys     0m49.680s
    

    I guess the “total CPU time” is real+sys from above. It all seems within noise. In both cases all tests would be recompiled, maybe I got @jonatack’s comment wrong?


    maflcko commented at 9:54 am on June 28, 2023:

    I guess the “total CPU time” is real+sys from above

    user+sys, but thanks for re-checking.


    jonatack commented at 3:42 pm on June 28, 2023:
    Thanks for looking into it. I like the idea of keeping them separate and avoiding including unneeded code, in general and for building with slower, older CPUs. So 👍 for your diff in #26812 (review). Great work so far AFAICT. Will continue reviewing the remaining commits.

    jonatack commented at 3:45 pm on June 28, 2023:

    In both cases all tests would be recompiled

    Yes, but future changes to util/net would not require compiling all the tests.


    vasild commented at 1:07 pm on June 29, 2023:

    Alright, so I got it the other way around. So the point is to avoid recompiling all tests if e.g. ALL_NETWORKS is changed (defined in test/util/net.h). That lowers the re-compilation from 28-29 sec to about 10 sec (if src/test/util/net.h is modified).

    I will leave it as it is because moving the code around would further increase the size of this PR which suffers from lack of interest from reviewers and I think expanding it may further turn people away.

  40. in src/test/banman_tests.cpp:33 in 4557cc336f outdated
    37-            banman.GetBanned(entries_read);
    38-            BOOST_CHECK_EQUAL(entries_read.size(), 1);
    39-        }
    40+        // The invalid entries will be dropped, but the valid one remains
    41+        DebugLogHelper debug_log1{"Dropping entry with unparseable address or subnet (aaaaaaaaa) from ban list"};
    42+        DebugLogHelper debug_log2{"Dropping entry with unknown version (2) from ban list"};
    


    jonatack commented at 10:56 pm on June 27, 2023:

    d2f46c7 Verified the behavior improvement in this commit by changing this line to make it fail.

     0$ ./src/test/test_bitcoin -t banman_tests       
     1Running 1 test case...
     2libc++abi: terminating due to uncaught exception of type std::runtime_error: 'Dropping entry with unknown version (2) rom ban list' not found in debug log
     3
     4******** errors disabling the alternate stack:
     5	#error:22
     6	Invalid argument
     7unknown location:0: fatal error: in "banman_tests/file": signal: SIGABRT (application abort requested)
     8test/banman_tests.cpp:38: last checkpoint
     9
    10*** 1 failure is detected in the test module "Bitcoin Core Test Suite"
    
    0$ ./src/test/test_bitcoin -t banman_tests
    1Running 1 test case...
    2unknown location:0: fatal error: in "banman_tests/file": std::runtime_error: 'Dropping entry with unknown version (2) rom ban list' not found in debug log
    3
    4test/banman_tests.cpp:37: last checkpoint
    5
    6*** 1 failure is detected in the test module "Bitcoin Core Test Suite"
    
  41. jonatack commented at 10:59 pm on June 27, 2023: contributor
  42. in src/Makefile.test.include:116 in 4557cc336f outdated
    112@@ -113,6 +113,7 @@ BITCOIN_TESTS =\
    113   test/net_peer_eviction_tests.cpp \
    114   test/net_tests.cpp \
    115   test/netbase_tests.cpp \
    116+  test/netmsg_tests.cpp \
    


    jonatack commented at 6:15 pm on June 28, 2023:
    37cd2f5 naming: maybe call the test file net_msg_tests.cpp (and move this to line 113)

    vasild commented at 1:07 pm on June 29, 2023:
    Done.
  43. in src/test/netmsg_tests.cpp:32 in 4557cc336f outdated
    27+    // Wait for all messages due to the initial handshake to be Send() to the socket.
    28+    // The FEEFILTER is the last one, so quit when we get that.
    29+    for (;;) {
    30+        auto msg = pipes->send.GetNetMsg();
    31+        if (msg.has_value()) {
    32+            count_sent_messages[msg->m_type]++;
    


    jonatack commented at 6:19 pm on June 28, 2023:

    37cd2f5

     0@@ -22,20 +22,15 @@ BOOST_FIXTURE_TEST_SUITE(netmsg_tests, NetTestingSetup)
     1 BOOST_AUTO_TEST_CASE(initial_messages_exchange)
     2 {
     3     std::unordered_map<std::string, size_t> count_sent_messages;
     4-    auto pipes = m_sockets_pipes.PopFront();
     5+    const auto pipes{m_sockets_pipes.PopFront()};
     6 
     7     // Wait for all messages due to the initial handshake to be Send() to the socket.
     8     // The FEEFILTER is the last one, so quit when we get that.
     9     for (;;) {
    10         auto msg = pipes->send.GetNetMsg();
    11-        if (msg.has_value()) {
    12-            count_sent_messages[msg->m_type]++;
    13-            if (msg->m_type == NetMsgType::FEEFILTER) {
    14-                break;
    15-            }
    16-        } else {
    17-            break;
    18-        }
    19+        if (!msg.has_value()) break;
    20+        ++count_sent_messages[msg->m_type];
    21+        if (msg->m_type == NetMsgType::FEEFILTER) break;
    22     }
    23 
    24     BOOST_CHECK_EQUAL(count_sent_messages[NetMsgType::VERSION], 1);
    25@@ -51,7 +46,7 @@ BOOST_AUTO_TEST_CASE(initial_messages_exchange)
    26 
    27 BOOST_AUTO_TEST_CASE(addr)
    28 {
    29-    auto pipes = m_sockets_pipes.PopFront();
    30+    const auto pipes{m_sockets_pipes.PopFront()};
    31     std::vector<CAddress> addresses{5};
    32 
    33     DebugLogHelper debug_log{strprintf("Received addr: %u addresses", addresses.size())};
    34@@ -61,7 +56,7 @@ BOOST_AUTO_TEST_CASE(addr)
    35 
    36 BOOST_AUTO_TEST_CASE(getblocks)
    37 {
    38-    auto pipes = m_sockets_pipes.PopFront();
    39+    const auto pipes{m_sockets_pipes.PopFront()};
    40     std::vector<uint256> hashes{5};
    41     CBlockLocator block_locator{std::move(hashes)};
    42     uint256 hash_stop;
    43@@ -73,7 +68,7 @@ BOOST_AUTO_TEST_CASE(getblocks)
    44 
    45 BOOST_AUTO_TEST_CASE(ping)
    46 {
    47-    auto pipes = m_sockets_pipes.PopFront();
    48+    const auto pipes{m_sockets_pipes.PopFront()};
    49 
    50     auto WaitForPingStats = [this](std::chrono::microseconds min,
    51                                    std::chrono::microseconds last,
    52@@ -173,7 +168,7 @@ BOOST_AUTO_TEST_CASE(ping)
    53 
    54 BOOST_AUTO_TEST_CASE(redundant_verack)
    55 {
    56-    auto pipes = m_sockets_pipes.PopFront();
    57+    const auto pipes{m_sockets_pipes.PopFront()};
    

    vasild commented at 1:08 pm on June 29, 2023:
    Done.
  44. in src/test/fuzz/process_message_e2e.cpp:31 in 4557cc336f outdated
    32+#include <cstdlib>
    33+#include <iostream>
    34+#include <memory>
    35+#include <string>
    36+#include <string_view>
    37+#include <vector>
    


    jonatack commented at 7:01 pm on June 28, 2023:

    vasild commented at 1:13 pm on June 29, 2023:

    Done, except this:

     0test/net_msg_tests.cpp should add these lines:
     1#include <boost/preprocessor/arithmetic/limits/dec_256.hpp>
     2#include <boost/preprocessor/comparison/limits/not_equal_256.hpp>
     3#include <boost/preprocessor/control/expr_iif.hpp>
     4#include <boost/preprocessor/control/iif.hpp>
     5#include <boost/preprocessor/detail/limits/auto_rec_256.hpp>
     6#include <boost/preprocessor/logical/compl.hpp>
     7#include <boost/preprocessor/logical/limits/bool_256.hpp>
     8#include <boost/preprocessor/repetition/detail/limits/for_256.hpp>
     9#include <boost/preprocessor/repetition/for.hpp>
    10#include <boost/preprocessor/seq/limits/elem_256.hpp>
    11#include <boost/preprocessor/seq/limits/size_256.hpp>
    12#include <boost/preprocessor/tuple/elem.hpp>
    13#include <boost/preprocessor/variadic/limits/elem_64.hpp>
    14#include <boost/test/tools/old/interface.hpp>
    15#include <boost/test/unit_test_log.hpp>
    16#include <boost/test/unit_test_suite.hpp>
    17#include <boost/test/utils/basic_cstring/basic_cstring.hpp>
    18#include <boost/test/utils/lazy_ostream.hpp>
    19
    20test/net_msg_tests.cpp should remove these lines:
    21- #include <boost/test/unit_test.hpp>  // lines 18-18
    

    which seems strange.

  45. in src/test/fuzz/process_message_e2e.cpp:60 in 4557cc336f outdated
    55+
    56+    static const auto testing_setup = MakeNoLogFileContext<NetTestingSetup>(
    57+            /*chain_type=*/ChainType::REGTEST,
    58+            /*extra_args=*/{"-txreconciliation"});
    59+    g_setup = testing_setup.get();
    60+    for (int i = 0; i < 2 * COINBASE_MATURITY; i++) {
    


    jonatack commented at 7:01 pm on June 28, 2023:

    https://github.com/bitcoin/bitcoin/commit/4557cc336fd8eb321b0db024b70213f46017071c

     0@@ -57,7 +57,7 @@ void initialize_process_message_e2e()
     1             /*chain_type=*/ChainType::REGTEST,
     2             /*extra_args=*/{"-txreconciliation"});
     3     g_setup = testing_setup.get();
     4-    for (int i = 0; i < 2 * COINBASE_MATURITY; i++) {
     5+    for (int i = 0; i < 2 * COINBASE_MATURITY; ++i) {
     6         MineBlock(g_setup->m_node, CScript() << OP_TRUE);
     7     }
     8     SyncWithValidationInterfaceQueue();
     9@@ -75,9 +75,9 @@ FUZZ_TARGET_INIT(process_message_e2e, initialize_process_message_e2e)
    10     SetMockTime(GetMockTime() + std::chrono::minutes{fuzzed_data_provider.ConsumeIntegral<uint8_t>()});
    11 
    12     g_setup->m_node.connman->OpenNetworkConnection(
    13-        CAddress{}, false, nullptr, "1.2.3.4:8333", ConnectionType::OUTBOUND_FULL_RELAY);
    14+        CAddress{}, /*fCountFailure=*/false, /*grantOutbound=*/nullptr, "1.2.3.4:8333", ConnectionType::OUTBOUND_FULL_RELAY);
    15 
    16-    auto pipes = g_setup->m_sockets_pipes.PopFront();
    17+    const auto pipes{g_setup->m_sockets_pipes.PopFront()};
    18 
    19     uint8_t buf[1024];
    

    vasild commented at 1:12 pm on June 29, 2023:

    Done.

    I changed it to use {} as suggested but I don’t see a point to use {} instead of = for initialization when auto is used. I find the {} variant a little bit less readable but see the merit to use it when Type1 x{expresson_of_Type2}; is used to detect incompatibilities between Type1 and Type2. But why do that for auto x{expressoin_of_any_type};?


    jonatack commented at 7:31 pm on June 29, 2023:
    With C++11, my understanding was to avoid auto with uniform (braced) initialization because in some cases it could have unexpected results. Since C++17, that was mostly fixed and the CPP Guideline is to prefer the {}-initializer syntax as its rules are simpler, more general, less ambiguous, and safer. See also https://ianyepan.github.io/posts/cpp-uniform-initialization/. Though it didn’t matter in this case, IIUC, so I didn’t mention it explicitly.

    vasild commented at 3:34 pm on July 19, 2023:

    I am still not convinced about auto and brace-initialization.

    https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es23-prefer-the--initializer-syntax mentions:

    Use = only when you are sure that there can be no narrowing conversions For built-in arithmetic types, use = only with auto

    There will not be narrowing conversions with auto x = expression;

    One of the examples down there uses auto p = ....

  46. jonatack commented at 7:13 pm on June 28, 2023: contributor

    ACK 4557cc336fd8eb321b0db024b70213f46017071c

    Happy to re-review for #26812 (review) and any of the suggestions below.

  47. vasild force-pushed on Jun 29, 2023
  48. vasild commented at 1:00 pm on June 29, 2023: contributor

    4557cc336f...353d323356: take some suggestions

    Invalidates ACK from @jonatack

    353d323356...612ba17fca: proper rename of the test netmsg -> net_msg

  49. vasild force-pushed on Jun 29, 2023
  50. DrahtBot added the label CI failed on Jun 29, 2023
  51. DrahtBot removed the label CI failed on Jun 29, 2023
  52. jonatack commented at 7:03 pm on June 29, 2023: contributor
    ACK 612ba17fcadc955c142f21f9b1af0b60f0def55f
  53. DrahtBot added the label CI failed on Jul 13, 2023
  54. in src/test/util/logging.cpp:34 in d2f46c7055 outdated
    34+
    35+void DebugLogHelper::EndAndThrowIfNotFound()
    36+{
    37+    StopReceivingLog();
    38+    if (!m_found) {
    39         throw std::runtime_error(strprintf("'%s' not found in debug log\n", m_message));
    


    maflcko commented at 6:06 am on July 14, 2023:
    Not sure about changing this. It just makes it harder to use. If you prefer to not have an exception, this can use an Assert? Also, in the commit description, it would be good to refer to documentation why it is “not allowed”, and/or include an example of what happens.

    vasild commented at 7:11 am on July 20, 2023:

    I will look into replacing this with Assert().

    “exception from destructor in c++” is a topic well covered all over the internet. The compiler produces a warning for that. Here is an example:

     0class A 
     1{
     2public:
     3    ~A() { 
     4        std::cout << "before throw a\n";
     5        // warning: '~A' has a non-throwing exception specification but can still throw [-Wexceptions]
     6        //171 |         throw std::runtime_error("a");
     7        //    |         ^
     8        throw std::runtime_error("a");
     9    } 
    10};
    11
    12int main(int, char**)
    13{
    14    try { 
    15        A a;
    16        std::cout << "before throw b\n";
    17        throw std::runtime_error("b");
    18    } catch (const std::exception& e) { 
    19        std::cout << "catch\n"; // never reached
    20    } 
    21    return 0;
    22}
    
    0$ ./t
    1before throw b
    2before throw a
    3Abort trap (core dumped)
    4$
    

    I will look into improving the commit message as well.


    maflcko commented at 7:25 am on July 20, 2023:
    Yeah, but this being an “Abort trap” is well defined, not “not allowed”. And I think the tests aborting on failure is fine.

    vasild commented at 1:28 pm on July 20, 2023:

    Changed to use std::abort() in the destructor. Reverted to the original interface:

    0{
    1    ASSERT_DEBUG_LOG("expected message");
    2    produce log messages;
    3}
    

    The change was necessary in order to pass the timeout to the “final check or abort” function which was the destructor and there is no way to pass parameters to the destructor. Anyway, now I pass the timeout to the constructor, save it in a member variable and use it in the destructor. With the wait it is now:

    0{
    1    ASSERT_DEBUG_LOG_WAIT("expected message", 30s);
    2    produce log messages;
    3} // the destructor will wait 30s for the message to be logged before aborting
    

    vasild commented at 4:56 pm on August 15, 2023:

    Marking this as “resolved” because I reverted to the original interface of ASSERT_DEBUG_LOG() and changed the exception from the destructor to std::abort().

    Let me know if you still have concerns (aka if I should “unresolve” it). Thanks, it looks better now!

  55. maflcko commented at 6:07 am on July 14, 2023: member

    Not sure about changing the framework.

    Also, needs rebase if still relevant.

  56. jonatack commented at 4:19 pm on July 19, 2023: contributor

    needs rebase if still relevant.

    For the previous releases CI? (I don’t find any merge conflicts with current master.)

  57. maflcko commented at 4:29 pm on July 19, 2023: member

    I don’t find any merge conflicts with current master.

    Are you sure you ran a full build after the merge?

  58. jonatack commented at 4:33 pm on July 19, 2023: contributor

    Ah, the second try did it :)

     0test/fuzz/process_message_e2e.cpp:59:1: error: a type specifier is required for all declarations
     1FUZZ_TARGET_INIT(process_message_e2e, initialize_process_message_e2e)
     2^
     3test/fuzz/process_message_e2e.cpp:59:18: error: use of undeclared identifier 'process_message_e2e'
     4FUZZ_TARGET_INIT(process_message_e2e, initialize_process_message_e2e)
     5                 ^
     6test/fuzz/process_message_e2e.cpp:59:70: error: expected ';' after top level declarator
     7FUZZ_TARGET_INIT(process_message_e2e, initialize_process_message_e2e)
     8                                                                     ^
     9                                                                     ;
    103 errors generated.
    11make[2]: *** [test/fuzz/fuzz-process_message_e2e.o] Error 1
    12make[2]: *** Waiting for unfinished jobs....
    13test/validation_chainstatemanager_tests.cpp:664:9: error: use of undeclared identifier 'ASSERT_DEBUG_LOG'
    14        ASSERT_DEBUG_LOG("failed to validate the -assumeutxo snapshot state");
    15        ^
    161 error generated.
    17make[2]: *** [test/test_bitcoin-validation_chainstatemanager_tests.o] Error 1
    18wallet/test/walletload_tests.cpp:61:20: error: no matching constructor for initialization of 'DebugLogHelper'
    19    DebugLogHelper logHelper("The descriptor ID calculated by the wallet differs from the one in DB", [&](const std::string* s) {
    20                   ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    21./test/util/logging.h:23:14: note: candidate constructor not viable: no known conversion from '(lambda at wallet/test/walletload_tests.cpp:61:103)' to 'MatchFn' (aka 'function<bool (const basic_string<char> &)>') for 2nd argument
    22    explicit DebugLogHelper(std::string message, MatchFn match = [](const std::string&){ return true; });
    23             ^
    24./test/util/logging.h:18:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
    25class DebugLogHelper
    26      ^
    271 error generated.
    
  59. vasild force-pushed on Jul 20, 2023
  60. vasild commented at 1:22 pm on July 20, 2023: contributor
    612ba17fca...55c84c2d3b: rebase due to conflicts plus reduce the changes to DebugLogHelper: #26812 (review).
  61. in src/test/util/logging.h:63 in 55c84c2d3b outdated
    67+    void StopReceivingLog();
    68 };
    69 
    70 #define ASSERT_DEBUG_LOG(message) DebugLogHelper UNIQUE_NAME(debugloghelper)(message)
    71 
    72+#define ASSERT_DEBUG_LOG_WAIT(message, timeout) \
    


    maflcko commented at 3:32 pm on July 20, 2023:

    It is already ugly to have a timeout in the functional tests, and use it to sync them. (At least this one is std::optional)

    However, my preference would still be to not have this and instead force the caller to sync.

    Otherwise this will just lead to brittle code down the line and potentially intermittent test issues when running with libc++ sanitizers in valgrind on arm64, etc.


    vasild commented at 7:00 am on July 21, 2023:

    Your concern is that the timeout will turn out not to be enough in some slow environments?

    When you say “force the caller to sync” do you mean something like this:

    0{
    1    ASSERT_DEBUG_LOG("foo");
    2    dostuff
    3    somehow ensure the code that is supposed to log "foo" has executed
    4}  // fail if "foo" is not in the log as soon as this is reached, like in `master`
    

    ?


    maflcko commented at 7:43 am on July 21, 2023:

    Yes.

    For example, if you are waiting for the scheduler to execute and log a validationinterface event, you can call SyncWithValidationInterfaceQueue() to sync without a timeout.


    vasild commented at 8:24 am on July 21, 2023:

    Thinking about this - when one thread triggers an event which another thread is supposed to pick and do something about it and the first thread wants to confirm that was done, then some waiting must be involved in the first thread.

    SyncWithValidationInterfaceQueue() uses https://en.cppreference.com/w/cpp/thread/future/wait which is unlimited, without a timeout. In practice I guess that means a 2h (?) timeout until some guard in the CI kills the waiting test.

    Syncing in the test, explicitly, outside of ASSERT_DEBUG_LOG_WAIT() would require putting sync points in various places in the high-level code - where a flag is raised that e.g. “a ping has been send”, or “we have received GETBLOCKS message” and then the test waits for a particular flag to be raised. If that is done then ASSERT_DEBUG_LOG* is unnecessary. In the current approach the log messages are used as sync points. I think that is better because it does not require littering the high-level code with test-specific sync-points/flags.

    What about changing ASSERT_DEBUG_LOG_WAIT() to wait infinitely, no timeout? Then it would be similar to SyncWithValidationInterfaceQueue() - a test failure due to ~message not logged~ the expected code not executed by the other thread would be designated not by a “timeout: message not found in log” after 30s but by infinite wait/hang of the test.


    maflcko commented at 8:37 am on July 21, 2023:

    Not sure about an infinite timeout for ASSERT_DEBUG_LOG. For example, this can easily be hit when the log string is modified by a developer, turning a (doc) change in one place into a hard to debug unit test issue.

    I know that it is possible to hit an infinite runtime in any other place, but that would generally mean a major fault or injected bug elsewhere. For example, if the scheduler thread is stopped and one called SyncWithValidationInterfaceQueue(), it would be quite obvious from the diff why the unit test never finishes. Alternatively, SyncWithValidationInterfaceQueue() could be extended with an assert to check the thread is running to map an infinitive runtime to an immediate crash to catch this type of bug, if it is expected to happen to at least once.

    It would be good to explain why you need the debug log to sync and why it is not (easily) possible to just sync in this pull.


    vasild commented at 10:03 am on July 21, 2023:

    Not sure about an infinite timeout for ASSERT_DEBUG_LOG. For example, this can easily be hit when the log string is modified by a developer, turning a (doc) change in one place into a hard to debug unit test issue.

    I agree. I will stick to a meaningful message after a timeout instead of infinite wait (without a message).

    I know that it is possible to hit an infinite runtime in any other place, but that would generally mean a major fault or injected bug elsewhere. For example, if the scheduler thread is stopped and one called SyncWithValidationInterfaceQueue()

    I agree with that too. But the parallel with SyncWithValidationInterfaceQueue() is not very close to the tests in this PR because SyncWithValidationInterfaceQueue() does not provide fine-grained sync points, see below what I mean.

    It would be good to explain why you need the debug log to sync and why it is not (easily) possible to just sync in this pull.

    It is possible to not use the debug log. Here is how it would look like:

      0diff --git i/src/net_processing.cpp w/src/net_processing.cpp
      1index 1c73ca4eb4..35301a6a2d 100644
      2--- i/src/net_processing.cpp
      3+++ w/src/net_processing.cpp
      4@@ -3529,12 +3529,13 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
      5     // At this point, the outgoing message serialization version can't change.
      6     const CNetMsgMaker msgMaker(pfrom.GetCommonVersion());
      7 
      8     if (msg_type == NetMsgType::VERACK) {
      9         if (pfrom.fSuccessfullyConnected) {
     10             LogPrint(BCLog::NET, "ignoring redundant verack message from peer=%d\n", pfrom.GetId());
     11+            g_stage_redundant_verack.post();
     12             return;
     13         }
     14 
     15         if (!pfrom.IsInboundConn()) {
     16             const auto mapped_as{m_connman.GetMappedAS(pfrom.addr)};
     17             LogPrintf("New outbound peer connected: version: %d, blocks=%d, peer=%d%s%s (%s)\n",
     18@@ -3781,12 +3782,14 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
     19         }
     20         peer->m_addr_processed += num_proc;
     21         peer->m_addr_rate_limited += num_rate_limit;
     22         LogPrint(BCLog::NET, "Received addr: %u addresses (%u processed, %u rate-limited) from peer=%d\n",
     23                  vAddr.size(), num_proc, num_rate_limit, pfrom.GetId());
     24 
     25+        g_stage_received_addr.post();
     26+
     27         m_addrman.Add(vAddrOk, pfrom.addr, 2h);
     28         if (vAddr.size() < 1000) peer->m_getaddr_sent = false;
     29 
     30         // AddrFetch: Require multiple addresses to avoid disconnecting on self-announcements
     31         if (pfrom.IsAddrFetchConn() && vAddr.size() > 1) {
     32             LogPrint(BCLog::NET, "addrfetch connection completed peer=%d; disconnecting\n", pfrom.GetId());
     33@@ -3948,12 +3951,15 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
     34 
     35         // Send the rest of the chain
     36         if (pindex)
     37             pindex = m_chainman.ActiveChain().Next(pindex);
     38         int nLimit = 500;
     39         LogPrint(BCLog::NET, "getblocks %d to %s limit %d from peer=%d\n", (pindex ? pindex->nHeight : -1), hashStop.IsNull() ? "end" : hashStop.ToString(), nLimit, pfrom.GetId());
     40+        if (pindex == nullptr && hashStop.IsNull()) {
     41+            g_stage_getblocks_all.post();
     42+        }
     43         for (; pindex; pindex = m_chainman.ActiveChain().Next(pindex))
     44         {
     45             if (pindex->GetBlockHash() == hashStop)
     46             {
     47                 LogPrint(BCLog::NET, "  getblocks stopping at %d %s\n", pindex->nHeight, pindex->GetBlockHash().ToString());
     48                 break;
     49@@ -4765,26 +4771,31 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
     50                     } else {
     51                         // This should never happen
     52                         sProblem = "Timing mishap";
     53                     }
     54                 } else {
     55                     // Nonce mismatches are normal when pings are overlapping
     56-                    sProblem = "Nonce mismatch";
     57                     if (nonce == 0) {
     58                         // This is most likely a bug in another implementation somewhere; cancel this ping
     59                         bPingFinished = true;
     60                         sProblem = "Nonce zero";
     61+                        g_stage_ping_nonce_zero.post();
     62+                    } else {
     63+                        sProblem = "Nonce mismatch";
     64+                        g_stage_ping_nonce_mismatch.post();
     65                     }
     66                 }
     67             } else {
     68                 sProblem = "Unsolicited pong without ping";
     69+                g_stage_unsolicited_pong_without_ping.post();
     70             }
     71         } else {
     72             // This is most likely a bug in another implementation somewhere; cancel this ping
     73             bPingFinished = true;
     74             sProblem = "Short payload";
     75+            g_stage_ping_short_payload.post();
     76         }
     77 
     78         if (!(sProblem.empty())) {
     79             LogPrint(BCLog::NET, "pong peer=%d: %s, %x expected, %x received, %u bytes\n",
     80                 pfrom.GetId(),
     81                 sProblem,
     82@@ -5259,12 +5270,13 @@ void PeerManagerImpl::MaybeSendPing(CNode& node_to, Peer& peer, std::chrono::mic
     83             m_connman.PushMessage(&node_to, msgMaker.Make(NetMsgType::PING, nonce));
     84         } else {
     85             // Peer is too old to support ping command with nonce, pong will never arrive.
     86             peer.m_ping_nonce_sent = 0;
     87             m_connman.PushMessage(&node_to, msgMaker.Make(NetMsgType::PING));
     88         }
     89+        g_stage_sending_ping.post();
     90     }
     91 }
     92 
     93 void PeerManagerImpl::MaybeSendAddr(CNode& node, Peer& peer, std::chrono::microseconds current_time)
     94 {
     95     // Nothing to do for non-address-relay peers
     96@@ -5928,6 +5940,15 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
     97         if (!vGetData.empty())
     98             m_connman.PushMessage(pto, msgMaker.Make(NetMsgType::GETDATA, vGetData));
     99     } // release cs_main
    100     MaybeSendFeefilter(*pto, *peer, current_time);
    101     return true;
    102 }
    103+
    104+CSemaphore g_stage_received_addr{0};
    105+CSemaphore g_stage_getblocks_all{0};
    106+CSemaphore g_stage_sending_ping{0};
    107+CSemaphore g_stage_ping_short_payload{0};
    108+CSemaphore g_stage_unsolicited_pong_without_ping{0};
    109+CSemaphore g_stage_ping_nonce_mismatch{0};
    110+CSemaphore g_stage_ping_nonce_zero{0};
    111+CSemaphore g_stage_redundant_verack{0};
    112diff --git i/src/net_processing.h w/src/net_processing.h
    113index 1bd2b1e8ca..de4a12705f 100644
    114--- i/src/net_processing.h
    115+++ w/src/net_processing.h
    116@@ -91,7 +91,16 @@ public:
    117                                 const std::chrono::microseconds time_received, const std::atomic<bool>& interruptMsgProc) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex) = 0;
    118 
    119     /** This function is used for testing the stale tip eviction logic, see denialofservice_tests.cpp */
    120     virtual void UpdateLastBlockAnnounceTime(NodeId node, int64_t time_in_seconds) = 0;
    121 };
    122 
    123+extern CSemaphore g_stage_received_addr;
    124+extern CSemaphore g_stage_getblocks_all;
    125+extern CSemaphore g_stage_sending_ping;
    126+extern CSemaphore g_stage_ping_short_payload;
    127+extern CSemaphore g_stage_unsolicited_pong_without_ping;
    128+extern CSemaphore g_stage_ping_nonce_mismatch;
    129+extern CSemaphore g_stage_ping_nonce_zero;
    130+extern CSemaphore g_stage_redundant_verack;
    131+
    132 #endif // BITCOIN_NET_PROCESSING_H
    133diff --git i/src/test/net_msg_tests.cpp w/src/test/net_msg_tests.cpp
    134index 49c97c1440..17bd74cb3a 100644
    135--- i/src/test/net_msg_tests.cpp
    136+++ w/src/test/net_msg_tests.cpp
    137@@ -63,25 +63,27 @@ BOOST_AUTO_TEST_CASE(initial_messages_exchange)
    138 
    139 BOOST_AUTO_TEST_CASE(addr)
    140 {
    141     const auto pipes{m_sockets_pipes.PopFront()};
    142     std::vector<CAddress> addresses{5};
    143 
    144-    ASSERT_DEBUG_LOG_WAIT(strprintf("Received addr: %u addresses", addresses.size()), 30s);
    145+    BOOST_REQUIRE(!g_stage_received_addr.try_wait());
    146     pipes->recv.PushNetMsg(NetMsgType::ADDRV2, addresses);
    147+    g_stage_received_addr.wait();
    148 }
    149 
    150 BOOST_AUTO_TEST_CASE(getblocks)
    151 {
    152     const auto pipes{m_sockets_pipes.PopFront()};
    153     std::vector<uint256> hashes{5};
    154     CBlockLocator block_locator{std::move(hashes)};
    155     uint256 hash_stop;
    156 
    157-    ASSERT_DEBUG_LOG_WAIT("getblocks -1 to end", 30s);
    158+    BOOST_REQUIRE(!g_stage_getblocks_all.try_wait());
    159     pipes->recv.PushNetMsg(NetMsgType::GETBLOCKS, block_locator, hash_stop);
    160+    g_stage_getblocks_all.wait();
    161 }
    162 
    163 BOOST_AUTO_TEST_CASE(ping)
    164 {
    165     const auto pipes{m_sockets_pipes.PopFront()};
    166 
    167@@ -132,14 +134,16 @@ BOOST_AUTO_TEST_CASE(ping)
    168             }
    169         }
    170     };
    171 
    172     auto SendPing = [&]() {
    173         {
    174-            ASSERT_DEBUG_LOG_WAIT("sending ping", 30s);
    175+            while (g_stage_sending_ping.try_wait())
    176+                ;
    177             SetMockTime(GetMockTime() + PING_INTERVAL + 1s);
    178+            g_stage_sending_ping.wait();
    179         }
    180         return GetPingNonceSent();
    181     };
    182 
    183     BOOST_TEST_MESSAGE("Ensure initial messages exchange has completed with the sending of a ping "
    184                        "with nonce != 0 and the ping stats indicate a pending ping.");
    185@@ -147,37 +151,41 @@ BOOST_AUTO_TEST_CASE(ping)
    186     auto time_elapsed = 1s;
    187     SetMockTime(GetMockTime() + time_elapsed);
    188     WaitForPingStats(/*min=*/std::chrono::microseconds::max(), /*last=*/0us, /*wait=*/time_elapsed);
    189 
    190     BOOST_TEST_MESSAGE("Check that receiving a PONG without nonce cancels our PING");
    191     {
    192-        ASSERT_DEBUG_LOG_WAIT("Short payload", 30s);
    193+        BOOST_REQUIRE(!g_stage_ping_short_payload.try_wait());
    194         pipes->recv.PushNetMsg(NetMsgType::PONG);
    195+        g_stage_ping_short_payload.wait();
    196     }
    197     WaitForPingStats(/*min=*/std::chrono::microseconds::max(), /*last=*/0us, /*wait=*/0us);
    198 
    199     BOOST_TEST_MESSAGE("Check that receiving an unrequested PONG is logged and ignored");
    200     {
    201-        ASSERT_DEBUG_LOG_WAIT("Unsolicited pong without ping", 30s);
    202+        BOOST_REQUIRE(!g_stage_unsolicited_pong_without_ping.try_wait());
    203         pipes->recv.PushNetMsg(NetMsgType::PONG, /*nonce=*/uint64_t{0});
    204+        g_stage_unsolicited_pong_without_ping.wait();
    205     }
    206 
    207     BOOST_TEST_MESSAGE("Check that receiving a PONG with the wrong nonce does not cancel our PING");
    208     uint64_t nonce{SendPing()};
    209     {
    210-        ASSERT_DEBUG_LOG_WAIT("Nonce mismatch", 30s);
    211+        BOOST_REQUIRE(!g_stage_ping_nonce_mismatch.try_wait());
    212         pipes->recv.PushNetMsg(NetMsgType::PONG, nonce + 1);
    213+        g_stage_ping_nonce_mismatch.wait();
    214     }
    215     time_elapsed = 5s;
    216     SetMockTime(GetMockTime() + time_elapsed);
    217     WaitForPingStats(/*min=*/std::chrono::microseconds::max(), /*last=*/0us, /*wait=*/time_elapsed);
    218 
    219     BOOST_TEST_MESSAGE("Check that receiving a PONG with nonce=0 cancels our PING");
    220     {
    221-        ASSERT_DEBUG_LOG_WAIT("Nonce zero", 30s);
    222+        BOOST_REQUIRE(!g_stage_ping_nonce_zero.try_wait());
    223         pipes->recv.PushNetMsg(NetMsgType::PONG, /*nonce=*/uint64_t{0});
    224+        g_stage_ping_nonce_zero.wait();
    225     }
    226     WaitForPingStats(/*min=*/std::chrono::microseconds::max(), /*last=*/0us, /*wait=*/0us);
    227 
    228     BOOST_TEST_MESSAGE("Check that receiving a PONG with the correct nonce cancels our PING");
    229     nonce = SendPing();
    230     time_elapsed = 5s;
    231@@ -187,11 +195,12 @@ BOOST_AUTO_TEST_CASE(ping)
    232 }
    233 
    234 BOOST_AUTO_TEST_CASE(redundant_verack)
    235 {
    236     const auto pipes{m_sockets_pipes.PopFront()};
    237 
    238-    ASSERT_DEBUG_LOG_WAIT("ignoring redundant verack message", 30s);
    239+    BOOST_REQUIRE(!g_stage_redundant_verack.try_wait());
    240     pipes->recv.PushNetMsg(NetMsgType::VERACK);
    241+    g_stage_redundant_verack.wait();
    242 }
    243 
    244 BOOST_AUTO_TEST_SUITE_END()
    

    I could be missing some easier alternative. Let me know if you have something better in mind.


    maflcko commented at 10:16 am on July 21, 2023:

    better in mind?

    Yeah, any reason you can’t just call ProcessMessagesOnce. Alternatively, except for the ping/verack case you can just do what the python def sync_with_ping does?


    vasild commented at 12:35 pm on July 21, 2023:

    If ProcessMessagesOnce() is called directly, then it will not be an end-to-end test. It will defeat the purpose of this PR.

    Sending ping and waiting for pong (sync_with_ping) might be an alternative. It is not as flexible and fine-grained though as waiting for a particular log message to be logged and it involves a timeout too (which could expire too soon on slow platforms). What would be the advantage of a sync_with_ping approach? I have to think about this.


    maflcko commented at 1:06 pm on July 21, 2023:

    If ProcessMessagesOnce() is called directly, then it will not be an end-to-end test. It will defeat the purpose of this PR.

    Makes sense. I guess I misunderstood the goal of this pull. So the goal is to mimic the functional tests that already do the same checks, but do it without spinning up a real socket, but use a mocked socket?


    vasild commented at 3:06 pm on July 21, 2023:

    Yes. My idea is to complement the functional tests, not obsolete them.

    In general, when the test is run from inside the same process then we have greater control of what happens - we can inspect variables and check state in more detail than a functional test can do from outside of bitcoind - it has to use the RPC. Also, from inside the process we can alter the state by modifying variables or calling certain functions that is not possible from outside of the process.


    vasild commented at 5:16 pm on August 15, 2023:

    When one thread waits for another to execute a particular line of code, if that line of code is not executed for some time the waiting thread has to assume a timeout failure. I.e. some waiting is unavoidable, I am leaving the code as it is.

    Do you consider this resolved?


    vasild commented at 1:06 pm on September 15, 2023:
    Marking this as resolved. Comment if you think it is not.
  62. DrahtBot removed the label CI failed on Jul 20, 2023
  63. in src/test/fuzz/process_message_e2e.cpp:69 in 55c84c2d3b outdated
    63+    const std::string random_message_type{fuzzed_data_provider.ConsumeBytesAsString(CMessageHeader::COMMAND_SIZE).c_str()};
    64+    if (!LIMIT_TO_MESSAGE_TYPE.empty() && random_message_type != LIMIT_TO_MESSAGE_TYPE) {
    65+        return;
    66+    }
    67+
    68+    SetMockTime(GetMockTime() + std::chrono::minutes{fuzzed_data_provider.ConsumeIntegral<uint8_t>()});
    


    brunoerg commented at 8:55 pm on July 20, 2023:
    In 55c84c2d3bff09784ad127aba68a166f3f36f215: Is there any specific reason for this approach - Instead of using a fixed value like process_message does?

    vasild commented at 7:54 am on July 21, 2023:
    You mean SetMockTime(1610000000);? Connman is already started at this point and I thought that the clock going backwards might be a problem, but replacing this with SetMockTime(1610000000); seems to work so I guess either one is fine. Do you think I should change it?

    maflcko commented at 9:15 am on July 21, 2023:

    Do you think I should change it?

    In any case, there should be a comment to explain why it sets the mock time, no?


    vasild commented at 5:23 pm on August 15, 2023:
    Added a comment.
  64. vasild force-pushed on Aug 15, 2023
  65. vasild commented at 5:22 pm on August 15, 2023: contributor
    55c84c2d3b...ca7a9983eb: add a comment in the source code, as per #26812 (review)
  66. in src/test/fuzz/process_message_e2e.cpp:69 in ca7a9983eb outdated
    64+    if (!LIMIT_TO_MESSAGE_TYPE.empty() && random_message_type != LIMIT_TO_MESSAGE_TYPE) {
    65+        return;
    66+    }
    67+
    68+    // any time to successfully reset ibd; use a time in the future because Connman is already running
    69+    SetMockTime(GetMockTime() + std::chrono::minutes{fuzzed_data_provider.ConsumeIntegral<uint8_t>()});
    


    maflcko commented at 5:29 pm on August 15, 2023:
    ConsumeIntegral can return 0, so this won’t be in the future. Can just drop ConsumeIntegral completely?

    vasild commented at 11:11 am on August 16, 2023:
    I stared at this a bit and realized it does not make sense without the ResetIbd() call and further, the point is to jump out of IBD at some point later during the test (by changing the mock time so that it is near or before the chain tip). Added that and some comments.
  67. DrahtBot added the label CI failed on Aug 15, 2023
  68. vasild force-pushed on Aug 16, 2023
  69. vasild commented at 11:12 am on August 16, 2023: contributor
    ca7a9983eb...ae75114975: enter IBD at the start and possibly jump out at some point, like in src/test/fuzz/process_messages.cpp.
  70. vasild force-pushed on Aug 17, 2023
  71. vasild commented at 9:41 am on August 17, 2023: contributor
    ae75114975...d31cb74c44: rebase for CI
  72. DrahtBot removed the label CI failed on Aug 18, 2023
  73. bitcoin deleted a comment on Aug 18, 2023
  74. DrahtBot added the label CI failed on Aug 29, 2023
  75. DrahtBot commented at 8:33 pm on September 4, 2023: contributor
    Needs rebase if still relevant
  76. vasild force-pushed on Sep 15, 2023
  77. vasild commented at 1:05 pm on September 15, 2023: contributor
    d31cb74c44...4bc062fb9c: rebase due to conflicts
  78. DrahtBot removed the label CI failed on Sep 15, 2023
  79. DrahtBot added the label Needs rebase on Oct 3, 2023
  80. vasild force-pushed on Oct 6, 2023
  81. vasild commented at 9:12 am on October 6, 2023: contributor
    4bc062fb9c...8f21993a06: rebase due to conflicts
  82. DrahtBot removed the label Needs rebase on Oct 8, 2023
  83. jonatack commented at 5:14 am on October 17, 2023: contributor
    Will re-review this soon.
  84. DrahtBot added the label CI failed on Nov 3, 2023
  85. maflcko commented at 7:34 am on November 4, 2023: member
    0test/net_msg_tests.cpp:70:48: error: ‘WithParams’ was not declared in this scope;
    
  86. DrahtBot added the label Needs rebase on Nov 8, 2023
  87. dergoegge commented at 11:53 am on November 10, 2023: member

    Coverage after 1000 CPU hours of fuzzing: https://dergoegge.github.io/bitcoin-coverage/pr26812/fuzz.coverage/index.html

    Doesn’t reach much in net_processing, so it’s probably not doing much in terms of e2e testing.

  88. vasild force-pushed on Nov 15, 2023
  89. vasild commented at 2:06 pm on November 15, 2023: contributor
    8f21993a06...2386ef0a54: rebase due to conflicts and drop the newly added fuzz test. @dergoegge, thanks for looking into this and testing it. The reason for the low coverage is, I guess, that the fuzzer has a hard time generating a valid or “interesting” data to push to the socket. Same as the reason for #28692. According to @maflcko’s advice #21387 (review) a dictionary or a custom mutator could resolve this. I am yet to study how to do that. So, I have removed the last commit (the fuzz test) from this PR.
  90. DrahtBot removed the label Needs rebase on Nov 15, 2023
  91. vasild force-pushed on Nov 16, 2023
  92. vasild commented at 9:54 am on November 16, 2023: contributor
    2386ef0a54...0858b0c084: remove duplicate #include <sync.h>
  93. DrahtBot removed the label CI failed on Nov 17, 2023
  94. DrahtBot added the label Needs rebase on Nov 29, 2023
  95. vasild commented at 1:09 pm on December 1, 2023: contributor
    0858b0c084...8b10990aa0: rebase due to conflicts
  96. vasild force-pushed on Dec 1, 2023
  97. DrahtBot removed the label Needs rebase on Dec 1, 2023
  98. DrahtBot added the label CI failed on Jan 13, 2024
  99. Sjors commented at 5:38 pm on February 16, 2024: member
    Concept ACK. The ability to (better) mock a socket would be very helpful in my Stratum v2 Template Provider work #29432. I found out the hard way that using a real socket makes the test rather brittle.
  100. in src/test/util/net.h:235 in f42e4f3b3b outdated
    227+        /**
    228+         * Deserialize a `CNetMessage` and remove it from the pipe.
    229+         * If not enough bytes are available then the function will wait. If parsing fails
    230+         * or EOF is signaled to the pipe, then `std::nullopt` is returned.
    231+         */
    232+        std::optional<CNetMessage> GetNetMsg() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    


    Sjors commented at 10:16 am on February 19, 2024:

    f42e4f3b3b4b3ca1945d1ea298b443f1cecaf2ea: this only works with v1 transport. In both v2 transport and the (proposed) Stratum v2 transport (#29432) bytes on the wire need further processing to reconstruct the underlying message.

    For those it’s more useful to have a GetBytes(size_t n) helper method that waits until n bytes have been received.

    (Specifically for the Stratum v2 I’m also trying to completely avoid a dependency on CNetMessage, but the design for that is still in flux)


    vasild commented at 3:53 pm on February 27, 2024:

    Like this?

      0commit 2e8674478813e2e672e5eb7dbb87f28ee11a7268
      1Parent: ad806a0969226806bb33a890356b89e39904c1a8
      2Author:     Vasil Dimov <vd@FreeBSD.org>
      3AuthorDate: Tue Feb 27 14:59:04 2024 +0100
      4Commit:     Vasil Dimov <vd@FreeBSD.org>
      5CommitDate: Tue Feb 27 15:48:18 2024 +0100
      6gpg: Signature made Tue Feb 27 15:48:19 2024 CET
      7gpg:                using RSA key E64D8D45614DB07545D9CCC154DF06F64B55CBBF
      8gpg: Good signature from "Vasil Dimov <vd@myforest.net>" [ultimate]
      9gpg:                 aka "Vasil Dimov <vd@FreeBSD.org>" [ultimate]
     10gpg:                 aka "Vasil Dimov <vasild@gmail.com>" [ultimate]
     11
     12
     13    test: add blocking "get bytes" method to DynSock
     14
     15diff --git a/src/test/util/net.h b/src/test/util/net.h
     16index 368c84d622..d03674cb23 100644
     17--- a/src/test/util/net.h
     18+++ b/src/test/util/net.h
     19@@ -224,12 +224,20 @@ public:
     20          * [@param](/bitcoin-bitcoin/contributor/param/)[in] flags Same as the flags of `recv(2)`. Just `MSG_PEEK` is honored.
     21          * [@return](/bitcoin-bitcoin/contributor/return/) The number of bytes written to `buf`. `0` if `Eof()` has been called.
     22          * If no bytes are available then `-1` is returned and `errno` is set to `EAGAIN`.
     23          */
     24         ssize_t GetBytes(void* buf, size_t len, int flags = 0) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
     25 
     26+        /**
     27+         * Wait for `len` bytes and remove them from the pipe.
     28+         * [@param](/bitcoin-bitcoin/contributor/param/)[in] buf Destination to write bytes to.
     29+         * [@param](/bitcoin-bitcoin/contributor/param/)[in] len Write this number of bytes to `buf` or less if EOF is signaled.
     30+         * [@return](/bitcoin-bitcoin/contributor/return/) The number of bytes written to `buf`. Less than `len` means EOF was signaled.
     31+         */
     32+        size_t GetBytesWait(void* buf, size_t len) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
     33+
     34         /**
     35          * Deserialize a `CNetMessage` and remove it from the pipe.
     36          * If not enough bytes are available then the function will wait. If parsing fails
     37          * or EOF is signaled to the pipe, then `std::nullopt` is returned.
     38          */
     39         std::optional<CNetMessage> GetNetMsg() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
     40@@ -243,13 +251,13 @@ public:
     41          * Construct and push CNetMessage to the pipe.
     42          */
     43         template <typename... Args>
     44         void PushNetMsg(const std::string& type, Args&&... payload) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
     45 
     46         /**
     47-         * Signal end-of-file on the receiving end (`GetBytes()` or `GetNetMsg()`).
     48+         * Signal end-of-file on the receiving end (`GetBytes()`, `GetBytesWait()` or `GetNetMsg()`).
     49          */
     50         void Eof() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
     51 
     52     private:
     53         /**
     54          * Return when there is some data to read or EOF has been signaled.
     55diff --git a/src/test/util/net.cpp b/src/test/util/net.cpp
     56index 0fee2041f0..2bd4815e89 100644
     57--- a/src/test/util/net.cpp
     58+++ b/src/test/util/net.cpp
     59@@ -256,12 +256,31 @@ ssize_t DynSock::Pipe::GetBytes(void* buf, size_t len, int flags)
     60         m_data.erase(m_data.begin(), m_data.begin() + read_bytes);
     61     }
     62 
     63     return read_bytes;
     64 }
     65 
     66+size_t DynSock::Pipe::GetBytesWait(void* buf, size_t len)
     67+{
     68+    WAIT_LOCK(m_mutex, lock);
     69+
     70+    size_t total_read_bytes{0};
     71+
     72+    do {
     73+        WaitForDataOrEof(lock);
     74+        if (!m_data.empty()) {
     75+            const size_t read_bytes{std::min(len - total_read_bytes, m_data.size())};
     76+            std::memcpy(reinterpret_cast<uint8_t*>(buf) + total_read_bytes, m_data.data(), read_bytes);
     77+            m_data.erase(m_data.begin(), m_data.begin() + read_bytes);
     78+            total_read_bytes += read_bytes;
     79+        }
     80+    } while (total_read_bytes < len && !m_eof);
     81+
     82+    return total_read_bytes;
     83+}
     84+
     85 std::optional<CNetMessage> DynSock::Pipe::GetNetMsg()
     86 {
     87     V1Transport transport{NodeId{0}};
     88 
     89     {
     90         WAIT_LOCK(m_mutex, lock);
     91@@ -309,12 +328,22 @@ void DynSock::Pipe::Eof()
     92 {
     93     LOCK(m_mutex);
     94     m_eof = true;
     95     m_cond.notify_all();
     96 }
     97 
     98+void DynSock::Pipe::WaitForDataOrEof(UniqueLock<Mutex>& lock)
     99+{
    100+    Assert(lock.mutex() == &m_mutex);
    101+
    102+    m_cond.wait(lock, [&]() EXCLUSIVE_LOCKS_REQUIRED(m_mutex) {
    103+        AssertLockHeld(m_mutex);
    104+        return !m_data.empty() || m_eof;
    105+    });
    106+}
    107+
    108 DynSock::DynSock(std::shared_ptr<Pipes> pipes) : m_pipes{pipes} {}
    109 
    110 DynSock::~DynSock()
    111 {
    112     m_pipes->send.Eof();
    113 }
    

    I did not add that method in this PR because it would be unused. Can be added any time later as part of a PR that needs it.


    Sjors commented at 4:09 pm on February 27, 2024:
    Something like that yes. Maybe it can be included and used a helper function by GetNetMsg?

    vasild commented at 4:15 pm on February 27, 2024:
    I do not think so (or maybe I just don’t see it). Inside GetNetMsg() we don’t know how many bytes are needed. We throw something at transport.ReceivedBytes() and it tells us how much of that it consumed.

    Sjors commented at 4:25 pm on February 27, 2024:
    size_t len could be an optional argument. Not sure if that otherwise creates a mess though.

    vasild commented at 5:00 pm on February 27, 2024:
    You mean to combine the two methods GetBytes() and GetBytesWait()? I considered this but ditched the idea. The bodies of the methods are too different and the len argument cannot be omitted as it tells the size of the buffer.
  101. Sjors commented at 10:18 am on February 19, 2024: member
    Reviewed the fist commit bee6bdf0a5084b2d5749ff06ad63c0e77816c733. Added a question for the second.
  102. vasild force-pushed on Feb 27, 2024
  103. vasild commented at 3:46 pm on February 27, 2024: contributor
    8b10990aa0...ad806a0969: rebase and move the local lambda WaitForDataOrEof() as a method of the Pipe class so it can be reused in the future, see #26812 (review)
  104. vasild force-pushed on Feb 27, 2024
  105. DrahtBot removed the label CI failed on Feb 27, 2024
  106. DrahtBot commented at 3:19 am on March 15, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/22041198321

  107. DrahtBot added the label CI failed on Mar 15, 2024
  108. vasild force-pushed on Mar 28, 2024
  109. vasild commented at 10:13 am on March 28, 2024: contributor
    c54a458cb9...9a158da46c: rebase due to conflicts
  110. DrahtBot removed the label CI failed on Mar 28, 2024
  111. DrahtBot added the label CI failed on May 2, 2024
  112. DrahtBot commented at 5:54 pm on May 2, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/23192935231

  113. DrahtBot marked this as a draft on May 13, 2024
  114. DrahtBot commented at 8:03 am on May 13, 2024: contributor
    Turned into draft for now, due to failing CI. If this is no longer a WIP, you can move it out of draft.
  115. test: put the generic parts from StaticContentsSock into a separate class
    This allows reusing them in other mocked implementations.
    
    Also move the implementation (method definitions) to
    `test/util/net.cpp` to make the header `test/util/net.h` easier to follow.
    051f390c0d
  116. test: add a mocked Sock that allows inspecting what has been Send() to it
    And also allows gradually providing the data to be returned by `Recv()`
    and sending and receiving net messages (`CNetMessage`).
    9e738cf127
  117. test: add NetTestingSetup that starts connman with mocked sockets 117b1101e2
  118. test: don't throw from the destructor of DebugLogHelper
    Throwing an exception from the destructor of a class is a bad practice
    because the destructor will be called when an object of that type is
    alive on the stack and another exception is thrown, which will result
    in "exception during the exception". This would terminate the program
    without any messages.
    
    Instead print the message to the standard error output and call
    `std::abort()`.
    f5cbcec951
  119. test: make it possible for DebugLogHelper to wait for messages
    Extend `DebugLogHelper::~DebugLogHelper()` to be able to
    optionally wait for messages, possibly logged from another thread, to
    arrive before performing the final check.
    b533f98157
  120. test: add unit tests exercising full call chain of CConnman and PeerManager
    Add tests that use a mocked socket to similate network IO and cover the
    full `CConnman`, `PeerManager` and the interaction between them.
    ecba2fbd20
  121. in src/test/util/setup_common.cpp:51 in 9a158da46c outdated
    47@@ -48,6 +48,7 @@
    48 #include <test/util/net.h>
    49 #include <test/util/random.h>
    50 #include <test/util/txmempool.h>
    51+#include <timedata.h>
    


    jonatack commented at 8:09 pm on May 13, 2024:

    Didn’t look deeply, but with the following diff this branch builds again after rebase to current master ff8c606cf1eaefd0eab9f144561120a and the unit tests pass:

     0--- a/src/test/util/setup_common.cpp
     1+++ b/src/test/util/setup_common.cpp
     2@@ -45,7 +45,6 @@
     3 #include <test/util/txmempool.h>
     4-#include <timedata.h>
     5 #include <txdb.h>
     6@@ -648,10 +647,6 @@ NetTestingSetup::~NetTestingSetup()
     7     fNameLookup = DEFAULT_NAME_LOOKUP;
     8     fListen = true;
     9     CreateSock = CreateSockOS;
    10-    // PeerManager::ProcessMessage() calls AddTimeData() which changes the internal state
    11-    // in timedata.cpp and later confuses the test "timedata_tests/addtimedata". Thus reset
    12-    // that state as it was before our test was run.
    13-    TestOnlyResetTimeData();
    14 }
    

    vasild commented at 9:21 am on May 23, 2024:
    Yes, indeed! Fixed, thanks!
  122. vasild force-pushed on May 23, 2024
  123. vasild commented at 9:21 am on May 23, 2024: contributor
    9a158da46c...ecba2fbd20: rebase due to conflicts, thanks, @jonatack!
  124. vasild marked this as ready for review on May 23, 2024
  125. DrahtBot removed the label CI failed on May 23, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me