Use mockable time everywhere in net_processing #20027

pull sipa wants to merge 2 commits into bitcoin:master from sipa:202009_mockable_netproc changing 3 files +34 −39
  1. sipa commented at 9:53 pm on September 27, 2020: member

    The fact that net_processing uses a mix of mockable tand non-mockable time functions made it hard to write functional tests for #19988.

    I’m opening this as a separate PR as I believe it’s independently useful. In some ways this doesn’t go quite as far as it could, as there are now several data structures that could be converted to std::chrono types as well now. I haven’t done that here, but I’m happy to reconsider that.

  2. sipa force-pushed on Sep 27, 2020
  3. DrahtBot added the label P2P on Sep 27, 2020
  4. sipa force-pushed on Sep 28, 2020
  5. fanquake deleted a comment on Sep 28, 2020
  6. DrahtBot commented at 3:53 am on September 28, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #20044 (Make all of net_processing (and some of net) use std::chrono types by sipa)
    • #19893 (test: Remove or explain syncwithvalidationinterfacequeue by MarcoFalke)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. in src/net_processing.cpp:4157 in 1281d7d31d outdated
    4153@@ -4155,7 +4154,7 @@ bool PeerManager::SendMessages(CNode* pto)
    4154             // Only actively request headers from a single peer, unless we're close to today.
    4155             if ((nSyncStarted == 0 && fFetch) || pindexBestHeader->GetBlockTime() > GetAdjustedTime() - 24 * 60 * 60) {
    4156                 state.fSyncStarted = true;
    4157-                state.nHeadersSyncTimeout = GetTimeMicros() + HEADERS_DOWNLOAD_TIMEOUT_BASE + HEADERS_DOWNLOAD_TIMEOUT_PER_HEADER * (GetAdjustedTime() - pindexBestHeader->GetBlockTime())/(consensusParams.nPowTargetSpacing);
    4158+                state.nHeadersSyncTimeout = current_time.count() + HEADERS_DOWNLOAD_TIMEOUT_BASE + HEADERS_DOWNLOAD_TIMEOUT_PER_HEADER * (GetAdjustedTime() - pindexBestHeader->GetBlockTime())/(consensusParams.nPowTargetSpacing);
    


    MarcoFalke commented at 6:27 am on September 28, 2020:
    This is not type safe, because someone could changed the type of current_time to have more or less precision than it has now. Please see the count_ helpers.

    sipa commented at 6:56 am on September 28, 2020:

    That’s indeed better, done.

    A bit more abstractly, I don’t think that’s very different from the risk of nHeadersSyncTimeout being changed to represent a time in nanoseconds instead of microseconds, for example. We’re mixing std::chrono types with integers, and the conversions will always involve these risks. The correct approach is to get rid of all of those, and replace them all with std::chrono types. I’m happy to look into that as follow-up PR.

  8. in src/net_processing.cpp:4632 in 1281d7d31d outdated
    4631@@ -4636,7 +4632,6 @@ bool PeerManager::SendMessages(CNode* pto)
    4632             !pto->HasPermission(PF_FORCERELAY) // peers with the forcerelay permission should not filter txs to us
    4633         ) {
    4634             CAmount currentFilter = m_mempool.GetMinFee(gArgs.GetArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000).GetFeePerK();
    4635-            int64_t timeNow = GetTimeMicros();
    


    MarcoFalke commented at 6:31 am on September 28, 2020:
    Is there any reason to subtract the time it took to send previous messages from the next send? It probably doesn’t matter too much, because we assume that sending all messages is instantaneous, but getting the current time is also trivial complexity, so I have a slight preference to keep the getter here.

    sipa commented at 7:04 am on September 28, 2020:
    I generally think we’re making way too many time calls already, and I doubt microsecond precisions matters at all for the fee filter sending.

    ajtowns commented at 0:36 am on October 7, 2020:

    a) We’re not actually sending the messages there, just moving them in memory to a different queue, so it should take microseconds b) This is often only called in intervals of 100ms, and randomly delayed depending on how long it takes to deal with other peers, so accuracy won’t be that fine-grained anyway c) The only times affected by not using an updated timestamp here are ones chosen in comparison to 5min or 10min intervals

    So I don’t see how not using a bumped time here can have any meaningful impact?

  9. in test/functional/wallet_resendwallettransactions.py:69 in 1281d7d31d outdated
    63@@ -64,6 +64,10 @@ def run_test(self):
    64         # Transaction should be rebroadcast approximately 24 hours in the future,
    65         # but can range from 12-36. So bump 36 hours to be sure.
    66         node.setmocktime(now + 36 * 60 * 60)
    67+        # Give some time for trickle to occur
    68+        time.sleep(1)
    69+        node.syncwithvalidationinterfacequeue()
    


    MarcoFalke commented at 6:32 am on September 28, 2020:
    Can you explain why this is needed, what this is trying to achieve? #19893

    sipa commented at 7:05 am on September 28, 2020:
    I copied it from elsewhere; it works fine without the syncwithvalidationinterfacequeue call, so I’ve removed it.
  10. in src/net_processing.cpp:4662 in 1281d7d31d outdated
    4662             // until scheduled broadcast, then move the broadcast to within MAX_FEEFILTER_CHANGE_DELAY.
    4663-            else if (timeNow + MAX_FEEFILTER_CHANGE_DELAY * 1000000 < pto->m_tx_relay->nextSendTimeFeeFilter &&
    4664+            else if (current_time.count() + MAX_FEEFILTER_CHANGE_DELAY * 1000000 < pto->m_tx_relay->nextSendTimeFeeFilter &&
    4665                      (currentFilter < 3 * pto->m_tx_relay->lastSentFeeFilter / 4 || currentFilter > 4 * pto->m_tx_relay->lastSentFeeFilter / 3)) {
    4666-                pto->m_tx_relay->nextSendTimeFeeFilter = timeNow + GetRandInt(MAX_FEEFILTER_CHANGE_DELAY) * 1000000;
    4667+                pto->m_tx_relay->nextSendTimeFeeFilter = current_time.count() + GetRandInt(MAX_FEEFILTER_CHANGE_DELAY) * 1000000;
    


    MarcoFalke commented at 6:33 am on September 28, 2020:
    none of the .count in this scope are type safe (see comment above)

    sipa commented at 7:05 am on September 28, 2020:
    Fixed.
  11. MarcoFalke commented at 6:34 am on September 28, 2020: member
    Concept ACK, but I don’t like that this patch makes the code more fragile.
  12. naumenkogs commented at 6:42 am on September 28, 2020: member
    Concept ACK
  13. sipa force-pushed on Sep 28, 2020
  14. MarcoFalke added the label Refactoring on Sep 28, 2020
  15. practicalswift commented at 1:13 pm on September 28, 2020: contributor
    Concept ACK
  16. in test/functional/p2p_tx_download.py:162 in e5c0223614 outdated
    174-
    175-        self.test_inv_block()
    176-
    177-        self.test_tx_requests()
    178+        # Run each test against new bitcoind instances, as setting mocktimes has long-term effects on when
    179+        # the next trickle relay event happens.
    


    ajtowns commented at 0:14 am on October 7, 2020:

    I’m not sure it’s a good idea to link mockable time to network delays in general – if you bump the mocktime by a hours/days that shouldn’t risk triggering “hey this peer hasn’t said anything for ages, let’s disconnect” conditions, and equally if you backdate the mocktime for some reason, that shouldn’t cause network tasks to say “oh, we decided we don’t need to do anything until time t, which is apparently years away” and effectively hang. Having to tweak tests feels a bit like a canary…

    My intuition for this was we have two sorts of delays – short ones (<30s) that just prevent us busy looping when there’s nothing to actually do and don’t generally need to be precisely controlled, and long ones (minutes/hours/days) that are human visible and that we’ll obviously need mocktime to test because minutes/hours/days is effectively forever.

    But maybe we’ve really got two different sorts of mocking that we want to do – “set the time to X” and it stays there until we’re ready to set it to some different time “Y” (what we’ve got now), and “everything that was going to happen in the next 5 minutes, get it done now” (something more like what the mockscheduler rpc does)? You could implement the latter by bumping the system clock (rather than replacing it), and only allowing the bump amount to increase.

    I suspect at a minimum we’d want to switch to the std::chrono::time_point abstraction and use different clocks so that we could tell these cases apart by type, but that doesn’t seem like a good thing to dive into immediately before feature freeze. (I had a sketch I was working on a while back at https://github.com/ajtowns/bitcoin/commits/201908-systime fwiw)

    And mocktime is only relevant for regtest, so the downsides of this approach are pretty limited (presumably to just these test case changes)


    sipa commented at 10:40 pm on October 7, 2020:

    I think we’ll indeed a number of timers for different purposes, eventually:

    • System time (possibly adjusted by peer data), for use in block and wallet transaction timestamps (as they have an absolute UNIX epoch as reference ).
    • Steady time, mockable, for timing application level events (tx relay, pinging, addr rebroadcast, …).
    • Steady time, for avoiding busy looping (like select() timeout, message handler 100ms delay, …).

    The last one indeed I think shouldn’t be mockable, but things like ping timeouts I think should be. It just becomes too complicated when multiple clocks are mixed within one layer of the code - even if it means using a mockable clock for some delays that are in practice very short.

    As for dealing with the fact that making small delays mockable complicates testing, I think the best solution is a “add X to mock time offset, but keep the clock running”, if that’s what you meant.


    MarcoFalke commented at 9:40 am on October 8, 2020:

    I disagree that system time (used for blocks and wallet txs) should not be mockable. For example, sometimes we want to create a deterministic blockchain, so the block times need to be fixed. I know it is possible to use the python block generator, but I don’t see a strong reason to disallow system time to be mocked.

    Edit: So it is also important to be able to say “set time to a constant” instead of just “add offset, but keep clock running”.


    ajtowns commented at 7:54 pm on October 8, 2020:
    I think the idea was to have two ways of mocking – fix the system time to X (and prevent it from advancing, so time goes t, t+1, t+2, t+3, X, X, X, X), or bump the system time by D (so time goes t, t+1, t+2, t+3, t+D+3, t+D+4, t+D+5).
  17. in src/net_processing.cpp:562 in e5c0223614 outdated
    558@@ -559,7 +559,7 @@ static bool MarkBlockAsReceived(const uint256& hash) EXCLUSIVE_LOCKS_REQUIRED(cs
    559         }
    560         if (state->vBlocksInFlight.begin() == itInFlight->second.second) {
    561             // First block on the queue was received, update the start download time for the next one
    562-            state->nDownloadingSince = std::max(state->nDownloadingSince, GetTimeMicros());
    563+            state->nDownloadingSince = std::max<int64_t>(state->nDownloadingSince, GetTime<std::chrono::microseconds>().count());
    


    ajtowns commented at 0:39 am on October 7, 2020:
    Swich to count_microseconds(GetTime<std::chrono::microseconds>()) and lose the <int64_t> ?

    sipa commented at 11:24 pm on October 7, 2020:
    Done.
  18. in src/net_processing.cpp:3640 in e5c0223614 outdated
    3609@@ -3610,7 +3610,7 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat
    3610                     // Matching pong received, this ping is no longer outstanding
    3611                     bPingFinished = true;
    3612                     const auto ping_time = ping_end - pfrom.m_ping_start.load();
    3613-                    if (ping_time.count() > 0) {
    3614+                    if (ping_time.count() >= 0) {
    


    ajtowns commented at 1:29 am on October 7, 2020:
    Not really directly related? I think this makes sense for instances where time isn’t increasing, which can happen with mocktime or (I think) on systems that return cached values of the system time.

    sipa commented at 11:23 pm on October 7, 2020:
    WIthout this you see “timing mishap” errors in the function test logs.

    ajtowns commented at 0:14 am on October 8, 2020:
    Right, but I mean you already see those errors without the other changes in this patch.

    sipa commented at 2:12 am on October 8, 2020:
    Oh, ok. Gone.

    ajtowns commented at 2:37 am on October 8, 2020:
    Err? A separate commit with Allow for ping response time of 0us (avoids "timing mishap" errors when mocktime is set) or similar would be fine and clear IMO, seems a bit trivial to separate into a PR of its own. (Or just leaving it in and letting people who’re curious trace it back to this PR and read the explanation above would also be fine) But either way.

    sipa commented at 2:41 am on October 8, 2020:
    Ok!
  19. ajtowns approved
  20. ajtowns commented at 2:34 am on October 7, 2020: member

    ACK e5c0223614d61d3d885b1c7eaf5ee095cd7d15ed

    Checked conversion from GetTime() to count_microsecounds(GetTime<>()) and that variables that are now mockable are only compared with mockable times.

  21. jnewbery commented at 8:41 am on October 7, 2020: member
    ACK e5c0223614d61d3d885b1c7eaf5ee095cd7d15ed
  22. sipa force-pushed on Oct 7, 2020
  23. sipa force-pushed on Oct 8, 2020
  24. in test/functional/wallet_resendwallettransactions.py:68 in 44b17b2a4d outdated
    63@@ -64,6 +64,9 @@ def run_test(self):
    64         # Transaction should be rebroadcast approximately 24 hours in the future,
    65         # but can range from 12-36. So bump 36 hours to be sure.
    66         node.setmocktime(now + 36 * 60 * 60)
    67+        # Give some time for trickle to occur
    68+        time.sleep(1)
    


    MarcoFalke commented at 7:26 am on October 8, 2020:

    It seems a bit confusing to “use mockable time everywhere”, but then start adding wallclock sleeps in tests. Also, the comment refers to “trickle”, but should probably say “wallet resend”. (The trickle happens when you call setmocktime + 600 in the next line). I suggest the following diff:

     0diff --git a/test/functional/wallet_resendwallettransactions.py b/test/functional/wallet_resendwallettransactions.py
     1index 9336f94ee3..f2f48b8bcd 100755
     2--- a/test/functional/wallet_resendwallettransactions.py
     3+++ b/test/functional/wallet_resendwallettransactions.py
     4@@ -64,8 +64,8 @@ class ResendWalletTransactionsTest(BitcoinTestFramework):
     5         # Transaction should be rebroadcast approximately 24 hours in the future,
     6         # but can range from 12-36. So bump 36 hours to be sure.
     7         node.setmocktime(now + 36 * 60 * 60)
     8-        # Give some time for trickle to occur
     9-        time.sleep(1)
    10+        # Tell scheduler to call MaybeResendWalletTxs now
    11+        node.mockscheduler(1)
    12         node.setmocktime(now + 36 * 60 * 60 + 600)
    13         peer_second.wait_for_broadcast([txid])
    14 
    

    sipa commented at 8:05 am on October 8, 2020:
    Done.
  25. MarcoFalke approved
  26. MarcoFalke commented at 7:39 am on October 8, 2020: member

    ACK 93fca5fbd84ff793625f1c2bd469cfee68aa07f1 🐎

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK 93fca5fbd84ff793625f1c2bd469cfee68aa07f1 🐎
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUg9lwv9Hr28B7UGDp2zRH285YsHrhEiqoQTWbSpuL9dQPRWTAD4ygHN9qW4TbZt
     8DOsuHRZOAjFBinWH/gYIUUW3F8WvIchHMgNY+p4+IkNbYIAMruDmMvbDTldUq0Hu
     9n9QzyB6Oz2OW9p8hMFJTbPB+8rVwMwA4H8ZwooghiKNszd/sFVFi5j5Pxgd6XlFi
    10MDkYpUmS3wQt1Kb2ya4SyOVPdlGQpje9ASjqCNSDz+CJOoJ9fFRZ+txEgn5lgTKI
    11o/oPoNVd2ZIQamx79+1y4kzLmkBBxbxP96ulC6l1UCMA9kKzX1m6yafSgN4Lc3Wo
    1201G5qBGawYXtWb48LPQGEYhVVIM0AcukRvpMA1TBpXkVPKDtvO/8/QEqkMZRnmaI
    13C10s9QP3TNuGhkIY4AUwVQkFmQEbWaePNVVdMd/ee3WO1N2VKuiH/+9/zxpMq9cf
    14tuoffg+0Y2MhRxVFjey23gjDoea+OQjo2KfRApyhELaqejX48k9A3oUbwW7NrUYb
    15IznFTsZW
    16=b0jx
    17-----END PGP SIGNATURE-----
    

    Timestamp of file with hash 03cb2a91902242d76811cd547cea6874233f1f46c555bd142069e3cbfa6b48fc -

  27. Use mockable time everywhere in net_processing ec3916f40a
  28. Avoid 'timing mishap' warnings when mocking b6834e312a
  29. sipa force-pushed on Oct 8, 2020
  30. MarcoFalke commented at 8:15 am on October 8, 2020: member

    ACK b6834e312a 🌶

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK b6834e312a 🌶
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUiqVAwAkgjqSWeoLAE5Eej6vdWXcdAqJlztAhCbdlyCZJJRd/Wu34rtBvHC0fwW
     8UsmVfevPkIL4ejJqUsII+h2QidQv8Ie//EiX+9HtdeQIbpX0AgWJd6mb5jdS8NJw
     9sXwpFfL+x+3H8weUk7mxlIvPtYLZm1VsrlhWR6QW1HG0wL3o9/MVNho3c78lOjxL
    10nQz//fbX6hCLD067gE2ujx7wJsBzbux8jn57oifZq2KPK+6ENmVy8s63u7qDocRw
    11Gc7xpBp2KmnRlfX0Juik8QjNOGOIuNBytRhuZwaLNWKupQT3ItxGWvI8jk7G8mov
    12cKL6csUlRx7iM0z3bHjzZRabbOp8lT29chErWaB6mprVpmC0QJMEw/+JkW6NARnd
    13LNpniw8ttC6Y4Nyci0DCQsDLzyQ942I6tGTWV+gR7XqHMC3DRd5/zCJsP/zhkrXT
    14Aac9tNB45brc8Vl534j+og+pZ7oxSKug6IUwmImavdqRjl0rq/+RsTdFr7Vk9IeG
    152MJK2Tjc
    16=pOLL
    17-----END PGP SIGNATURE-----
    

    Timestamp of file with hash df57ab8fe3ed162f8bb109f94ab3daf5cb2dd502b92b384bc065c8b7c69084eb -

  31. jnewbery commented at 9:17 am on October 8, 2020: member
    utACK b6834e312a6a7bb395ec7266bc9469384639df96
  32. fanquake requested review from ajtowns on Oct 8, 2020
  33. naumenkogs commented at 9:31 am on October 8, 2020: member

    utACK b6834e3

    I agree with the comments by @ajtowns, but I guess it can only introduce problems on regtest? And since this PR is necessary for #19988, I think it’s fine, that can be fixed later.

  34. MarcoFalke merged this on Oct 8, 2020
  35. MarcoFalke closed this on Oct 8, 2020

  36. in src/net_processing.cpp:585 in b6834e312a
    581@@ -582,7 +582,7 @@ static bool MarkBlockAsReceived(const uint256& hash) EXCLUSIVE_LOCKS_REQUIRED(cs
    582         }
    583         if (state->vBlocksInFlight.begin() == itInFlight->second.second) {
    584             // First block on the queue was received, update the start download time for the next one
    585-            state->nDownloadingSince = std::max(state->nDownloadingSince, GetTimeMicros());
    586+            state->nDownloadingSince = std::max(state->nDownloadingSince, count_microseconds(GetTime<std::chrono::microseconds>()));
    


    rebroad commented at 3:35 pm on February 23, 2021:
    seems like a backwards step turning neat code into lengthy code - why not use a function for this or change GetTimeMicros() itself?
  37. Fabcien referenced this in commit 6f09bb224d on May 18, 2021
  38. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me