test: Add random interval upper bound in regtest #19588

pull hebasto wants to merge 1 commits into bitcoin:master from hebasto:200725-uni-test changing 1 files +13 −2
  1. hebasto commented at 7:24 am on July 25, 2020: member

    Actually, the exponential distribution is unbounded that causes intermittent timeout errors in functional tests.

    Fix #18832 Fix #18969 Fix #19265 Fix #19311

  2. in src/init.cpp:570 in 237bc3edbb outdated
    566@@ -567,6 +567,7 @@ void SetupServerArgs(NodeContext& node)
    567     gArgs.AddArg("-rpcwhitelistdefault", "Sets default behavior for rpc whitelisting. Unless rpcwhitelistdefault is set to 0, if any -rpcwhitelist is set, the rpc server acts as if all rpc users are subject to empty-unless-otherwise-specified whitelists. If rpcwhitelistdefault is set to 1 and no -rpcwhitelist is set, rpc server acts as if all rpc users are subject to empty whitelists.", ArgsManager::ALLOW_BOOL, OptionsCategory::RPC);
    568     gArgs.AddArg("-rpcworkqueue=<n>", strprintf("Set the depth of the work queue to service RPC calls (default: %d)", DEFAULT_HTTP_WORKQUEUE), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::RPC);
    569     gArgs.AddArg("-server", "Accept command line and JSON-RPC commands", ArgsManager::ALLOW_ANY, OptionsCategory::RPC);
    570+    gArgs.AddArg("-uniformintervaltest", "Use uniform distribution for random intervals instead of exponential one (default: 0)", ArgsManager::ALLOW_BOOL | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
    


    MarcoFalke commented at 7:40 am on July 25, 2020:

    not sure if adding another mock-setting is the right approach here

    • This is not guarded and can accidentally be set on mainnet
    • This will run a different code path when set (partially code that has been added for this test-setting only), thus reducing the effectiveness of tests and increasing the maintenance burden
    • There is already a way to mock the poisson dist with the noban permission (This could be extracted to a hidden test-only permission if there are concerns about noban being too broad)
    • A timeout of 60s with the expected value of the poisson dist (5s or 7s or whatever) yields a failure rate that is far lower than the failure rate we see due to infrastructure problems or other test problems and races. The issue here is that the nodes are daisy chained and thus the delays add up, which could bring the delay over 60s more likely.

    hebasto commented at 8:38 am on July 25, 2020:
    • This is not guarded and can accidentally be set on mainnet

    Yes. It must be guarded. Going to fix it.

    • This will run a different code path when set (partially code that has been added for this test-setting only), thus reducing the effectiveness of tests and increasing the maintenance burden

    The current code path, which is skipped in tests with this PR, is not covered by tests in master.

    • There is already a way to mock the poisson dist with the noban permission (This could be extracted to a hidden test-only permission if there are concerns about noban being too broad)

    This approach does not work universally, e.g., in wallet_backup.py: https://github.com/bitcoin/bitcoin/blob/40a04814d130dfc9131af3f568eb44533e2bcbfc/test/functional/wallet_backup.py#L52-L57

    • A timeout of 60s with the expected value of the poisson dist (5s or 7s or whatever) yields a failure rate that is far lower than the failure rate we see due to infrastructure problems or other test problems and races. The issue here is that the nodes are daisy chained and thus the delays add up, which could bring the delay over 60s more likely.

    The assumption about test failures that would be fixed by this PR could be biased as in the most cases we have only CI logs that lack data about infrastructure problems.


    MarcoFalke commented at 7:55 pm on August 12, 2020:

    The current code path, which is skipped in tests with this PR, is not covered by tests in master.

    The same might be true for this change. Capping the time might never execute code paths (in other functions) that only run after a long timeout.

  3. hebasto force-pushed on Jul 25, 2020
  4. hebasto renamed this:
    test: Use uniform distribution for random intervals in functional tests
    test: Add random interval upper bound in regtest
    on Jul 25, 2020
  5. hebasto commented at 9:55 am on July 25, 2020: member

    Reworked. @MarcoFalke Mind taking another look?

    The changes are pretty local, and if they will not justify themselves, they could be easily reverted back :smiley:

  6. DrahtBot commented at 6:33 pm on July 26, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18077 (net: Add NAT-PMP port forwarding support by hebasto)
    • #17428 (p2p: Try to preserve outbound block-relay-only connections during restart by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. DrahtBot added the label P2P on Jul 26, 2020
  8. DrahtBot added the label Tests on Jul 26, 2020
  9. hebasto force-pushed on Jul 29, 2020
  10. hebasto commented at 11:41 am on July 29, 2020: member
    Rebased c6d1cbdb4b7d21ebb3254781f8f6a6c8a679837d -> 8a945c0a4834b60c5453485a78cbae45ce65b3d9 (pr19588.02 -> pr19588.03) to include the recent CI changes.
  11. test: Add random interval upper bound in regtest
    These changes fix intermittent timeout errors in the functional tests.
    b34b8edfb2
  12. in src/net.cpp:2858 in 8a945c0a48 outdated
    2854@@ -2853,7 +2855,16 @@ int64_t CConnman::PoissonNextSendInbound(int64_t now, int average_interval_secon
    2855 
    2856 int64_t PoissonNextSend(int64_t now, int average_interval_seconds)
    2857 {
    2858-    return now + (int64_t)(log1p(GetRand(1ULL << 48) * -0.0000000000000035527136788 /* -1/2^48 */) * average_interval_seconds * -1000000.0 + 0.5);
    2859+    const int64_t average_interval_ms = average_interval_seconds * 1000000ULL;
    


    instagibbs commented at 3:10 pm on August 5, 2020:
    math isn’t my strongest suit: Why are we multiplying seconds by 100k to get ms? 🤔

    hebasto commented at 3:14 pm on August 5, 2020:

    math isn’t my strongest suit: Why are we multiplying seconds by 100k to get ms? thinking

    Why 1000000ULL is 100k? :)


    hebasto commented at 3:20 pm on August 5, 2020:
    Going to update names.

    hebasto commented at 3:26 pm on August 5, 2020:
    Thanks! Updated.

    instagibbs commented at 3:32 pm on August 5, 2020:
    too many zeroes clearly :eyes: , thanks for update of name
  13. hebasto force-pushed on Aug 5, 2020
  14. hebasto commented at 3:29 pm on August 5, 2020: member

    Updated 8a945c0a4834b60c5453485a78cbae45ce65b3d9 -> b34b8edfb24a7660376466bb136affcc6d688b75 (pr19588.03 -> pr19588.04, diff).

    Addressed @instagibbscomment:

    math isn’t my strongest suit: Why are we multiplying seconds by 100k to get ms?

  15. in src/net.cpp:2862 in b34b8edfb2
    2854@@ -2853,7 +2855,16 @@ int64_t CConnman::PoissonNextSendInbound(int64_t now, int average_interval_secon
    2855 
    2856 int64_t PoissonNextSend(int64_t now, int average_interval_seconds)
    2857 {
    2858-    return now + (int64_t)(log1p(GetRand(1ULL << 48) * -0.0000000000000035527136788 /* -1/2^48 */) * average_interval_seconds * -1000000.0 + 0.5);
    2859+    const int64_t average_interval_microseconds = average_interval_seconds * 1000000ULL;
    2860+    int64_t delta_microseconds = -std::log1p(GetRand(1ULL << 48) * -0.0000000000000035527136788 /* -1/2^48 */) * average_interval_microseconds + 0.5;
    2861+
    2862+    // Upper bound to avoid timeouts in the functional tests.
    2863+    static const bool regtest = gArgs.GetChainName() == CBaseChainParams::REGTEST;
    


    laanwj commented at 6:02 pm on August 12, 2020:
    I think this is really ugly, both doing a string compare here, making the function depend on a global, and doing this at first call (which is not guaranteed to be after initializing the chain name). Though I don’t really have a proposal how to do this differently.
  16. hebasto closed this on Aug 15, 2020

  17. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 04:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me