test: Seed test RNG context for each test case, print seed #16978

pull MarcoFalke wants to merge 1 commits into bitcoin:master from MarcoFalke:1909-testSeed changing 9 files +46 −15
  1. MarcoFalke commented at 7:39 pm on September 27, 2019: member

    Debugging failing unit tests is hard if the failure is non-deterministic and the seed is not known.

    Fix that by printing the seed and making it possible to set the seed from outside.

  2. fanquake added the label Tests on Sep 27, 2019
  3. MarcoFalke force-pushed on Sep 27, 2019
  4. MarcoFalke force-pushed on Sep 27, 2019
  5. MarcoFalke force-pushed on Sep 27, 2019
  6. practicalswift commented at 8:09 am on September 28, 2019: contributor
    Concept ACK
  7. MarcoFalke force-pushed on Sep 30, 2019
  8. MarcoFalke force-pushed on Oct 1, 2019
  9. in src/test/setup_common.h:43 in fa74667c47 outdated
    37@@ -38,9 +38,21 @@ extern FastRandomContext g_insecure_rand_ctx;
    38  */
    39 extern bool g_mock_deterministic_tests;
    40 
    41-static inline void SeedInsecureRand(bool deterministic = false)
    42+enum class SeedRand {
    43+    ZEROS, //!< Seed with a compile time constant of zeros
    44+    SEEDR, //!< Call the Seed() helper
    


    practicalswift commented at 8:24 pm on October 3, 2019:
    Is this intentionally abbreviated as SEEDR, or should it be SEED/SEEDER?

    MarcoFalke commented at 8:30 pm on October 3, 2019:
    Happy to change it to something else if you can find a better name

    practicalswift commented at 8:37 pm on October 3, 2019:
    I was thinking simply SEED? :)

    MarcoFalke commented at 8:46 pm on October 3, 2019:
    Changed. Let’s hope this doesn’t conflict with any windows symbols
  10. MarcoFalke force-pushed on Oct 3, 2019
  11. in src/test/setup_common.cpp:52 in fab503d6df outdated
    47+void Seed(FastRandomContext& ctx)
    48+{
    49+    // Should be enough to get the seed once for the process
    50+    static uint256 seed{};
    51+    static const std::string RANDOM_CTX_SEED{"RANDOM_CTX_SEED"};
    52+    if (seed.IsNull()) seed = GetUintFromEnv(RANDOM_CTX_SEED);
    


    davereikher commented at 11:32 am on October 4, 2019:
    Is if (seed.IsNull()) necessary in this line? Isn’t seed always null here?

    MarcoFalke commented at 1:05 pm on October 4, 2019:
    It will be set once for the process, so (ideally) should only be null once
  12. in contrib/devtools/test_deterministic_coverage.sh:16 in fab503d6df outdated
    11@@ -12,6 +12,9 @@ export LC_ALL=C
    12 # Use GCOV_EXECUTABLE="llvm-cov gcov" if compiling with clang.
    13 GCOV_EXECUTABLE="gcov"
    14 
    15+# Keep random seed constant over multiple runs.
    16+export RANDOM_CTX_SEED=${RANDOM_CTX_SEED:-8c9678d7cde616175258686f6aaf7c2446657f40a9bc5196ba97ca21b5e3c5c4}
    


    davereikher commented at 12:09 pm on October 4, 2019:
    I think the contrib/devtools/test_deterministic_coverage.sh script will fail to detect non-deterministic coverage if the seed is fixed between runs. Isn’t it the whole point of that script? To make sure I ran a single test, wallet_tests/coin_mark_dirty_immature_credit 30 times using the script and I always got the same coverage. If I’m not mistaken, the main purpose here is to log the seed, so maybe this should be changed to a random number read from /dev/random, something like export RANDOM_CTX_SEED=$(xxd -l 32 -p /dev/urandom | tr -d " \n") Also, for this purpose, I believe this should be moved to within the while [[ ${TEST_RUN_ID} -lt ${N_TEST_RUNS} ]]; do loop so that the seed will be different for each iteration.

    MarcoFalke commented at 1:16 pm on October 4, 2019:
    Thanks, switched to xxd -l 32 -p /dev/urandom | tr -d " \n"
  13. davereikher changes_requested
  14. MarcoFalke force-pushed on Oct 4, 2019
  15. MarcoFalke commented at 1:16 pm on October 4, 2019: member
    Addressed feedback by @davereikher
  16. in contrib/devtools/test_deterministic_coverage.sh:16 in faad0e8e4d outdated
    11@@ -12,6 +12,9 @@ export LC_ALL=C
    12 # Use GCOV_EXECUTABLE="llvm-cov gcov" if compiling with clang.
    13 GCOV_EXECUTABLE="gcov"
    14 
    15+# Keep random seed constant over multiple runs.
    16+export RANDOM_CTX_SEED=${RANDOM_CTX_SEED:-$(xxd -l 32 -p /dev/urandom | tr -d " \n")}
    


    davereikher commented at 2:30 pm on October 4, 2019:
    This makes the seed different between subsequent runs of the script, but the script will still show deterministic coverage each time it runs, since the seed doesn’t change between iterations within the script, so this line should be moved down into this loop: https://github.com/bitcoin/bitcoin/blob/faad0e8e4d61dfb115331c18e68adffb2b3ee471/contrib/devtools/test_deterministic_coverage.sh#L106

    MarcoFalke commented at 5:45 pm on October 7, 2019:
    Huh? If you change the seed between runs and then compare the results of those runs, you are pretty much guaranteed to get non-deterministic coverage. I think this test should check for non-determinism across runs, assuming everything kept constant. That is, it should act as a proxy to be able to answer the question “How reliable can test failures be reproduced”.

    MarcoFalke commented at 5:49 pm on October 7, 2019:
    I think you are looking for the test to answer the question “How deterministic is coverage across runs”, which is also a useful question to ask. Not sure how to accommodate both.

    MarcoFalke commented at 2:30 pm on October 8, 2019:

    And a random seed is already picked per process in the while loop you referenced below. So moving the export RANDOM_CTX_SEED wouldn’t make any observable difference.

    So either I leave it as is or remove the export RANDOM_CTX_SEED from this file again. What do you think?


    davereikher commented at 11:10 am on October 9, 2019:

    Sorry for the delay,

    Huh? If you change the seed between runs and then compare the results of those runs, you are pretty much guaranteed to get non-deterministic coverage. I think this test should check for non-determinism across runs, assuming everything kept constant.

    You’re right, many tests fail with a fixed seed (I tried 5 different tests from the list in the script and all of them exhibited non-deterministic coverage after a few tens of runs), thanks for pointing this out! On the other hand, I’ve been running the script for all tests which are not in the list inside the script while having the seed changed every time (I moved the line inside the loop just for this test) and no non-determinism in coverage has been detected so far (I’m at iteration ~70 now). I’ll update when it gets to a 1000.

    …it should act as a proxy to be able to answer the question “How reliable can test failures be reproduced”

    But then I think it doesn’t perfectly serve it’s purpose - the fact is that the script’s actual mechanism of operation is to compare the coverage between runs and look for discrepancies. If the coverage is found to be the same between runs, it still does not guarantee that a failure can always be reproduced. That’s because for a test that might fail, at some point in the code there is a branching into two outcomes - test failing and test succeeding. Since all tests probably always succeed when running this script, the succeeding branch is always taken and we never know what’s going on in the failing branch. There might be non-determinism there which could still make the failure irreproducible.

    And a random seed is already picked per process in the while loop you referenced below. So moving the export RANDOM_CTX_SEED wouldn’t make any observable difference.

    So either I leave it as is or remove the export RANDOM_CTX_SEED from this file again. What do you think?

    Isn’t the seed set to RANDOM_CTX_SEED if this environment variable is found and thus the seed is the same for all runs in the loop if this line is located outside the loop?


    MarcoFalke commented at 2:28 pm on October 9, 2019:

    Isn’t the seed set to RANDOM_CTX_SEED if this environment variable is found and thus the seed is the same for all runs in the loop if this line is located outside the loop?

    Yes, but putting it inside the loop will make it pick a new seed every time. This is already the default behaviour when no seed is provided at all. So might as well leave it out completely.


    davereikher commented at 3:02 pm on October 9, 2019:
    Oh ok, so yes, I think it’s best to remove it then.

    MarcoFalke commented at 3:20 pm on October 9, 2019:
    Ok, removed. Thanks for the discussion!

    davereikher commented at 5:04 am on October 10, 2019:
    Likewise!

    davereikher commented at 5:12 am on October 10, 2019:

    BTW,

    … I’ll update when it gets to a 1000.

    The script found non-deterministic behaviour at iteration 399, meaning it ran the entire test suite 398 times with a different seed each run and did not detect non-deterministic coverage in any of them. The non-determinism in run 399 might be caused by a test that should be included in the list of tests in the script. I’m not sure how to isolate the test that causes this non-determinism though. It requires some detective work. I’ll try working on it.

  17. test: Seed test RNG context for each test case, print seed fae43a97ca
  18. MarcoFalke force-pushed on Oct 9, 2019
  19. davereikher commented at 5:03 am on October 10, 2019: contributor
    Tested ACK fae43a97ca947cd0802392e9bb86d9d0572c0fba
  20. DrahtBot added the label Needs rebase on Nov 7, 2019
  21. DrahtBot commented at 2:27 pm on November 7, 2019: member
  22. MarcoFalke referenced this in commit 772673dfbe on Nov 7, 2019
  23. MarcoFalke merged this on Nov 7, 2019
  24. MarcoFalke closed this on Nov 7, 2019

  25. MarcoFalke deleted the branch on Nov 7, 2019
  26. sidhujag referenced this in commit 9bc18304f1 on Nov 9, 2019
  27. fanquake removed the label Needs rebase on Jul 26, 2020
  28. jasonbcox referenced this in commit ee2ce03e6d on Oct 19, 2020
  29. sidhujag referenced this in commit 7e8c01256e on Nov 10, 2020
  30. PastaPastaPasta referenced this in commit ee3f303a07 on Dec 22, 2021
  31. PastaPastaPasta referenced this in commit 81e86e6954 on Dec 22, 2021
  32. PastaPastaPasta referenced this in commit 601f39505b on Dec 22, 2021
  33. furszy referenced this in commit 73ac5f699e on Feb 7, 2022
  34. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-17 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me