Refactor: Improve setup_clean_chain semantics #19168

pull fjahr wants to merge 3 commits into bitcoin:master from fjahr:test_chain changing 111 files +108 −118
  1. fjahr commented at 6:36 pm on June 4, 2020: member

    The semantics of setup_clean_chain are off in my opinion. If it is set to true, there is actually no real chain setup, we only start with the Genesis block. That is not what I would expect ‘clean’ to mean. The 200 blocks chain that we set up if it is set to false could also be called ‘clean’ since there are no transactions or reorgs.

    I think this is also the reason that other people have been confused. The documentation for test_shell is not correct and the default has been re-declared in several tests. This PR fixes the documentation and removes settings that don’t change the default. Additionally I suggest renaming setup_clean_chain to skip_chain_setup.

  2. MarcoFalke commented at 6:39 pm on June 4, 2020: member
    The first commit can also be a scripted diff with the sed d action?
  3. MarcoFalke commented at 6:40 pm on June 4, 2020: member
    ACK 6e1a6ad68a9ebf3bc114cc40c96c5075937a93d4
  4. fjahr force-pushed on Jun 4, 2020
  5. fjahr commented at 6:59 pm on June 4, 2020: member

    The first commit can also be a scripted diff with the sed d action?

    Yepp, done.

  6. in test/functional/example_test.py:81 in 3719653585 outdated
    77@@ -78,7 +78,10 @@ def set_test_params(self):
    78         """Override test parameters for your individual test.
    79 
    80         This method must be overridden and num_nodes must be explicitly set."""
    81-        self.setup_clean_chain = True
    82+        # By default the test pre-mines a chain of 200 blocks automatically.
    


    jonatack commented at 7:12 pm on June 4, 2020:
    The test pre-mines, or it loads the cached data directories?

    fjahr commented at 9:51 pm on June 4, 2020:
    Yeah, it loads a pre-mined chain. Given that mostly absolute beginners are going to look at this I was trying to make it easier to understand but in this case was giving the wrong impression. Updated.
  7. jonatack commented at 7:27 pm on June 4, 2020: member
    My issue with this setting is not how it’s named, but the lack of clarity on whether the default setting (e.g. setup_clean_chain=False) is the efficient one, and if it is not, why is the inefficient setting the default and what the real-world tradeoffs and differences are – and to clarify all this better in the README and the example test. I don’t see this renaming making that clearer but would love for the docs to be clearer.
  8. MarcoFalke commented at 7:31 pm on June 4, 2020: member

    @jonatack If you need the blocks in the chain, then loading them from the cache is efficient. If you don’t need them, then you may skip the cache, but it wouldn’t hurt if you forget to skip the cache.

    Setting the default the other way round would risk that tests that need blocks generate them instead of loading them from the cache.

  9. jonatack commented at 7:50 pm on June 4, 2020: member
    Thanks @MarcoFalke, so IIUC the default is for safety. @fjahr from my personal perspective, adding helpful discussion in the docs about the tradeoffs and reasons would be very helpful (I’m less enthusiastic about the current renaming proposal, esp. if it is loading a cached chain rather than doing setup).
  10. MarcoFalke commented at 8:05 pm on June 4, 2020: member
    I like the rename, but naming is always subjective to some extent
  11. DrahtBot added the label Docs on Jun 4, 2020
  12. DrahtBot added the label Refactoring on Jun 4, 2020
  13. DrahtBot added the label Tests on Jun 4, 2020
  14. jonatack commented at 8:54 pm on June 4, 2020: member
    Maybe I’m thoroughly confused, but isn’t the new name inaccurate? :) ISTM it ought to be strictly better to justify a change. For the docs, there is also this recent conversation #18638 (review) about when to use this. It’s very late here but will revisit this at a better hour and try to come up with a clear addition to the docs that everyone agrees with, unless @fjahr you’re keen to do it (which would be great).
  15. fjahr commented at 9:12 pm on June 4, 2020: member

    Maybe I’m thoroughly confused, but isn’t the new name inaccurate? :) ISTM it ought to be strictly better to justify a change. For the docs, there is also this recent conversation #18638 (comment) about when to use this. It’s very late here but will revisit this at a better hour and try to come up with a clear addition to the docs that everyone agrees with, unless @fjahr you’re keen to do it (which would be great).

    It is a different abstraction layer. The person writing a test using BitcoinTestFramework should not have to learn every little detail of how the framework works internally. That’s why it is a framework. I mentioned why I think the name change is better: because the current name is very confusing. I think most people do not understand ‘clean’ to be what it is supposed to mean here. I am trying to use a name that is easy to understand and appropriate for the abstraction layer and I think setup is a neutral term that does not imply mining or loading from cache.

    Feel free to open an additional PR for more docs updates or give me a hint where you think I could extend this here and I will add it.

  16. fjahr force-pushed on Jun 4, 2020
  17. fjahr force-pushed on Jun 4, 2020
  18. fjahr force-pushed on Jun 4, 2020
  19. fjahr commented at 10:19 pm on June 4, 2020: member
    Added some more small adjustments to the docs based on feedback from @jonatack to make sure users understand performance implications.
  20. DrahtBot commented at 10:36 pm on June 4, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #20954 (test: Declare nodes type in test_framework.py. by kiminuo)
    • #19013 (test: add v0.20.1 to backwards compatibility test by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  21. jonatack commented at 8:21 am on June 5, 2020: member

    I think this renaming is unwise and am NACK on it. Why:

    • Semi-subjective: The new name is more confusing. “Set to true to not do something” is not an improvement in clarity.

    • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren’t copyright headers but test settings that are important enough that the convention has been to incude them explicitly in most every file. The only wise reason to change these lines is to change the setting.

    Concept ACK on improving the documentation. Frameworks increase, not decrease, the need for good documentation and clear explanation of when and why to use settings. Links to source code or terms to grep should be added. Reviewers and contributors aren’t beginners to hide details from; they need to thoroughly understand what is happening under the hood. That is why this documentation exists.

  22. MarcoFalke removed the label Docs on Jun 5, 2020
  23. MarcoFalke removed the label Refactoring on Jun 5, 2020
  24. in test/functional/example_test.py:84 in d2183c7942 outdated
    77@@ -78,7 +78,10 @@ def set_test_params(self):
    78         """Override test parameters for your individual test.
    79 
    80         This method must be overridden and num_nodes must be explicitly set."""
    81-        self.setup_clean_chain = True
    82+        # By default the test loads a pre-mined chain of 200 blocks from cache.
    83+        # Set skip_chain_setup to True to skip this and start from the Genesis
    84+        # block.
    85+        self.skip_chain_setup = True
    


    MarcoFalke commented at 11:30 am on June 5, 2020:
    0        self.use_cached_chain = False
    

    The double negative is a good point. What about this name?


    fjahr commented at 11:52 am on June 8, 2020:
    Sounds good to me, happy to change it to that if other reviewers prefer it. :)

    MarcoFalke commented at 11:57 am on June 8, 2020:

    Sounds good to me, happy to change it to that if other reviewers prefer it. :)

    It would make me feel less guilty of picking the “clean chain” name in fac93497986b5f74716383bf26c78406253c625a, so I hope that other reviewers take that into consideration :sweat_smile:

  25. fjahr commented at 11:51 am on June 8, 2020: member
    • Semi-subjective: The new name is more confusing. “Set to true to not do something” is not an improvement in clarity.

    I don’t understand this argument. The naming itself is misleading. So you would like it to be clearly misleading rather than not so clearly non-misleading? Anyway, I think @MarcoFalke ’s suggested renaming would solve this.

    • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren’t copyright headers but test settings that are important enough that the convention has been to include them explicitly in almost every file. The only wise reason to change these lines is to change the setting.

    If we would follow this rule blindly we would never do any refactorings. However, we do them sometimes if there is an added benefit. The benefit here is clarity of what this setting is doing. There is no doubt it’s confusing as I have laid out before. And in this case, it is also less critical because this line never stands on its own. It is setup code that enables other stuff to work but is never a standalone thing that is relevant without the other code it makes work. The other will be the lines that developers will want to git blame, not the setup code. At least that is my impression of the code I have seen in the files I am changing. Feel free to give counterexamples if I am wrong.

  26. MarcoFalke commented at 11:59 am on June 8, 2020: member
    btw, git blame on recent versions of git can ignore specific commits (like refactoring scripted diffs)
  27. jonatack commented at 12:18 pm on June 8, 2020: member
    • Objective: The change has consequences that are strictly worse. Changing all these files is noise and churn, messing with git history and erasing all usability of git blame for these settings across the functional test suite. These aren’t copyright headers but test settings that are important enough that the convention has been to include them explicitly in almost every file. The only wise reason to change these lines is to change the setting.

    If we would follow this rule blindly we would never do any refactorings.

    I am not suggesting to follow this blindly, but to follow it in this case for the reasons provided.

  28. jonatack commented at 12:21 pm on June 8, 2020: member
    I suggest improving the documentation that can be pointed to in the future regarding the use of this setting.
  29. DrahtBot added the label Needs rebase on Jun 11, 2020
  30. fjahr force-pushed on Jun 30, 2020
  31. fjahr commented at 3:35 pm on June 30, 2020: member
    rebased
  32. fjahr commented at 3:36 pm on June 30, 2020: member

    I suggest improving the documentation that can be pointed to in the future regarding the use of this setting.

    Doesn’t 5d94c2ebf70a3bc4634ff97da18a19040fd95cbb do that?

  33. DrahtBot removed the label Needs rebase on Jun 30, 2020
  34. fjahr force-pushed on Jul 29, 2020
  35. fjahr force-pushed on Jul 29, 2020
  36. fjahr commented at 1:40 pm on July 29, 2020: member
    Giving this one last effort with the rename to use_chached_chain as suggested by @MarcoFalke . Rebased on master to fix silent merge conflicts.
  37. DrahtBot added the label Needs rebase on Aug 13, 2020
  38. fjahr force-pushed on Aug 16, 2020
  39. fjahr commented at 9:41 pm on August 16, 2020: member
    Rebased. @jonatack could you check if the latest changes make you reconsider your NACK? Incorrect use of the setup_clean_chain config still happens regularly (and slips though reviews) as in this commit that was just recently merged: https://github.com/bitcoin/bitcoin/commit/3bd67ba5a4ef3c20ef1f873b63c9f53a6c8608b6 (and it had a lot of experienced reviewers).
  40. DrahtBot removed the label Needs rebase on Aug 16, 2020
  41. DrahtBot added the label Needs rebase on Aug 17, 2020
  42. fjahr force-pushed on Aug 17, 2020
  43. DrahtBot removed the label Needs rebase on Aug 17, 2020
  44. DrahtBot added the label Needs rebase on Aug 18, 2020
  45. fjahr force-pushed on Aug 19, 2020
  46. jonatack commented at 3:41 pm on August 19, 2020: member
    Sure, sorry about that. Re-reviewing soon.
  47. DrahtBot removed the label Needs rebase on Aug 19, 2020
  48. in test/functional/example_test.py:82 in ee74d2398e outdated
    77@@ -78,7 +78,10 @@ def set_test_params(self):
    78         """Override test parameters for your individual test.
    79 
    80         This method must be overridden and num_nodes must be explicitly set."""
    81-        self.setup_clean_chain = True
    82+        # By default the test loads a pre-mined chain of 200 blocks from cache.
    83+        # Set use_chached_chain to False to skip this and start from the Genesis
    84+        # block.
    85+        self.use_cached_chain = False
    


    glozow commented at 11:40 pm on August 20, 2020:
    Noob confession: when I first started tinkering with example_test.py, I struggled a lot because I didn’t quite understand what a “clean chain” meant and wasn’t aware that coinbases needed to mature (I was trying to create transactions with sendtoaddress and getting “insufficient funds”). Maybe comments/docs here or elsewhere could help?

    fjahr commented at 0:12 am on August 21, 2020:
    Hm, I get your point but I think explaining coinbase maturity, a very basic part of the protocol, here in the example test or in the functional test README, feels out of place to me.
  49. glozow commented at 11:45 pm on August 20, 2020: member

    utACK https://github.com/bitcoin/bitcoin/pull/19168/commits/ee74d2398e4753124b442701d61d8a8febec5ab1

    For me personally, I think this is clearer: if I want to start the chain from scratch (which is not the default), I set the option to False.

    I have a question about why use_cached_chain is the default instead of the other way around? It seems appropriate to me that, when we’re trying to test bitcoind behavior, starting with an existing chain and wallets funded represents a typical situation. However, it seems like starting from scratch is more common than using the cached chain in functional tests? I think using the cached chain is more convenient, but are there other reasons to use it?

  50. fjahr commented at 11:56 pm on August 20, 2020: member

    I have a question about why use_cached_chain is the default instead of the other way around? It seems appropriate to me that, when we’re trying to test bitcoind behavior, starting with an existing chain and wallets funded represents a typical situation. However, it seems like starting from scratch is more common than using the cached chain in functional tests? I think using the cached chain is more convenient, but are there other reasons to use it?

    Thanks for the review! I think @MarcoFalke mostly answered that question here: #19168 (comment) The downside of not using the cached chain when you could is bigger than using it when you don’t need it.

  51. in test/functional/README.md:66 in ee74d2398e outdated
    62@@ -63,10 +63,11 @@ don't have test cases for.
    63 - Avoid stop-starting the nodes multiple times during the test if possible. A
    64   stop-start takes several seconds, so doing it several times blows up the
    65   runtime of the test.
    66-- Set the `self.setup_clean_chain` variable in `set_test_params()` to control whether
    67-  or not to use the cached data directories. The cached data directories
    68+- Set the `self.use_chached_chain` variable in `set_test_params()` to `False` if
    


    jonatack commented at 9:12 am on August 21, 2020:

    s/chached/cached/ throughout commit ee74d239, error which could be avoided by including the renaming in this commit in the previous commit’s scripted diff?

    Could mention the default is True in the doc here.


    fjahr commented at 5:09 pm on August 21, 2020:

    Thanks. I fixed the misspelling. I couldn’t fully automatically do the renamings through a scripted diff because of the boolean flip (or at least it would have been a lot more work). And I thought it is easier to review if I would leave all the doc changes together in one commit, rather than doing part of it through the scripted diff and another part in a manual commit. Now together with the squashed doc improvements, it’s a more concise commit I think.

    I mention the default is true now.

  52. in test/functional/test-shell.md:181 in ee74d2398e outdated
    177@@ -178,7 +178,7 @@ can be called after the TestShell is shut down.
    178 | `num_nodes` | `1` | Sets the number of initialized bitcoind processes. |
    179 | `perf` | False | Profiles running nodes with `perf` for the duration of the test if set to `True`. |
    180 | `rpc_timeout` | `60` | Sets the RPC server timeout for the underlying bitcoind processes. |
    181-| `setup_clean_chain` | `False` | Initializes an empty blockchain by default. A 199-block-long chain is initialized if set to `True`. |
    182+| `use_chached_chain` | `True` | A 199-block-long chain is initialized by default. Initializes an empty blockchain if set to `False`. |
    


    jonatack commented at 9:15 am on August 21, 2020:
    Why 199 here when 200 is stated in the other docs?

    fjahr commented at 4:31 pm on August 21, 2020:
    Probably because one counted the genesis block and the other didn’t. I think I will change it to 200 because that’s the actual height and that seems more intuitive to me.
  53. jonatack commented at 9:21 am on August 21, 2020: member

    Concept ACK. I’d suggest:

    • merging the two manual doc edit commits into one after the scripted diffs

    • possibly automating the misspelled manual renaming by adding it to the scripted diff

    • possibly dropping the commit that removes the default setting from tests, the rationale being that it is helpful and saves time when running the test with the opposite setting, when it is present.

  54. fjahr force-pushed on Aug 21, 2020
  55. fjahr commented at 5:20 pm on August 21, 2020: member

    Thanks for the review @jonatack !

    • merging the two manual doc edit commits into one after the scripted diffs

    done

    • possibly automating the misspelled manual renaming by adding it to the scripted diff

    See my answer on your inline comment

    • possibly dropping the commit that removes the default setting from tests, the rationale being that it is helpful and saves time when running the test with the opposite setting, when it is present.

    I’m not 100% sure I fully understand what you are saying. But in general, I would prefer not to do this because I find code that has no effect very irritating. I expect a line of code to have an effect, otherwise, it should just be a comment which does documentation just as well and also makes clear there is no effect.

  56. jonatack commented at 11:27 am on August 23, 2020: member

    Doc improvement commit 55e5bc178cf9e562a36d204c79a01a0d7db1a066: ACK

    Renaming commit e081377f85ca2fd39376f5a6d1d7969a0e23d028: Light ACK

    I like having the explicit setting in each test setup (and not needing to re-verify the default setting when I don’t remember what it is), so I’m not highly supportive of 82e61646b49aec364c990c1f8e08472e181c9d99 but ok if others feel strongly in favor.

    Incorrect use of the setup_clean_chain config still happens regularly (and slips though reviews) as in this commit that was just recently merged: 3bd67ba (and it had a lot of experienced reviewers).

    (I don’t want to derail further but it might be good to fix some of the wrong settings, not necessarily in this PR.)

  57. DrahtBot added the label Needs rebase on Sep 1, 2020
  58. fjahr force-pushed on Sep 1, 2020
  59. fjahr commented at 12:50 pm on September 1, 2020: member
    Had to do another rebase, no code changes git range-diff e9b30126545d6ddd8772363e4079d1e4908ad117..55e5bc178cf9e562a36d204c79a01a0d7db1a066 bab4cce1b0eedc1a51692aaf83ba54dd0a9d17e6..708bfc31388cdb7fcf8b3348f012e6753116bfbc. Would you mind taking another look @MarcoFalke @gzhao408 ?
  60. DrahtBot removed the label Needs rebase on Sep 1, 2020
  61. DrahtBot added the label Needs rebase on Sep 11, 2020
  62. fjahr force-pushed on Sep 13, 2020
  63. DrahtBot removed the label Needs rebase on Sep 13, 2020
  64. DrahtBot added the label Needs rebase on Oct 2, 2020
  65. fjahr force-pushed on Oct 2, 2020
  66. fjahr commented at 9:15 pm on October 2, 2020: member
    Ain’t no party like a rebase party, ‘cause a rebase party don’t stop…
  67. DrahtBot removed the label Needs rebase on Oct 2, 2020
  68. fjahr force-pushed on Oct 3, 2020
  69. DrahtBot added the label Needs rebase on Oct 16, 2020
  70. fjahr force-pushed on Oct 16, 2020
  71. fjahr commented at 9:53 pm on October 16, 2020: member
    Rebased
  72. DrahtBot removed the label Needs rebase on Oct 16, 2020
  73. DrahtBot added the label Needs rebase on Nov 2, 2020
  74. scripted-diff: Remove setup_clean_chain if default is not changed
    -BEGIN VERIFY SCRIPT-
    git grep -l "self.setup_clean_chain = False" test/functional/*.py | xargs sed -i "/self.setup_clean_chain = False/d";
    -END VERIFY SCRIPT-
    b9c7e66079
  75. scripted-diff: Rename setup_clean_chain to use_cached_chain
    Also switched the effect of the bool value, i.e. setup_clean_chain=True is now use_cached_chain=False.
    
    -BEGIN VERIFY SCRIPT-
    git grep -l "self.setup_clean_chain = True" test/functional | xargs sed -i "s/self.setup_clean_chain = True/self.use_cached_chain = False/g";
    
    git grep -l "self.setup_clean_chain = False" test/functional/test_framework | xargs sed -i "s/self.setup_clean_chain = False/self.use_cached_chain = True/g";
    
    git grep -l "not self.setup_clean_chain" test/functional/test_framework | xargs sed -i "s/not self.setup_clean_chain/self.use_cached_chain/g";
    
    git grep -l "self.setup_clean_chain" test/functional/test_framework | xargs sed -i "s/self.setup_clean_chain/not self.use_cached_chain/g";
    
    git grep -l "setup_clean_chain" test/functional/test_framework | xargs sed -i "s/setup_clean_chain/use_cached_chain/g";
    -END VERIFY SCRIPT-
    044f5fa8ab
  76. doc: Clean up use_cached_chain renaming in docs
    Also clarify behavior better.
    f8ebae118b
  77. fjahr force-pushed on Nov 29, 2020
  78. fjahr commented at 10:49 pm on November 29, 2020: member
    herwaardeer (rebased in Afrikaans)
  79. DrahtBot removed the label Needs rebase on Nov 30, 2020
  80. DrahtBot added the label Needs rebase on Jan 31, 2021
  81. DrahtBot commented at 10:06 am on January 31, 2021: member

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  82. fjahr commented at 8:55 pm on January 31, 2021: member
    Closing this since the renaming seems to be too controversial/is not interesting enough to most reviewers. I opened a new PR without the renaming and some improved doc cleanups in #21042.
  83. fjahr closed this on Jan 31, 2021

  84. MarcoFalke referenced this in commit ea5a50f92a on Feb 4, 2021
  85. glozow referenced this in commit cbf452bb18 on Feb 23, 2021
  86. glozow referenced this in commit 834f22e64a on Feb 23, 2021
  87. glozow referenced this in commit b1ea5c5bf7 on Feb 23, 2021
  88. glozow referenced this in commit 7eebc809a3 on Mar 31, 2021
  89. glozow referenced this in commit bb998fbc2c on Apr 5, 2021
  90. glozow referenced this in commit 8a365df558 on Apr 8, 2021
  91. windsok referenced this in commit fbc3a4e599 on Apr 23, 2021
  92. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-26 16:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me