Remove CMerkleTx #16451

pull jnewbery wants to merge 3 commits into bitcoin:master from jnewbery:2019-07-remove-CMerkleTx changing 2 files +72 −85
  1. jnewbery commented at 2:50 pm on July 24, 2019: member

    CMerkleTx is only used as a base class for CWalletTx. It was previously also used for vtxPrev which was removed in 93a18a3650292afbb441a47d1fa1b94aeb0164e3.

    This PR moves all of the CMerkleTx members and logic into CWalletTx. The CMerkleTx class is kept for deserialization and serialization of old wallet files.

    This makes the refactor in #15931 cleaner.

  2. DrahtBot added the label Wallet on Jul 24, 2019
  3. in src/wallet/wallet.h:368 in c3a4d162c4 outdated
    364@@ -365,82 +365,24 @@ struct COutputEntry
    365     int vout;
    366 };
    367 
    368-/** A transaction with a merkle branch linking it to the block chain. */
    369+/** Legacy class used for deserializing vtxPrev for backwards compatibility **/
    


    MarcoFalke commented at 3:29 pm on July 24, 2019:
    Could add the commit or wallet version where this was changed?

    jnewbery commented at 6:10 pm on July 24, 2019:
    I’ve expanded the comment here
  4. in src/wallet/wallet.h:381 in c3a4d162c4 outdated
    411 
    412     template <typename Stream, typename Operation>
    413     inline void SerializationOp(Stream& s, Operation ser_action) {
    414+        CTransactionRef tx;
    415+        uint256 hashBlock;
    416+        int nIndex;
    


    MarcoFalke commented at 3:34 pm on July 24, 2019:
    Would this write uninitialized memory to the wallet file?

    jnewbery commented at 3:52 pm on July 24, 2019:

    No CMerkleTx object is ever serialized.

    I could go even further in this PR and remove the CMerkleTx object entirely. Let me add a commit to see what that looks like.


    MarcoFalke commented at 5:36 pm on July 24, 2019:
    Yeah, I am interested in seeing that commit. That probably means you can get rid of the static_cast<const CMerkleTx&>(*this) that happens in the serialization of CWalletTx

    MarcoFalke commented at 5:39 pm on July 24, 2019:

    static_cast<const CMerkleTx&>(*this)

    Oh, that probably means that serialization is called, so I’d like to re-raise my previous question of uninitialized memory.


    jnewbery commented at 6:08 pm on July 24, 2019:

    That probably means you can get rid of the static_cast<const CMerkleTx&>(*this) that happens in the serialization of CWalletTx

    Huh? This PR already does get rid of the static_cast<const CMerkleTx&>(*this)


    promag commented at 10:43 pm on July 24, 2019:

    fd2773f2eba659d78fae9d74461d9dd952609aa9

    No CMerkleTx object is ever serialized.

    Right, the vector is always empty. You could add assert(ser_action.ForRead()). But the following removes the serialization support so maybe not worth it?


    MarcoFalke commented at 1:58 pm on July 25, 2019:
    I like the assert

    jnewbery commented at 2:01 pm on July 25, 2019:

    But the following removes the serialization support so maybe not worth it?

    I agree (if we decide to keep the second commit)

  5. MarcoFalke commented at 3:36 pm on July 24, 2019: member
    Just to clarify: When I open+close an “old” wallet file, this will overwrite all merkle txs with zeros? If yes, would it be safe to open such a file with the older version of Bitcoin Core?
  6. MarcoFalke commented at 3:36 pm on July 24, 2019: member
    Concept ACK
  7. jnewbery commented at 3:49 pm on July 24, 2019: member

    When I open+close an “old” wallet file, this will overwrite all merkle txs with zeros?

    No. When opening the old file, the CWalletTx deserialization logic will deserialize the merkle txs into a locally scoped vUnused which is thrown away. When closing the wallet, CWalletTx::Serialize() serializes an empty vector vUnused into the file. From my reading of serialize.h, that just writes a single zero byte (a CompactSize of 0).

    That behaviour should be unchanged by this PR.

  8. jonatack commented at 4:35 pm on July 24, 2019: member
    Concept ACK. Light code review, build, all tests pass locally.
  9. ariard commented at 5:54 pm on July 24, 2019: member
    Concept ACK, should go before #15931 as it lets me remove one commit
  10. DrahtBot commented at 5:59 pm on July 24, 2019: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #16341 (Introduce ScriptPubKeyMan interface and use it for key and script management (aka wallet boxes) by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. jnewbery force-pushed on Jul 24, 2019
  12. in src/wallet/wallet.h:384 in f7fe30c7c0 outdated
    424-        std::vector<uint256> vMerkleBranch; // For compatibility with older versions.
    425-        READWRITE(tx);
    426-        READWRITE(hashBlock);
    427-        READWRITE(vMerkleBranch);
    428-        READWRITE(nIndex);
    429+        s << tx << hashBlock << vMerkleBranch << nIndex;
    


    sipa commented at 6:11 pm on July 24, 2019:
    You’re invoking the serialization operators inside an Unserialize function. I suspect you mean to call the >> ones instead.

    jnewbery commented at 9:41 pm on July 24, 2019:
    gah. Thanks. Fixed.
  13. jnewbery force-pushed on Jul 24, 2019
  14. jnewbery commented at 6:13 pm on July 24, 2019: member

    @MarcoFalke

    Yeah, I am interested in seeing that commit.

    I’ve added a second commit that removes the CMerkleTx serialization code (since we never actually have to use it). I couldn’t figure out how to replicate the deserialization code for CMerkleTx in CWalletTx::deserialize() and be sure that I’d replicated it byte-for-byte, so I left that in.

    Let me know if you think that second commit is useful. I’m happy to keep it or leave it.

  15. ryanofsky commented at 8:32 pm on July 24, 2019: member

    IMO, messing with this serialization code is not an improvement. It makes the PR more longer and more confusing than it needs to be and now repeats the READWRITE(tx, hashBlock, vMerkleBranch, nIndex) logic across 3 different functions instead of just leaving it in one place.

    I’d suggest leave the inheritance untouched, leave the serialization untouched, getting rid of the comment explaining serialization workarounds, and just move all the CMerkleTx methods to CWalletTx, leaving CMerkleTx as a simple struct with a serialization operator.

  16. jnewbery force-pushed on Jul 24, 2019
  17. jnewbery commented at 9:47 pm on July 24, 2019: member

    IMO, messing with this serialization code is not an improvement. It makes the PR more longer and more confusing than it needs to be and now repeats the READWRITE(tx, hashBlock, vMerkleBranch, nIndex) logic across 3 different functions instead of just leaving it in one place.

    I’d suggest leave the inheritance untouched, leave the serialization untouched, getting rid of the comment explaining serialization workarounds, and just move all the CMerkleTx methods to CWalletTx, leaving CMerkleTx as a simple struct with a serialization operator.

    Thanks for reviewing @ryanofsky . I prefer having all the CWalletTx logic in a single place, but I can understand your point about making this PR smaller.

    I’m happy to keep this PR as it is with two commits, reduce to just the first commit, or reduce further as you’ve suggested (put I’d probably add some comments to explain why we have a base class for CWalletTx that is just used for serialization/deserialization). I’ll wait for others to give their opinions before changing this.

  18. ryanofsky commented at 10:19 pm on July 24, 2019: member

    but I can understand your point about making this PR smaller

    PR size isn’t the main issue. The main thing I don’t like is that READWRITE(tx, hashBlock, vMerkleBranch, nIndex) logic is now repeated in 3 places, rewritten 3 slightly different ways, when it only needs to live in one place. The current serialization seems fine and little seems to be gained by blowing it up, since either way you have to keep the CMerkleTx class. The difference is that with my suggestion, CMerkleTx is something normal: a straightforward struct with a serialization operator. While with the current commits, CMerkleTx turns into this weird duplicative dangling thing requiring explanations from ancient history.

    I also don’t understand the division between the two commits. First a “Remove CMerkleTx” commit that doesn’t actually remove it, then a “tidy up” commit with no description that changes more random things?

    So I don’t get why a simple MOVEONLY commit that just moves all CMerkleTx methods except SerializationOp to CWalletTx isn’t the obvious choice here. But I’m mostly just perplexed by the choice. If you actually think the changes here are great and make sense then please keep them!

  19. in src/wallet/wallet.h:461 in fd2773f2eb outdated
    460@@ -512,7 +461,10 @@ class CWalletTx : public CMerkleTx
    461     mutable bool fInMempool;
    462     mutable CAmount nChangeCached;
    463 
    464-    CWalletTx(const CWallet* pwalletIn, CTransactionRef arg) : CMerkleTx(std::move(arg))
    465+    CWalletTx(const CWallet* pwalletIn, CTransactionRef arg)
    466+        : tx(std::move(arg)),
    467+          hashBlock(uint256()),
    


    promag commented at 10:45 pm on July 24, 2019:

    fd2773f2eba659d78fae9d74461d9dd952609aa9

    nit, personally I prefer

    0: tx(std::move(arg))
    1, hashBlock(uint256())
    2, nIndex(-1)
    

    MarcoFalke commented at 1:55 pm on July 25, 2019:
    That is neither in the style-guide nor in clang-format

    promag commented at 3:13 pm on July 25, 2019:

    6779b7cbdeec091bf2c5a704c07a018641625d5f

    None is AFAICT.



    jnewbery commented at 3:49 pm on July 25, 2019:
    I actually prefer your style @promag , but I’ll keep this as is for now to not invalidate your review. If I need to update the PR again I’ll swap to your style.

    laanwj commented at 6:19 am on July 31, 2019:
    This code is fine. Please do not give formatting comments about preferences, that are not in the code style or developer notes,
  20. in src/wallet/wallet.h:503 in fad5decb1f outdated
    499@@ -505,22 +500,21 @@ class CWalletTx
    500             mapValueCopy["timesmart"] = strprintf("%u", nTimeSmart);
    501         }
    502 
    503-        std::vector<uint256> dummy_vector; //!< Used to be vMerkleBranch
    504-        s << tx << hashBlock << dummy_vector << nIndex;
    505-        std::vector<CMerkleTx> vUnused; //!< Used to be vtxPrev
    506-        s << vUnused << mapValueCopy << vOrderForm << fTimeReceivedIsTxTime << nTimeReceived << fFromMe << fSpent;
    507+        std::vector<char> dummy_vector1; //!< Used to be vMerkleBranch
    508+        std::vector<char> dummy_vector2; //!< Used to be vtxPrev
    


    promag commented at 11:56 pm on July 24, 2019:

    fad5decb1fdab249f626c6468174c50be15f14cf

    Just to be sure, this means that an empty vector<char> serialized is compatible with the unserialization of vector<CMerkleTx>?

    Edit: nervermind, went and see Unserialize_impl for vectors.


    jnewbery commented at 2:00 pm on July 25, 2019:
    Correct. Empty vector<>s get serialized to a single 0 byte (the compactSize of an empty vector)

    MarcoFalke commented at 2:32 pm on July 25, 2019:

    in commit fad5dec:

    I think you can keep everything in this function as is (except changing the type of the vector). This should simplify review.


    jnewbery commented at 3:01 pm on July 25, 2019:
    See #16451 (comment). I’ll remove everything after the first commit if others agree.
  21. in src/wallet/wallet.h:514 in fad5decb1f outdated
    519-        std::vector<uint256> dummy_vector; //!< Used to be vMerkleBranch
    520-        s >> tx >> hashBlock >> dummy_vector >> nIndex;
    521-        std::vector<CMerkleTx> vUnused; //!< Used to be vtxPrev
    522-        s >> vUnused >> mapValue >> vOrderForm >> fTimeReceivedIsTxTime >> nTimeReceived >> fFromMe >> fSpent;
    523+        std::vector<uint256> dummy_vector1; //!< Used to be vMerkleBranch
    524+        std::vector<CMerkleTx> dummy_vector2; //!< Used to be vtxPrev
    


    promag commented at 0:00 am on July 25, 2019:

    fad5decb1fdab249f626c6468174c50be15f14cf

    Could remove this and inline CMerkleTx::Unserialize, see:

     0--- a/src/wallet/wallet.h
     1+++ b/src/wallet/wallet.h
     2@@ -365,26 +365,6 @@ struct COutputEntry
     3     int vout;
     4 };
     5
     6-/** Legacy class used for deserializing vtxPrev for backwards compatibility.
     7- * vtxPrev was removed in commit 93a18a3650292afbb441a47d1fa1b94aeb0164e3,
     8- * but old wallet.dat files may still contain vtxPrev vectors of CMerkleTxs.
     9- * These need to get deserialized for field alignment when deserializing
    10- * a CWalletTx, but the deserialized values are discarded.**/
    11-class CMerkleTx
    12-{
    13-public:
    14-    template<typename Stream>
    15-    void Unserialize(Stream& s)
    16-    {
    17-        CTransactionRef tx;
    18-        uint256 hashBlock;
    19-        std::vector<uint256> vMerkleBranch;
    20-        int nIndex;
    21-
    22-        s >> tx >> hashBlock >> vMerkleBranch >> nIndex;
    23-    }
    24-};
    25-
    26 //Get the marginal bytes of spending the specified output
    27 int CalculateMaximumSignedInputSize(const CTxOut& txout, const CWallet* pwallet, bool use_max_sig = false);
    28
    29@@ -512,9 +492,17 @@ public:
    30         Init(nullptr);
    31
    32         std::vector<uint256> dummy_vector1; //!< Used to be vMerkleBranch
    33-        std::vector<CMerkleTx> dummy_vector2; //!< Used to be vtxPrev
    34         char dummy_char; //! Used to be fSpent
    35-        s >> tx >> hashBlock >> dummy_vector1 >> nIndex >> dummy_vector2 >> mapValue >> vOrderForm >> fTimeReceivedIsTxTime >> nTimeReceived >> fFromMe >> dummy_char;
    36+        s >> tx >> hashBlock >> dummy_vector1 >> nIndex;
    37+        {
    38+            CTransactionRef tx;
    39+            uint256 hashBlock;
    40+            std::vector<uint256> vMerkleBranch;
    41+            int nIndex;
    42+
    43+            s >> tx >> hashBlock >> vMerkleBranch >> nIndex; //!< Used to be vtxPrev
    44+        }
    45+        s >> mapValue >> vOrderForm >> fTimeReceivedIsTxTime >> nTimeReceived >> fFromMe >> dummy_char;
    46
    47         ReadOrderPos(nOrderPos, mapValue);
    48         nTimeSmart = mapValue.count("timesmart") ? (unsigned int)atoi64(mapValue["timesmart"]) : 0;
    

    And if you do so then I’d squash now that CMerkleTx is actually removed.


    jnewbery commented at 2:00 pm on July 25, 2019:

    This isn’t equivalent. vtxPrev was a vector, so:

    • there can be multiple CMerkleTx objects serialized
    • the serialization is preceded with a compactSize

    I would like to remove CMerkleTx entirely, but implementing the low-lever deserialization logic for a vector of CMerkleTxs here seems like it could be easy to introduce an inconsistency (potentially causing wallets to get corrupted) and it seems wrong to be adding low-level deserialization code to this file.


    promag commented at 3:11 pm on July 25, 2019:
    Ah sorry.. You could read compact size, then loop to read dummy values but really not worth it. You can resolve this.

    jnewbery commented at 3:44 pm on July 25, 2019:

    You could read compact size, then loop to read dummy values but really not worth it.

    Exactly. Seems error-prone and redundant.

  22. promag commented at 0:17 am on July 25, 2019: member

    Concept ACK.

    First a “Remove CMerkleTx” commit that doesn’t actually remove it

    I was going to say the same. BTW nit, add wallet prefix to commit message. @ryanofsky suggestion to keep CMerkleTx just for serialization/unserialization make the code more straightforward and easy to understand than the last commit, IMO.

    Otherwise my comment below has a suggestion to actually drop CMerkleTx, but I think with Russ suggestion for now.

  23. jnewbery commented at 2:26 pm on July 25, 2019: member

    The main thing I don’t like is that READWRITE(tx, hashBlock, vMerkleBranch, nIndex) logic is now repeated in 3 places, rewritten 3 slightly different ways, when it only needs to live in one place.

    CWalletTx already has separate functions for serialize and deserialize, and adding the logic to both places seems to fit with that pattern.

    I much prefer that after this PR I can look at a single function to see how a wallet tx will be serialized or deserialized, rather than seeing the object being static_casted to a CMerkleTx, then (de)serialized, then a vector of CMerkleTxs (de)serialized.

    Longer term, I think it might make sense to move all of the serialization logic for the wallet objects to walletdb.{cpp,h}. There are already some objects that are defined in that header file with (de)serialization functions and deserialization fixups in ReadKeyValue().

    I also don’t understand the division between the two commits. First a “Remove CMerkleTx” commit that doesn’t actually remove it, then a “tidy up” commit with no description that changes more random things?

    I’ll try to improve the commit structure/messages.

    So I don’t get why a simple MOVEONLY commit that just moves all CMerkleTx methods except SerializationOp to CWalletTx isn’t the obvious choice here.

    I can certainly do that. The main aim of this PR is to move the CMerkleTx members into CWalletTx. Flattening the class hierarchy seemed like a good next step, but I can remove it from this PR if that’s what others want.

  24. in src/wallet/wallet.h:555 in fad5decb1f outdated
    512 
    513     template<typename Stream>
    514     void Unserialize(Stream& s)
    515     {
    516         Init(nullptr);
    517-        char fSpent;
    


    MarcoFalke commented at 2:31 pm on July 25, 2019:

    in commit fad5decb1fdab249f626c6468174c:

    I think this function does not need to be changed and can be kept as is.


    jnewbery commented at 3:01 pm on July 25, 2019:
    See #16451 (comment). I’ll remove everything after the first commit if others agree.
  25. MarcoFalke commented at 2:33 pm on July 25, 2019: member
    ACK fad5dec
  26. jnewbery force-pushed on Jul 25, 2019
  27. jnewbery commented at 3:00 pm on July 25, 2019: member

    @ryanofsky

    I also don’t understand the division between the two commits.

    I’ve resplit the changes into three commits:

    1. Moves the all CMerkleTx methods except initialization and serialization into CWalletTx
    2. Moves CMerkleTx data members into CWalletTx and moves the serialization logic into CWalletTx.
    3. Removes CMerkleTx serialization, just leaving CMerkleTx deserialization.

    Only (1) is needed. I think (2) and (3) are nice-to-haves, but can split them into a separate PR if others disagree.

  28. in src/wallet/wallet.h:378 in 6779b7cbde outdated
    408-
    409     ADD_SERIALIZE_METHODS;
    410 
    411     template <typename Stream, typename Operation>
    412     inline void SerializationOp(Stream& s, Operation ser_action) {
    413+        CTransactionRef tx;
    


    promag commented at 3:36 pm on July 25, 2019:

    6779b7cbdeec091bf2c5a704c07a018641625d5f

    Just noting that it’s not really necessary to make these local variables since that unserialized CMerkleTx are discarded anyway. I guess you are doing this to underline the idea, so maybe also add a comment here?


    jnewbery commented at 3:48 pm on July 25, 2019:

    Yeah, it’s equivalent, but I think the intent is clearer this way. Why have data members in a class where any instantiated objects are just going to be thrown away?

    I don’t think an additional comment is necessary here - the code along with the comment on the CMerkleTx class should be clear enough.

  29. promag commented at 3:36 pm on July 25, 2019: member
    ACK 458e43ed7cc3f4ddc196fc4e6e369ef8eba34aef, refactor reviewed, confirmed moved lines. I think 2nd and 3rd commits are fine.
  30. sipa commented at 4:12 pm on July 25, 2019: member

    I couldn’t figure out how to replicate the deserialization code for CMerkleTx in CWalletTx::deserialize() and be sure that I’d replicated it byte-for-byte, so I left that in.

    FWIW, it’s possible to do something like std::vector<std::pair<CTransactionRef, std::pair<uint256, std::pair<...>>>> tmp; ss >> tmp; (with a nesting of pairs correspond to all the fields in the original CMerkleTx).

    Not sure that’s an improvement though.

  31. jnewbery commented at 4:38 pm on July 25, 2019: member

    @sipa

    FWIW, it’s possible to do something like std::vector<std::pair<CTransactionRef, std::pair<uint256, std::pair<…»» tmp; ss » tmp; (with a nesting of pairs correspond to all the fields in the original CMerkleTx).

    Thanks! I’ve implemented that here: https://github.com/jnewbery/bitcoin/commit/efcdb2e02c2c2b1726fc0c973869bb325685a893, which removes CMerkleTx entirely. I don’t feel confident reviewing that it will always deserialize the bytes identically, and I’d want to test with an old wallet file before proposing that we make that change.

  32. jonatack commented at 4:59 pm on July 25, 2019: member
    Light ACK 458e43ed7cc3f4ddc196fc4e6e369ef8eba34aef, code review of the 3 new commits, build with no warnings, all tests pass.
  33. meshcollider commented at 11:02 am on July 27, 2019: contributor

    I’ve reviewed the code and it looks good 458e43ed7cc3f4ddc196fc4e6e369ef8eba34aef

    I have no strong opinion on whether the second and third commits are kept or not, I think the PR is good as-is

  34. DrahtBot added the label Needs rebase on Jul 30, 2019
  35. [wallet] Move CMerkleTx functions into CWalletTx
    CMerkleTx only exists as a base class for CWalletTx and for wallet file
    serialization/deserialization. Move CMerkleTx methods into CWalletTx,
    but leave class hierarchy and serialization logic in place.
    b3a9d179f2
  36. [wallet] Flatten CWalletTx class hierarchy
    Removes CMerkleTx as a base class for CWalletTx. Serialization logic is
    moved from CMerkleTx to CWalletTx.
    783a76f23b
  37. [wallet] Remove CMerkleTx serialization logic
    CMerkleTx is only used for deserialization of old wallet files. Remove
    the serialization logic, and tidy up CWalletTx serialization logic.
    05b56d1c93
  38. jnewbery force-pushed on Jul 30, 2019
  39. jnewbery commented at 3:57 pm on July 30, 2019: member
    rebased
  40. DrahtBot removed the label Needs rebase on Jul 30, 2019
  41. laanwj commented at 6:57 am on July 31, 2019: member

    ACK 05b56d1c937b7667ad51400d2f9fb674af72953f. Looks good to me.

    I can certainly do that. The main aim of this PR is to move the CMerkleTx members into CWalletTx. Flattening the class hierarchy seemed like a good next step

    Having an intermediate base class just for serialization/deserialization is just silly and I’m all for getting rid of that in this PR.

    And I think flattening the class hierarchy when possible is great in general, this simplifies understanding of the code.

    Changing it to a separate “legacy class” solely for de-serializing makes a lot of sense to me, it’s easier to review for correctness/compatibility and also self-documenting, more so than std::pair<CTransactionRef, std::pair<uint256, std::pair<std::vector<uint256>, int>>>.

  42. laanwj merged this on Jul 31, 2019
  43. laanwj closed this on Jul 31, 2019

  44. laanwj referenced this in commit 8241b51504 on Jul 31, 2019
  45. sidhujag referenced this in commit 8e647ab7ee on Aug 1, 2019
  46. furszy referenced this in commit 8a65d7d99a on Feb 19, 2021
  47. Munkybooty referenced this in commit 2bd48bf151 on Nov 17, 2021
  48. Munkybooty referenced this in commit 6cae26ad5e on Nov 17, 2021
  49. Munkybooty referenced this in commit 438010ab74 on Nov 17, 2021
  50. Munkybooty referenced this in commit ae87f054f5 on Nov 18, 2021
  51. Munkybooty referenced this in commit 338e7c21c9 on Nov 24, 2021
  52. Munkybooty referenced this in commit c7e5c321e2 on Nov 30, 2021
  53. Munkybooty referenced this in commit 4d83b241ae on Dec 15, 2021
  54. DrahtBot locked this on Dec 16, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 13:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me