refactor: Convert GenTxid to std::variant #32631

pull marcofleon wants to merge 9 commits into bitcoin:master from marcofleon:2025/05/gentxid-variant changing 31 files +298 −269
  1. marcofleon commented at 3:45 pm on May 28, 2025: contributor

    Part of the type safety refactor.

    This PR changes the GenTxid class to a variant, which holds both Txids and Wtxids. This provides compile-time type safety and eliminates the manual type check (bool m_is_wtxid). Variables that can be either a Txid or a Wtxid are now using the new GenTxid variant, instead of uint256.

  2. Implement GenTxid as a variant
    Reimplements the GenTxid class as a variant for better type safety.
    Also adds two temporary functions to the old GenTxid class that
    convert to and from the new variant.
    cbea2a6e49
  3. Convert txmempool `exists` to GenTxidVariant 811d2f34a0
  4. Convert `info` and `info_for_relay` to GenTxidVariant ad6d02720c
  5. Convert `CompareInvMempoolOrder` to GenTxidVariant
    This completes the conversion to the variant in txmempool.
    6598f26cbd
  6. DrahtBot commented at 3:45 pm on May 28, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32631.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK dergoegge, stickies-v, instagibbs

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32189 (refactor: Txid type safety (parent PR) by marcofleon)
    • #31829 (p2p: improve TxOrphanage denial of service bounds and increase -maxorphantxs by glozow)
    • #30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. DrahtBot added the label Refactoring on May 28, 2025
  8. marcofleon commented at 3:46 pm on May 28, 2025: contributor

    In txrequest the ByPeer and ByTxHash indexes still sort announcements by the underlying uint256 txhash only, as opposed to using the GenTxid comparators which sort by type and then hash. The patch below can be used to make sure that there are no accidental switches from uint256 comparisons to GenTxid comparisons.

     0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
     1index 011e3cd928..1a47a39dfd 100644
     2--- a/src/net_processing.cpp
     3+++ b/src/net_processing.cpp
     4@@ -206,6 +206,12 @@ struct QueuedBlock {
     5     std::unique_ptr<PartiallyDownloadedBlock> partialBlock;
     6 };
     7 
     8+struct CompareGenTxid {
     9+    bool operator()(const GenTxid& a, const GenTxid& b) const {
    10+        return std::tuple(a.index(), a.ToUint256()) < std::tuple(b.index(), b.ToUint256());
    11+    }
    12+};
    13+
    14 /**
    15  * Data structure for an individual peer. This struct is not protected by
    16  * cs_main since it does not contain validation-critical data.
    17@@ -302,7 +308,7 @@ struct Peer {
    18          *  non-wtxid-relay peers, wtxid for wtxid-relay peers). We use the
    19          *  mempool to sort transactions in dependency order before relay, so
    20          *  this does not have to be sorted. */
    21-        std::set<GenTxid> m_tx_inventory_to_send GUARDED_BY(m_tx_inventory_mutex);
    22+        std::set<GenTxid, CompareGenTxid> m_tx_inventory_to_send GUARDED_BY(m_tx_inventory_mutex);
    23         /** Whether the peer has requested us to send our complete mempool. Only
    24          *  permitted if the peer has NetPermissionFlags::Mempool or we advertise
    25          *  NODE_BLOOM. See BIP35. */
    26@@ -856,7 +862,7 @@ private:
    27     std::shared_ptr<const CBlock> m_most_recent_block GUARDED_BY(m_most_recent_block_mutex);
    28     std::shared_ptr<const CBlockHeaderAndShortTxIDs> m_most_recent_compact_block GUARDED_BY(m_most_recent_block_mutex);
    29     uint256 m_most_recent_block_hash GUARDED_BY(m_most_recent_block_mutex);
    30-    std::unique_ptr<const std::map<GenTxid, CTransactionRef>> m_most_recent_block_txs GUARDED_BY(m_most_recent_block_mutex);
    31+    std::unique_ptr<const std::map<GenTxid, CTransactionRef, CompareGenTxid>> m_most_recent_block_txs GUARDED_BY(m_most_recent_block_mutex);
    32 
    33     // Data about the low-work headers synchronization, aggregated from all peers' HeadersSyncStates.
    34     /** Mutex guarding the other m_headers_presync_* variables. */
    35@@ -2027,7 +2033,7 @@ void PeerManagerImpl::NewPoWValidBlock(const CBlockIndex *pindex, const std::sha
    36         std::async(std::launch::deferred, [&] { return NetMsg::Make(NetMsgType::CMPCTBLOCK, *pcmpctblock); })};
    37 
    38     {
    39-        auto most_recent_block_txs = std::make_unique<std::map<GenTxid, CTransactionRef>>();
    40+        auto most_recent_block_txs = std::make_unique<std::map<GenTxid, CTransactionRef, CompareGenTxid>>();
    41         for (const auto& tx : pblock->vtx) {
    42             most_recent_block_txs->emplace(tx->GetHash(), tx);
    43             most_recent_block_txs->emplace(tx->GetWitnessHash(), tx);
    44diff --git a/src/util/transaction_identifier.h b/src/util/transaction_identifier.h
    45index 3cdeb664d4..6ab5d98263 100644
    46--- a/src/util/transaction_identifier.h
    47+++ b/src/util/transaction_identifier.h
    48@@ -88,13 +88,12 @@ public:
    49         return std::visit([](const auto& id) -> const uint256& { return id.ToUint256(); }, *this);
    50     }
    51 
    52-    friend bool operator==(const GenTxid& a, const GenTxid& b) {
    53-        return a.index() == b.index() && a.ToUint256() == b.ToUint256();
    54-    }
    55-
    56-    friend bool operator<(const GenTxid& a, const GenTxid& b) {
    57-        return std::tuple(a.index(), a.ToUint256()) < std::tuple(b.index(), b.ToUint256());
    58-    }
    59+    friend bool operator==(const GenTxid& a, const GenTxid& b) = delete;
    60+    friend bool operator!=(const GenTxid& a, const GenTxid& b) = delete;
    61+    friend bool operator<(const GenTxid& a, const GenTxid& b) = delete;
    62+    friend bool operator<=(const GenTxid& a, const GenTxid& b) = delete;
    63+    friend bool operator>(const GenTxid& a, const GenTxid& b) = delete;
    64+    friend bool operator>=(const GenTxid& a, const GenTxid& b) = delete;
    65 };
    66 
    67 #endif // BITCOIN_UTIL_TRANSACTION_IDENTIFIER_H
    68 
    

    Then run cmake --build build --target bitcoind. To check all tests as well, comment out the unit and fuzz tests for txrequest, as these tests directly compare GenTxid objects.

  9. Convert `txdownloadman_impl` to GenTxidVariant
    Convert all of `txdownloadman_impl` to the new variant except for
    `GetRequestsToSend`, which will be easier to switch at the same
    time as `txrequest`.
    a954a49fdf
  10. Convert `txrequest` to GenTxidVariant
    Switch all instances of GenTxid to the new variant
    in `txrequest` and complete `txdownloadman_impl` by
    converting `GetRequestsToSend`.
    d40849859c
  11. Convert remaining instances of GenTxid to GenTxidVariant 772e9a47b9
  12. Remove old GenTxid class c9888c5b73
  13. scripted-diff: Replace GenTxidVariant with GenTxid
    -BEGIN VERIFY SCRIPT-
    sed -i 's/GenTxidVariant/GenTxid/g' $(git grep -l 'GenTxidVariant')
    -END VERIFY SCRIPT-
    98358c4d85
  14. in src/test/fuzz/txdownloadman.cpp:384 in e66a5540e2 outdated
    382+                GenTxid gtxid;
    383+                if (fuzzed_data_provider.ConsumeBool()) {
    384+                    rand_tx->GetHash();
    385+                } else {
    386+                    rand_tx->GetWitnessHash();
    387+                }
    


    dergoegge commented at 4:13 pm on May 28, 2025:
    The return values for GetHash and GetWitnessHash need to be assigned to gtxid
  15. marcofleon force-pushed on May 28, 2025
  16. in src/net_processing.cpp:5820 in 98358c4d85
    5818+                        std::set<GenTxid>::iterator it = vInvTx.back();
    5819                         vInvTx.pop_back();
    5820-                        uint256 hash = *it;
    5821-                        CInv inv(peer->m_wtxid_relay ? MSG_WTX : MSG_TX, hash);
    5822+                        GenTxid hash = *it;
    5823+                        CInv inv(peer->m_wtxid_relay ? MSG_WTX : MSG_TX, hash.ToUint256());
    


    dergoegge commented at 4:19 pm on May 28, 2025:
    0                        Assume(peer->m_wtxid_relay == std::holds_alternative<Wtxid>(hash));
    1                        CInv inv(peer->m_wtxid_relay ? MSG_WTX : MSG_TX, hash.ToUint256());
    
  17. in src/node/txdownloadman_impl.cpp:380 in 98358c4d85
    377@@ -378,27 +378,27 @@ node::RejectedTxTodo TxDownloadManagerImpl::MempoolRejectedTx(const CTransaction
    378             // We can tolerate having up to 1 parent in m_lazy_recent_rejects_reconsiderable since we
    379             // submit 1p1c packages. However, fail immediately if any are in m_lazy_recent_rejects.
    380             std::optional<uint256> rejected_parent_reconsiderable;
    


    dergoegge commented at 4:26 pm on May 28, 2025:

    not that it really matters but could do

    0            std::optional<Txid> rejected_parent_reconsiderable;
    
  18. in src/util/transaction_identifier.h:97 in 98358c4d85
    92+        return a.index() == b.index() && a.ToUint256() == b.ToUint256();
    93+    }
    94+
    95+    friend bool operator<(const GenTxid& a, const GenTxid& b) {
    96+        return std::tuple(a.index(), a.ToUint256()) < std::tuple(b.index(), b.ToUint256());
    97+    }
    


    stickies-v commented at 9:12 am on May 29, 2025:

    nit: can also be implemented as a single three-way operator:

    0    friend auto operator<=>(const GenTxid& a, const GenTxid& b) {
    1        return std::tuple(a.index(), a.ToUint256()) <=> std::tuple(b.index(), b.ToUint256());
    2    }
    

    (needs to include <compare> and <tuple>)

  19. in src/txmempool.h:656 in 98358c4d85
    655-        return (mapTx.count(gtxid.GetHash()) != 0);
    656+        return std::visit(util::Overloaded{
    657+            [this](const Wtxid& wtxid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return (mapTx.get<index_by_wtxid>().count(wtxid) != 0); },
    658+            [this](const Txid& txid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return (mapTx.count(txid) != 0); }
    659+        }, gtxid);
    660     }
    


    stickies-v commented at 9:51 am on May 29, 2025:

    In 811d2f34a0309a62a69897a15db47214bbfec776:

    Isn’t it more straightforward to just to have two bool exists(const Txid& txid) and bool exists(const Wtxid& wtxid) overloads?

  20. dergoegge commented at 9:52 am on May 29, 2025: member

    Concept ACK

    I’m not sure if the intermediate commits make things better here. Looking at the final diff seems simple enough.

  21. in src/txmempool.cpp:907 in 98358c4d85
    900@@ -893,7 +901,11 @@ CTransactionRef CTxMemPool::get(const uint256& hash) const
    901 TxMempoolInfo CTxMemPool::info(const GenTxid& gtxid) const
    902 {
    903     LOCK(cs);
    904-    indexed_transaction_set::const_iterator i = (gtxid.IsWtxid() ? get_iter_from_wtxid(gtxid.GetHash()) : mapTx.find(gtxid.GetHash()));
    905+    indexed_transaction_set::const_iterator i = std::visit(util::Overloaded{
    906+        [this](const Wtxid& wtxid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return get_iter_from_wtxid(wtxid); },
    907+        [this](const Txid& txid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return mapTx.find(txid); }
    908+    }, gtxid);
    


    stickies-v commented at 10:23 am on May 29, 2025:

    Similarly to my comment about TxMempool::exists(), I think it would be better here to just overload info:

     0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
     1index c0f6ac3f03..c34511d27d 100644
     2--- a/src/net_processing.cpp
     3+++ b/src/net_processing.cpp
     4@@ -5821,7 +5821,7 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
     5                             continue;
     6                         }
     7                         // Not in the mempool anymore? don't bother sending it.
     8-                        auto txinfo = m_mempool.info(ToGenTxid(inv).ToVariant());
     9+                        auto txinfo{std::visit([&](const auto& tx) { return m_mempool.info(tx); }, ToGenTxid(inv).ToVariant())};
    10                         if (!txinfo.tx) {
    11                             continue;
    12                         }
    13diff --git a/src/txmempool.cpp b/src/txmempool.cpp
    14index 4153794218..a60c692e84 100644
    15--- a/src/txmempool.cpp
    16+++ b/src/txmempool.cpp
    17@@ -890,17 +890,18 @@ CTransactionRef CTxMemPool::get(const uint256& hash) const
    18     return i->GetSharedTx();
    19 }
    20 
    21-TxMempoolInfo CTxMemPool::info(const GenTxidVariant& gtxid) const
    22+TxMempoolInfo CTxMemPool::info(const Txid& txid) const
    23 {
    24     LOCK(cs);
    25-    indexed_transaction_set::const_iterator i = std::visit(util::Overloaded{
    26-        [this](const Wtxid& wtxid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return get_iter_from_wtxid(wtxid); },
    27-        [this](const Txid& txid) EXCLUSIVE_LOCKS_REQUIRED(cs) { return mapTx.find(txid); }
    28-    }, gtxid);
    29+    indexed_transaction_set::const_iterator i{mapTx.find(txid)};
    30+    return i == mapTx.end() ? TxMempoolInfo{} : GetInfo(i);
    31+}
    32 
    33-    if (i == mapTx.end())
    34-        return TxMempoolInfo();
    35-    return GetInfo(i);
    36+TxMempoolInfo CTxMemPool::info(const Wtxid& wtxid) const
    37+{
    38+    LOCK(cs);
    39+    indexed_transaction_set::const_iterator i{get_iter_from_wtxid(wtxid)};
    40+    return i == mapTx.end() ? TxMempoolInfo{} : GetInfo(i);
    41 }
    42 
    43 TxMempoolInfo CTxMemPool::info_for_relay(const GenTxidVariant& gtxid, uint64_t last_sequence) const
    44diff --git a/src/txmempool.h b/src/txmempool.h
    45index f96434a344..5421f029bb 100644
    46--- a/src/txmempool.h
    47+++ b/src/txmempool.h
    48@@ -663,7 +663,8 @@ public:
    49         AssertLockHeld(cs);
    50         return mapTx.project<0>(mapTx.get<index_by_wtxid>().find(wtxid));
    51     }
    52-    TxMempoolInfo info(const GenTxidVariant& gtxid) const;
    53+    TxMempoolInfo info(const Txid& txid) const;
    54+    TxMempoolInfo info(const Wtxid& wtxid) const;
    55 
    56     /** Returns info for a transaction if its entry_sequence < last_sequence */
    57     TxMempoolInfo info_for_relay(const GenTxidVariant& gtxid, uint64_t last_sequence) const;
    
  22. stickies-v commented at 10:31 am on May 29, 2025: contributor

    Concept ACK, the new GenTxid allows us to better benefit from the type safety of Txid and Wtxid and this PR takes a clear and structured approach that seems straightforward to review.

    Approach-wise, I think I have a slightly different view. I find code generally easier to understand when we use the underlying types {Txid, Wtxid} directly and as such would prefer functions to be overloaded with those types, pushing the handling of variants to the edge as much as possible. This also has the side benefit of gently nudging users of those functions to not using variants if they don’t have to. I have left 2 such comments on the first commits, but I suspect it applies to later commits too (did not investigate that in detail yet). This is largely a style preference, and I don’t want to block the PRs progress over it if other reviewers disagree.

  23. instagibbs commented at 4:42 pm on May 29, 2025: member
    concept ACK

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-05-30 00:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me