p2p: Fill reconciliation sets (Erlay)

naumenkogs commented at 9:06 am on November 1, 2023: member

Keep track of per-peer reconciliation sets containing transactions to be exchanged efficiently. The remaining transactions are announced via usual flooding.

Erlay Project Tracking: #28646

DrahtBot commented at 9:06 am on November 1, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	brunoerg
Stale ACK	sr-gi, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

DrahtBot added the label P2P on Nov 1, 2023

in src/net_processing.cpp:5762 in b07029a67c outdated

5758@@ -5752,9 +5759,12 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5759         }
5760 
5761         if (auto tx_relay = peer->GetTxRelay(); tx_relay != nullptr) {
5762+                // Lock way before it's used to maintain lock ordering.

mzumsande commented at 8:27 pm on November 1, 2023:

Could you explain a bit more why this is necessary, what is the lock order that would get violated if we did the locking later (just here, not necessarily in the comment)?

naumenkogs commented at 7:58 am on November 2, 2023:

I tried a whole bunch of combinations. Say, you move the LOCK(m_peer_mutex); to L5831, where m_peer_map is used.

Then you get something like this in Cirrus (not exactly!).

 0 node0 2023-08-17T12:01:53.647510Z [msghand] [sync.cpp:97] [potential_deadlock_detected] POTENTIAL DEADLOCK DETECTED 
 1 node0 2023-08-17T12:01:53.647516Z [msghand] [sync.cpp:98] [potential_deadlock_detected] Previous lock order was: 
 2 node0 2023-08-17T12:01:53.647523Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  'NetEventsInterface::g_msgproc_mutex' in net.cpp:2095 (in thread 'msghand') 
 3 node0 2023-08-17T12:01:53.647531Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  'cs_main' in net_processing.cpp:5473 (in thread 'msghand') 
 4 node0 2023-08-17T12:01:53.647574Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  (2) 'm_peer_mutex' in net_processing.cpp:5686 (in thread 'msghand') 
 5 node0 2023-08-17T12:01:53.647581Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  'tx_relay->m_tx_inventory_mutex' in net_processing.cpp:5688 (in thread 'msghand') 
 6 node0 2023-08-17T12:01:53.647587Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  'tx_relay->m_bloom_filter_mutex' in net_processing.cpp:5768 (in thread 'msghand') 
 7 node0 2023-08-17T12:01:53.647593Z [msghand] [sync.cpp:107] [potential_deadlock_detected]  (1) 'm_mempool.cs' in net_processing.cpp:5850 (in thread 'msghand') 
 8 node0 2023-08-17T12:01:53.647598Z [msghand] [sync.cpp:111] [potential_deadlock_detected] Current lock order is: 
 9 node0 2023-08-17T12:01:53.647604Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  'NetEventsInterface::g_msgproc_mutex' in net.cpp:2095 (in thread 'msghand') 
10 node0 2023-08-17T12:01:53.647610Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  'm_chainstate_mutex' in validation.cpp:3102 (in thread 'msghand') 
11 node0 2023-08-17T12:01:53.647616Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  'cs_main' in validation.cpp:3124 (in thread 'msghand') 
12 node0 2023-08-17T12:01:53.647622Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  (1) 'MempoolMutex()' in validation.cpp:3126 (in thread 'msghand') 
13 node0 2023-08-17T12:01:53.647627Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  'cs_main' in net_processing.cpp:2013 (in thread 'msghand') 
14 node0 2023-08-17T12:01:53.647633Z [msghand] [sync.cpp:122] [potential_deadlock_detected]  (2) 'm_peer_mutex' in net_processing.cpp:1593 (in thread 'msghand')

From this log you see that m_peer_mutex should go before m_mempool.cs. I admit it might be not the 100% optimal placement, feels NP-hard to me :) Do you know how to approach this better?

mzumsande commented at 4:02 pm on November 21, 2023:

no, I don’t know a better approach, but thanks for the explanation!

in src/test/txreconciliation_tests.cpp:144 in b07029a67c outdated

139+                                               /*inbounds_nonrcncl_tx_relay=*/0, /*outbounds_nonrcncl_tx_relay=*/0);
140+        }
141+        BOOST_CHECK_EQUAL(total_fanouted, 3);
142+    }
143+
144+    // // Don't relay if there is sufficient non-reconciling peers

mzumsande commented at 8:35 pm on November 1, 2023:

nit: remove one //

in src/node/txreconciliation.cpp:214 in b07029a67c outdated

209@@ -193,6 +210,104 @@ class TxReconciliationTracker::Impl
210         LOCK(m_txreconciliation_mutex);
211         return IsPeerRegistered(peer_id);
212     }
213+
214+    std::vector<NodeId> GetFanoutTargets(CSipHasher& deterministic_randomizer_with_wtxid,

mzumsande commented at 8:36 pm on November 1, 2023:

I wonder if this algorithm (which took me a while to fully understand) could be simpler. E.g., if we used a sorted container for best_peers instead of a vector, inserted all of the peers, and then finally return the first targets elements of that container, I think we could do without the try_fanout_candidate lambda. Or would that be incorrect / less performant?

I’m thinking of something like the following (just to show idea, I didn’t test it):

struct ComparePairs {
    bool operator()(const std::pair<uint64_t, NodeId>& left, const std::pair<uint64_t, NodeId>& right) const {
        return left.first > right.first;
    }
};
std::vector<NodeId> GetFanoutTargets(CSipHasher& deterministic_randomizer_with_wtxid,
                                     bool we_initiate, double limit) const EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex)
{
    // The algorithm works as follows. We iterate through the peers (of a given direction)
    // hashing them with the given wtxid, and sort them by this hash.
    // We then consider top `limit` peers to be low-fanout flood targets.
    // The randomness should be seeded with wtxid to return consistent results for every call.

    double integer_part;
    double fractional_peer = std::modf(limit, &integer_part);
    const bool drop_peer_if_extra = deterministic_randomizer_with_wtxid.Finalize() > fractional_peer * double(UINT64_MAX);
    const size_t targets = drop_peer_if_extra ? size_t(integer_part): size_t(integer_part) + 1;

    std::set<std::pair<uint64_t, NodeId>, ComparePairs> best_peers;

    for (auto indexed_state : m_states) {
        const auto cur_state = std::get_if<TxReconciliationState>(&indexed_state.second);
        if (cur_state && cur_state->m_we_initiate == we_initiate) {
            uint64_t hash_key = deterministic_randomizer_with_wtxid.Write(cur_state->m_k0).Finalize();
            best_peers.insert(std::make_pair(hash_key, indexed_state.first));
        }
    }

    std::vector<NodeId> result;
    auto it = best_peers.begin();
    for (size_t i = 0; i < targets && it != best_peers.end(); ++i, ++it) {
        result.push_back(it->second);
    }
    return result;
}

naumenkogs commented at 8:06 am on November 2, 2023:

This is certainly better. For no good reason, i just chose to follow a pattern we use elsewhere and never reconsidered it. I will take this code.

in src/node/txreconciliation.cpp:227 in b07029a67c outdated

222+        // To handle fractional values, we add one peer optimistically and then probabilistically
223+        // drop it later.
224+        double integer_part;
225+        double fractional_peer = std::modf(limit, &integer_part);
226+        const size_t targets = size_t(integer_part) + 1;
227+        const bool drop_peer_if_extra = deterministic_randomizer_with_wtxid.Finalize() > fractional_peer * double(UINT64_MAX);

mzumsande commented at 8:39 pm on November 1, 2023:

Do we have to first add and then (maybe) drop a peer here (instead of determining how many peers we want at the beginning, and then getting as many peers as we can up the desired number).

in src/test/txreconciliation_tests.cpp:192 in b07029a67c outdated

121+    for (int i = 0; i < 100; ++i) {
122+        BOOST_CHECK(tracker.ShouldFanoutTo(GetRandHash(), hasher, peer_id0,
123+                                           /*inbounds_nonrcncl_tx_relay=*/0, /*outbounds_nonrcncl_tx_relay=*/0));
124+    }
125+
126+    // Now for inbound connections.

mzumsande commented at 8:43 pm on November 1, 2023:

10% of 30 is integer, so maybe also add an example with a fraction. If we run it often enough, we could probably assert that two values for total_fanouted are possible.

naumenkogs commented at 8:32 am on November 2, 2023:

I’m always not sure what to do with these kinds of probabilistic scenarios… Say you run 1000 experiments, and get 1000/0, so the assert fails. Is 1,000,000 sufficient in that case? Or how do you think this should be asserted otherwise.

sipa commented at 3:06 pm on January 10, 2024:

In commit “p2p: Add transactions to reconciliation set”

You could pick a fixed seed and wtxid for which you assert that with say 35 peers 3 are picked, and another seed/wtxid for which you assert that with 35 peers 4 are picked. That would at least show that the fractional probability code isn’t just rounding down.

in src/test/txreconciliation_tests.cpp:130 in b07029a67c outdated

125+
126+    // Now for inbound connections.
127+    for (int i = 1; i < 31; ++i) {
128+        tracker.PreRegisterPeer(i);
129+        BOOST_REQUIRE_EQUAL(tracker.RegisterPeer(i, /*is_peer_inbound=*/true, 1, 1), ReconciliationRegisterResult::SUCCESS);
130+

mzumsande commented at 8:43 pm on November 1, 2023:

nit: remove empty line

mzumsande commented at 8:45 pm on November 1, 2023: contributor

Approach ACK

naumenkogs force-pushed on Nov 2, 2023

brunoerg commented at 1:16 pm on November 2, 2023: contributor

Concept ACK

in src/net_processing.cpp:5892 in 983f8c6e30 outdated

5888+                                    // child we need to to check if any of the parents is currently
5889+                                    // reconciled so that the child isn't fanouted ahead. But then
5890+                                    // it gets tricky when reconciliation sets are full: a) the child
5891+                                    // can't just be added; b) removing parents from reconciliation
5892+                                    // sets for this one child is not good either.
5893+                                    fanout = true;

brunoerg commented at 1:28 pm on November 2, 2023:

In 983f8c6e305fd9707c109c2a92637825262b9b09: Since fanout is already initialized true, couldn’t we simplify it?

 0@@ -5878,19 +5878,17 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
 1                         if (reconciles_txs) {
 2                             auto txiter = m_mempool.GetIter(txinfo.tx->GetHash());
 3                             if (txiter) {
 4-                                if ((*txiter)->GetCountWithDescendants() > 1) {
 5-                                    // If a transaction has in-mempool children, always fanout it.
 6-                                    // Until package relay is implemented, this is needed to avoid
 7-                                    // breaking parent+child relay expectations in some cases.
 8-                                    //
 9-                                    // Potentially reconciling parent+child would mean that for every
10-                                    // child we need to to check if any of the parents is currently
11-                                    // reconciled so that the child isn't fanouted ahead. But then
12-                                    // it gets tricky when reconciliation sets are full: a) the child
13-                                    // can't just be added; b) removing parents from reconciliation
14-                                    // sets for this one child is not good either.
15-                                    fanout = true;
16-                                } else {
17+                                // If a transaction has in-mempool children, always fanout it.
18+                                // Until package relay is implemented, this is needed to avoid
19+                                // breaking parent+child relay expectations in some cases.
20+                                //
21+                                // Potentially reconciling parent+child would mean that for every
22+                                // child we need to to check if any of the parents is currently
23+                                // reconciled so that the child isn't fanouted ahead. But then
24+                                // it gets tricky when reconciliation sets are full: a) the child
25+                                // can't just be added; b) removing parents from reconciliation
26+                                // sets for this one child is not good either.
27+                                if ((*txiter)->GetCountWithDescendants() <= 1) {
28                                     auto fanout_randomizer = m_connman.GetDeterministicRandomizer(RANDOMIZER_ID_FANOUTTARGET);
29                                     fanout = m_txreconciliat
30ion->ShouldFanoutTo(wtxid, fanout_randomizer, pto->GetId(),
31                                                             
32                    inbounds_nonrcncl_tx_relay, outbounds_non
33rcncl_tx_relay);

in src/net_processing.cpp:5829 in 983f8c6e30 outdated

5823@@ -5814,6 +5824,28 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5824                     // No reason to drain out at many times the network's capacity,
5825                     // especially since we have many peers and some will draw much shorter delays.
5826                     unsigned int nRelayedTransactions = 0;
5827+
5828+                    size_t inbounds_nonrcncl_tx_relay = 0, outbounds_nonrcncl_tx_relay = 0;
5829+                    if (m_txreconciliation) {

brunoerg commented at 2:16 pm on November 2, 2023:

In 983f8c6e305fd9707c109c2a92637825262b9b09: Correct me if I’m wrong, but we’re going use inbounds_nonrcncl_tx_relay and outbounds_nonrcncl_tx_relay only whether reconciles_txs is true, couldn’t we only fill them if so?:

0                    if (reconciles_txs) {

naumenkogs force-pushed on Nov 7, 2023

in src/test/txreconciliation_tests.cpp:149 in 2af4d12410 outdated

80@@ -81,4 +81,75 @@ BOOST_AUTO_TEST_CASE(IsPeerRegisteredTest)
81     BOOST_CHECK(!tracker.IsPeerRegistered(peer_id0));
82 }
83 
84+BOOST_AUTO_TEST_CASE(ShouldFanoutToTest)

ariard commented at 6:51 pm on November 12, 2023:

I think a test can be added to exercise AddToSet and the check of the boundary MAX_SET_SIZE value.

in src/node/txreconciliation.cpp:177 in 7c4f8d9956 outdated

149+        // (2) really a lot of valid fee-paying transactions were dumped on us at once.
150+        // We don't care about a laggy peer (1) because we probably can't help them even if we fanout transactions.
151+        // However, exploiting (2) should not prevent us from relaying certain transactions.
152+        //
153+        // Transactions which don't make it to the set due to the limit are announced via fan-out.
154+        if (recon_state.m_local_set.size() >= MAX_SET_SIZE) return false;

ariard commented at 6:53 pm on November 12, 2023:

I think can be > only. Generally sounds maximum in net_processing is understood as strictly superior, e.g MAX_INV_SZ usage.

naumenkogs commented at 10:50 am on November 15, 2023:

This happens before the addition. Your suggestion would mean that the size could be 3001, which is not desired.

naumenkogs force-pushed on Nov 15, 2023

DrahtBot added the label CI failed on Nov 15, 2023

DrahtBot commented at 11:43 am on November 16, 2023: contributor

0node/txreconciliation.cpp:173 AddToSet: Assertion `recon_state.m_local_set.insert(wtxid).second' failed.

naumenkogs force-pushed on Nov 17, 2023

in src/node/txreconciliation.h:81 in 6977a5a156 outdated

73@@ -74,6 +74,22 @@ class TxReconciliationTracker
74     ReconciliationRegisterResult RegisterPeer(NodeId peer_id, bool is_peer_inbound,
75                                               uint32_t peer_recon_version, uint64_t remote_salt);
76 
77+    /**
78+     * Step 1. Add a new transaction we want to announce to the peer to the local reconciliation set
79+     * of the peer, so that it will be reconciled later, unless the set limit is reached.
80+     * Returns whether it was added.
81+     */

sr-gi commented at 9:47 pm on November 17, 2023:

Given the docs on {Pre}RegisterPeer and the last paragraph on TryRemovingFromSet, shouldn’t this also advise the caller to make sure the peer is registered?

naumenkogs commented at 7:57 am on November 23, 2023:

I think i’d rather drop it from TryRemovingFromSet. It’s not that strong anymore anyway, doesn’t help much and pretty obvious i think. Let me know if you think differently.

in src/node/txreconciliation.cpp:160 in 50dc9bf216 outdated

155+        AssertLockNotHeld(m_txreconciliation_mutex);
156+        LOCK(m_txreconciliation_mutex);
157+        if (!IsPeerRegistered(peer_id)) return false;
158+        auto& recon_state = std::get<TxReconciliationState>(m_states.find(peer_id)->second);
159+
160+        // Check if reconciliation set is not at capacity for two reasons:

sr-gi commented at 4:39 pm on November 20, 2023:

nit: the reconciliation set

naumenkogs commented at 7:58 am on November 23, 2023:

took this, but generally i think we have these mistakes all over the code and it’s fine :)

in src/node/txreconciliation.cpp:213 in 50dc9bf216 outdated

172+
173+        Assume(recon_state.m_local_set.insert(wtxid).second);
174+        LogPrintLevel(BCLog::TXRECONCILIATION, BCLog::Level::Debug, "Added %s to the reconciliation set for peer=%d. " /* Continued */
175+                                                                    "Now the set contains %i transactions.\n",
176+                      wtxid.ToString(), peer_id, recon_state.m_local_set.size());
177+        return true;

sr-gi commented at 7:34 pm on November 20, 2023:

I think this is a bit counter-intuitive. If the transaction to be added is already in the set, we will treat this as if it was added when that is actually not the case.

I guess the reasoning is that there should be no way of adding the same transaction more than once, so false is being used to signal failures. However, this seems like a way to potentially shoot ourselves in the foot.

If we want to keep the logic as is, I’d suggest that we at least mention this in the functions docs, given it currently reads:

0[...]
1* Returns whether it was added.

naumenkogs commented at 8:03 am on November 23, 2023:

You’re right, i shall think how to make it better.

naumenkogs commented at 8:53 am on November 24, 2023:

Improved the documentation around it a bit.

in src/node/txreconciliation.cpp:92 in 50dc9bf216 outdated

84@@ -77,6 +85,11 @@ class TxReconciliationTracker::Impl
85 private:
86     mutable Mutex m_txreconciliation_mutex;
87 
88+    /**
89+     * ReconciliationTracker-wide randomness to choose fanout targets for a given txid.
90+     */
91+    const SaltedTxidHasher m_txid_hasher;
92+

sr-gi commented at 8:13 pm on November 20, 2023:

Is this being used?

naumenkogs commented at 8:04 am on November 23, 2023:

no, it’s legacy i guess. wondering why the compiler haven’t found it.

naumenkogs commented at 8:04 am on November 23, 2023:

i think it’s from the latter commits. can be dropped here.

in src/node/txreconciliation.cpp:223 in 50dc9bf216 outdated

218+        // hashing them with the given wtxid, and sort them by this hash.
219+        // We then consider top `limit` peers to be low-fanout flood targets.
220+        // The randomness should be seeded with wtxid to return consistent results for every call.
221+
222+        double integer_part;
223+        double fractional_peer = std::modf(limit, &integer_part);

sr-gi commented at 8:17 pm on November 20, 2023:

I’m guessing this is supposed to be integral_part and fractional_part (?)

source: https://en.cppreference.com/w/cpp/numeric/math/modf

in src/node/txreconciliation.cpp:226 in 50dc9bf216 outdated

221+
222+        double integer_part;
223+        double fractional_peer = std::modf(limit, &integer_part);
224+        // Handle fractional value.
225+        const bool add_extra = deterministic_randomizer_with_wtxid.Finalize() > fractional_peer * double(UINT64_MAX);
226+        const size_t targets = add_extra ? size_t(integer_part): size_t(integer_part) + 1;

sr-gi commented at 8:22 pm on November 20, 2023:

Shouldn’t this need to be the other way around?

Maybe I’m confused by the name of the variables, but it seems like if add_extra is true, then you should do x+1

sr-gi commented at 9:10 pm on November 20, 2023:

nit: also, tagets_size may be a better name, in lieu of limit which is already being used, given this does not really refer to the targets themselves, but to the size of the returned targets collection

naumenkogs commented at 10:36 am on November 23, 2023:

flipped the < and flipped the ternary conditions. The behavior remains the same. Indeed, it was double-upside-down before.

in src/node/txreconciliation.cpp:1 in 50dc9bf216

sr-gi commented at 8:44 pm on November 20, 2023:

The TODOs for the pubic fields referring to m_we_initiate and m_k0 should be fixable at this point, given they are both used by GetFanoutTargets, shouldn’t they?

naumenkogs commented at 10:44 am on November 23, 2023:

I think this stuff remained from the legacy code where its use was encapsulated…. just deleting the comments.

in src/node/txreconciliation.cpp:276 in 50dc9bf216 outdated

281+            // number of inbound targets.
282+            const double inbound_targets = (inbounds_nonrcncl_tx_relay + inbound_rcncl_peers) * INBOUND_FANOUT_DESTINATIONS_FRACTION;
283+            destinations = inbound_targets - inbounds_nonrcncl_tx_relay;
284+        }
285+
286+        if (destinations < 0.01) {

sr-gi commented at 9:33 pm on November 20, 2023:

Where does this value come from? If we are going to use an arbitrary number, shouldn’t we at least make it constant and add some reasoning?

naumenkogs commented at 10:45 am on November 23, 2023:

eh, it’s just a workaround to handle almost-0-fractional-value C++ thing i guess. Might as well be 0.0001 or 0.000000001. Not sure how to make it beautiful.

sr-gi commented at 3:33 pm on November 27, 2023:

Right, got you. I guess we can also ask this the other way around to see if there is a cleaner workaround. In what case is destinations smaller than 1 but we still want to pass it to GetFanoutTargets?

naumenkogs commented at 8:17 am on December 1, 2023:

Say we have 5 inbound Erlay peers and 0 inbound legacy peers. inbound_targets = (5+0) * 0.1 = 0.5 destinations = 0.5 - 0 = 0.5

It just means that we will take 1 inbound peer for fanout with a 50% chance.

I guess it will work just fine if i drop the 0.01 (1% or less chance) check. It’s kinda unfortunate we have to do the compute if chance is that small. Fine with me either way, let me know what you think.

sr-gi commented at 3:24 pm on December 1, 2023:

Just adding a comment with some context, saying that for chances lower than 1% it may not be worth the computation may be fine. It just struck me as weird where that value came from.

in src/node/txreconciliation.cpp:259 in 50dc9bf216 outdated

254+        AssertLockNotHeld(m_txreconciliation_mutex);
255+        LOCK(m_txreconciliation_mutex);
256+        if (!IsPeerRegistered(peer_id)) return true;
257+        // We use the pre-determined randomness to give a consistent result per transaction,
258+        // thus making sure that no transaction gets "unlucky" if every per-peer roll fails.
259+        deterministic_randomizer.Write(wtxid.GetUint64(0));

sr-gi commented at 10:07 pm on November 20, 2023:

Shouldn’t we move this a bit down the method given it may not even be used if we return early? (such as is the destinations are too small)

sr-gi commented at 8:11 pm on November 21, 2023:

Either that, or seed it outside the method so you don’t even need to pass the wtxid in, is writing to the randomizer is cheap enough.

in src/net_processing.cpp:5766 in 50dc9bf216 outdated

5757@@ -5751,9 +5758,12 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5758         }
5759 
5760         if (auto tx_relay = peer->GetTxRelay(); tx_relay != nullptr) {
5761+                // Lock way before it's used to maintain lock ordering.
5762+                LOCK2(m_mempool.cs, m_peer_mutex);
5763                 LOCK(tx_relay->m_tx_inventory_mutex);
5764                 // Check whether periodic sends should happen
5765                 bool fSendTrickle = pto->HasPermission(NetPermissionFlags::NoBan);
5766+                const bool reconciles_txs = m_txreconciliation && m_txreconciliation->IsPeerRegistered(pto->GetId());

sr-gi commented at 4:11 pm on November 21, 2023:

Isn’t reconciles_txs only used within fSendTrickle == true? Wouldn’t it be worth moving it to that context?

naumenkogs commented at 10:54 am on November 23, 2023:

it will have to be moved in the following PR, but yeah, I’ll stick to your suggestion in this one.

in src/node/txreconciliation.h:82 in 6977a5a156 outdated

73@@ -74,6 +74,22 @@ class TxReconciliationTracker
74     ReconciliationRegisterResult RegisterPeer(NodeId peer_id, bool is_peer_inbound,
75                                               uint32_t peer_recon_version, uint64_t remote_salt);
76 
77+    /**
78+     * Step 1. Add a new transaction we want to announce to the peer to the local reconciliation set
79+     * of the peer, so that it will be reconciled later, unless the set limit is reached.
80+     * Returns whether it was added.
81+     */
82+    bool AddToSet(NodeId peer_id, const uint256& wtxid);

glozow commented at 4:39 pm on November 21, 2023:

Could this use the Wtxid type?

in src/node/txreconciliation.cpp:162 in 6977a5a156 outdated

155+        // However, exploiting (2) should not prevent us from relaying certain transactions.
156+        //
157+        // Transactions which don't make it to the set due to the limit are announced via fan-out.
158+        if (recon_state.m_local_set.size() >= MAX_SET_SIZE) return false;
159+
160+        Assume(recon_state.m_local_set.insert(wtxid).second);

mzumsande commented at 6:32 pm on November 21, 2023:

The test fails in debug mode because this assume conflicts with the test saying “Adding a duplicate transaction should not happen, but it does happen, nothing should break.”.

naumenkogs commented at 7:32 am on November 23, 2023:

I’m going to drop this test, unless you know a way how to handle this better.

in src/net_processing.cpp:5892 in 50dc9bf216 outdated

5887+                                // Until package relay is implemented, this is needed to avoid
5888+                                // breaking parent+child relay expectations in some cases.
5889+                                //
5890+                                // Potentially reconciling parent+child would mean that for every
5891+                                // child we need to to check if any of the parents is currently
5892+                                // reconciled so that the child isn't fanouted ahead. But then

mzumsande commented at 6:37 pm on November 21, 2023:

Although this reduced it, I think the situation where the child is fanouted ahead could still happen if we receive the parent first, add it to the recon set, and only after that receive the child and decide to fanout it. Not sure if that is a problem though.

naumenkogs commented at 7:49 am on November 23, 2023:

You’re right.

My fear with this is unexpected behavior for tx sender: e.g., you craft a “package” thinking parent always goes ahead, but then child gets ahead (potentially with the attacker’s help) and dropped on the floor due to some policy. Something along this, but maybe I’m making it up. Are these concerns at least semi-valid? @glozow

I can add “see whether a parent is in the set already” check, when looking at a child, if we think it’s worth it.

ariard commented at 11:56 pm on February 20, 2024:

I confirm with current implemented bip331 approach there is currently a MAX_ORPHAN_TOTAL_SIZE limit. You can always get a non-standard parent (e.g an under-dust output) and yet the child be policy valid. There is currently no sanitization of policy equivalence among a set of parent within ancpkginfo. In the future, you could have reconciliation at the pkgtxns-level or at package announcement (ancpkginfo). Ideally both, though that something that can be seen once erlay or bip331 are deployed.

Assuming there is no substantial timely delay between the parent being reconciliated and the child being fanout to the peer which would allow an overflow of MAX_ORPHAN_TOTAL_SIZE, I don’t think it’s altering the package acceptance of the receiving peer.

Assuming no exploitable timers, one can still make the simulation to quantity the “child drift” risk for a distribution of parent / child being reconciliated / fanout on the average time discrepancies between those 2 tx announcement strategy. Ideally in the future, we would move to sender-initiated package, which would remove this concern from my understanding. However, this is already a post-bip331 future, we’re talking about.

in src/node/txreconciliation.cpp:291 in 50dc9bf216 outdated

289+        if (destinations < 0.01) {
290+            return false;
291+        }
292+
293+        auto fanout_candidates = GetFanoutTargets(deterministic_randomizer, recon_state.m_we_initiate, destinations);
294+        return std::count(fanout_candidates.begin(), fanout_candidates.end(), peer_id);

sr-gi commented at 7:38 pm on November 21, 2023:

I wonder why are we using std::count here when we will find, at most, a single instance of peer_id within fanout_candidates. Wouldn’t std::find be a better option?

On the same line, it may even be worth making GetFanoutTargets return a set instead of a vector, given we are sure that the elements of the collection won’t be repeated

in src/node/txreconciliation.cpp:270 in 50dc9bf216 outdated

218+                                        bool we_initiate, double limit) const EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex)
219+    {
220+        // The algorithm works as follows. We iterate through the peers (of a given direction)
221+        // hashing them with the given wtxid, and sort them by this hash.
222+        // We then consider top `limit` peers to be low-fanout flood targets.
223+        // The randomness should be seeded with wtxid to return consistent results for every call.

sr-gi commented at 7:48 pm on November 21, 2023:

Is the assumption that we will never call this method more than once for the same wtxid? I wonder because it feels like the results may be deterministic based on the ordering of the m_states, which may not be persistent if a peer changes (?)

naumenkogs commented at 11:32 am on November 23, 2023:

We call this for every peer. So there might be, say, 8 calls for the same wtxid. The risk of changing m_states i thought is acceptable, it’s a rare event (outbound peers) and hard to exploit in a meaningful way. But I’m open to more analysis.

sr-gi commented at 9:58 pm on November 30, 2023:

Why not just make a copy of the seeded randomizer instead of accumulating to it all the changes from m_states? That way you ensure that a change in the ordering is not going to, potentially, produce a completely different order of the whole best_peers set. In this case, the worst that could happen would be that the newly added peer jumps places, and becomes one of the selected/not selected peers, instead of a complete rearrange of the collection

naumenkogs commented at 8:35 am on December 1, 2023:

done

in src/net_processing.cpp:5895 in 50dc9bf216 outdated

5890+                                // Potentially reconciling parent+child would mean that for every
5891+                                // child we need to to check if any of the parents is currently
5892+                                // reconciled so that the child isn't fanouted ahead. But then
5893+                                // it gets tricky when reconciliation sets are full: a) the child
5894+                                // can't just be added; b) removing parents from reconciliation
5895+                                // sets for this one child is not good either.

sr-gi commented at 8:39 pm on November 21, 2023:

Can you give more context regarding this?

naumenkogs commented at 11:42 am on November 23, 2023:

Sure, what exactly?

here I talk about an alternative way to implement this. Say, we want to add a child to the set, because the parent is already there, and we don’t want child ahead of parent. But the set is full. We can’t ignore the limit — that’s (a). (b) means that we could fanout child+parent in this case, but this remove parent from set operation is harder to reason about. Should we then remove parents of a parent too?

Maybe I overthink this issue.

sr-gi commented at 3:39 pm on November 27, 2023:

I was referring to b). So this is trying to prevent a situation in which, potentially, the whole set is a cluster and once we are full, a new transaction belonging to the cluster is tried to be added, triggering a cascade removal and making us waste resources?

naumenkogs commented at 8:40 am on December 1, 2023:

Yes. My primary concern is the complexity though. Cascade removal is more difficult to reason about. I just thought it was not worth it.

sr-gi commented at 3:24 pm on December 1, 2023:

Fair

sr-gi commented at 9:51 pm on November 21, 2023: member

Concept ACK

naumenkogs force-pushed on Nov 23, 2023

DrahtBot removed the label CI failed on Nov 23, 2023

naumenkogs force-pushed on Nov 24, 2023

naumenkogs commented at 8:53 am on November 24, 2023: member

Addressed the comments, mostly refactoring. Some conversations pending above. The code is good for review.

naumenkogs force-pushed on Dec 1, 2023

naumenkogs force-pushed on Dec 4, 2023

naumenkogs commented at 9:13 am on December 4, 2023: member

Addressed all comments. Ready for review.

in src/node/txreconciliation.cpp:52 in 150aad3764 outdated

58@@ -51,9 +59,6 @@ class TxReconciliationState
59     bool m_we_initiate;
60 
61     /**
62-     * TODO: These fields are public to ignore -Wunused-private-field. Make private once used in
63-     * the following commits.
64-     *

sr-gi commented at 3:20 pm on December 4, 2023:

nit: if you have created a refactor commit to deal with removing the TODOs, it may be worth moving that change there.

Not sure if you’re planning to squash it or not. Feel free to disregard.

sr-gi commented at 3:20 pm on December 4, 2023: member

ACK f895ae4

DrahtBot requested review from mzumsande on Dec 4, 2023

DrahtBot requested review from brunoerg on Dec 4, 2023

naumenkogs force-pushed on Dec 5, 2023

naumenkogs commented at 8:23 am on December 5, 2023: member

@sr-gi implemented your suggestion and moved the last commit to the front, so that these legacy field don’t distract reviewers.

sr-gi commented at 4:49 pm on December 5, 2023: member

re-ACK 3a062b2 the diff is mainly moving the removal of TODOs between commits.

I’ve noticed that the co-authorship of 3a062b2bdc6dc787b967947872f55131522cd2ac was dropped, which may have been unintended.

Also, looks like this is failing CI, but it may be unrelated.

refactor: remove legacy comments

These comments became irrelevant in one of the previous code changes.
They simply don't make sense anymore.

2ef69fbbe3

in src/test/txreconciliation_tests.cpp:146 in 3a062b2bdc outdated

141+    CSipHasher hasher(0x0706050403020100ULL, 0x0F0E0D0C0B0A0908ULL);
142+
143+    // If peer is not registered for reconciliation, it should be always chosen for flooding.
144+    BOOST_REQUIRE(!tracker.IsPeerRegistered(peer_id0));
145+    for (int i = 0; i < 100; ++i) {
146+        BOOST_CHECK(tracker.ShouldFanoutTo(GetRandHash(), hasher, peer_id0,

maflcko commented at 5:06 pm on December 5, 2023:

Seems better to use a deterministic fast random context in tests, so that failures, if they happen, are deterministic?

naumenkogs force-pushed on Dec 6, 2023

naumenkogs commented at 10:26 am on December 6, 2023: member

@sr-gi fixed the co-authorship. Thank you for patience :)

took @maflcko suggestion on deterministic tests.

sr-gi commented at 3:57 pm on December 6, 2023: member

tACK dca7990

in src/node/txreconciliation.h:105 in dca7990c3b outdated

 97@@ -84,6 +98,12 @@ class TxReconciliationTracker
 98      * Check if a peer is registered to reconcile transactions with us.
 99      */
100     bool IsPeerRegistered(NodeId peer_id) const;
101+
102+    /**
103+     * Returns whether the peer is chosen as a low-fanout destination for a given tx.
104+     */
105+    bool ShouldFanoutTo(const uint256& wtxid, CSipHasher deterministic_randomizer, NodeId peer_id,

sr-gi commented at 3:59 pm on December 6, 2023:

Ups, I forgot to add this.

nit: Shouldn’t this also be Wtxid& for consistency with the rest of the interface?

This is non-blocking, but I realized when reviewing the last changes to the tests

naumenkogs commented at 8:20 am on December 7, 2023:

as long as you reack a new version it’s fine, since no other acks are pending yet :)

naumenkogs force-pushed on Dec 7, 2023

naumenkogs commented at 8:42 am on December 7, 2023: member

Took a minor suggestion from @sr-gi, fixed clang-formatting, fixed code distribution between the two last commits.

sr-gi commented at 5:08 pm on December 7, 2023: member

re-ACK f56ec8a

in src/node/txreconciliation.cpp:243 in f56ec8a3b0 outdated

238+            result.insert(it->second);
239+        }
240+        return result;
241+    }
242+
243+    bool ShouldFanoutTo(const Wtxid& wtxid, CSipHasher deterministic_randomizer, NodeId peer_id,

mzumsande commented at 5:14 pm on December 12, 2023:

I think that deterministic_randomizer could be passed by reference

naumenkogs commented at 8:04 am on December 13, 2023:

Will fix if i end up retouching it :)

sipa commented at 8:33 pm on December 13, 2023:

I believe the appropriate type in this case is CSipHasher&& deterministic_randomizer. This avoids a copy, allows the local variable inside ShouldFanoutTo to be modified, and the caller is constructing a fresh CSipHasher anyway.

mzumsande commented at 6:21 pm on December 12, 2023: contributor

Code Review ACK f56ec8a3b086b0f2f8a0fbde861447e40ffbc3d9

One thing I’m unsure about is how the way we call ShouldFanoutTo() multiple times for each transaction might affect performance. If we have 120 reconciling peers, and get a dump of MAX_SET_SIZE=3000 transactions, I think we’d call this function 360000 times within a short timeframe. I wonder how long that would take, maybe we could add a benchmark for this? (could be done a follow-up)

naumenkogs commented at 8:11 am on December 13, 2023: member

@mzumsande Previously I also felt it’s unfortunate, but never found a good way to fix it. Now I think perhaps we should compute GetFanoutTargets once for every transaction, and then memorize it in a std::map<Wtxid, list_of_peers> inside the TxReconciliation module. We can cap the map at 1000 in a FIFO-fashion. Worst case we drop a relevant transaction and then have to recompute it (that’s current behaviour default case). Memory should be ok too: 1000 txs * 120 peers * 0.1 ratio = 12,000 entries of NodeId. What do you think?

mzumsande commented at 8:08 pm on December 13, 2023: contributor

What do you think?

To get an estimate about performance, I wrote a simple benchmark, see https://github.com/mzumsande/bitcoin/commit/22f98d6f54042b3e8ee41889ceb362f6355e4fa0. On my computer (neither particularly fast nor slow) I get ~1000op/s per wtxid to call ShouldFanoutTo() for all 120 peers, so a batch of 3000 txs should take ~3s time in ShouldFanoutTo. So I think that caching could make sense.

I wonder if there is there a good way to restrict this map to wtxids that are still in some peer’s inv queue and remove them afterwards, or would the map just always fill up?

in src/node/txreconciliation.cpp:207 in f56ec8a3b0 outdated

205+        AssertLockNotHeld(m_txreconciliation_mutex);
206+        LOCK(m_txreconciliation_mutex);
207+        return IsPeerRegistered(peer_id);
208+    }
209+
210+    std::set<NodeId> GetFanoutTargets(CSipHasher& deterministic_randomizer_with_wtxid,

sipa commented at 8:18 pm on December 13, 2023:

The deterministic_randomizer_with_wtxid variable could be const CSipHasher& here (to make it clear that this object is only ever used as a starting point for (independent) hashes).

in src/node/txreconciliation.cpp:225 in f56ec8a3b0 outdated

223+
224+        auto cmp_by_key = [](const std::pair<uint64_t, NodeId>& left, const std::pair<uint64_t, NodeId>& right) {
225+            return left.first > right.first;
226+        };
227+
228+        std::set<std::pair<uint64_t, NodeId>, decltype(cmp_by_key)> best_peers(cmp_by_key);

sipa commented at 8:21 pm on December 13, 2023:

I believe it would be significantly faster to make this an std::vector<std::pair<uint64_t, NodeId>>, .reserve() it once, fill it up like in the loop below, and then std::sort it in-place using cmp_by_key. This has far better memory locality, less allocation overhead, and ought to have the same asymptotic complexity (O(n log n)).

in src/node/txreconciliation.cpp:286 in f56ec8a3b0 outdated

284+
285+        // We use the pre-determined randomness to give a consistent result per transaction,
286+        // thus making sure that no transaction gets "unlucky" if every per-peer roll fails.
287+        deterministic_randomizer.Write(wtxid.ToUint256());
288+
289+        auto fanout_candidates = GetFanoutTargets(deterministic_randomizer, recon_state.m_we_initiate, destinations);

sipa commented at 8:31 pm on December 13, 2023:

Constructing the full set of fanout candidates any time you want to consider relaying anything to any node has O(n^2 log n) scaling (you’ll call GetFanoutTargets O(n) times, and each does O(n log n) work going over all nodes), I believe.

That may or may not matter, but I’d consider at least one of these variants:

Cache the GetFanoutTargets result per transaction as discussed elsewhere in this PR, making it O(n log n) only.
Instead of computing an std::set<NodeId> in GetFanoutTargets (which at least involves an allocation per result), pass the peer_id to GetFanoutTargets, which can then just return a boolean (see if the peer_id appears in the first target_size elements of best_peers after sorting). This doesn’t change the asymptotic behavior, but means there is just one allocation per ShouldFanoutTo.

in src/node/txreconciliation.cpp:219 in f56ec8a3b0 outdated

217+
218+        double integral_part;
219+        double fractional_part = std::modf(limit, &integral_part);
220+        // Handle fractional value.
221+        const bool add_extra = deterministic_randomizer_with_wtxid.Finalize() < fractional_part * double(UINT64_MAX);
222+        const size_t targets_size = add_extra ? size_t(integral_part) + 1 : size_t(integral_part);

sipa commented at 8:34 pm on December 13, 2023:

Nit: const size_t targets_size = integral_part + add_extra.

sipa commented at 8:43 pm on December 13, 2023:

An alternative which may be slightly faster (unsure, depends on how fast std::modf is):

const size_t targets_size = ((deterministic_randomizer_with_wtxid.Finalize() & 0xFFFFFFFF) + uint64_t(limit * 0x100000000)) >> 32;

It has lower precision, but I suspect 32 bits is more than sufficient here.

in src/node/txreconciliation.cpp:264 in f56ec8a3b0 outdated

262+        double destinations;
263+        if (recon_state.m_we_initiate) {
264+            destinations = OUTBOUND_FANOUT_DESTINATIONS - outbounds_nonrcncl_tx_relay;
265+        } else {
266+            const size_t inbound_rcncl_peers = std::count_if(m_states.begin(), m_states.end(),
267+                                                             [](std::pair<NodeId, std::variant<uint64_t, TxReconciliationState>> indexed_state) {

sipa commented at 8:40 pm on December 13, 2023:

Pass indexed_state by reference. This is creating a copy of the entire TxReconciliationState for every element in m_states.

[](const auto& indexed_state) { ... } works.

in src/node/txreconciliation.cpp:231 in f56ec8a3b0 outdated

229+
230+        for (auto indexed_state : m_states) {
231+            const auto cur_state = std::get_if<TxReconciliationState>(&indexed_state.second);
232+            if (cur_state && cur_state->m_we_initiate == we_initiate) {
233+                uint64_t hash_key = CSipHasher(deterministic_randomizer_with_wtxid).Write(cur_state->m_k0).Finalize();
234+                best_peers.insert(std::make_pair(hash_key, indexed_state.first));

sipa commented at 8:45 pm on December 13, 2023:

best_peers.emplace(hash_key, indexed_state.first); (avoids a temporary std::pair object in the caller).

(it’d become best_peers.emplace_back(hash_key indexed_state.first); if best_peers is converted to a vector as suggested above.

in src/node/txreconciliation.cpp:227 in f56ec8a3b0 outdated

225+            return left.first > right.first;
226+        };
227+
228+        std::set<std::pair<uint64_t, NodeId>, decltype(cmp_by_key)> best_peers(cmp_by_key);
229+
230+        for (auto indexed_state : m_states) {

sipa commented at 8:46 pm on December 13, 2023:

Use const auto& indexed_state : m_states) here; you’re making a copy of the entire TxReconciliationState for every iteration as written here.

in src/net_processing.cpp:5834 in f56ec8a3b0 outdated

5826@@ -5818,6 +5827,29 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5827                     // No reason to drain out at many times the network's capacity,
5828                     // especially since we have many peers and some will draw much shorter delays.
5829                     unsigned int nRelayedTransactions = 0;
5830+
5831+                    size_t inbounds_nonrcncl_tx_relay = 0, outbounds_nonrcncl_tx_relay = 0;
5832+                    const bool reconciles_txs = m_txreconciliation && m_txreconciliation->IsPeerRegistered(pto->GetId());
5833+                    if (reconciles_txs) {
5834+                        for (auto [cur_peer_id, cur_peer] : m_peer_map) {

sipa commented at 8:50 pm on December 13, 2023:

Use const auto& [cur_peer_id, cur_peer] : m_peer_map here. You’re making a copy of the PeerRef object for every iteration here (which isn’t the end of the world, it’s just a shared pointer, but copying those does involve an atomic memory operation).

sipa commented at 8:54 pm on December 13, 2023: member

Not a full review; mostly performance improvement suggestions as I saw there was a suggestion to cache things. Before considering that, it may be worth benchmarking if that’s still relevant with these review comments addressed.

See https://github.com/sipa/bitcoin/commits/pr28765 for a commit incorporating these changes (excluding caching). Feel free to use whatever part of it. I assume you’d at least want to rename GetFanoutTargets if its return value becomes a bool like here, but I haven’t included that to minimize changes.

naumenkogs force-pushed on Dec 14, 2023

naumenkogs commented at 9:23 am on December 14, 2023: member

Before @sipa optimization:

0|               ns/op |                op/s |    err% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------:|:----------
2|        1,401,785.00 |              713.38 |    1.6% |      0.02 | `ShouldFanoutTo`

After:

0|               ns/op |                op/s |    err% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------:|:----------
2|          369,620.50 |            2,705.48 |    1.4% |      0.01 | `ShouldFanoutTo`

4x improvement, I think batching is still worth implementing. I will suggest code changes shortly.

naumenkogs force-pushed on Dec 20, 2023

naumenkogs commented at 9:02 am on December 20, 2023: member

Finally submitted the optimized code. I also tried caching, but the speed went down 10x (either map or unordered_map). Perhaps @sipa @mzumsande have interest in looking what could be the issue.

mzumsande commented at 5:43 pm on January 8, 2024: contributor

Finally submitted the optimized code.

Did you perhaps push the wrong branch? You upvoted several of sipa’s suggestions but they don’t seem to be addressed in the current code.

naumenkogs force-pushed on Jan 9, 2024

naumenkogs commented at 8:37 am on January 9, 2024: member

@mzumsande you’re right i lost between branches, sorry.

Apparently caching gives another 12x boost, assuming it works correctly :)

0|               ns/op |                op/s |    err% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------:|:----------
2|           35,617.11 |           28,076.40 |    1.9% |      0.01 | `ShouldFanoutTo`

naumenkogs force-pushed on Jan 9, 2024

in src/node/txreconciliation.cpp:103 in c7c6bcc5c6 outdated

 98+     * be hard to manipulate).
 99+     *
100+     * No need to use LRU (bump transaction order upon access) because in most cases
101+     * transactions are processed almost-sequentially.
102+     */
103+    std::deque<Wtxid> tx_fanout_targes_cache_order;

sr-gi commented at 7:19 pm on January 9, 2024:

nit: targes -> targets

in src/node/txreconciliation.cpp:268 in c7c6bcc5c6 outdated

266+            new_fanout_candidates.insert(it->second);
267+        }
268+
269+        tx_fanout_targets_cache_data.emplace(wtxid, new_fanout_candidates);
270+        // Replace the oldest cache item with this new one.
271+        if (tx_fanout_targes_cache_order.size () == 3000) {

sr-gi commented at 7:34 pm on January 9, 2024:

nit: I’d be nice to define this as a constant

in src/node/txreconciliation.cpp:272 in c7c6bcc5c6 outdated

270+        // Replace the oldest cache item with this new one.
271+        if (tx_fanout_targes_cache_order.size () == 3000) {
272+            auto expired_tx = tx_fanout_targes_cache_order.front();
273+            tx_fanout_targets_cache_data.erase(expired_tx);
274+            tx_fanout_targes_cache_order.pop_front();
275+            tx_fanout_targes_cache_order.push_back(wtxid);

sr-gi commented at 7:36 pm on January 9, 2024:

I don’t think this is right(?), this logic is never triggered.

You are checking whether the size of tx_fanout_targes_cache_order is over the limit, but you are only adding elements to the queue within that same if context, meaning that the queue is always empty.

sr-gi commented at 7:42 pm on January 9, 2024: member

Some additional comments after the recent optimizations + caching

naumenkogs force-pushed on Jan 10, 2024

in src/node/txreconciliation.cpp:131 in 1d06576d74 outdated

127@@ -128,14 +128,20 @@ class TxReconciliationTracker::Impl
128         }
129     }
130 
131-    bool IsPeerRegistered(NodeId peer_id) const EXCLUSIVE_LOCKS_REQUIRED(!m_txreconciliation_mutex)
132+    bool IsPeerRegistered(NodeId peer_id) const EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex)

sipa commented at 1:31 pm on January 10, 2024:

In “refactor: Add a pre-mutexed version of IsPeerRegistered”:

I’d prefer to swap the names (= have an external IsPeerRegistered, and an internal IsPeerRegisteredInternal), as external callers ought to only care about the external one. Also is it possible to make the internal one private?

in src/node/txreconciliation.cpp:61 in 64cae4514b outdated

56+     * Store all wtxids which we would announce to the peer (policy checks passed, etc.)
57+     * in this set instead of announcing them right away. When reconciliation time comes, we will
58+     * compute a compressed representation of this set ("sketch") and use it to efficiently
59+     * reconcile this set with a set on the peer's side.
60+     */
61+    std::set<Wtxid> m_local_set;

sipa commented at 1:42 pm on January 10, 2024:

If this set may contain up to 3000 (=MAX_SET_SIZE) elements, it may be worth using an std::unordered_set instead (with something like SaltedTxidHasher as hasher).

in src/node/txreconciliation.cpp:236 in baaf390c75 outdated

231+        }
232+
233+        static constexpr auto cmp_by_key = [](const auto& left, const auto& right) {
234+            return left.first > right.first;
235+        };
236+        std::sort(best_peers.begin(), best_peers.end(), cmp_by_key);

sipa commented at 2:58 pm on January 10, 2024:

In commit “p2p: Add transactions to reconciliation sets”

It doesn’t actually matter whether we pick the lowest-hash_key entries or highest-hash_key entries to fanout to, I think? If so, can just use std::sort(best_peers.begin(), best_peers.end());, and drop the cmp_by_key.

Also, since you only care about the top targets_size results, I believe using std::partial_sort may be faster (but benchmark it).

naumenkogs commented at 10:01 am on January 12, 2024:

ShouldFanoutTo looks fitting for benchmarking, and partial_sort had no effect. I took the former suggestion of course.

in src/node/txreconciliation.cpp:251 in baaf390c75 outdated

246+                        size_t inbounds_nonrcncl_tx_relay, size_t outbounds_nonrcncl_tx_relay)
247+        const EXCLUSIVE_LOCKS_REQUIRED(!m_txreconciliation_mutex)
248+    {
249+        AssertLockNotHeld(m_txreconciliation_mutex);
250+        LOCK(m_txreconciliation_mutex);
251+        if (!IsPeerRegistered(peer_id)) return true;

sipa commented at 2:59 pm on January 10, 2024:

In commit “p2p: Add transactions to reconciliation set”

Would it make sense to have an (internal) helper function instead of IsPeerRegistered, that instead of returning a bool returns a TxReconciliationState* (nullptr if not found)? That would avoid locking/finding twice. I suspect it’d be usable in other places too.

in src/node/txreconciliation.cpp:265 in baaf390c75 outdated

260+        // whether the given peer falls into this list.
261+        double destinations;
262+        if (recon_state.m_we_initiate) {
263+            destinations = OUTBOUND_FANOUT_DESTINATIONS - outbounds_nonrcncl_tx_relay;
264+        } else {
265+            const size_t inbound_rcncl_peers = std::count_if(m_states.begin(), m_states.end(),

sipa commented at 3:02 pm on January 10, 2024:

In commit “p2p: Add transactions to reconciliation set”

This value could be cached inside TxReconciliationTracker::Impl I think, and updated on registering/deregister, to avoid recomputating it every time?

in src/node/txreconciliation.cpp:108 in b54d8ec9ff outdated

103+     *
104+     * No need to use LRU (bump transaction order upon access) because in most cases
105+     * transactions are processed almost-sequentially.
106+     */
107+    std::deque<Wtxid> tx_fanout_targets_cache_order;
108+    std::map<Wtxid, std::set<NodeId>> tx_fanout_targets_cache_data GUARDED_BY(m_txreconciliation_mutex);

sipa commented at 3:12 pm on January 10, 2024:

In commit “Cache fanout candidates to optimize txreconciliation”

Since the cache entry std::set<NodeId> don’t change until being replaced entirely, I believe it would be more efficient to use a std::vector<NodeId>, with the nodeids in sorted order. You can then use std::binary_search to query in that vector.

It’d be more efficient as std::vector has much better memory locality than std::set, and fewer indirections.

If you do this, can also just reuse the best_peers vector, and shrink it to target_size (but make sure to call shrink_to_fit, as otherwise the memory for the remainder won’t be released.

in src/node/txreconciliation.cpp:270 in b54d8ec9ff outdated

267-            if (it->second == peer_id) return true;
268+            new_fanout_candidates.insert(it->second);
269         }
270-        return false;
271+
272+        tx_fanout_targets_cache_data.emplace(wtxid, new_fanout_candidates);

sipa commented at 3:19 pm on January 10, 2024:

In commit “Cache fanout candidates to optimize txreconciliation”

Use std::move around new_fanout_candidates (or best_peers if you take my earlier suggestion to use a vector), to avoid a copy. If so, lookup the peer_id in it before moving, so you can return that value.

DrahtBot added the label CI failed on Jan 12, 2024

naumenkogs force-pushed on Jan 19, 2024

p2p: Functions to add/remove wtxids to tx reconciliation sets

They will be used later on.

be8ef38d29

naumenkogs force-pushed on Jan 19, 2024

naumenkogs force-pushed on Jan 22, 2024

DrahtBot removed the label CI failed on Jan 22, 2024

in src/net_processing.cpp:5832 in 82126d3ca5 outdated

5827@@ -5818,6 +5828,29 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5828                     // No reason to drain out at many times the network's capacity,
5829                     // especially since we have many peers and some will draw much shorter delays.
5830                     unsigned int nRelayedTransactions = 0;
5831+
5832+                    size_t inbounds_nonrcncl_tx_relay = 0, outbounds_nonrcncl_tx_relay = 0;

ariard commented at 11:23 pm on February 20, 2024:

for code readability, such variables could be renamed inbounds_flooding_tx_relay / outbounds_flooding_tx_relay, which is matching the low-fanout flooding strategy used for them.

in src/net_processing.cpp:5897 in 82126d3ca5 outdated

5893+                                // reconciled so that the child isn't fanouted ahead. But then
5894+                                // it gets tricky when reconciliation sets are full: a) the child
5895+                                // can't just be added; b) removing parents from reconciliation
5896+                                // sets for this one child is not good either.
5897+                                if ((*txiter)->GetCountWithDescendants() <= 1) {
5898+                                    fanout = m_txreconciliation->ShouldFanoutTo(wtxid, pto->GetId(),

ariard commented at 11:36 pm on February 20, 2024:

can add a LogPrint(BCLog::NET, “Non-signaling reconciliation inbound peers flooding %d Outbound peers flooding %d for debug”); for debug purpose and observation

in src/node/txreconciliation.cpp:342 in 82126d3ca5 outdated

337+            destinations = inbound_targets - inbounds_nonrcncl_tx_relay;
338+        }
339+
340+        // Pure optimization to avoid going through the peers when the odds of picking one are
341+        // too low.
342+        if (destinations < 0.001) {

ariard commented at 5:56 pm on February 21, 2024:

can be constified same than OUTBOUND_FANOUT_DESTINATIONS and INBOUND_FANOUT_DESTINATIONS_FRACTION as explanation for this value is tight to them

naumenkogs commented at 7:09 am on February 22, 2024:

Not really related. Change these constants in any way, and the 0.001 value won’t be affected.

ariard commented at 7:24 pm on February 22, 2024:

okay will read back the txrelayism issue on const selection.

p2p: Add transactions to reconciliation sets

Transactions eligible for reconciliation are added to the
reconciliation sets. For the remaining txs, low-fanout is used.

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>

cad6f0a374

naumenkogs force-pushed on Feb 22, 2024

in src/node/txreconciliation.cpp:306 in b3db2bcd01 outdated

301+
302+        // If the cache is full, make room for the new entry.
303+        if (m_tx_fanout_targets_cache_order.size() == FANOUT_TARGETS_PER_TX_CACHE_SIZE) {
304+            auto expired_tx = m_tx_fanout_targets_cache_order.front();
305+            m_tx_fanout_targets_cache_data.erase(expired_tx);
306+            m_tx_fanout_targets_cache_order.pop_front();

ariard commented at 7:37 pm on February 22, 2024:

In the eventuality of an influx of inbound transactions, faster than we can flush out them to low-fanout flooding peers, my understanding of dropping the upfront wtxid candidate, we would keep propagating this transaction only according to tx-relay policy and connection state of other peers (not this NodeId anymore).

I understand we’re fanning out only to outbound peers (m_tx_fanout_targets_cache_data doc), though here it’s more a dependency on the perfomance capabilities of the full-node itself (i.e how fast you process vInv(MSG_WTX) and how fast you-reannounce them to downstream peers if valid). To interferes with a transaction propagation, assuming a non-listening node, an attacker would have to be puppet or compromise all our low-fanout outbound peers, I think ? Obviously more outbound peers would make things better on this front, which should be allowed by Erlay tx-relay bandwidth savings.

in src/node/txreconciliation.cpp:69 in b3db2bcd01 outdated

67      * These values are used to salt short IDs, which is necessary for transaction reconciliations.
68      */
69     uint64_t m_k0, m_k1;
70 
71+    /**
72+     * Store all wtxids which we would announce to the peer (policy checks passed, etc.)

ariard commented at 2:08 am on February 24, 2024:

In terms of peers-side policy check (i.e m_fee_filter_received), this policy limit is at the tx-relay link level and this is unilaterally initiated by the peer. As such I think there is no guarantee that between time point A we add a Wtxid in m_local_set and time point B we reconciliate, we have not received a new bip133 message, updating the m_fee_filter_received. I believe we can retro-actively stale stored Wtxid and as such a bandwidth performance leak, under situations of sudden network mempool spikes.

I don’t think there is that much a tx-announcement strategy (either flooding or reconciliation) can do it in itself, unless assuming some extensions to bip133 messages to commit on a feerate-level duration. As such, I think any improvement is out of scope for this PR.

in src/node/txreconciliation.cpp:197 in b3db2bcd01 outdated

196+        // Check if the reconciliation set is not at capacity for two reasons:
197+        // - limit sizes of reconciliation sets and short id mappings;
198+        // - limit CPU use for sketch computations.
199+        //
200+        // Since we reconcile frequently, reaching capacity either means:
201+        // (1) a peer for some reason does not request reconciliations from us for a long while, or

ariard commented at 2:33 am on February 24, 2024:

I think “(1)” can be extended a bit more e.g “Memory DoS issue for a laggy peer are bounded by DEFAULT_MAX_PEER_CONNECTIONS and reconciliation state is clean up with FinalizeNode".

ariard commented at 2:35 am on February 24, 2024: none

Reviewed up to be8ef38d29, still reading back txrelayism issue on announcement-related bandwidth / latency and responsibilities trade-off for the choice of current constants.

in src/bench/txreconciliation.cpp:34 in 17ae36c0f6 outdated

29+    for (size_t i = 0; i < 1000; i++) {
30+        txs.push_back(Wtxid::FromUint256(rc.rand256()));
31+    }
32+
33+    bench.run([&] {
34+        size_t rand_i = rand() % txs.size();

brunoerg commented at 1:18 pm on February 26, 2024:

In 17ae36c0f60b976c237bdb522059c31cf113c773: Is there any reason to define rand_i into bench.run? Couldn’t we have it out of it? Notice that without it, the bench is 70% faster.

naumenkogs commented at 10:03 am on February 27, 2024:

I’m dropping it. Do you understand why this simple op as var assignment eats so much perf? :)

maflcko commented at 10:57 am on February 27, 2024:

Do you understand why this simple op as var assignment eats so much perf? :)

I don’t think std::rand() has any quality or performance documentations, so it is free to do whatever it wants

add bench for ShouldFanoutTo 87e0ec0b4c

p2p: Cache fanout candidates to optimize txreconciliation 07f3ad4d94

p2p: Cache inbound reconciling peers count

It helps to avoid recomputing every time we consider
a transaction for fanout/reconciliation.

a14dfd9c87

naumenkogs force-pushed on Feb 27, 2024

in src/net_processing.cpp:5896 in a14dfd9c87

5892+                                // child we need to check if any of the parents is currently
5893+                                // reconciled so that the child isn't fanouted ahead. But then
5894+                                // it gets tricky when reconciliation sets are full: a) the child
5895+                                // can't just be added; b) removing parents from reconciliation
5896+                                // sets for this one child is not good either.
5897+                                if ((*txiter)->GetCountWithDescendants() <= 1) {

ariard commented at 9:12 pm on February 27, 2024:

One follow-up improvement, all the descendants in GetCountWithDescendants() could be marked with parent_fanout=true, that way we guarantee more stringently that all the members of a chain of transactions are tx-announcement relayed through the same strategy (either erlay or low-fanout flooding). I’ll check if there is test coverage here.

in src/net_processing.cpp:179 in a14dfd9c87

174@@ -175,6 +175,8 @@ static constexpr double MAX_ADDR_RATE_PER_SECOND{0.1};
175 static constexpr size_t MAX_ADDR_PROCESSING_TOKEN_BUCKET{MAX_ADDR_TO_SEND};
176 /** The compactblocks version we support. See BIP 152. */
177 static constexpr uint64_t CMPCTBLOCKS_VERSION{2};
178+/** Used to determine whether to use low-fanout flooding (or reconciliation) for a tx relay event. */
179+static const uint64_t RANDOMIZER_ID_FANOUTTARGET = 0xbac89af818407b6aULL; // SHA256("fanouttarget")[0:8]

Empact commented at 8:46 pm on February 29, 2024:

constexpr?

Empact commented at 8:47 pm on February 29, 2024:

FANOUTTARGET -> FANOUT_TARGET?

in src/node/txreconciliation.cpp:295 in a14dfd9c87

290+        std::sort(best_peers.begin(), best_peers.end());
291+        best_peers.resize(targets_size);
292+
293+        std::vector<NodeId> new_fanout_candidates;
294+        new_fanout_candidates.reserve(targets_size);
295+        for_each(best_peers.begin(), best_peers.end(),

Empact commented at 9:01 pm on February 29, 2024:

std::for_each?

in src/node/txreconciliation.cpp:265 in a14dfd9c87

260+        }
261+
262+        // We use the pre-determined randomness to give a consistent result per transaction,
263+        // thus making sure that no transaction gets "unlucky" if every per-peer roll fails.
264+        CSipHasher deterministic_randomizer{m_deterministic_randomizer};
265+        deterministic_randomizer.Write(wtxid.ToUint256());

ariard commented at 7:30 pm on March 1, 2024:

Looking on CSipHasher, given it’s a pseudo-random hash function, verified it’s well-initialized from two hidden random 64-bit seeds in src/init.cpp (L1239). Then we add a CSipHasher instance provided by CConman at TxReconciliationTracker initialization in src/net_processing.cpp. This respect the SipHash’s PRF’s requirement to initialize it with a random 128-bit key. I still wonder if in the future TxReconciliationTracker shouldn’t get it’s own random seed (i.e use GetRand(), it promises fast entropy generation) to isolate tx-announcement from the rest of network connection management.

sr-gi commented at 8:08 pm on May 14, 2024:

This seems to be consistent with how a deterministic randomizer is seeded in many other places in the codebase. What is your rationale for making it different here?

in src/node/txreconciliation.cpp:255 in a14dfd9c87

249@@ -142,9 +250,104 @@ class TxReconciliationTracker::Impl
250         return (recon_state != m_states.end() &&
251                 std::holds_alternative<TxReconciliationState>(recon_state->second));
252     }
253+
254+    // Not const because of caching.
255+    bool IsFanoutTarget(const Wtxid& wtxid, NodeId peer_id, bool we_initiate, double limit) EXCLUSIVE_LOCKS_REQUIRED(m_txreconciliation_mutex)

ariard commented at 7:33 pm on March 1, 2024:

This variable name can be called destination rather than limit to be consistent with ShouldFanoutTo and denotates more clearly it’s the sample space boundary.

in src/node/txreconciliation.cpp:284 in a14dfd9c87

279+        best_peers.reserve(m_states.size());
280+
281+        for (const auto& indexed_state : m_states) {
282+            const auto cur_state = std::get_if<TxReconciliationState>(&indexed_state.second);
283+            if (cur_state && cur_state->m_we_initiate == we_initiate) {
284+                uint64_t hash_key = CSipHasher(deterministic_randomizer).Write(cur_state->m_k0).Finalize();

ariard commented at 8:53 pm on March 7, 2024:

I think the comment L66 in src/node/txreconciliation.cpp can be updated to reflect the usage of m_k0 as a siphash input string for low-fanout flood peers selection. Not only used in ComputeShortID.

sr-gi commented at 7:01 pm on May 15, 2024:

I think this can actually be seeded with anything, it doesn’t have to be m_k0. IMO it’d better not be, to not repurpose something that is meant for something completely different

in src/node/txreconciliation.cpp:93 in be8ef38d29 outdated

88+    {
89+        AssertLockHeld(m_txreconciliation_mutex);
90+        auto salt_or_state = m_states.find(peer_id);
91+        if (salt_or_state == m_states.end()) return nullptr;
92+
93+        auto* state = std::get_if<TxReconciliationState>(&salt_or_state->second);

brunoerg commented at 5:58 pm on March 21, 2024:

nit: you could return it directly.

in src/node/txreconciliation.cpp:199 in a14dfd9c87

198+        // - limit CPU use for sketch computations.
199+        //
200+        // Since we reconcile frequently, reaching capacity either means:
201+        // (1) a peer for some reason does not request reconciliations from us for a long while, or
202+        // (2) really a lot of valid fee-paying transactions were dumped on us at once.
203+        // We don't care about a laggy peer (1) because we probably can't help them even if we fanout transactions.

brunoerg commented at 6:08 pm on March 21, 2024:

What does “laggy peer” mean?

sr-gi commented at 7:46 pm on May 14, 2024:

I’m guessing “a peer for some reason does not request reconciliations from us for a long while”, hence why it references (1)

achow101 commented at 2:15 pm on May 16, 2024: member

Superseded by #30116

achow101 closed this on May 16, 2024

bitcoin locked this on May 16, 2025

p2p: Fill reconciliation sets (Erlay) #28765

Code Coverage

Reviews

Conflicts