This is an implementation of Erlay , using primitives in the BIP-330 (see the updated spec here ). Please refer to these two to understand the design. My talk is here.
Abstract
Erlay uses both flooding (announcing using INV messages to all peers) and reconciliation to announce transactions. Flooding is expensive, so Erlay seeks to use it sparingly and in strategic locations - only well-connected publicly reachable nodes flood transactions to other publicly reachable nodes via outbound connections. Since every unreachable node is directly connected to several reachable nodes, this policy ensures that a transaction is quickly propagated to be within one hop from most of the nodes in the network.
All transactions not propagated through flooding are propagated through efficient set reconciliation. To do this, every node keeps a reconciliation set for each peer, in which transactions are placed which would have been announced using INV messages absent this protocol. Every 2 seconds every node chooses a peer from its outbound connections in a predetermined order to reconcile with, resulting in both sides learning the transactions known to the other side. After every reconciliation round, the corresponding reconciliation set is cleared.
I think both paper and the BIP motives the changes, but I’ll mention them briefly once again here:
save 40% of the bandwidth consumed by a node
increase network connectivity for almost no bandwidth or latency cost
improves privacy as a side-effect
How to review
I suggest doing make clean && autogen.sh && configure before you try building it locally.
naumenkogs marked this as a draft
on Mar 23, 2021
DrahtBot added the label
Build system
on Mar 23, 2021
DrahtBot added the label
P2P
on Mar 23, 2021
DrahtBot added the label
Utils/log/libs
on Mar 23, 2021
in
src/txreconciliation.h:12
in
8110da8b1boutdated
Looks like insertion should always happen? If so you could assert .second of emplace() result. If not, then GetRand() could be avoided in case it exists, and below it should return the existing salt.
promag
commented at 9:50 pm on March 23, 2021:
member
Just noticed it’s marked as draft, I’ll resume once you mark it ready for review.
naumenkogs force-pushed
on Mar 23, 2021
naumenkogs force-pushed
on Mar 23, 2021
naumenkogs force-pushed
on Mar 24, 2021
DrahtBot
commented at 9:59 am on March 24, 2021:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#28912 (refactor: VectorWriter and SpanReader without nVersion by maflcko)
#28892 (refactor: P2P transport without serialize version and type by maflcko)
#28765 (p2p: Fill reconciliation sets (Erlay) by naumenkogs)
#28463 (p2p: Increase inbound capacity for block-relay only connections by mzumsande)
#28451 (refactor: Remove unused SER_DISK, SER_NETWORK, SER_GETHASH by maflcko)
#28429 (Do not log p2p bip61 reject messages, improve log, add tests by jonatack)
#27826 (validation: log which peer sent us a header by Sjors)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
naumenkogs force-pushed
on Mar 24, 2021
naumenkogs force-pushed
on Mar 24, 2021
naumenkogs force-pushed
on Mar 25, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 26, 2021
naumenkogs force-pushed
on Mar 27, 2021
naumenkogs closed this
on Mar 28, 2021
naumenkogs reopened this
on Mar 28, 2021
naumenkogs marked this as ready for review
on Mar 28, 2021
naumenkogs force-pushed
on Mar 28, 2021
naumenkogs force-pushed
on Mar 28, 2021
naumenkogs force-pushed
on Mar 28, 2021
in
src/txreconciliation.cpp:60
in
3e4022a4a8outdated
The problem is that you can hit undefined behavior with the current code - assuming peer can be disconnected between IsPeerRegistered and IsPeerChosenForFlooding.
Edit: Looks like the above is not possible.
promag
commented at 10:43 pm on March 28, 2021:
member
Code review ACK up to 99906e0c3399225ee361ce85291074971d2025e1.
naumenkogs force-pushed
on Mar 29, 2021
in
src/net_processing.cpp:4557
in
3692db74f8outdated
Making const. Can’t move because it’s erased in two lines after, so it won’t be accessible in the inner scope.
And I can’t call .second on .erase, because it returns a number of erased elements.
One alternative could be to use txid_to_ann_protocol in vInvTx (instead of uint256). But then I won’t be able to reuse CompareInvMempoolOrder in AnnounceTxs.
in
src/txreconciliation.cpp:126
in
05ac9dc21eoutdated
119@@ -120,6 +120,13 @@ class TxReconciliationTracker::Impl {
120 */
121 std::unordered_map<NodeId, ReconciliationState> m_states GUARDED_BY(m_mutex);
122123+ /**
124+ * Maintains a queue of reconciliations we should initiate. To achieve higher bandwidth
125+ * conservation and avoid overflows, we should reconcile in the same order, because then it’s
126+ * easier to estimate set differene size.
nit, enum class and use as ReconciliationPhase::NONE?
in
src/net_processing.h:56
in
272b3b2db9outdated
52+ /**
53+ * Relay transaction to every node.
54+ * The announcement protocol (flooding or reconciliation) will be chosen based on from which
55+ * peer the transaction came to us. For transactions originated locally, pass from=nullopt.
56+ */
57+ virtual void RelayTransaction(const uint256& txid, const uint256& wtxid, std::optional<NodeId> from=std::nullopt)
There’s no need to update this public interface method. The only users of this method outside net_processing will always call RelayTransaction() without a from parameter.
(it’s also unnecessary to wrap an optional parameter in a std::optional since the fact that it’s an optional parameter already gives you a way to indicate no value)
Yes, you’ll need to change the function signature in PeerManagerImpl() as well. An override function must have the same signature as the function it’s overriding.
in
src/net_processing.cpp:376
in
272b3b2db9outdated
371@@ -365,6 +372,9 @@ class PeerManagerImpl final : public PeerManager
372 /** Number of peers with wtxid relay. */
373 int m_wtxid_relay_peers GUARDED_BY(cs_main) = 0;
374375+ /** Number of outbound peers we may flood transactions to. */
376+ int m_out_flood_to GUARDED_BY(cs_main) = 0;
No need to protect this with cs_main. You can just make it atomic.
in
src/net_processing.cpp:611
in
272b3b2db9outdated
606@@ -597,6 +607,9 @@ struct CNodeState {
607 //! Whether this peer relays txs via wtxid
608 bool m_wtxid_relay{false};
609610+ //! Whether this peer may be used by us for flooding txs.
611+ bool m_we_may_flood_to{false};
Again, no need for this to be protected by main. I think it can easily live in Peer.
promag
commented at 11:38 am on March 29, 2021:
member
Code review up to f842ec7839e36adf24782228934f4fff223279a5.
in
src/net_processing.cpp:2068
in
272b3b2db9outdated
2062@@ -2017,8 +2063,9 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
2063 * orphan will be reconsidered on each call of this function. This set
2064 * may be added to if accepting an orphan causes its children to be
2065 * reconsidered.
2066+ * @param from The peer for which we are processing the orphan set.
2067 */
2068-void PeerManagerImpl::ProcessOrphanTx(std::set<uint256>& orphan_work_set)
2069+void PeerManagerImpl::ProcessOrphanTx(std::set<uint256>& orphan_work_set, NodeId from)
DrahtBot added the label
Needs rebase
on Apr 1, 2021
naumenkogs force-pushed
on Apr 2, 2021
naumenkogs force-pushed
on Apr 2, 2021
DrahtBot removed the label
Needs rebase
on Apr 2, 2021
in
src/txreconciliation.h:37
in
baa461151doutdated
26+ * required because the initiator knows only short IDs of those transactions.
27+ * 5. Sometimes reconciliation fails if the difference is larger than the parties estimated,
28+ * then there is one sketch extension round, in which the initiator requests for extra data.
29+ * 6. If extension succeeds, go to step 4.
30+ * 7. If extension fails, the initiator notifies the peer
31+ */
Thanks for the clear write-up, may I suggest to insert a stable link to the paper ?
in
src/txreconciliation.h:38
in
baa461151doutdated
27+ * 5. Sometimes reconciliation fails if the difference is larger than the parties estimated,
28+ * then there is one sketch extension round, in which the initiator requests for extra data.
29+ * 6. If extension succeeds, go to step 4.
30+ * 7. If extension fails, the initiator notifies the peer
31+ */
32+class TxReconciliationTracker {
I think one obvious extension of the test coverage would be to join basic fuzz coverage of TxReconciliationTracker interface to assert its robustness. Have a look on src/test/fuzz/addrman.cpp or src/test/fuzz/banman.cpp.
Fuzz coverage might test further sanity of TxReconciliationTracker after each call or in function of m_we_initiate or m_flood_to. Still thinking about concrete suggestions there.
ariard
commented at 9:32 pm on April 14, 2021:
member
I think PR description might incorporate a test plan describing in details functional/unit/fuzz coverage/manual testing coverage. Working on a suggestion.
At least for manual testing, we should start to run nodes on a signet with this branch. But should add a preferential peering to find erlay peers on the default signet network or should we encourage signet node operators to upgrade their peers with this branch ? Or even boostrap a new signet ?
in
src/net_processing.cpp:139
in
baa461151doutdated
130+static constexpr auto INBOUND_INVENTORY_BROADCAST_INTERVAL = 2s;
131 /** Average delay between trickled inventory transmissions for outbound peers.
132 * Use a smaller delay as there is less privacy concern for them.
133 * Blocks and peers with noban permission bypass this. */
134-static constexpr auto OUTBOUND_INVENTORY_BROADCAST_INTERVAL = 2s;
135+static constexpr auto OUTBOUND_INVENTORY_BROADCAST_INTERVAL = 1s;
Surprised that modifying those 2 values doesn’t lead to functional test changes. Apparently, they’re not covered in p2p_tx_download.py. I can extend coverage there in another PR.
in
src/net_processing.cpp:1596
in
baa461151doutdated
1550+ // inbounds* (see above), so non-reachable nodes would never flood any transactions.
1551+ // Now, if they flood their local transactions, it will be obvious for their immediate peers
1552+ // that those transactions belong to those non-reachable nodes.
1553+ //
1554+ // On the other hand, if non-reachable nodes reconcile those transactions, they would look
1555+ // like any other transactions of non-reachable nodes.
Do we have privacy leaks for the following cases not mentioned in this comment ?
A non-reachable node reconciliate with reconciliation-supporting flooding outbound peers ? As it would be a deviation from the expected announcement policy.
A reachable node reconciliate with reconciliation-supporting flooding inbound/outbound peers ? Likewise, it would be a deviation from the expected announcement policy on those links.
I think those cases are handled well at tx-announcement in SendMessages. Though I believe this comment would be better if it said that reconciliation is downgraded to flooding if this the announcement previously selected for the to-be-announced-to peer.
Let’s continue this same discussion in one place: #21515 (review)
in
src/net_processing.cpp:3312
in
baa461151doutdated
2388+ std::vector<CInv> remote_missing_invs;
2389+ remote_missing_invs.reserve(std::min<size_t>(remote_missing_wtxids.size(), MAX_INV_SZ));
2390+
2391+ while (!remote_missing_wtxids.empty()) {
2392+ // No need to add transactions to peer's filter or do checks
2393+ // because it was already done when adding to the reconciliation set.
In theory, peer’s feefilter might have increased between transaction selection for announcement at INV sending and receiving a SKETCH or RECONCILDIFF messages. Though AVG_FEEFILTER_BROADCAST_INTERVAL is 10min so it should be really okay.
Is this gap could be exploited by an off-path attacker to interfere with tx-relay ? I don’t think so, or at least without costly manipulation of victim’s mempools.
naumenkogs
commented at 9:45 am on April 21, 2021:
Accidentally some nodes might ignore fee filter, but we don’t punish, so it’s safe in that sense. Some useless transactions could be sent, but I think it’s too hard to exploit in a real way.
Yes I recognize that’s hard to exploit but even further any mempool manipulation comes up with a priced cost for the attacker as you need utxo value to fan out in victim(s) mempools.
in
src/net_processing.cpp:2406
in
baa461151doutdated
Not only with that. I ran pycodestyle and got many warnings about it.
naumenkogs
commented at 4:11 pm on April 21, 2021:
@brunoerg I think following best practices beyond what our linter does is not necessary, but still I tried to resolve most of those complaints from pycodestyle where it makes sense.
in
src/net_processing.cpp:2797
in
a9270714e5outdated
2792+ uint32_t recon_version;
2793+ uint64_t remote_salt;
2794+ vRecv >> they_initiator >> they_responder >> recon_version >> remote_salt;
2795+
2796+ // Since this is called before VERACK, m_out_flooding_peers doesn't include this peer
2797+ // being considered for reconciliation support, so no need for substraction.
in
src/net_processing.cpp:4003
in
a9270714e5outdated
3998+ // we want to request extension).
3999+ m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::RECONCILDIFF,
4000+ *recon_result, txs_to_request));
4001+ } else {
4002+ // No final result means we should request sketch extension to make another
4003+ // reconciliation attempt without loosing the initial data.
Why 1024? Is it an arbitrary number? I tested it using 500 blocks and it seems to work. Is there an explanation?
naumenkogs
commented at 7:14 am on April 19, 2021:
Yeah, pretty arbitrary. I could generate them in generate_transactions function, but then it would force block sync across nodes, which is not ideal to test that transactions are not lost between scenarios (block sync could hide this effect by forcing tx sync)
DrahtBot added the label
Needs rebase
on Apr 19, 2021
in
src/net_processing.cpp:2910
in
a9270714e5outdated
2780+ return;
2781+ }
2782+
2783+ LOCK(cs_main);
2784+ if (!State(pfrom.GetId())->m_wtxid_relay) {
2785+ // Disconnect peers that send a SENDRECON message before/without WTXIDRELAY.
I think we should update BIP330 to mark the dependency on BIP339. Also, should we bump the protocol version ? Otherwise a bip330 peer will send SENDRECON for nothing to bip339-only peer (core version 20+).
naumenkogs
commented at 9:25 am on April 21, 2021:
I think we should update BIP330 to mark the dependency on BIP339.
Agree.
Also, should we bump the protocol version ? Otherwise a bip330 peer will send SENDRECON for nothing to bip339-only peer (core version 20+).
A bip330 peer sends a SENDRECON message to a wtxid-relay peer but not supporting bip330.
The wtxid-relay peer doesn’t understand SENDRECON and will ignore the message.
The bip330 wasted SENDRECON message bandwidth.
If the bip330 peer would have been aware that the wtxid-relay peer didn’t support bip330 it could have avoid sending SENDRECON. Right now, I think your check implies that wtxid-relay peers also support bip330, I believe a protocol version bump would avoid this minor issue.
in
src/net_processing.cpp:2672
in
a9270714e5outdated
2653+ we_may_flood_to = true;
2654+ }
2655+ } else {
2656+ // All non-reconciling peers which pass the MightSupportTransactionRelay and
2657+ // !IsInboundConn checks are outbound flooding peers.
2658+ we_may_flood_to = true;
I don’t think what said in comment “which pass… !IsInboundConn” is enforced and actually inbound peers might take outbound flooding peers slots ?
Turning this flag to false break test_outgoing_recon as all added peers L475-479 aren’t occupying anymore flooding peers slots and as such the tested peers is flood to. IIRC add_p2p_connection’s peers are seen as inbound ?
naumenkogs
commented at 9:02 am on April 21, 2021:
I don’t think what said in comment “which pass… !IsInboundConn” is enforced and actually inbound peers might take outbound flooding peers slots ?
See line above: if (!pfrom.IsInboundConn()) {, that’s where it’s enforced.
IIRC add_p2p_connection’s peers are seen as inbound ?
Nope, I added a hack for that: 108b0233be7735b793e488ba5bda836d24b75d4e
Ah sorry didn’t see the patch was the reason of my confusion. So yes it’s tested and the logic works as documented.
Maybe add a friendly note L475 in p2p_erlay.py “Contrary to default mininode setting, we mark those peers as node_outgoing=true to simulate them as outgoing connections and thus have them consume regular flood tx forwarding” ?
in
src/net_processing.cpp:2818
in
a9270714e5outdated
2794+ vRecv >> they_initiator >> they_responder >> recon_version >> remote_salt;
2795+
2796+ // Since this is called before VERACK, m_out_flooding_peers doesn't include this peer
2797+ // being considered for reconciliation support, so no need for substraction.
2798+ m_reconciliation.EnableReconciliationSupport(pfrom.GetId(), pfrom.IsInboundConn(),
2799+ they_initiator, they_responder, recon_version, remote_salt, m_out_flood_to);
current outbound flooding peers=MAX_OUTBOUND_FLOOD_TO - 1
outbound peer 1 connects
outbound peer 2 connects
outbound peer 1 send a SENDRECON is marked as outbound flooding peers as a slot is still available
outbound peer 2 send a SENDRECON is marked as outbound flooding peers as a slot is still available
outbound peer 1 send a VERACK, current outbound flooding peers = MAX_OUTBOUND_FLOOD_TO
outbound peer 2 send a VERACK, current outbound flooding peers = MAX_OUTBOUND_FLOOD_TO +1
What do you think about moving outbound flooding peers accouting from VERACK to SENDRECON ? Independently of my scenario truthiness, I believe it would even avoid to reason on how message ordering is influencing accounting.
naumenkogs
commented at 10:01 am on April 27, 2021:
Yes, this scenario is possible.
What do you think about moving outbound flooding peers accouting from VERACK to SENDRECON ?
The problem with your suggestion is that we also need to account for peers which will never send SENDRECON (legacy flooding peers). There are 2 possible solutions:
Handle legacy peers as a special case, so have this accounting twice (in VERACK for legacy, in SENDRECON for reconciliation).
Always initialize ReconciliationState with m_flood_to=false, and update it to true on VERACK if there are still slots (along with tracking out peers in VERACK)
naumenkogs
commented at 12:24 pm on April 27, 2021:
I will implement the latter.
in
src/txreconciliation.cpp:47
in
a9270714e5outdated
42+/**
43+ * When considering whether we should flood to an outbound connection supporting reconciliation,
44+ * see how many outbound connections are already used for flooding. Flood only if the limit is not reached.
45+ * It helps to save bandwidth and reduce the privacy leak.
46+ */
47+constexpr uint32_t MAX_OUTBOUND_FLOOD_TO = 8;
I think this variable belong more to our transaction relay policy rather than the reconciliation module in itself. Its effect on the reconciliation module are function of the connection state, which is located outside this module. I’m leaning towards removing any “flooding” mention from this module, it should be agnostic about the existence of another transaction-announcement protocol.
I guess “flood” could be a flag in CNodeState/Peer and allow to remove IsPeerChosenForFlooding here.
What do you think ?
in
src/net_processing.cpp:4826
in
a9270714e5outdated
4780+ // We don't care about a laggy peer (1) because we probably can't help them even if we flood transactions.
4781+ // However, exploiting (2) should not prevent us from relaying certain transactions.
4782+ //
4783+ // Transactions which don't make it to the set due to the limit are announced via flooding.
4784+ const size_t recon_set_size = *m_reconciliation.GetPeerSetSize(pto->GetId());
4785+ if (txs_to_reconcile.size() + recon_set_size < MAX_PEER_TX_ANNOUNCEMENTS) {
I don’t think MAX_PEER_TX_ANNOUNCEMENTS is covered by p2p_erlay.py ? This diff doesn’t yell an error.
0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
1index 689b899f1..0703fcea7 100644 2--- a/src/net_processing.cpp
3+++ b/src/net_processing.cpp
4@@-81,7+81,7@@static constexpr int32_t MAX_PEER_TX_REQUEST_IN_FLIGHT =100;
5* the actual transaction (from any peer) in response to requests for them. 6* Also limits a maximum number of elements to store in the reconciliation set. 7*/ 8-static constexpr int32_t MAX_PEER_TX_ANNOUNCEMENTS =5000;
9+static constexpr int32_t MAX_PEER_TX_ANNOUNCEMENTS =4999;
10/** How long to delay requesting transactions via txids, if we have wtxid-relaying peers */11static constexpr auto TXID_RELAY_DELAY = std::chrono::seconds{2};
12/** How long to delay requesting transactions from non-preferred peers */
IIUC, erlay is introducing an interdependency between transaction’s source and selection of outgoing announcement-protocol (cf RelayTransaction), but I’m not if this is covered by p2p_erlay.py ? Following diff doesn’t yell a failure. Same we turning to always true.
0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
1index 1818fc0f2..a4e847110 100644
2--- a/src/net_processing.cpp
3+++ b/src/net_processing.cpp
4@@ -4732,7 +4732,8 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5 // Fetch the top element from the heap
6 std::pop_heap(vInvTx.begin(), vInvTx.end(), compareInvMempoolOrder);
7 uint256 hash = vInvTx.back();
8- bool flood_tx = pto->m_tx_relay->m_txs_to_announce.at(hash);
9+ //bool flood_tx = pto->m_tx_relay->m_txs_to_announce.at(hash);
10+ bool flood_tx = false;
11 vInvTx.pop_back();
12 CInv inv(state.m_wtxid_relay ? MSG_WTX : MSG_TX, hash);
13 // Remove it from the to-be-sent set
in
src/net_processing.cpp:4787
in
a9270714e5outdated
4785+ if (txs_to_reconcile.size() + recon_set_size < MAX_PEER_TX_ANNOUNCEMENTS) {
4786+ // Check that:
4787+ // 1) the peer isn't set for flooding OR
4788+ // 2) the transaction isn't set for flooding
4789+ const bool recon_peer_flood_to = *m_reconciliation.IsPeerChosenForFlooding(pto->GetId());
4790+ if (!recon_peer_flood_to || !flood_tx) {
I think it should be AND not an OR here. If recon_peer_flood_to= false, even if transaction is marked for flooding, it will be added to reconciliation set. Is it the semantic intended ?
naumenkogs
commented at 8:57 am on April 21, 2021:
Is it the semantic intended?
Yes. Whether the transaction is marked for flooding only matters for reconciliation peers which are set for flooding.
So, if flood_tx=true but recon_peer_flood_to= false, we should reconcile.
Perhaps this comment covers it already, no:
0/**1* Per BIP-330, we may want to flood certain transactions to a subset of peers with whom we
2* reconcile.3* If the peer was not previously registered for reconciliations, returns nullopt.4*/5 std::optional<bool> IsPeerChosenForFlooding(NodeId peer_id) const;
So, if flood_tx=true but recon_peer_flood_to= false, we should reconcile.
This case I understand. The case I don’t get the rational for is recon_peer_flood_to=true and flood_tx=false.
AFAICT, it’s motivated to preserve privacy of transactions originated locally as detailed in RelayTransaction ? I don’t think it’s achieving its goal as this announcement (i.e reconciliation) will diverge from the announcement-protocol expected on the link (i.e flooding) and this is observable by the peer on the other side of the link ?
Overall, yes the comment could be better as it’s describing code behavior but not it’s rational (or pointing where the rational is laid out). Instead, what do you think of ?
01// Unless sketch capacity has been reached, expected announcement policy should not diverge
2// for reconciling peers.
3// Outbound flooding peers may the expected announcement policy overridden in function of
4// the transaction source. Transactions originated locally should be always announced through
5// reconciliation, see RelayTransaction().
naumenkogs
commented at 8:43 am on April 27, 2021:
This case I understand. The case I don’t get the rational for is recon_peer_flood_to=true and flood_tx=false.
Ok, what exactly are you suggesting? Always flood if recon_peer_flood_to=true and drop flood_tx at all?
If this is applied, non-reachable nodes will flood transactions to 7 (erlay enabled) peers after receiving transactions from the 8th. No bandwidth savings because almost all transactions are flooded, bad.
How erlay solves this? Adding a policy: “flood only if a transaction is received from reconciliation initiator [or legacy peer]”. Now non-reachable nodes will never flood, because they have no recon initiator peers (because they have no inbounds). Assuming no legacy peers.
Now, what to do with local transactions? If non-reachable nodes flood them, their reconciling peers will easily tell those are local transactions, because this is the only case non-reachable node floods.
For this special case, we also use reconciliation, to provide privacy.
——————-––––——-
Now, I assume you can change step (2) or even step (1) here and have some different logic. Feel free to suggest, but this is my thinking for now, and Erlay is based on it.
If this is applied, non-reachable nodes will flood transactions to 7 (erlay enabled) peers after receiving transactions from the 8th. No bandwidth savings because almost all transactions are flooded, bad.
That said, I agree with your laid out reasoning on flooding-vs-reconciling peers selection for non-reachable nodes and how to protect their local transaction privacy.
But I don’t think current code logic hinders well reachable nodes’s local transactions as you will reconciliate (flood_tx=false) with your outbound flooding peers.
Let me write a functional test to sustain (or not) my thinking :)
For a non-reachable node, all the peers are inbound=false…. and I just realized that my logic only works if connectivity=MAX_OUTBOUND_FLOOD_TO. Is that understandable?
Now, we should think what to do when we bump the connectivity… Non-reachable nodes will have both flood_to false AND true peers.
Now, we should think what to do when we bump the connectivity… Non-reachable nodes will have both flood_to false AND true peers.
Ok, so this is fine, because again, local transactions and transactions from outbounds are always reconciled, so it doesn’t matter if a non-reachable node has some peers with flood_to=true.
I will improve the docs.
Keeping this open, it’s likely gonna change a bit after I do some experiments.
in
src/net_processing.cpp:4724
in
a9270714e5outdated
4678+ if (!supports_recon && pto->IsInboundConn()) {
4679 pto->m_tx_relay->nNextInvSend = m_connman.PoissonNextSendInbound(current_time, INBOUND_INVENTORY_BROADCAST_INTERVAL);
4680 } else {
4681+ // Use half the delay for outbound peers, as there is less privacy concern for them.
4682+ // Also use half delay for inbound peers if they use reconciliations,
4683+ // because we have an additional delay for sending out requested sketches.
If your peer has been marked as a responder but it sends you anyway a REQRECON, should we severe the connection ? I think it’s qualify as a violation of protocol roles (sender, responder) as negotiated in SENDRECON.
Same reasoning for other Erlay messages, should we enforce protocol flow ?
naumenkogs
commented at 6:55 am on April 22, 2021:
It is enforced in txreconciliation.cpp, but currently these requests are just ignored. I’m not sure if we should disconnect.
in
src/net_processing.cpp:4976
in
0b142cce89outdated
What if a peer send you peer_recon_set_size=UINT16_MAX, may this provoke issues in EstimateSketchCapacity or ComputeBaseSketch, which are both consuming this value or product of it ?
naumenkogs
commented at 6:56 am on April 22, 2021:
What if a peer send you peer_recon_set_size=UINT16_MAX, may this provoke issues in EstimateSketchCapacity or ComputeBaseSketch, which are both consuming this value or product of it ?
This is taken care by capacity = std::min(capacity, MAX_SKETCH_CAPACITY); in ComputeBaseSketch, so this shouldn’t cause any harm.
in
src/txreconciliation.cpp:805
in
0b142cce89outdated
806+ // returning values
807+ std::vector<uint32_t>& txs_to_request, std::vector<uint256>& txs_to_announce, std::optional<bool>& result)
808+ {
809+ uint16_t remote_sketch_capacity = uint16_t(skdata.size() / BYTES_PER_SKETCH_CAPACITY);
810+ // Protocol violation: our peer exceeded the sketch capacity, or sent a malformed sketch.
811+ if (remote_sketch_capacity > MAX_SKETCH_CAPACITY) {
How do we handle failure if the responder never replies to our reconciliation request ?
I don’t think this is handled by TxReconciliationTracker for now. After RECON_REQUEST_INTERVAL expires again, is MaybeRequestReconciliation going to generate a new reconciliation request ?
naumenkogs
commented at 6:58 am on April 22, 2021:
is MaybeRequestReconciliation going to generate a new reconciliation request ?
No, because if (recon_state->second.m_state_init_by_us.m_phase != Phase::NONE) return std::nullopt; takes care of that. Basically, an initiator will wait forever.
in
src/net_processing.cpp:5001
in
0b142cce89outdated
4021+ if (valid_finalization) {
4022+ AnnounceTxs(remote_missing, pfrom);
4023+ } else {
4024+ // Disconnect peers that send reconciliation finalization violating the protocol.
4025+ LogPrint(BCLog::NET, "reconcildiff from peer=%d violates reconciliation protocol; disconnecting\n", pfrom.GetId());
4026+ pfrom.fDisconnect = true;
linter turned up:
test/functional/p2p_erlay.py:20:1: F401 'test_framework.p2p.MIN_P2P_VERSION_SUPPORTED' imported but unused
naumenkogs
commented at 7:42 pm on April 22, 2021:
oops, sorry I keep missing these issues. Linter broke on my machine for some reason, I better fix it before pushing again.
in
src/net_processing.cpp:3611
in
5b8c65d32coutdated
2803+ }
2804+
2805+ if (pfrom.GetCommonVersion() < RECONCILIATION_RELAY_VERSION) {
2806+ LogPrint(BCLog::NET, "ignoring sendrecon due to old common version=%d from peer=%d\n", pfrom.GetCommonVersion(), pfrom.GetId());
2807+ return;
2808+ }
Further, should we disconnect sendrecon sender with a protocol version inferior to RECONCILIATION_RELAY_VERSION ?
Good point to keep the check on WTXID_RELAY_VERSION support, I thought at first it should be removed but you might have sendrecon sender who don’t signal wtxid-relay.
Further, should we disconnect sendrecon sender with a protocol version inferior to RECONCILIATION_RELAY_VERSION ?
Yeah seems right.
in
src/txreconciliation.h:61
in
5b8c65d32coutdated
55+ * - whether we agree to respond to reconciliation requests (send our sketches)
56+ * - reconciliation protocol version
57+ * - salt used for short ID computation required for reconciliation
58+ * A peer can't be registered for future reconciliations without this call.
59+ */
60+ std::tuple<bool, bool, uint32_t, uint64_t> SuggestReconciling(NodeId peer_id, bool inbound);
Should interface documentation say that peer must not be registered more than once ?
I think that’s a property enforced by the assert in the implementation.
naumenkogs
commented at 9:59 am on April 30, 2021:
I think there is a confusion about “register”. Register for me is when Enable[…] is called and when ReconciliationState is created Here, we just generate the salt.
Ok, adding the mention.
in
src/test/txreconciliation_tests.cpp:136
in
5b8c65d32coutdated
What’s purpose serves the TxReconciliationTrackerTest wrapper given in that unit test you’re using directly the reference implementation of TxReconciliationTracker ?
naumenkogs
commented at 9:56 am on April 30, 2021:
The purpose is to deduplicate the repeatable valid handshake (Suggest/Enable) by hiding it under the TxReconciliationTrackerTest
in
src/txreconciliation.cpp:628
in
5b8c65d32coutdated
625+ // for the second time should not happen
626+ LOCK(m_mutex);
627+ if (m_states.find(peer_id) != m_states.end()) return;
628+
629+ recon_version = std::min(recon_version, RECON_VERSION);
630+ if (recon_version < 1) return;
W.r.t to future reconciliation upgrades, IIUC if peer support a lower version than us we downgrade to the lowest common version ? I think it’s interesting to document in the EnableReconciliationSupport declaration how reconciliation version conflicts are negotiated. Further, why RECON_VERSION=1 and not just 0 ?
Really likely we’ll have to upgrade reconciliation if we introduce package_ids as tx-relay announcements in a near future.
naumenkogs
commented at 9:54 am on April 30, 2021:
W.r.t to future reconciliation upgrades, IIUC if peer support a lower version than us we downgrade to the lowest common version ? I think it’s interesting to document in the EnableReconciliationSupport declaration how reconciliation version conflicts are negotiated.?
Agree.
Further, why RECON_VERSION=1 and not just 0 ?
I think this is about personal taste, 0 might have special meaning or something. Gonna keep it 1.
Really likely we’ll have to upgrade reconciliation if we introduce package_ids as tx-relay announcements in a near future.
Then we have v2 :)
in
src/test/txreconciliation_tests.cpp:20
in
5b8c65d32coutdated
0diff --git a/src/test/txreconciliation_tests.cpp b/src/test/txreconciliation_tests.cpp
1index e7f20d7de..8dbad9461 100644
2--- a/src/test/txreconciliation_tests.cpp
3+++ b/src/test/txreconciliation_tests.cpp
4@@ -146,6 +146,10 @@ BOOST_AUTO_TEST_CASE(EnableReconciliationSupportTest)
5 // Test inbound peers.
6 bool inbound = true;
7 8+ NodeId non_registered_peer = 100;
9+ // Do nothing if peer is not registered
10+ tracker.EnableReconciliationSupport(non_registered_peer, inbound, false, false, 1, salt, 0);
11+
12 NodeId peer_id0 = 0;
13 tracker.SuggestReconciling(peer_id0, inbound);
14 // Both roles are false, don't register.
It’s maybe covered in other tests but otherwise asserting on
0diff --git a/src/txreconciliation.cpp b/src/txreconciliation.cpp
1index d4cfc8ee4..a325e0a2a 100644
2--- a/src/txreconciliation.cpp
3+++ b/src/txreconciliation.cpp
4@@ -632,7 +632,9 @@ class TxReconciliationTracker::Impl {
5 auto local_salt = m_local_salts.find(peer_id);
6 7 // This function should be called only after generating the local salt.
8- if(local_salt == m_local_salts.end()) return;
9+ if(local_salt == m_local_salts.end()) {
10+ assert(false);
11+ }
1213 // Must match SuggestReconciling logic.
14 bool we_may_initiate = !inbound, we_may_respond = inbound;
Doesn’t break EnableReconciliationSupportTest.
in
src/txreconciliation.cpp:644
in
5b8c65d32coutdated
641+ bool we_initiate = we_may_initiate && they_may_respond;
642+ // If we ever announce we_initiate && we_may_respond, this will need tie-breaking. For now,
643+ // this is mutually exclusive because both are based on the inbound flag.
644+ assert(!(they_initiate && we_initiate));
645+
646+ if (!(they_initiate || we_initiate)) return;
IIUC, you can only hit this return if the peer is sending us both they_may_initiate=false and they_may_respond=false. This is def a pathological case and maybe we should disconnect such buggy peer ?
in
src/test/txreconciliation_tests.cpp:106
in
5b8c65d32coutdated
262+ SetMockTime(start_time + 20000);
263+ // Still don't request because the other peer is earlier in the queue.
264+ assert(!tracker.MaybeRequestReconciliation(peer_id2));
265+ tracker.RemovePeer(peer_id1);
266+ assert(tracker.MaybeRequestReconciliation(peer_id2));
267+}
This doesn’t seem to break HandleReconciliationRequestTest ?
0diff --git a/src/txreconciliation.cpp b/src/txreconciliation.cpp
1index d4cfc8ee4..65dcc2675 100644
2--- a/src/txreconciliation.cpp
3+++ b/src/txreconciliation.cpp
4@@ -715,7 +715,9 @@ class TxReconciliationTracker::Impl {
5 auto recon_state = m_states.find(peer_id);
6 if (recon_state == m_states.end()) return;
7 if (recon_state->second.m_state_init_by_them.m_phase != Phase::NONE) return;
8- if (recon_state->second.m_we_initiate) return;
9+ if (recon_state->second.m_we_initiate) {
10+ assert(false);
11+ }
1213 double peer_q_converted = peer_q * 1.0 / Q_PRECISION;
14 recon_state->second.m_state_init_by_them.m_remote_q = peer_q_converted;
ariard
commented at 6:58 pm on April 29, 2021:
member
Currently looking through unit tests.
naumenkogs force-pushed
on Apr 30, 2021
ariard
commented at 3:11 pm on April 30, 2021:
member
See comment here if you think replacing assert with exception is better to exercise coverage in unit tests : #16700 (comment)
in
src/test/txreconciliation_tests.cpp:442
in
96c261630coutdated
310+ // The node receives a reconciliation request noting that the initiator has an empty set.
311+ // The node should terminate reconciliation by sending an empty sketch.
312+ TxReconciliationTrackerTest tracker_test(true);
313+ tracker_test.HandleReconciliationRequest(1);
314+ SetMockTime(start_time + 2000);
315+ assert(tracker_test.RespondToReconciliationRequest(skdata));
What’s the unit of the mocktime bump here, seconds or microseconds ? If it’s second I would recommend to replace the bumping value by RECON_RESPOND_INTERVAL, I think it can be taken verbatim from txreconciliation.cpp like you’re already doing for few values.
That way you can assert that’s the module is binding to the interval expectation. Otherwise, slightly increasing RECON_RESPOND_INTERVAL doesn’t break the test.
in
src/test/txreconciliation_tests.cpp:425
in
1d8d43ed09outdated
in
src/test/txreconciliation_tests.cpp:324
in
1d8d43ed09outdated
319+ tracker_test.HandleReconciliationRequest(1);
320+ SetMockTime(start_time + 2000);
321+ assert(tracker_test.RespondToReconciliationRequest(skdata));
322+ assert(skdata.size() == 0);
323+
324+ // The node receives a reconciliation request noting that their local set is empty.
I think this comment should be reverse with the test case above ?
Local set is increased by AddTransaction(5) though initiator set is empty HandleReconciliationRequest
in
src/test/txreconciliation_tests.cpp:329
in
1d8d43ed09outdated
324+ // The node receives a reconciliation request noting that their local set is empty.
325+ // The node should terminate reconciliation by sending an empty sketch.
326+ TxReconciliationTrackerTest tracker_test2(true);
327+ tracker_test2.AddTransactions(5);
328+ tracker_test2.HandleReconciliationRequest(0);
329+ SetMockTime(start_time + 3000);
Adding a check for local_set, but it’s hard to get into local_set_snapshot because it’s a private variable with no getter method. I could possibly check it implicitly, but that’s more of a functional test thing.
in
src/txreconciliation.cpp:747
in
1d8d43ed09outdated
765+ // Update local reconciliation state for the peer.
766+ recon_state->second.m_state_init_by_them.m_phase = Phase::EXT_RESPONDED;
767+
768+ // Local extension sketch can be null only if initial sketch or initial capacity was 0,
769+ // in which case we would have terminated reconciliation already.
770+ uint32_t extended_capacity = std::min(uint32_t(recon_state->second.m_capacity_snapshot * 2), MAX_SKETCH_CAPACITY * 2);
No way to test this. HandleSketch() can be legally called only after MaybeRequestSketch (m_state_init_by_us.m_phase = Phase::INIT_REQUESTED) , which is not possible for responders in the first place.
in
src/test/txreconciliation_tests.cpp:520
in
1d8d43ed09outdated
I would verify returned pointer before to try Deserialize(). Beyond capacity == 0, minisketch API will also return a nullptr in case of memory allocation failure inside library. I don’t know how to tread it, but definitively not crashing on it :/
in
src/txreconciliation.cpp:515
in
1d8d43ed09outdated
510+ recon_state->second.m_local_set.GetRelevantIDsFromShortIDs(differences, txs_to_request, txs_to_announce);
511+
512+ // Update local reconciliation state for the peer.
513+ size_t local_set_size = recon_state->second.m_local_set.GetSize();
514+ recon_state->second.FinalizeInitByUs(true,
515+ RecomputeQ(local_set_size, txs_to_request.size(), txs_to_announce.size()));
Note, coefficient recomputation doesn’t seem covered by unit tests but sounds to be by functional tests. I would favor covering it unit tests too. An unrelated change in p2p_erlay.py in the future might remove coverage, up to you.
in
src/txreconciliation.cpp:39
in
1d8d43ed09outdated
34+*/
35+constexpr unsigned int RECON_FALSE_POSITIVE_COEF = 16;
36+static_assert(RECON_FALSE_POSITIVE_COEF <= 256,
37+ "Reducing reconciliation false positives beyond 1 in 2**256 is not supported");
38+/** Default value for the coefficient used to estimate reconciliation set differences. */
39+constexpr double DEFAULT_RECON_Q = 0.02;
@naumenkogs#21859 should be sufficiently ready to rebase on, if you want CI-testable code. Note the src/minisketchwrapper module, which automatically benchmarks.
naumenkogs force-pushed
on May 31, 2021
naumenkogs force-pushed
on Jun 28, 2021
DrahtBot removed the label
Needs rebase
on Jun 28, 2021
naumenkogs force-pushed
on Jun 29, 2021
rebroad
commented at 11:07 pm on June 29, 2021:
contributor
How can I help test this? I’ve been running it for a few hours so far on mainnet, but haven’t noticed much difference compared to the master branch.
naumenkogs
commented at 4:43 pm on June 30, 2021:
member
@rebroad It only makes a difference if your erlay-enabled node connects to other erlay-enabled nodes :)
I’m planning to invite people to a such sub-network soon (where we can connect to each other by –addnode manually). I’ll let you know once I start that :)
You can obviously run several nodes with this patch by yourself too, just make sure to connect them.
naumenkogs force-pushed
on Jun 30, 2021
naumenkogs force-pushed
on Jul 1, 2021
naumenkogs force-pushed
on Jul 1, 2021
naumenkogs force-pushed
on Jul 2, 2021
naumenkogs force-pushed
on Jul 4, 2021
naumenkogs force-pushed
on Jul 4, 2021
in
doc/README.md:33
in
1a0868c90doutdated
29@@ -30,8 +30,8 @@ Drag Bitcoin Core to your applications folder, and then run Bitcoin Core.
3031 * See the documentation at the [Bitcoin Wiki](https://en.bitcoin.it/wiki/Main_Page)
32 for help and more information.
33-* Ask for help on the [Bitcoin StackExchange](https://bitcoin.stackexchange.com)
34-* Ask for help on [#bitcoin](https://webchat.freenode.net/#bitcoin) on Freenode. If you don't have an IRC client, use [webchat here](https://webchat.freenode.net/#bitcoin).
35+* Ask for help on [Bitcoin StackExchange](https://bitcoin.stackexchange.com).
oops, definitely not indented. Probably a rebasing mistake.
in
test/functional/test_runner.py:330
in
1a0868c90doutdated
338- parser.add_argument('--keepcache', '-k', action='store_true', help='the default behavior is to flush the cache directory on startup. --keepcache retains the cache from the previous testrun.')
339- parser.add_argument('--quiet', '-q', action='store_true', help='only print dots, results summary and failure logs')
340- parser.add_argument('--tmpdirprefix', '-t', default=tempfile.gettempdir(), help="Root directory for datadirs")
341- parser.add_argument('--failfast', action='store_true', help='stop execution after the first test failure')
342- parser.add_argument('--filter', help='filter scripts to run by regular expression')
343+ parser.add_argument('--ansi', action='store_true', default=sys.stdout.isatty(),
oops, definitely not indented. Probably a rebasing mistake.
naumenkogs force-pushed
on Jul 6, 2021
DrahtBot added the label
Needs rebase
on Jul 7, 2021
naumenkogs force-pushed
on Jul 15, 2021
naumenkogs force-pushed
on Jul 20, 2021
DrahtBot removed the label
Needs rebase
on Jul 20, 2021
rebroad
commented at 4:07 pm on July 23, 2021:
contributor
I’ve been testing this, and so far I’ve found that the efficiency is reduced with this patch - i.e. a lower PERCENTAGE of the bandwidth is used for ACCEPTED transactions compared to not using this patch. Therefore it is better NOT to use this patch. Therefore, NACK, for now, perhaps premature.
I’ll continue testing, and share my data in the near future.
sipa
commented at 7:29 pm on July 23, 2021:
member
@rebroad That’s interesting, and surprising. Can you share your data? Which other Erlay-capable nodes have you been testing with? Substantial improvements are only expected once significant subgraphs adopt it.
How can I help test this? I’ve been running it for a few hours so far on mainnet, but haven’t noticed much difference compared to the master branch.
I’ve been testing this, and so far I’ve found that the efficiency is reduced with this patch - i.e. a lower PERCENTAGE of the bandwidth is used for ACCEPTED transactions compared to not using this patch. Therefore it is better NOT to use this patch. Therefore, NACK.
sipa:
Can you share your data?
+1 I love NACK data. Please share else NACK is meaningless.
naumenkogs
commented at 2:23 pm on July 24, 2021:
member
@sipa@prayank23@rebroad
I’ve spent last couple days running my own subnetwork of Erlay nodes connected to mainnet. I was slightly tuning implementation/configurations and observing savings of 20-50% of overall bandwidth.
I can imagine this PR with no modifications yeilding suboptimal results. I will update the PR with good actual-peformance (according to my real-time experiments) in the next few days, and bring that up at the meeting.
I think judging/nacking this based on the perf is premature.
Sorry for spending your time testing this commit. At this point, the PR was open mainly for concept/code review rather than pefr (maybe it could have been a draft PR, yeah).
naumenkogs marked this as a draft
on Jul 24, 2021
naumenkogs force-pushed
on Aug 3, 2021
naumenkogs force-pushed
on Aug 3, 2021
naumenkogs force-pushed
on Aug 3, 2021
naumenkogs force-pushed
on Aug 3, 2021
rebroad
commented at 10:19 am on August 4, 2021:
contributor
I’m now running the latest erlay patch on 2 nodes, and created a manual connection between them. Compared to non-erlay nodes, the percentage of mempool-successful receive traffic seems below average, and the upload bps seems significantly higher than for the other nodes.
Update (5th August 2021) - still getting very poor TX percentage compared to regular nodes - and a lot of wasted bandwidth - far more than any other connection. I get better percentage with an onion connection even though it has a much larger minimum ping - usually the lower the ping the higher the TX percentage, but not with erlay!
naumenkogs force-pushed
on Aug 5, 2021
naumenkogs force-pushed
on Aug 5, 2021
naumenkogs force-pushed
on Aug 5, 2021
naumenkogs force-pushed
on Aug 5, 2021
naumenkogs force-pushed
on Aug 7, 2021
naumenkogs force-pushed
on Aug 7, 2021
naumenkogs force-pushed
on Aug 8, 2021
naumenkogs force-pushed
on Aug 8, 2021
naumenkogs force-pushed
on Aug 8, 2021
naumenkogs force-pushed
on Aug 11, 2021
naumenkogs force-pushed
on Aug 11, 2021
jonatack
commented at 5:54 pm on August 17, 2021:
contributor
Debug build is clean. I’ll try to start looking at this soon.
naumenkogs force-pushed
on Aug 19, 2021
naumenkogs force-pushed
on Aug 19, 2021
naumenkogs force-pushed
on Aug 19, 2021
naumenkogs
commented at 6:37 pm on August 19, 2021:
member
I created 2 discussion boards, please discuss not-code-related topics there:
rebroad
commented at 3:04 pm on August 20, 2021:
contributor
After running this for a few minutes, I’m so far seeing better performance from non-erlay nodes, but perhaps we need more erlay nodes for testing to really see if there’s really an improvement.
naumenkogs
commented at 5:17 pm on August 20, 2021:
member
@rebroad no, the bandwidth savings should come immediately for your node. Maybe the difference is that you run too many conns (I always tested 8/12 erlay conns and that’s it)…
We could also try to understand why it eats more bandwidth in your case, but perhaps we should stick to the realistic case? E.g., 4 erlay conns + 4 legacy conns for.
Anyway, could we continue the bandwidth-related discussion here?
naumenkogs force-pushed
on Sep 17, 2021
naumenkogs marked this as ready for review
on Sep 20, 2021
rebroad
commented at 9:10 am on September 20, 2021:
contributor
Maybe this is an “the emperor has no clothes moment”, but I shall say what’s on my mind without fear of appearing stupid.
To me, this change looks like the improvements are simply due to reducing the delay between sending TXs. With this change I see greater bandwidth, greater TX throughput but reduced efficiency (i.e. percentage of useful bandwidth). I’ve been seeing this since I started testing months ago.
For example, there’s a node in Toronto currently that is managing to be the main provider of TXs, and is not using erlay. I suspect they have changed the code to reduce the delay between TXs:-
This seems easily achieveable without erlay. Erlay also reduces the delay between sending TXs, so it’s an easy hack to get better TX throughput, but the difference with erlay is that it also reduces efficiency (in all my tests so far).
Erlay does well with throughput:-
but badly with efficency:-
I’d like to see some test results from others that contradict mine though.
DrahtBot added the label
Needs rebase
on Sep 30, 2021
naumenkogs force-pushed
on Nov 5, 2021
DrahtBot removed the label
Needs rebase
on Nov 5, 2021
DrahtBot added the label
Needs rebase
on Nov 10, 2021
fanquake referenced this in commit
c1fb30633b
on Nov 12, 2021
jnewbery
commented at 10:59 am on November 12, 2021:
contributor
Minisketch is merged! Please rebase :)
maflcko
commented at 11:07 am on November 12, 2021:
member
#23491 will conflict with this pull after rebase. Let me know if this acceptable or if you prefer to defer 23491 until after all other changes are merged.
fanquake referenced this in commit
d0923098c6
on Nov 16, 2021
naumenkogs force-pushed
on Nov 17, 2021
naumenkogs force-pushed
on Nov 17, 2021
DrahtBot removed the label
Needs rebase
on Nov 17, 2021
rebroad
commented at 1:20 pm on November 25, 2021:
contributor
what is the 2021-11-erlay1 branch please? confused as it looks more recent (from the name) but this one is also getting updated.
rebroad
commented at 7:26 pm on November 25, 2021:
contributor
This does not compile:-
08d4354499f39a49bbb8675494c532802ef034b03 is the first bad commit
1commit 8d4354499f39a49bbb8675494c532802ef034b03 2Author: Gleb Naumenko <naumenko.gs@gmail.com>
3Date: Sat Mar 20 17:36:08 2021 +0200
4 5 Respond to a reconciliation request
6 7 When the time comes, we should send a sketch of our
8 local reconciliation set to the reconciliation initiator.
910 src/net_processing.cpp | 15 +++++++++++++++
11 src/protocol.cpp | 2 ++
12 src/protocol.h | 6 ++++++
13 src/txreconciliation.cpp | 50 +++++++++++++++++++++++++++++++++++++++++++++++-
14 src/txreconciliation.h | 8 ++++++++
15 5 files changed, 80 insertions(+), 1 deletion(-)
168d4354499f39a49bbb8675494c532802ef034b03 is the first bad commit
naumenkogs
commented at 9:53 am on November 29, 2021:
member
what is the 2021-11-erlay1 branch please? confused as it looks more recent (from the name) but this one is also getting updated.
That branch is a first batch of this PR commits, forked to be merged separately within that PR.
I try to keep the OP PR up-to-date with that one, but also not too often to avoid too many notifications here.
8d4354499f39a49bbb8675494c532802ef034b03 is the first bad commit
Yeah I somehow can’t reproduce CI compiler errors neither on my mac nor on my linux, are you referring to those as well? Or can you share your compilation output?
naumenkogs force-pushed
on Nov 29, 2021
naumenkogs force-pushed
on Nov 29, 2021
glozow
commented at 2:13 pm on November 29, 2021:
member
Concept ACK, thanks for splitting the big PR up :) Will review soon.
naumenkogs
commented at 2:57 pm on November 29, 2021:
member
Fixing new CI issues here, for now please review the first batch separately here :)
naumenkogs force-pushed
on Dec 1, 2021
DrahtBot added the label
Needs rebase
on Jan 6, 2022
in
src/txreconciliation.cpp:32
in
5728fac4d3outdated
27+constexpr unsigned int BYTES_PER_SKETCH_CAPACITY = RECON_FIELD_SIZE / 8;
28+/**
29+ * Limit sketch capacity to avoid DoS. This applies only to the original sketches,
30+ * and implies that extended sketches could be at most twice the size.
31+ */
32+constexpr uint32_t MAX_SKETCH_CAPACITY = 2 << 12;
dergoegge
commented at 1:27 pm on February 13, 2022:
The value chosen here seems too large based on some tests that i did.
On my laptop (with a decent CPU) a sketch with a capacity of 2 << 13 and 2 << 13 differences (possible if an extension is requested) took ~20 seconds to decode. On lower end devices this will likely be worse. This is not super bad since sketches are only requested on outbound connections but having the msghand thread block like this is undesirable in any case.
I am not entirely sure what an appropriate value would be but given the expected average set difference of 7 i would assume that 256 - 1024 as the max. sketch capacity would suffice.
naumenkogs
commented at 8:11 am on February 15, 2022:
Excellent point. I will address this once I have to rebase these changes.
in
src/Makefile.test.include:156
in
5728fac4d3outdated
0$ ./src/test/test_bitcoin -t txreconciliation_tests
1Running 4 test cases...
23*** No errors detected
jonatack
commented at 4:00 pm on May 7, 2022:
contributor
Here is a rebase of the pull at https://github.com/jonatack/bitcoin/commits/erlay-PR21515-rebase, if helpful. It may be good to keep this parent PR up to date to provide an overall view of the proposed changes. At first glance it also looks like there are some changes in #23443 that are not present here.
Due to the net_processing renamings and refactoring changes in the interim, there were quite a few updates and some silent merge conflicts that only show up when building each commit, so please reverify: git range-diff 59ac8ba 5728fac 9d93cbe
Doing this helped become acquainted with the changes. Next will begin review of #23443.
Independently of the rebase, most of the commits have one or several build -Wunused-function warnings and re-organizing is needed to add functions in the same commit where they are first used (which facilitates review as well). In order:
Beginning with commit “Add helper to compute reconciliation salt”:
For a subset of reconciling peers we announce transactions
via low fanout. We need to set lower intervals for that to
achieve lower relay latency.
Note that for privacy reasons the ratio between inbound and outbound
delays matter much more than the actual delays. That ratio is preserved
here, so it is not a privacy degradation.
89de0f71dc
p2p: Add peers to reconciliation queue on negotiation
When we're finalizing negotiation, we should add the peers
for which we will initiate reconciliations to the queue.
0ec1b427ec
p2p: Track reconciliation requests schedule
We initiate reconciliation by looking at the queue periodically
with equal intervals between peers to achieve efficiency.
This will be later used to see whether it's time to initiate.
4b57aef2f1
p2p: Initiate reconciliation round
When the time comes for the peer, we send a
reconciliation request with the parameters which
will help the peer to construct a (hopefully) sufficient
reconciliation sketch for us. We will then use that
sketch to find missing transactions.
272f0d5fa8
test: functional test for reqtxrcncle710451051
p2p: Handle reconciliation request
Store the parameters the peer sent us inside the
reconciliation request.
9e0c8cd0a3
Add helper to see whether we should respond to recon request
For initial reconciliatin requests we track whether was the last time
we responded to them, and if it was too recent, we respond after
a small delay.
f99fa46911
Add helper to compute reconciliation tx short idb68cada9db
build: Link minisketch librariese5d8e4a5be
Add helper to compute sketches for tx reconciliation9d27b63575
naumenkogs force-pushed
on Sep 11, 2023
DrahtBot added the label
CI failed
on Sep 11, 2023
DrahtBot removed the label
Needs rebase
on Sep 11, 2023
naumenkogs force-pushed
on Sep 12, 2023
Respond to a reconciliation request
When the time comes, we should send a sketch of our
local reconciliation set to the reconciliation initiator.
e2fbf20193
p2p: Add a function to identify local/remote missing txs
When the sketches from both sides are combined successfully,
the diff is produced. Then this diff can (together with the local txs)
be used to identified which transactions are missing locally and remotely.
89312ab824
Use txid/uint256 in CompareInvMempoolOrder
This will help to reuse the code later on in the function
to announce transactions.
dab16090ea
p2p: Handle reconciliation sketch and successful decoding2edc141a2a
p2p: Request extension if decoding failed
If after decoding a reconciliation sketch it turned out
to be insufficient to find set difference, request extension.
c227ba23f7
Be ready to receive sketch extension
Store the initial sketches so that we are able to process
extension sketch while avoiding transmitting the same data.
8dc4cd2238
p2p: Prepare for sketch extension request
To be ready to respond to a sketch extension request
from our peer, we should store a snapshot of our state
and capacity of the initial sketch, so that we compute
extension of the same size and over the exact same
transactions.
Transactions arriving during this reconciliation will
be instead stored in the regular set.
a6e4a13372
p2p: Keep track of announcements during txrcncl extension41b3325d16
p2p: Handle reconciliation extension request
If peer failed to reconcile based on our initial response sketch,
they will ask us for a sketch extension. Store this request to respond later.
db671c63cc
p2p: Respond to sketch extension request
Sending an extension may allow the peer to reconcile
transactions, because now the full sketch has twice
as much capacity.
44b83dbbc8
Handle sketch extension
If a peer sent us an extension sketch, we should
reconstruct a full sketch from it with the snapshot
we stored initially, and attempt to decode the difference.
p2p: Handle sketch extension
If a peer sent us an extension sketch, we should
reconstruct a full sketch from it with the snapshot
we stored initially, and attempt to decode the difference.
14b2eac934
p2p: Add a finalize incoming reconciliation function
This currently unused function is supposed to be used once
a reconciliation round is done. It cleans the state corresponding
to the passed reconciliation.
b11a05219a
Handle reconciliation finalization message
Once a peer tells us reconciliation is done, we should behave as follows:
- if it was successful, just respond them with the transactions they asked
by short ID.
- if it was a full failure, respond with all local transactions from the reconciliation
set snapshot
- if it was a partial failure (only low or high part was failed after a bisection),
respond with all transactions which were asked for by short id,
and announce local txs which belong to the failed chunk.
test: Update mempool_packages.py test for reconciliation
First, extend the timeout to facilitate reconciliation
relay which might take longer than previously.
Second, since reconciliation can't guarantee that
descendants of same-degree (from the parent) are
relayed in the same order they were received, loosen
the check for descendant limit.
This is the case because, if the parent is flooded,
the children might be both flooded and reconciled,
and the decision is made independently. So even if
one arrived earlier, it can be relayed later because
flooding.
Note, it's possible to make txreconciliation.cpp be aware
of the order of adding transactions to the set, but this
won't help to solve the above issue.
3e08dab56b
naumenkogs force-pushed
on Sep 14, 2023
DrahtBot added the label
Needs rebase
on Nov 28, 2023
DrahtBot
commented at 12:44 pm on November 28, 2023:
contributor
🐙 This pull request conflicts with the target branch and needs rebase.
DrahtBot
commented at 2:02 am on February 26, 2024:
contributor
⌛ There hasn’t been much activity lately and the patch still needs rebase. What is the status here?
Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
Is it no longer relevant? ➡️ Please close.
Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
DrahtBot
commented at 0:19 am on May 25, 2024:
contributor
⌛ There hasn’t been much activity lately and the patch still needs rebase. What is the status here?
Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
Is it no longer relevant? ➡️ Please close.
Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-01-21 12:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me