This is a full implementation of Erlay. Its purpose is to check the integrity and correctness of the implementation against changes/additions that may originate from the review process and/or rebases on top of newer functionality.
This is not to be merged. Functionality will be spread across multiple smaller PRs to ease the review process.
Approach
The implementation approach builds on the following assumptions:
Fanout (the current relay method) is faster than Erlay, but less bandwidth efficient
Fanout is optimal if the node we want to announce a certain transaction doesn’t know about it (but of course, we don’t have that information)
The general approach works as follows:
Reconciliation is used alongside fanout to relay transactions across the network. For Erlay nodes, the relay method will be decided per-transaction, instead of per connection, meaning that Erlay connections will do both fanout and reconciliation depending on the transaction (legacy connections will do only fanout, obviously).
The parameters selected for fanout are minimized to maximize the bandwidth saving. The current selected defaults are 4 outbound peers and 10% of inbounds. The relay logic depends on the type of connection and how the transaction has been received:
For outbound connections, if the transaction was received via fanout (or originates in us), we fanout until up to N peers know about it. We do this by checking the m_tx_inventory_known_filter, so their announcements also count. If the transaction was received via reconciliation, we simply reconcile with the rest of our peers.
For inbounds, we select 10% of our connections and rotate that selection periodically.
The reasoning for this is trying to guess how far the transaction has made it into the network with imperfect information. Knowing that fanout is faster than reconciliation, we want to have a higher fanout rate at the very beginning of the propagation, to get as far as we can, being fully efficient. This can be tied to how many of our peers know about the transaction already. Once the transaction is sufficiently spread, we can just reconcile it with the rest of our peers.
This does not apply to inbounds, as they are not trusted, and the metric will be easily abused, plus it may be used to leak transaction origin information. For them, we just keep a low fanout rate.
Testing and simulating
The last two commits of this PR are currently for simulation only. They allow to easily config the inbound/outbound fanout rate without having to recompile the code, and make full reconciliation more efficient.
DrahtBot
commented at 5:18 pm on June 12, 2024:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
#30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
#28690 (build: Introduce internal kernel library by TheCharlatan)
#28463 (p2p: Increase inbound capacity for block-relay only connections by mzumsande)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
LLM Linter (✨ experimental)
Possible typos and grammar issues:
In RelayTransaction comment: “we do no reconcile by txid” → “we do not reconcile by txid” [fix negation for clarity]
In RelayTransaction inner comment: “Skipp it otherwiese.” → “Skip it otherwise.” [correct spelling]
No other grammatical or typographic errors impacting comprehension were found.
drahtbot_id_4_m
sr-gi marked this as a draft
on Jun 12, 2024
sr-gi force-pushed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 13, 2024
DrahtBot
commented at 4:17 am on June 13, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sr-gi force-pushed
on Apr 14, 2025
sr-gi force-pushed
on Apr 15, 2025
sr-gi force-pushed
on Apr 15, 2025
sr-gi force-pushed
on Apr 15, 2025
sr-gi force-pushed
on Apr 16, 2025
sr-gi force-pushed
on Apr 16, 2025
DrahtBot removed the label
CI failed
on Apr 16, 2025
sr-gi force-pushed
on Apr 18, 2025
sr-gi force-pushed
on Apr 18, 2025
sr-gi force-pushed
on Apr 18, 2025
DrahtBot added the label
CI failed
on Apr 18, 2025
DrahtBot
commented at 8:13 pm on April 18, 2025:
contributor
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sr-gi force-pushed
on Apr 18, 2025
sr-gi force-pushed
on Apr 21, 2025
sr-gi force-pushed
on Apr 21, 2025
sr-gi force-pushed
on Apr 21, 2025
DrahtBot removed the label
CI failed
on Apr 21, 2025
sr-gi force-pushed
on Apr 22, 2025
DrahtBot added the label
CI failed
on Apr 22, 2025
DrahtBot removed the label
CI failed
on Apr 24, 2025
DrahtBot added the label
Needs rebase
on Apr 29, 2025
sr-gi force-pushed
on Jun 18, 2025
dergoegge
commented at 3:43 pm on June 20, 2025:
member
I was fuzzing this branch with fuzzamoto and it looks like it actually found a remotely reachable assertion.
I’m surprised our existing fuzz tests do not catch this (process_messages might but I haven’t tried), but it looks like we actually never call Minisketch::Deserialize with bytes straight from the fuzzer but rather only with result from Minisketch::Serialize (maybe to avoid the assertion? not sure):
When computing sketches, the capacity was derived from the size of the received data, but it was never checked that the received data size was a multiple of the sketch element size {BYTES_PER_SKETCH_CAPACITY}. Therefore, a sketch could be created such that the capacity was smaller than the data to be decoded into it, making it crash.
Happy to run the fuzzer over the new code in case I’ve missed anything.
sr-gi force-pushed
on Jun 20, 2025
dergoegge
commented at 4:45 pm on June 23, 2025:
member
Have been running the fuzzer all day and the bug appears to be fixed (and no other bugs so far).
sr-gi force-pushed
on Jun 23, 2025
maflcko
commented at 3:34 pm on June 26, 2025:
member
I’m surprised our existing fuzz tests do not catch this (process_messages might but I haven’t tried), but it looks like we actually never call Minisketch::Deserialize with bytes straight from the fuzzer but rather only with result from Minisketch::Serialize (maybe to avoid the assertion? not sure):
I tried this by starting 8 fuzz processes 5 days ago. 7 still run and one of them crashed after two days. I minimized the input:
DrahtBot removed the label
Needs rebase
on Jul 17, 2025
DrahtBot added the label
Needs rebase
on Jul 18, 2025
sr-gi force-pushed
on Jul 18, 2025
DrahtBot removed the label
Needs rebase
on Jul 19, 2025
sr-gi force-pushed
on Jul 23, 2025
sr-gi force-pushed
on Jul 23, 2025
sr-gi force-pushed
on Jul 23, 2025
DrahtBot added the label
CI failed
on Jul 23, 2025
DrahtBot
commented at 9:10 pm on July 23, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task CentOS, depends, gui: https://github.com/bitcoin/bitcoin/runs/46596474277
LLM reason (✨ experimental): The CI failure is caused by an assertion failure in the txreconciliation.cpp file during the txreconciliation_tests, leading to the test subprocess abort.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on Jul 24, 2025
sr-gi force-pushed
on Jul 24, 2025
sr-gi force-pushed
on Jul 24, 2025
sr-gi force-pushed
on Jul 24, 2025
sr-gi force-pushed
on Jul 25, 2025
sr-gi force-pushed
on Jul 25, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
sr-gi force-pushed
on Jul 28, 2025
DrahtBot added the label
CI failed
on Jul 29, 2025
DrahtBot
commented at 4:10 am on July 29, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task tidy: https://github.com/bitcoin/bitcoin/runs/46894854808
LLM reason (✨ experimental): Missing constructor arguments and undefined identifiers caused the build to fail.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sr-gi force-pushed
on Jul 29, 2025
DrahtBot removed the label
CI failed
on Jul 29, 2025
DrahtBot added the label
Needs rebase
on Jul 30, 2025
sr-gi force-pushed
on Jul 31, 2025
sr-gi force-pushed
on Jul 31, 2025
sr-gi force-pushed
on Jul 31, 2025
DrahtBot added the label
CI failed
on Jul 31, 2025
DrahtBot
commented at 8:26 pm on July 31, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/runs/47151055383
LLM reason (✨ experimental): The CI failure was caused by a lint error due to missing a required trailing newline.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
Needs rebase
on Jul 31, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
sr-gi force-pushed
on Aug 1, 2025
DrahtBot removed the label
CI failed
on Aug 2, 2025
sr-gi force-pushed
on Aug 4, 2025
sr-gi force-pushed
on Aug 6, 2025
sr-gi force-pushed
on Aug 7, 2025
sr-gi force-pushed
on Aug 7, 2025
sr-gi force-pushed
on Aug 7, 2025
sr-gi force-pushed
on Aug 11, 2025
sr-gi force-pushed
on Aug 11, 2025
sr-gi force-pushed
on Aug 11, 2025
DrahtBot added the label
CI failed
on Aug 11, 2025
DrahtBot
commented at 8:56 pm on August 11, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task CentOS, depends, gui: https://github.com/bitcoin/bitcoin/runs/47850758479
LLM reason (✨ experimental): The failure is caused by compilation errors due to missing constructor overload and incorrect usage of std::floor.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sr-gi force-pushed
on Aug 11, 2025
sr-gi force-pushed
on Aug 12, 2025
sr-gi force-pushed
on Aug 12, 2025
sr-gi force-pushed
on Aug 12, 2025
sr-gi force-pushed
on Aug 12, 2025
sr-gi force-pushed
on Aug 13, 2025
DrahtBot removed the label
CI failed
on Aug 13, 2025
sr-gi force-pushed
on Aug 13, 2025
DrahtBot added the label
CI failed
on Aug 13, 2025
sr-gi force-pushed
on Aug 14, 2025
refactor: redesigns txreconciliation file split and namespace
Splits the txreconciliation logic in three files instead of two, allowing the
TxreconciliationState to be properly tested, instead of being internal to
txreconciliation.cpp.
Also defines includes everything in the node namespace, instead of being part
of an anonymous one.
d1388c0f9a
refactor: remove legacy comments
These comments became irrelevant in one of the previous code changes.
They simply don't make sense anymore.
a5ad054842
refactor: Defines generic error to be used in several reconciliation methods218e8fd2ea
refactor: use LogDebug instead of LogPrintLevel in txreconciliation09315916f9
p2p: Functions to add/remove wtxids to tx reconciliation sets
They will be used later on.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
262e21ffc4
p2p: Add PeerManager method to count the amount of inbound/outbounds fanout peersb8ca50c23d
p2p: Cache inbound reconciling peers count
It helps to avoid recomputing every time we consider
a transaction for fanout/reconciliation.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
0a80697fd1
p2p: Add method to decided whether to fanout or reconcile a transactions
Fanout or reconciliation is decided on a transaction basis, based on the following criteria:
If the peer is inbound, we fanout to a pre-defined subset of peers (which is rotated periodically).
If the peer is outbound, we will reconcile the transaction if we received it via reconciliation, or
defer the decision to relay time otherwise. At relay time, we will fanout to outbounds until a threshold is met
(selecting peers in the order their timers go off) and reconcile with the rest.
With this approach we try to fanout when we estimate to be early in the propagation of the transaction,
and reconcile otherwise. Notice these heuristics don't apply to inbound peers, since they would be easily
exploitable. For inbounds we just aim for a target subset picked at random.
984f1c5100
p2p: Add transactions to reconciliation sets
Transactions eligible for reconciliation are added to the reconciliation sets. For the remaining txs, low-fanout is used.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
18ef701733
p2p: Add helper to compute reconciliation tx short ids and a cache of short ids to wtxids44edf74dd1
p2p: Deal with shortid collisions for reconciliation sets
If a transaction to be added to a peer's recon set has a shot id collisions (a previously
added wtxid maps to the same short id), both transaction should be fanout, given
our peer may have added the opposite transaction to our recon set, and these two
transaction won't be reconciled.
e5f1244d68
p2p: Add peers to reconciliation queue on negotiation
When we're finalizing negotiation, we should add the peers
for which we will initiate reconciliations to the queue.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
7c4f38f127
p2p: Track reconciliation requests schedule
We initiate reconciliation by looking at the queue periodically
with equal intervals between peers to achieve efficiency.
This will be later used to see whether it's time to initiate.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
b10d5a0e41
p2p: Initiate reconciliation round
When the time comes for the peer, we send a
reconciliation request with the parameters which
will help the peer to construct a (hopefully) sufficient
reconciliation sketch for us. We will then use that
sketch to find missing transactions.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
When the time comes, we should send a sketch of our
local reconciliation set to the reconciliation initiator.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
eef78293ac
p2p: Add a function to identify local/remote missing txs
When the sketches from both sides are combined successfully,
the diff is produced. Then this diff can (together with the local txs)
be used to identified which transactions are missing locally and remotely.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
a817497b56
Use txid/uint256 in CompareInvMempoolOrder
This will help to reuse the code later on in the function to announce transactions.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
f9b5fa26ed
p2p: Handle reconciliation sketch and successful decoding
SQUASH-ME: Flags tx as received via recon if it was requested via recondiff
TODO: We may be OK defining a smaller m_recently_requested_short_ids, since
its contents only really matters for less than a minute
6686213c4f
p2p: Request extension if decoding failed
If after decoding a reconciliation sketch it turned out
to be insufficient to find set difference, request extension.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
1bcb8e0789
p2p: Be ready to receive sketch extension
Store the initial sketches so that we are able to process
extension sketch while avoiding transmitting the same data.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
ac6e06ce91
p2p: Prepare for sketch extension request
To be ready to respond to a sketch extension request
from our peer, we should store a snapshot of our state
and capacity of the initial sketch, so that we compute
extension of the same size and over the exact same
transactions.
Transactions arriving during this reconciliation will
be instead stored in the regular set.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
4f17cd9026
p2p: Keep track of announcements during txrcncl extension
If peer failed to reconcile based on our initial response sketch,
they will ask us for a sketch extension. Store this request to respond later.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
3e39fd871f
p2p: Respond to sketch extension request
Sending an extension may allow the peer to reconcile
transactions, because now the full sketch has twice
as much capacity.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
797dcc94ab
p2p: Handle sketch extension
If a peer sent us an extension sketch, we should
reconstruct a full sketch from it with the snapshot
we stored initially, and attempt to decode the difference.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
3b34b4059a
p2p: Add a finalize incoming reconciliation function
This currently unused function is supposed to be used once
a reconciliation round is done. It cleans the state corresponding
to the passed reconciliation.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
06e20f76a0
p2p: Handle reconciliation finalization message
Once a peer tells us reconciliation is done, we should behave as follows:
- if it was successful, just respond them with the transactions they asked
by short ID.
- if it was a full failure, respond with all local transactions from the reconciliation
set snapshot
- if it was a partial failure (only low or high part was failed after a bisection),
respond with all transactions which were asked for by short id,
and announce local txs which belong to the failed chunk.
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
8fa7b8347a
p2p, test: Add tx reconciliation functional tests
We may still need to add more tests, specially around extensions (if we keep them)
Co-authored-by: Gleb Naumenko <naumenko.gs@gmail.com>
3f79937866
REMOVE-ME, SIMS-ONLY: Adds options to configure in/out fanout ratesf6d810a595
sr-gi force-pushed
on Aug 14, 2025
DrahtBot removed the label
CI failed
on Aug 14, 2025
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-08-15 18:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me