ML thread: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html
Supersedes #1324.
ML thread: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html
Supersedes #1324.
Assigned BIP # 331
Updated for number
137 | +[[File:./bip-package-relay/tx_scenarios.png|800px]] 138 | + 139 | +'''No Package Information Round:''' Instead of having a package information round, just use the 140 | +child's wtxid to refer to the package and always send the entire package together. This would make 141 | +it very likely for honest nodes to redownload duplicate transactions. 142 | +[[File:./bip-package-relay/no_package_info.png|900px]]
Need to update paths to images now that bip number is assigned
Doh, thanks!
I'm not a big mempool expert, but this seems like a good direction.
There are obvious typos and misalignments you're probably aware of, i assume we shouldn't merge them?
Package Information Only
This example seems confusing: it violates the following rule A node MUST NOT send a "ancpkginfo" message that has not been requested by the recipient. Upon receipt of an unsolicited "ancpkginfo", a node may disconnect the sender.
It will likely work if you expand the example to 3 transactions?
Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.
Same about timestamping concerns i guess.
As a style comment, I'd suggest doing some compression of the text :)
Is Package Erlay possible?
6a) I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need.
6b) One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing.
MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example? Not sure savings worth the complexity (how often this will be used? every lightning closure?) but maybe worth making it into q/a :)
Thanks for the review @naumenkogs! yes, I think it's best to talk specifics with the actual implementation but I'll throw some quick answers for now:
Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.
One should treat this situation the same way we do for transactions that get replaced in between inv and getdata (I don't see any reason it should be different). Currently, we send "notfound". If in the future we decide to relay recently-replaced transactions, then package relay would include recently-replaced transactions as well. Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."
I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need.
Yes it's great that Erlay orders them correctly. Note below-fee-filter transactions to the reconciliation bucket so not every case is handled. I plan to have a rebase-on-top-of-Erlay version so we can exercise the two together.
One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing.
Mostly the "Is Package Erlay possible" section was to clarify that they aren't mutually exclusive. Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.
MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example?
Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?
Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."
The thing with current notfound is that it's atomic I guess — makes it easy to consider every announcement independent and our code treats it as such (in a poor way but there are also no expectations). With packages, I imagine a CPFP transaction gets replaced and dropped from the package due. A package recipient would then do what, drop the entire package, or consider it invalid?
A requestor will get this package which doesn't pass the fees, and drop it on the floor? Or should it re-request the package? At the moment, I'm not interested in the final answer — but interested in making sure we don't miss such questions.
Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.
Not sure this anything to do with who initiates packages. Let's think about the current proposal. Imagine, upon the announcement of C, the receiver sends getdata(MSG_ANCPKGINFO, C), along with some compressed data saying that he already has 1 parent and 1 child of C (by just looking at the wtxid). Then it would save us announcing these 2 wtxids in the reverse direction.
Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?
I don't think memory allocation and stuff is a show-stopper, but it might be too much complexity if the gain is low. We should measure how much savings it allows... I think saving 28 byte per announced tx might be worthy. As for the "same peer", do you ever expect packages to be requested from a different peer from the one announced them?
88 | +changing its txid; a node cannot use txid to deduplicate transactions it has already downloaded 89 | +or validated. Ideally, two nodes that both support BIP339 wtxid-based transaction relay shouldn't 90 | +ever need to use txid-based transaction relay. 91 | + 92 | +A single use case of txid-based relay remains: handling "orphan" transactions that spend 93 | +output(s) from a transaction they are unaware of. Orphans are common for new nodes that have just
output(s) from a transaction the receiving node is unaware of. Orphans are common for new nodes that have just
95 | + 96 | +Nodes may handle orphans by storing them in a cache and requesting any missing parent(s) by txid 97 | +(prevouts specify txid, not wtxid). These parents may end up being orphans as well, if they also 98 | +spend unconfirmed inputs that the node is unaware of. This method of handling orphans is problematic 99 | +for two reasons: it requires nodes to allocate memory for unvalidated data received on the p2p 100 | +network and txid-based relay between two wtxid-relay peers.
network and it relies on txid-based relay between two wtxid-relay peers.
108 | + 109 | +A transaction's '''ancestors''' include, recursively, its parents, the parents of its parents, etc. 110 | +A transaction's '''descendants''' include, recursively, its children, the children of its children, 111 | +etc. A transaction's parent is its ancestor, but an ancestor is not necessarily a parent. 112 | + 113 | +A '''package''' is an ordered list of transactions, representable by a connected Directed Acyclic
"an ordered list". Is there a well-defined ordering? What is it?
131 | +For example, consider the following scenarios (where the mempool minimum feerate is 3sat/vB and all 132 | +transactions are the same size): 133 | + 134 | +* Package {A, B} where A pays 0 satoshis and B pays 8000 satoshis in fees. 135 | +* Package {C, D} where C pays 0 satoshis and D pays 1200 satoshis in fees. 136 | +* Package {E, F, G, H, J} that pays 4000, 8000, 0, 2000, and 4000 satoshis in fees, respectively.
* Package {E, F, G, H, J} that pay 4000, 8000, 0, 2000, and 4000 satoshis in fees, respectively.
129 | +network bandwidth, avoid downloading a transaction more than once, avoid downloading transactions 130 | +that are eventually rejected, and minimize storage allocated for not-yet-validated transactions. 131 | +For example, consider the following scenarios (where the mempool minimum feerate is 3sat/vB and all 132 | +transactions are the same size): 133 | + 134 | +* Package {A, B} where A pays 0 satoshis and B pays 8000 satoshis in fees.
Why not make the transactions' virtual size 1kvB to make the math super easy?
133 | + 134 | +* Package {A, B} where A pays 0 satoshis and B pays 8000 satoshis in fees. 135 | +* Package {C, D} where C pays 0 satoshis and D pays 1200 satoshis in fees. 136 | +* Package {E, F, G, H, J} that pays 4000, 8000, 0, 2000, and 4000 satoshis in fees, respectively. 137 | +[[File:./bip-0331/tx_scenarios.png|800px]] 138 | +
< br/>
The line below ("No Package Information Round...") wraps strangely on certain resolution displays. The "No Package" appears to the right of the image and then there's a new line for "Information Round...". I think you can force a line break so that the text appears on a new line.
138 | + 139 | +'''No Package Information Round:''' Instead of having a package information round, just use the 140 | +child's wtxid to refer to the package and always send the entire package together. This would make 141 | +it very likely for honest nodes to redownload duplicate transactions. 142 | +[[File:./bip-0331/no_package_info.png|900px]] 143 | +
As above, force a newline here:
<br />
159 | +Too-low-feerate transactions (i.e. below the node's minimum mempool feerate) with high-feerate 160 | +descendants can also be relayed this way. If the peers are using BIP133 fee filters and a 161 | +low-feerate transaction is below the node's fee filer, the sender will not announce it. The 162 | +high-feerate transaction is seen and handled as an orphan, the transactions are validated as a 163 | +package, and so the protocol naturally works for this use case.<ref>'''Does this mean BIP133 is 164 | +required for package relay to work?'''No. If the low-feerate transaction was sent because it wasn't
required for package relay to work?''' No. If the low-feerate transaction was sent because it wasn't
269 | + 270 | +# The "pkgtxns" message has the structure defined above, with pchCommand == "pkgtxns". 271 | + 272 | +# A "pkgtxns" message should contain the transaction data requested using "getpkgtxns". 273 | + 274 | +# Upon reeipt of a "pkgtxns" message, the node should validate the transactions together as a package rather than individually.
# Upon receipt of a "pkgtxns" message, the node should validate the transactions together as a package rather than individually.
331 | +# This inv type should only be used if both peers agreed to send version 0 packages. If a "getdata" message with type MSG_ANCPKGINFO is received from a peer with which ancestor package relay was not negotiated, the sender may be disconnected. 332 | + 333 | +====MSG_PKGTXNS==== 334 | + 335 | +# A new inv type (MSG_PKGTXNS == 0x7) is added, for use only in "notfound" messages pertaining to 336 | +package transactions.
# A new inv type (MSG_PKGTXNS == 0x7) is added, for use only in "notfound" messages pertaining to package transactions.
203 | + 204 | +===Combined Hash=== 205 | + 206 | +A "combined hash" serves as a unique "package id" for some list of unique transactions. 207 | + 208 | +The combined hash of a package of transactions is equal to the hash of each transaction's wtxid, concatenated in lexicographical order.
Which hash function?
314 | + 315 | +# Upon receipt of a malformed "ancpkginfo" message that does not contain the ancestor package of the transaction (which the node learns after receiving the transaction data and validating them), the sender may be disconnected. 316 | + 317 | +# A node MUST NOT send a "ancpkginfo" message that has not been requested by the recipient. Upon receipt of an unsolicited "ancpkginfo", a node may disconnect the sender. 318 | + 319 | +# A "ancpkginfo" message should only be sent if both peers agreed to send version 0 packages. If a "ancpkginfo" message is received from a peer with which ancestor package relay was not negotiated, the sender may be disconnected.
It'd be good to explicitly define what a "version 0 package" is in this doc.
Sorry about that, the BIP-327 title issue has been fixed, so please rebase.
25 | +selecting them for inclusion in blocks 26 | +<ref>[https://github.com/bitcoin/bitcoin/pull/7600 Select transactions using feerate-with-ancestors]</ref>. 27 | +Incentive-compatible mempool and miner policies help create a fair, fee-based market for block 28 | +space. While miners maximize transaction fees in order to earn higher block rewards, non-mining 29 | +users participating in transaction relay reap many benefits from employing policies that result in a 30 | +mempool with the same contents, including faster compact block relay and more accurate fee
Perhaps "the same contents" -> "similar contents". I think that better conveys that there's a spectrum, where the closer you are to the miners' mempools, the more benefit you get from compact block relay and fee estimation.
61 | +</ref> 62 | + 63 | +These transactions must meet a certain confirmation target to be effective, but their feerates 64 | +are negotiated well ahead of broadcast time. If the forecasted feerate was too low and no 65 | +fee-bumping options are available, attackers can steal money from their counterparties. Always 66 | +overestimating fees may sidestep this issue temporarily (while mempool traffic is low and
overestimating fees may sidestep this issue (at least while mempool traffic is low and
I think the word 'temporarily' here could be misinterpreted.
59 | +* [https://github.com/t-bast/lightning-docs/blob/master/pinning-attacks.md T-bast's Pinning Attacks Document] 60 | +* [https://gist.github.com/instagibbs/60264606e181451e977e439a49f69fe1 Eltoo Pinning] 61 | +</ref> 62 | + 63 | +These transactions must meet a certain confirmation target to be effective, but their feerates 64 | +are negotiated well ahead of broadcast time. If the forecasted feerate was too low and no
Perhaps prefer "forecast" as the past of "forecast":
are negotiated well ahead of broadcast time. If the forecast feerate was too low and no
(both "forecast" and "forecasted" are acceptable forms for the past participle, but "forecast" seems to be more common)
62 | + 63 | +These transactions must meet a certain confirmation target to be effective, but their feerates 64 | +are negotiated well ahead of broadcast time. If the forecasted feerate was too low and no 65 | +fee-bumping options are available, attackers can steal money from their counterparties. Always 66 | +overestimating fees may sidestep this issue temporarily (while mempool traffic is low and 67 | +predictable), but this solution is not foolproof and wastes users' money. For some attacks,
Perhaps "foolproof" -> "guaranteed to work". "foolproof" implies more user error to me, whereas this method may fail due to circumstances outside the user's control.
77 | + 78 | +Theoretically, developing a safe and incentive-compatible package mempool acceptance policy is 79 | +sufficient to solve this issue. Nodes could opportunistically accept packages (e.g. by trying 80 | +combinations of transactions rejected from their mempools), but this practice would likely be 81 | +inefficient at best and open new Denial of Service attacks at worst. As such, this proposal 82 | +suggests adding new p2p messages enabling nodes to request and share package validation-related
Does the "related" refer to "package validation"? If yes, I think this is better:
suggests adding new p2p messages enabling nodes to request and share package-validation-related
88 | +changing its txid; a node cannot use txid to deduplicate transactions it has already downloaded 89 | +or validated. Ideally, two nodes that both support BIP339 wtxid-based transaction relay shouldn't 90 | +ever need to use txid-based transaction relay. 91 | + 92 | +A single use case of txid-based relay remains: handling "orphan" transactions that spend output(s) 93 | +from a transaction the receiving node is unaware of. Orphans are common for new nodes that have
from an unconfirmed transaction the receiving node is unaware of. Orphans are common for new nodes that have
132 | +transactions are 1000vB in size): 133 | + 134 | +* Package {A, B} where A pays 0 satoshis and B pays 8000 satoshis in fees. 135 | +* Package {C, D} where C pays 0 satoshis and D pays 1200 satoshis in fees. 136 | +* Package {E, F, G, H, J} that pay 4000, 8000, 0, 2000, and 4000 satoshis in fees, respectively. 137 | +[[File:./bip-0331/tx_scenarios.png|800px]]
You still need to update the values in these bullet points and the image. Also, the image says the virtual size is 1600Wu, but virtual size should be in vB. I'd also remove the "size" from the image.
If it doesn't clutter it too much it might also be nice to add the package feerates? Although they are simple enough to calculate too...
Ended up removing this as I don't think it was adding very much
133 | + 134 | +* Package {A, B} where A pays 0 satoshis and B pays 8000 satoshis in fees. 135 | +* Package {C, D} where C pays 0 satoshis and D pays 1200 satoshis in fees. 136 | +* Package {E, F, G, H, J} that pay 4000, 8000, 0, 2000, and 4000 satoshis in fees, respectively. 137 | +[[File:./bip-0331/tx_scenarios.png|800px]] 138 | +< br/>
Sorry, I got the whitespace wrong in my review comment. It should be:
<br />
Same on line 144 below.
205 | + 206 | +===Combined Hash=== 207 | + 208 | +A "combined hash" serves as a unique "package id" for some list of unique transactions. 209 | + 210 | +The combined hash of a package of transactions is equal to the hash of each transaction's wtxid, concatenated in lexicographical order.
I think you should include the name of the hash function in this definition.
Should we use a BIP340 tagged hash here? (I'm not sure, thinking out loud). For example, could be relevant for these rolling bloom filters that contain hashes corresponding to both txes and packages?
67 | +predictable), but this solution is not foolproof and wastes users' money. For some attacks, 68 | +the available defenses require nodes to have a bird's-eye view of Bitcoin nodes' mempools, 69 | +which is an unreasonable security requirement. 70 | + 71 | +The best solution is to enable nodes to consider packages of transactions as a unit, e.g. one or 72 | +more low-fee parent transactions with a high-fee child, instead of separately. A package-aware
more low-fee ancestor transactions with a high-fee descendant, instead of separately. A package-aware
149 | +transaction download fails for some reason, it shouldn't be used as the default because it always 150 | +requires storage of unvalidated transactions. 151 | +[[File:./bip-0331/package_info_only.png|1200px]] 152 | +</ref> 153 | + 154 | +Upon encountering an orphan, a node may request ancestor package information delineating the wtxids
Upon receiving an orphan transaction, a node may request ancestor package information delineating the wtxids
119 | + 120 | +==Specification== 121 | + 122 | +===Intended Protocol Flow=== 123 | + 124 | +Package Relay consists of two phases: a package information round and a transaction data round.
I'd add a sentence here along the lines of "This package relay protocol satisfies both use cases (orphan transaction handling and high-feerate descendant paying for low-feerate ancestors)." and then have sub-headings for "orphan transaction handling", "high-feerate descendants" and "feature negotiation" below. Without the sub-headings, it's a bit difficult to follow the structure of the doc.
159 | +[[File:./bip-0331/protocol_flow.png|1200px]] 160 | + 161 | +Too-low-feerate transactions (i.e. below the node's minimum mempool feerate) with high-feerate 162 | +descendants can also be relayed this way. If the peers are using BIP133 fee filters and a 163 | +low-feerate transaction is below the node's fee filer, the sender will not announce it. The 164 | +high-feerate transaction is seen and handled as an orphan, the transactions are validated as a
high-feerate transaction ''will'' be sent by the sender, and received and handled as an orphan by the receiver. The transactions are validated as a
160 | + 161 | +Too-low-feerate transactions (i.e. below the node's minimum mempool feerate) with high-feerate 162 | +descendants can also be relayed this way. If the peers are using BIP133 fee filters and a 163 | +low-feerate transaction is below the node's fee filer, the sender will not announce it. The 164 | +high-feerate transaction is seen and handled as an orphan, the transactions are validated as a 165 | +package, and so the protocol naturally works for this use case.<ref>'''Does this mean BIP133 is
I suggest you bring this into the main body of this section rather than have it as a footnote.
281 | +It was [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020496.html suggested] that 282 | +this message is not package-specific and generally useful for downloading transactions as 283 | +batches.</ref> 284 | + 285 | + 286 | +====ancpkginfo====
Would it make sense to put this above the definition of "getpkgtxns"? "ancpkginfo comes first both sequentially in the protocol flow and alphabetically.
Do we want to redefine a package info message for every package version? Could also use a generic pkginfo with version field matching the one in sendpackages?
Do we want to redefine a package info message for every package version?
Yes
Could also use a generic pkginfo with version field matching the one in sendpackages?
I think it makes more sense to have a separate protocol message for separate things than to add 4-byte overhead on every package info message.
326 | + 327 | +# A new inv type (MSG_ANCPKGINFO == 0x6) is added, for use only in getdata requests pertaining to ancestor packages. 328 | + 329 | +# As a getdata request type, it indicates that the sender wants a "ancpkginfo" containing all of the unconfirmed ancestors of a transaction, referenced by wtxid. 330 | + 331 | +# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request for, the node should respond with the "ancpkginfo" corresponding to the transaction's unconfirmed ancestor package, or with NOTFOUND. The wtxid of the requested transaction should be the last item in the list. The node should not assume that the sender is requesting the transaction data as well.
# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request, the node should respond with the "ancpkginfo" corresponding to the transaction's unconfirmed ancestor package, or with NOTFOUND. The wtxid of the requested transaction should be the last item in the list. The node should not assume that the sender is requesting the transaction data as well.
# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request for, the node should respond with an "ancpkginfo" message corresponding to the transaction's unconfirmed ancestor package, or with NOTFOUND. The wtxid of the requested transaction should be the last item in the list. The node should not assume that the sender is requesting the transaction data as well.
I think "The node should not assume that the sender is requesting the transaction data as well." is unnecessary, since the description of the protocol above has already specified that the protocol has separate package information and transaction data rounds.
328 | + 329 | +# As a getdata request type, it indicates that the sender wants a "ancpkginfo" containing all of the unconfirmed ancestors of a transaction, referenced by wtxid. 330 | + 331 | +# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request for, the node should respond with the "ancpkginfo" corresponding to the transaction's unconfirmed ancestor package, or with NOTFOUND. The wtxid of the requested transaction should be the last item in the list. The node should not assume that the sender is requesting the transaction data as well. 332 | + 333 | +# The inv type should not be used in announcements, i.e. "inv(MSG_ANCPKGINFO)" should never be sent.
What should happen if a node receives an inv(MSG_ANCPKGINFO)? Drop/ignore the message or disconnect/punish?
314 | +downloaded. See [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-June/020558.html discussion]. 315 | +</ref> 316 | + 317 | +# Upon receipt of a "ancpkginfo" message, the node may use it to request the transactions not already in its mempool. 318 | + 319 | +# Upon receipt of a malformed "ancpkginfo" message that does not contain the ancestor package of the transaction (which the node learns after receiving the transaction data and validating them), the sender may be disconnected.
This seems to slightly contradict the "Should a peer be punished if they provide incorrect package info" footnote, or am I misunderstanding?
Fixed, described what malformed means
330 | + 331 | +# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request for, the node should respond with the "ancpkginfo" corresponding to the transaction's unconfirmed ancestor package, or with NOTFOUND. The wtxid of the requested transaction should be the last item in the list. The node should not assume that the sender is requesting the transaction data as well. 332 | + 333 | +# The inv type should not be used in announcements, i.e. "inv(MSG_ANCPKGINFO)" should never be sent. 334 | + 335 | +# This inv type should only be used if both peers agreed to send version 0 packages. If a "getdata" message with type MSG_ANCPKGINFO is received from a peer with which ancestor package relay was not negotiated, the sender may be disconnected.
Perhaps:
# This inv type should only be used if both peers agreed to send version 0 packages. If a "getdata" message with type MSG_ANCPKGINFO is received from a peer with which ancestor package relay was not negotiated, no response should be sent and the sender may be disconnected.
318 | + 319 | +# Upon receipt of a malformed "ancpkginfo" message that does not contain the ancestor package of the transaction (which the node learns after receiving the transaction data and validating them), the sender may be disconnected. 320 | + 321 | +# A node MUST NOT send a "ancpkginfo" message that has not been requested by the recipient. Upon receipt of an unsolicited "ancpkginfo", a node may disconnect the sender. 322 | + 323 | +# A "ancpkginfo" message should only be sent if both peers agreed to send version 0 packages. If a "ancpkginfo" message is received from a peer with which ancestor package relay was not negotiated, the sender may be disconnected.
This seems slightly redundant with the point above. If package relay has not been negotiated, then the peer shouldn't have sent a getdata(MSG_ANCPKGINFO)
254 | + 255 | +# The "getpkgtxns" message has the structure defined above, with pchCommand == "getpkgtxns". 256 | + 257 | +# A "getpkgtxns" message should be used to request all or some of the transactions previously announced in a "ancpkginfo" message, specified by witness transaction id. This message is intended to allow nodes to avoid downloading and storing transactions that cannot be validated immediately. 258 | + 259 | +# Upon receipt of a "getpkgtxns" message, a node must respond with either a "pkgtxns" containing the requested transactions or "notfound" messages indicating one or more of the transactions is unavailable.<ref>'''Why isn't package relay negotiation required for getpkgtxns?'''
If I read this bullet point and footnote literally, it says that all nodes must respond to getpkgtxns messages, even if they haven't negotiated package relay. That seems wrong. It means that all existing nodes on the network would be in violation of this rule.
Yes my bad, a node should respond.
186 | +right, I propose we hold off on sender-initiated for now, deploy receiver-initiated package relay, 187 | +observe its usage and figure out where we can save a round trip, and then introduce a 188 | +well-researched sender-initiated package relay protocol.</ref> 189 | + 190 | +Ancestor package relay is negotiated between two peers during the version handshake using a 191 | +"sendpackages" message containing version=1. It requires both peers to support wtxid-based relay
Should this be version=0?
238 | +Yes, this is allowed in order to reduce the number of negotiation steps. This means nodes can 239 | +announce features without first checking what the other peer has sent, and then apply logic at the 240 | +end. See [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020510.html discussion]. 241 | +</ref> 242 | + 243 | +# Version=0 packages correspond to ancestor packages, i.e. "ancpkginfo" and "MSG_ANCPKGINFO".
I think it'd be better to pull this definition out and make it more explicit. It'd also be useful to distinguish between the parts of this proposal that are specific to version 0 (MSG_ANCPKGINFO and ancpkginfo) and the parts that are generic to any kind of package (sendpackages, getpkgtxns, pkgtxns and MSG_PKGTXNS)
Done
348 | +Older clients remain fully compatible and interoperable after this change. Clients implementing this 349 | +protocol will only attempt to send and request packages if agreed upon during the version handshake. 350 | +<ref>'''Will package relay cause non-package relay nodes to waste bandwidth on low-feerate transactions?''' 351 | +If a node supports package relay, it may accept low-feerate transactions (e.g. paying zero fees) 352 | +into its mempool, but non-package relay nodes would most likely reject them. To mitigate bandwidth 353 | +waste, a package relay node should probably not announce descendants of below-fee-filter
Can we remove "probably" here?
362 | + 363 | +==Implementation== 364 | + 365 | +Implementation for Bitcoin Core: https://github.com/glozow/bitcoin/tree/2023-01-package-relay 366 | + 367 | +Caveat: the crux of package relay is a mempool policy that works.
Caveat/crux/"that works" all feel a bit informal/imprecise. Perhaps:
Sample implementation for Bitcoin Core: https://github.com/glozow/bitcoin/tree/2023-01-package-relay. A prerequisite for implementing a safe package relay protocol is to have a mempool acceptance policy that is able to safely validate packages of transactions.
110 | +A transaction's '''descendants''' include, recursively, its children, the children of its children, 111 | +etc. A transaction's parent is its ancestor, but an ancestor is not necessarily a parent. 112 | + 113 | +A '''package''' is a list of transactions, representable by a connected Directed Acyclic 114 | +Graph (a directed edge exists between a transaction that spends the output of another transaction). 115 | +Typically, a package refers to unconfirmed transactions.
In this proposal, a package is limited to unconfirmed transactions.
What happens if my node sends a package including a confirmed ancestor transaction?
What happens if my node sends a package including a confirmed ancestor transaction?
The receiver should be able to deal with it. It is not a violation to do so, as you might just have different chainstates.
154 | +Upon encountering an orphan, a node may request ancestor package information delineating the wtxids 155 | +of the transaction's unconfirmed ancestors. This is done without using txid-based relay. The package 156 | +information can be used to request transaction data. As these transactions are dependent upon one 157 | +another to be valid, the transactions can be requested and sent as a batch. 158 | + 159 | +[[File:./bip-0331/protocol_flow.png|1200px]]
I think it's supposed to be getdata(MSG_ANCPKGINFO, ...) instead of getdata(ANCPKGINFO, ...)?
Also general comment: perhaps figure annotations would be helpful?
I omit the MSG because all the inv types have it
218 | +====sendpackages==== 219 | + 220 | +{| 221 | +| Field Name || Type || Size || Purpose 222 | +|- 223 | +|version || uint32_t || 4 || Denotes a package version supported by the node.
If version is a uint32_t, it seems to indicate that we want to leave room for a significant amount more package versions (and even if we don't, it's an infrequently sent message, so no biggie to have this be oversized). In that light, would it make sense to have instead have a versions array<uint32_t>, so in a future with lots of possible package versions we don't need to flood sendpackages messages, and making it an array now is not a lot of overhead?
I'd be surprised if we end up with a huge number of package versions so I'm probably exaggerating here, but I think even for a small number of versions it makes sense, since the expectation is that we'll have multiple versions?
It's a shame we can't version getdata messages, because then we could have a generic MSG_PKGINFO (and MSG_PKGTXNS INV type, and use the same version that we announce in sendpackages. That would avoid having to create a new INV type for every package version.
I think we can make things more consistent by having the version(s) field in sendpackages correspond to the INV type, though? I.e. instead of version=0 we'd use version=0x6 for ancestor packages?
I agree that having space for 4294967295 package versions is probably more than we'll ever need.
An alternative would be to treat this as a bit field, in the way that nServices flags are signalled in the version message. This BIP would define version (1 << 0) and future versions would be defined as (1 << n). When we receive a sendpackages message, we XOR the received versions with the received versions from previous sendpackages messages.
Yes I think a bitfield is best :+1: 32 package versions seems plenty, and this makes the implementation cleaner as well.
231 | +# Upon receipt of a "sendpackages" message with a version that is not supported, a node must treat the peer as if it never received the message. 232 | + 233 | +# The "sendpackages" message MUST be sent before sending a "verack" message. If a "sendpackages" message is received afer "verack", the sender may be disconnected. 234 | + 235 | +# A node that sends "sendpackages" MUST also send "wtxidrelay". Upon receipt of a "verack", if the sender has sent a "sendpackages" but not "wtxidrelay", the sender may be disconnected. 236 | +
This is inconsistent with Erlay behaviour. I think I prefer the "may be disconnected" approach, but just flagging that since ideally they're aligned?
Removed
334 | + 335 | +# This inv type should only be used if both peers agreed to send version 0 packages. If a "getdata" message with type MSG_ANCPKGINFO is received from a peer with which ancestor package relay was not negotiated, the sender may be disconnected. 336 | + 337 | +====MSG_PKGTXNS==== 338 | + 339 | +# A new inv type (MSG_PKGTXNS == 0x7) is added, for use only in "notfound" messages pertaining to package transactions.
supernit: I know it really doesn't matter, and yet it feels so much cleaner if the version-less MSG_PKGTXNS has the lower number (0x6) so all the package-specific invs can increase from there? (i.e. MSG_ANCPKGINFO (0x7))
Done, I like it better too :)
29 | +users participating in transaction relay reap many benefits from employing policies that result in a 30 | +mempool with the same contents, including faster compact block relay and more accurate fee 31 | +estimation. Additionally, users may take advantage of mempool and miner policy to bump the priority 32 | +of their transactions by attaching high-fee descendants (Child Pays for Parent or CPFP). 33 | + 34 | +Only considering transactions one at a time for submission to the mempool creates a limitation in
You could consider re-wording this to:
"Individually considering transactions for submission to the mempool creates a limitation..."
37 | +transaction's descendants when considering which of two conflicting transactions to keep (Replace by 38 | +Fee or RBF). 39 | + 40 | +When a user's transaction does not meet a mempool's minimum feerate and they cannot create a 41 | +replacement transaction directly, their transaction will simply be rejected by this mempool. They 42 | +also cannot attach a descendant to pay for replacing a conflicting transaction.
I think it's alluded to, but perhaps it's worth clarifying that they also currently cannot submit the high-fee descendant first, followed by the low-fee ancestor?
158 | + 159 | +[[File:./bip-0331/protocol_flow.png|1200px]] 160 | + 161 | +Too-low-feerate transactions (i.e. below the node's minimum mempool feerate) with high-feerate 162 | +descendants can also be relayed this way. If the peers are using BIP133 fee filters and a 163 | +low-feerate transaction is below the node's fee filer, the sender will not announce it. The
low-feerate transaction is below the node's fee filter, the sender will not announce it. The
183 | +to design the sender-initiated protocol carefully, and inform the design decisions using data 184 | +collected from the mainnet p2p network. However, there is no historical transaction data to use 185 | +because the goal is to enable currently-rejected transactions to propagate. In order to get this 186 | +right, I propose we hold off on sender-initiated for now, deploy receiver-initiated package relay, 187 | +observe its usage and figure out where we can save a round trip, and then introduce a 188 | +well-researched sender-initiated package relay protocol.</ref>
I thought that a decent middle-ground could be to apply something like the following:
This way you still retain some bandwidth minimisation, whilst also being somewhat proactive about broadcasting packages and reducing the # of roundtrips. Also it places the onus on the (original) sender to be broadcasting as a package if they want it relayed (as quickly as possible) as a package to the network.
But in thinking it through more, this could permit malicious nodes to start sending many large packages around unsolicited wasting the bandwidth of our P2P network...
Therefore I agree that reciever-initiated makes the most sense here.
203 | + 204 | +[[File:./bip-0331/version_negotiation.png|600px]] 205 | + 206 | +===Combined Hash=== 207 | + 208 | +A "combined hash" serves as a unique "package id" for some list of unique transactions.
What exactly is this combined hash for -- is it only used in the inv(MSG_PKGTXNS) message?
My understanding is that we are referring to packages by the child wtxid?
I think one use case to unambiguously determine if we've already validated (/rejected) a certain (exact) package as per this comment
Yes, the combined hash is only used in notfound(MSG_PKGTXNS).
We cannot use the child wtxid because the child wtxid is not included in the getpkgtxns request.
Thanks @jnewbery @stickies-v @willcl-ark for the feedback. I'm working on incorporating the suggestions, apologies for the delay.
Changed the version field in "sendpackages" to versions, now interpreted as a bitfield, and updated the negotiation procedure accordingly. (https://github.com/bitcoin/bips/pull/1382#discussion_r1173605835)
Swapped the inv values for MSG_ANCPKGINFO and MSG_PKGTXNS (https://github.com/bitcoin/bips/pull/1382#discussion_r1156267093)
Neither of these are implemented in the branch I had linked before, so I've removed that link. Will update when I've done that.
I've restructured the BIP to make the difference between "generic" (can be used in future versions of package relay) vs "ancestor info" (first defined version) parts more clear (https://github.com/bitcoin/bips/pull/1382#discussion_r1155782599). Also updated the diagrams and took wording suggestions.
Maybe it's obvious or I am missing something but I couldn't see the answer in the doc: How would a node decide between sending Inv(PKGINFO, C) or Inv(WTX, C) in the Generic Package Relay example graphic? Is it always sending PKGINFO if the peer wants packages and C is part of a package?
@fjahr "The inv type should not be used in announcements" it's a type used for getdata only IIUC
@fjahr "The inv type should not be used in announcements" it's a type used for
getdataonly IIUC
Hmm, ok, first insight: the inconsistencies in the message naming can be confusing. I guess it gets obvious when someone goes deeper but I did a pretty superficial pass and then tried to find an answer to this question and I guess I stumbled over that it's PKGINFO in the graphics but ANCPKGINFO in the text.
It seems that the name has changed before which makes researching past discussions on the mailing list harder. Maybe that could be noted somewhere (the artist formerly known as PKGINFO, also formerly know as MSG_PCKG1 etc.). Not sure it's worth the hassle but in this case it would have helped me.
In an old ML post I found:
''Q: Under what circumstances should a sender announce a
child-with-unconfirmed-parents package?''
A child-with-unconfirmed-parents package for a transaction should be
announced when it meets the peer's fee filter but one or more of its
parents don't; a "inv(MSG_PCKG1)" instead of "inv(WTX)" should be sent
for the child. Each of the parents which meet the peer's fee filter
should still be announced normally.
(from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html)
But this is now changed it seems so maybe the graphics can be updated in this regard as well and the inv messages removed?
Yes, PKGINFO is not specified anywhere; think of it as an "abstract class." The purpose of that diagram is to show how this protocol might be extended or upgraded in the future, i.e. we can define a new type of PKGINFO message to be used in invs and/or getdatas, and continue using the same getpkgtxns and pkgtxns message for transaction download. I want to separate those 2 concepts: exchanging package information and exchanging transaction data.
For example, CHUNKPKGINFO could contain a chunk (with cluster-based mempool), perhaps specifying the chunk fee and size as well. Originally, I proposed a PCKGINFO1 type which was a child and its parents to communicate a fee-bumping relationship.
The reason there is an "inv(PKGINFO)" there is to communicate that, while inv(ANCPKGINFO) is not allowed, some other kind of inv(PKGINFO) can be allowed in the future for a sender-initiated protocol.
I will update the diagram to make this more clear, thanks for the feedback and sorry for the confusion!
Linking https://github.com/nostr-protocol/nips/pull/476, nostr-based package relay.
233 | +historical data is available, as one of the primary goals of this protocol is to enable 234 | +currently-rejected transactions to propagate. After deploying receiver-initiated package relay, we 235 | +can observe its usage and then introduce a sender-initiated package relay protocol informed by data 236 | +collected from the p2p network.</ref> 237 | + 238 | +===Combined Hash===
I think this could be moved into the definitions section.
Unsure... somebody would need to use this to implement MSG_PKGTXNS. That seems like it should be "specification" ?
Kept this as is for now
It would be good to have the motivation (ie "to provide a meaningful but short notfound response to getpkgtxns") mentioned before the definition
258 | +|- 259 | +|} 260 | + 261 | +# The "sendpackages" message has the structure defined above, with pchCommand == "sendpackages". 262 | + 263 | +# During version handshake, nodes should one "sendpackages" message indicating they support package relay, with the versions field indicating which versions of they support.
should one => should send one
of they => they
Thanks, fixed
122 | +ancestors. 123 | + 124 | +In a '''topologically sorted''' package, each parent appears somewhere in the list before its child. 125 | + 126 | +==Specification== 127 | +
Consider adding something here like "This BIP defines both Generic Package Relay, the unconditional part of the protocol that all nodes opt-in to when they indicate support for package relay, as well as Ancestor Package Relay, the first optional part of the protocol that nodes may opt-in to during the version handshake. The version handshake itself is considered part of Generic Package Relay."
Thanks, yeah, it's weird the way drew a line between generic/ancestor package relay!
Had an offline conversation with someone about the "generic" thing. I decided to get rid of that and instead add the information in an "Extensibility" section. At the end of the day, I'm just suggesting what a future package relay upgrade could look like and we are not locked into using the format I outlined.
178 | +supporting version n, it should be easy to announce support for version n-1, but not version n, 179 | +packages.</ref> 180 | + 181 | +[[File:./bip-0331/version_negotiation.png|400px]] 182 | + 183 | +===Ancestor Package Information===
Was it ever considered to name Ancestor Package Relay together with the sendpackages version number it uses? It's possible that there is a different way of exchanging ancestors in the future and then that might be confusing if we are not referring to it with v0 ancestor package relay and v1 ancestor package relay. It's also much shorter to reference ;)
Heh, I think in the future if we have 2, we could probably say PKG_RELAY_ANC (the name of the version bit) to refer to this one?
159 | + 160 | +This protocol can be extended to include more types of package information in the future, while 161 | +continuing to use the same messages for transaction data download. 162 | +[[File:./bip-0331/generic_package_relay.png|600px]] 163 | + 164 | +Package relay is negotiated between two peers during the version handshake using a "sendpackages"
Consider moving the version stuff at the beginning of the Generic Package Relay section or even giving it it's own section. It feels more natural and aligns with the chronology and the order of the messages in "New Messages".
:+1: I've gotten rid of Generic and hopefully the flow is now more natural since there's only 1 section in "New Messages"
280 | +|txns||List of wtxids||txns_length * 32|| The wtxids of each transaction in the package. 281 | +|} 282 | + 283 | +# The "getpkgtxns" message has the structure defined above, with pchCommand == "getpkgtxns". 284 | + 285 | +# A "getpkgtxns" message should be used to request some list of transactions specified by witness transaction id. It indicates that the node wants to receiver either all the specified transactions or none of them. This messages is intended to allow nodes to avoid downloading and storing transactions that cannot be validated without each other. The list of transactions does not need to correspond to a previously-received pkginfo message.
receiver => receive
This messages => This message
260 | + 261 | +# The "sendpackages" message has the structure defined above, with pchCommand == "sendpackages". 262 | + 263 | +# During version handshake, nodes should one "sendpackages" message indicating they support package relay, with the versions field indicating which versions of they support. 264 | + 265 | +# The "sendpackages" message MUST be sent before sending a "verack" message. If a "sendpackages" message is received afer "verack", the sender may be disconnected.
afer => after
282 | + 283 | +# The "getpkgtxns" message has the structure defined above, with pchCommand == "getpkgtxns". 284 | + 285 | +# A "getpkgtxns" message should be used to request some list of transactions specified by witness transaction id. It indicates that the node wants to receiver either all the specified transactions or none of them. This messages is intended to allow nodes to avoid downloading and storing transactions that cannot be validated without each other. The list of transactions does not need to correspond to a previously-received pkginfo message. 286 | + 287 | +# Upon receipt of a "getpkgtxns" message, a node should respond with either a "pkgtxns" containing all of the requested transactions in the same order specified in the "getpkgtxns" request or one "notfound" message of type MSG_PKGTXNS and combined hash indicating one or more of the transactions is unavailable.<ref>'''Why isn't package relay negotiation required for getpkgtxns?'''
notfound message is sent as a response, there is never also a pgtxns sent, even if some of the txs are available, right?Yep! tried to clarify this
303 | + 304 | +# A "pkgtxns" message should contain the transaction data requested using "getpkgtxns". 305 | + 306 | +# Upon receipt of a "pkgtxns" message, the node should validate the transactions together as a package rather than individually. 307 | + 308 | +# A "pkgtxns" message should only be sent to a peer that requested the package using "getpkgtxns". If a node receives an unsolicited package, the sender may be disconnected.<ref>'''Why isn't package relay negotiation required for pkgtxns?'''
What if a node sends more transactions than were requested? Might that also be a reason to disconnect?
Yes, reason to disconnect. I think that is the same thing as an unsolicited package.
180 | + 181 | +[[File:./bip-0331/version_negotiation.png|400px]] 182 | + 183 | +===Ancestor Package Information=== 184 | + 185 | +Ancestor Package Relay allows nodes to send and request a transaction's list of unconfirmed
"Ancestor Package Relay" and "Ancestor Package Information" is used interchangeably in the proposal. The reason is probably that this only contains an *info message. Nonetheless, I think it would be better if this could be made consistent. I don't have a preference since both are correct, it may even be "Ancestor Package Information Relay"
Changed all the proper noun instances to Ancestor Package Relay
262 | + 263 | +# During version handshake, nodes should one "sendpackages" message indicating they support package relay, with the versions field indicating which versions of they support. 264 | + 265 | +# The "sendpackages" message MUST be sent before sending a "verack" message. If a "sendpackages" message is received afer "verack", the sender may be disconnected. 266 | + 267 | +# Upon successful connection ("verack" sent by both peers), a node may relay packages with the peer if they did not set "fRelay" to false in the "version" message, both peers sent "wtxidrelay", and both peers sent "sendpackages" for matching version(s). Peers should relay packages corresponding to versions that both sent "sendpackages" for.<ref>'''Is it ok to send "sendpackages" to a peer that specified fRelay=false in their "version" message?'''
Should versions == 0 be allowed?
# Upon successful connection ("verack" sent by both peers), a node may relay packages with the peer if they did not set "fRelay" to false in the "version" message, both peers sent "wtxidrelay", and both peers sent "sendpackages" for matching version bit(s)). Peers should relay packages corresponding to versions that both sent "sendpackages" for.<ref>'''Is it ok to send "sendpackages" to a peer that specified fRelay=false in their "version" message?'''
Yes, though ignored. I've added a note about this now
Thanks, taken
184 | + 185 | +Ancestor Package Relay allows nodes to send and request a transaction's list of unconfirmed 186 | +ancestors. Nodes indicate support for Ancestor Package Relay by adding the 187 | +<code>PKG_RELAY_ANCPKG = (1 << 0)</code> bit in their "sendpackages" messages during version handshake. 188 | + 189 | +===Intended Protocol Flow===
While this section does talk about protocol flow it does so in the context of specific examples. I would probably call this section "Practical application of the Protocol with examples" or something like that, and move it out of and right after the specification section. I.e. have it between Specification and Compatibility. But I don't feel too strongly about this.
I've changed this to "Protocol Flow Examples" but kept them in the same place. I'm thinking they might be helpful earlier on as a quick way to learn how the dialogue looks. Lmk if you disagree though!
In my opinion, the proposed protocol is sound and the BIP provides almost all the necessary information. I am asking for a little more clarification in some places but most of my suggestions are intended to make to document easier to parse for first-time readers. While the information is there it is often ordered differently than I would expect. Maybe others' brains work differently but for me, this would have made a significant difference when I started studying this.
bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it
bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it
Thanks for catching!!! Fixed
ACK, thanks for the updates, I find it an easier read now and didn't see any other issues in my latest pass.
One more question: this protocol seems pretty convenient for fingerprinting nodes by giving specific ancestors to different connections and then observing which tx the connections request to complete the package. I don't think this is a huge problem since there are several other options that allow fingerprinting. But I am curious if this has been discussed somewhere in the past or if there are already ideas for mitigations in the implementation (which I haven't fully reviewed yet).
@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.
What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G) to see which tx of B through F the connection is aware of.
@fjahr fingerprinting is a good consideration but neither new nor made easier. An even more convenient approach: make invalid transactions (as many as you want for free), each with different wtxids, and send one to each node. Nodes will cache their unique rejections and not send a getdata on announcement.
El El lun, 5 jun 2023 a las 21:47, Fabian Jahr @.***> escribió:
@fjahr https://github.com/fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.
What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G) to see which tx of B through F the connection is aware of.
— Reply to this email directly, view it on GitHub https://github.com/bitcoin/bips/pull/1382#issuecomment-1577450727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAEGGKBQPRGH6NQJYKJNG3XJZAVLANCNFSM6AAAAAAREVY6QA . You are receiving this because you authored the thread.Message ID: @.***>
45 | +This limitation harms users' ability to fee-bump their transactions. Further, it presents a security 46 | +issue in contracting protocols which rely on presigned, time-sensitive transactions<ref>'''Examples of time-sensitive pre-signed transactions in L2 protocols.''' 47 | +* [https://github.com/lightning/bolts/blob/master/03-transactions.md#htlc-timeout-and-htlc-success-transactions HTCL-Timeout in LN Penalty] 48 | +* [https://github.com/revault/practical-revault/blob/master/transactions.md#cancel_tx Unvault Cancel in Revault] 49 | +* [https://github.com/discreetlogcontracts/dlcspecs/blob/master/Transactions.md#refund-transaction Refund Transaction in Discreet Log Contracts] 50 | +* [https://gist.github.com/instagibbs/60264606e181451e977e439a49f69fe1 Updates in Eltoo]
There is one more L2 protocol with time-sensitive pre-signed transactions deployed today in production (at least by Eclair and CLN) that benefits from package relay: https://github.com/ElementsProject/peerswap/blob/master/docs/peer-protocol.md#peer-protocol-for-peerswap-swaps
The maker can pin the off-chain commitment transaction until the claim_by_csv path becomes final with the CSV expiration, and therefore spend both the incoming HTLC output and the opening transaction output.
Added
137 | +that are eventually rejected, and minimize storage allocated for not-yet-validated transactions. 138 | + 139 | +<br /> 140 | + 141 | +'''No Package Information Round:''' One proposal is to just use the child's wtxid to refer to the 142 | +package and always send the entire package together, skipping the package information round.
And I think such design would suffer from transaction censorship vector if you have a parent with multiple childs of unequal feerate ?
Let’s say you have LN commitment transaction with anchor output support. Both anchors are present as both counterparties have balances. If your counterparty has broadcast its commitment with its own child, you can attach your to_remote child downgrading the feerate of the whole package.
Of course, from the sender-side, you could re-announce the package for each known child, though it sounds a bandwidth waste as you should do ancestor feerate to propagate the parent first.
I need to look on the PR to see how package are assembled on the sender-side.
Note that this comment is on an alternative design considered but not proposed.
And I think such design would suffer from transaction censorship vector if you have a parent with multiple childs of unequal feerate ?
you can attach your to_remote child downgrading the feerate of the whole package.
Sure, an additional child may exist. But it is not included in the package because it's not part of the actual cpfp child's ancestor set. Its low fees should not be considered in combination with transactions that don't rely on it.
As for potential censorship, that is solved by caching "reconsiderable" rejections separately (see m_recent_rejects_reconsiderable in PR). You can relay a separate package that is the ancestor set of the additional child, and its rejection should not affect whether another overlapping ancestor package is considered/accepted.
As for potential censorship, that is solved by caching "reconsiderable" rejections separately (see m_recent_rejects_reconsiderable in PR). You can relay a separate package that is the ancestor set of the additional child, and its rejection should not affect whether another overlapping ancestor package is considered/accepted.
Yes sounds correct to process each cpfp’s ancestor set separately.
160 | +multiple versions of packages. Package relay requires both peers to support wtxid-based relay 161 | +because package transactions are referenced by their wtxids. 162 | +<ref>'''Why do we need multiple versions? Why can't we just support arbitrary packages?''' 163 | +Attempting to support arbitrary packages in mempool validation may result in very complex logic, new 164 | +Denial of Service attack vectors, and policy limitations that could be leveraged to censor 165 | +transactions (aka "pinning attacks"). This protocol is extensible to support up to 32 types of
It sounds that type of cheap bandwidth saving that might reveal as a downside a decade from now ? And it’s one time message per-link.
Let’s hypothetize you will have half a dozen of time-sensitive second-layers deployed few years from now (lightning, vaults, statechains, submarine swaps, time-locked wallets, discreet log contracts). Among those second-layers, they might need different variants of packages for each of their protocol versions (e.g for Lightning taproot channels, eltoo channels, demux channels for jamming) and for each protocol phase (dual-funding, splicing).
And this becomes worst if you have to combine package size characteristics (e.g N-parent for 1-child) with other future package information like feerare or relationships between transactions.
Using an extensible feature flags (e.g TLV) like Lightning is doing for most of its peer messages has been discussed so far in the context of BIP 331 ?
It sounds that type of cheap bandwidth saving that might reveal as a downside a decade from now ? And it’s one time message per-link.
Let’s hypothetize you will have half a dozen of time-sensitive second-layers deployed few years from now (lightning, vaults, statechains, submarine swaps, time-locked wallets, discreet log contracts). Among those second-layers, they might need different variants of packages for each of their protocol versions (e.g for Lightning taproot channels, eltoo channels, demux channels for jamming) and for each protocol phase (dual-funding, splicing).
Er, what exactly is the "cheap bandwidth saving?" Are you suggesting to increase the field size?
I've removed the mention of 32 versions to leave room for other ways to extend. For now, I don't think we should over-engineer the versions field for a hypothetical super-complex future that may or may not need dozens of types of packages. I think 32 is plenty, and the worst case scenario is defining a new message and sending another few bytes once per connection.
Er, what exactly is the "cheap bandwidth saving?" Are you suggesting to increase the field size?
Yes increasing the filed size to a u64 or u128 to have more bits available for mechanism.
I think 32 is plenty, and the worst case scenario is defining a new message and sending another few bytes once per connection.
Sure we can always define a new message, or an extension of it at the price of more p2p code surface and complexity (and note all the bikeshedding with fRelay). Sounds package relay will get more complex not less, though happy to take the bet and wait when this hypothetical super-complex future does happen :)
I've changed this to u64.
happy to take the bet and wait when this hypothetical super-complex future does happen :)
haha :handshake: If this super-complex future happens and we run out of package types, I'll send you sats via each protocol
215 | +<ref>'''Why no sender-initiated protocol?''' Sender-initiated package 216 | +relay can, theoretically, save a round trip by notifying the receiver ahead of time that they will 217 | +probably need to request and validate a group of transactions together in order for them to be 218 | +accepted. As with any proactive communication, there is a chance that the receiver already knows 219 | +this information, so this network bandwidth may be wasted. Shortened latency is less significant 220 | +than wasted bandwidth.
I think this assumption is dependent on the bumping patterns of package broadcast. In the sense that if the package is re-broadcast as it is (and there is no mempool fluctuations), there is a complete waste of bandwidth as both parent and child are known.
If the package is rebroadcast with the CPFP bumped (probably the future usage for LN where the parent commitment transaction stays an invariant), there is only a need to re-announce the child to the peer nodes.
And note bandwidth assumptions we’re making on our side can be falsified in changes in the re-broadcast logic of LN implementations, like it has been done recently.
I think a realistic simulation model would encompass at least few dimensions
More dimensions to think about ?
I think this is where practical experience is going to help. Receiver initiated imo helps out in the sense of it fits somewhat more neatly in the current orphan/feefilter paradigm?
Yeah so sender-initiated is not being proposed here. The general philosophy is that since everything here is receiver-initiated in certain situations, and we are careful to not waste bandwidth in honest cases in the implementation, hopefully we are not blowing up bandwidth.
If we can do even better by doing sender-initiated in some cases in the future, that'd be nice. I don't think any simulation can be realistic, as we have 0 data about package relay usage until we actually deploy it and people use it to relay transactions that they couldn't before. But these are very helpful considerations for measuring things, thanks will keep in mind.
Receiver initiated imo helps out in the sense of it fits somewhat more neatly in the current orphan/feefilter paradigm?
Correct the update frequency sounds another dimension to think about to evaluate bandwidth consumption.
Yeah so sender-initiated is not being proposed here. The general philosophy is that since everything here is receiver-initiated in certain situations, and we are careful to not waste bandwidth in honest cases in the implementation, hopefully we are not blowing up bandwidth.
Note here I’m not thinking to evaluate future hypothetical sender-initiated though the performance of this current receiver-initiated version, especially the burden than rebroadcast frequency of Lightning node might throw on the base layer. Yeah not a blocker right now to deploy, I might start to do simulation by myself once I’m good reviewing the code, really interested to have a clearer picture, and tweak back in consequence Lightning backend.
Review at “Combined Hash” excluded.
384 | +continuing to use the same messages for transaction data download. One would define a new package 385 | +information message, allocate its corresponding inv type, and its bit in the versions field of 386 | +"sendpackages". A future version of package relay may allow a sender-initiated dialogue if it is 387 | +specified that the "*PKGINFO" inv type can be used in an "inv" message. 388 | +<br /> 389 | +[[File:./bip-0331/generic_package_relay.png|600px]]
Actually not totally clear what this image is showing. why is there a notfound followed by packagetxns?
Changed it and added some more descriptions. I had wanted to show multiple possible dialogues but it just looks confusing.
15 | +Peer-to-peer protocol messages enabling nodes to request and relay the unconfirmed ancestor package 16 | +of a given transaction, and to request and relay transactions in batches. 17 | + 18 | +==Motivation== 19 | + 20 | +(1) Help incentive-compatible transactions propagate.
Use a subheading? ===Help incen...=== ?
"incentive-compatible" seems a bit euphemistic -- wouldn't it be better just to say "help ensure the transactions paying the most fees propogate" ?
81 | +combinations of transactions rejected from their mempools), but this practice would likely be 82 | +inefficient at best and open new Denial of Service attacks at worst. As such, this proposal 83 | +suggests adding new p2p messages enabling nodes to request and share package-validation-related 84 | +information with one another, resulting in a more efficient and reliable way to propagate packages. 85 | + 86 | +(2) Eliminate the need for txid-based transaction relay and make orphan handling more robust.
You still "need" txid-based relay for backwards compatibility.
Arguably, txid-based relay would be more efficient than the proposed protocol: if you see an orphan Z, whose parent Y does not spend any in-mempool txs, then this protocol would make three round trips:
but you could alternatively do two round trips:
That approach would not reduce round-trips when receiving orphans that do have an in-mempool grandparent that the receiver does not already know about (but it wouldn't increase the number of roundtrips either), and would also require storing both the child and parents while waiting for any grandparents.
I suppose it's true that this protocol is not going to have fewer round trips in every possible case of orphans, but we wouldn't have a way of detecting those cases on the receiver end.
123 | + 124 | +In a '''topologically sorted''' package, each parent appears somewhere in the list before its child. 125 | + 126 | +==Specification== 127 | + 128 | +Ancestor Package Relay includes two parts: package information and transaction data download.
"package information round and package transaction download round" might read better -- as it is my first read associated "download" with both the "information" and "data". (Being synonyms, "information" and "data" aren't great ways of describing two different things, either...)
126 | +==Specification== 127 | + 128 | +Ancestor Package Relay includes two parts: package information and transaction data download. 129 | +A package information round is used to help a receiver learn what transactions are in a package and 130 | +whether they want to download them. The transaction data round is used to help a node download 131 | +multiple transactions in one message instead of as separate messages.
The information round doesn't really help you learn "whether you want to download the transactions" -- you know that already, either because you already have the txs, or you've remembered that you've downloaded them previously and rejected them.
In future package types where you might get additional information, e.g., package feerates or similar this may make more sense to say.
I don't think it's that simple - we have cases of
In future package types where you might get additional information, e.g., package feerates or similar this may make more sense to say.
Okay, "the information round described here doesn't really help you learn..."
I haven't seen this tx before but there's another tx in the package that I know is consensus-invalid so I'm not going to download any of them
In that case, (on the implementation side) would we need different reject caches, one for "consensus invalid", one for "policy invalid"? Not sure that would be worth the hassle (honest peers sending consensus invalid stuff should be extremely rare; attackers could just send novel txs each time avoiding the cache)
In that case, (on the implementation side) would we need different reject caches, one for "consensus invalid", one for "policy invalid"?
Yes. Two rejection caches is necessary, see https://github.com/glozow/bitcoin/pull/8#issuecomment-1453791776. But not policy vs consensus, just ones that are eligible for reconsideration in a package (i.e. too low feerate or missing ephemeral anchor spend) vs ones that are not. I said "consensus invalid" to provide a clear example, but I just mean "something we don't need to validate again, even if it's in a package." If something was larger than max standard size or has too many ancestors, trying it in a package won't make it valid.
I guess as long as they're global caches and not per-peer it doesn't matter much anyway.
I'm going to resolve this for now, but let me know if you still feel that I should change this. Right now the text is "The package information round is used to help a receiver learn what transactions are in a package and decide whether they want to download them."
156 | +</ref> 157 | + 158 | +Package relay is negotiated between two peers during the version handshake using a "sendpackages" 159 | +message. The versions field within "sendpackages" is interpreted as a bitfield; peers may relay 160 | +multiple versions of packages. Package relay requires both peers to support wtxid-based relay 161 | +because package transactions are referenced by their wtxids.
You could break ancpkginfo ("ancestor package information") and pkgtxns ("package tx data relay") into two separate features, with the idea that maybe someday we'll have something better than ancestor package relay and want to deprecate the ancpkginfo part, while continuing to have the pkgtxns part.
If so, you could consider sending sendpackages 3 instead of sendpackages 1 -- with bit 0 indicating support of the pkgtxns part, and bit 1 indicating support of the ancpkginfo part.
Ah good idea.
167 | +different types of packages and/or contain more information than a list of wtxids, e.g. feerate or 168 | +relationships between transactions.</ref> 169 | +<ref>'''Why send one message per version instead of sending a single message for the highest 170 | +supported version?''' It should also be possible to support some subset of the existing package 171 | +types. For example, if a node's mempool policy doesn't support or a node implementation stops 172 | +supporting version n, it should be easy to announce support for version n-1, but not version n,
"doesn't support" seems sufficient and clearer?
303 | + 304 | +====MSG_ANCPKGINFO==== 305 | + 306 | +# A new inv type (MSG_ANCPKGINFO == 0x7) is added, for use only in getdata requests pertaining to ancestor packages. 307 | + 308 | +# As a getdata request type, it indicates that the sender wants a "ancpkginfo" containing all of the unconfirmed ancestors of a transaction, referenced by wtxid.
an "ancpkginfo"
305 | + 306 | +# A new inv type (MSG_ANCPKGINFO == 0x7) is added, for use only in getdata requests pertaining to ancestor packages. 307 | + 308 | +# As a getdata request type, it indicates that the sender wants a "ancpkginfo" containing all of the unconfirmed ancestors of a transaction, referenced by wtxid. 309 | + 310 | +# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request, the node should respond with an "ancpkginfo" message corresponding to the transaction's unconfirmed ancestor package, or with "notfound". The wtxid of the requested transaction should be the last item in the list.
"The ancestor package should be topologically sorted, and therefore the requested transaction should be the last item in the list." ?
Alternatively, if we're using the last item in the list to determine which request this was in response to, perhaps change "should" to "must" ?
Maybe be explicit about whether a node that receive a ancpkginfo set that's not topologically sorted (which they determine either because the requested tx isn't last, or they figure out after downloading the tx contents) should potentially disconnect in response? (I assume they shouldn't)
Maybe note that topological sort can be achieved by sorting by mempool acceptance order, if you always accept parents prior to accepting any of their children?
Updated to clarify that the last transaction must be the requested one, the rest can be in any order (nodes should not disconnect or punish), but sender "should" sort topologically for the receiver's convenience, and that can be achieved through sorting by acceptance order.
307 | + 308 | +# As a getdata request type, it indicates that the sender wants a "ancpkginfo" containing all of the unconfirmed ancestors of a transaction, referenced by wtxid. 309 | + 310 | +# Upon receipt of a "getdata(MSG_ANCPKGINFO)" request, the node should respond with an "ancpkginfo" message corresponding to the transaction's unconfirmed ancestor package, or with "notfound". The wtxid of the requested transaction should be the last item in the list. 311 | + 312 | +# The inv type should not be used in announcements, i.e. "inv(MSG_ANCPKGINFO)" must never be sent. If a "inv(MSG_ANCPKGINFO)" is received, the sender may be disconnected.
an "inv()"
Maybe better to phrase it as "the MSG_ANCPKGINFO inv type must only be used in getdata message; it must not be used with the inv message" or similar?
146 | + 147 | +[[File:./bip-0331/no_package_info.png|600px]] 148 | +<br /> 149 | + 150 | +'''Package Information Only:''' Just having package information gives enough information for the 151 | +receiver to accept the packages. Omit the "getpkgtxns" and "pkgtxns" messages and just download the
change "Omit the" to "that is, rather than using getpkgtxns and pkgtxns, send getdata and receive each tx individually" ?
346 | + 347 | +# A "pkgtxns" message should contain the transaction data requested using "getpkgtxns". 348 | + 349 | +# Upon receipt of a "pkgtxns" message, the node should validate the transactions together as a package rather than individually. 350 | + 351 | +# A "pkgtxns" message should only be sent to a peer that requested the package using "getpkgtxns". If a node receives an unsolicited package, the sender may be disconnected.<ref>'''Why isn't package relay negotiation required for pkgtxns?'''
Maybe also mention that a node might simply ignore an unsolicited package, or may attempt to validate it, or may accept a subset of transactions from the package that it can validate, and may disconnect only if validation fails, or only if validation fails due the package being consensus invalid?
I assume what core would do is only disconnect if one or more of the txs fails validation due to TX_CONSENSUS.
leaving it undefined seems ok, since it's just a bucket of "mays" at this point?
The way I would read the above is that most of those "mays" would be violating the previous paragraph: "the node should validate the transactions together as a package rather than individually".
Do I understand correctly that you're saying we might want to validate an unsolicited package and not disconnect the sender? Why would we do this? It doesn't seem like a good idea to spend resources validating transactions if we didn't ask for them.
"the node should validate the transactions together as a package rather than individually".
hm good point. I'm not actually sure what this phrase means? Even the implementation for Core tries transactions individually and in subpackages.
Why would we do this? It doesn't seem like a good idea to spend resources validating transactions if we didn't ask for them.
It's kind of an implementation detail, in that it doesn't effect the underlying protocol(?)
Ah good point, I suppose I can just delete this bullet. Doesn't really matter for protocol what the node does with the transactions.
325 | + 326 | +# The "getpkgtxns" message has the structure defined above, with pchCommand == "getpkgtxns". 327 | + 328 | +# A "getpkgtxns" message should be used to request some list of transactions specified by witness transaction id. It indicates that the node wants to receive either all the specified transactions or none of them. This message is intended to allow nodes to avoid downloading and storing transactions that cannot be validated without each other. The list of transactions does not need to correspond to a previously-received pkginfo message. 329 | + 330 | +# Upon receipt of a "getpkgtxns" message, a node should respond with either a "pkgtxns" containing all of the requested transactions in the same order specified in the "getpkgtxns" request or one "notfound" message of type MSG_PKGTXNS and combined hash of all of the wtxids in the "getpkgtxns" request (only one "notfound" message and nothing else), indicating one or more of the transactions is unavailable.<ref>'''Why isn't package relay negotiation required for getpkgtxns?'''
What if you received a getpkgtxns for wtxids that weren't in an ancpkginfo that you set? Should you always accept and respond if you have the txs and they all fit into a single message? Can you say "I'll never accept a tx with more than 20 ancestors in my mempool, so I'll never send more than 21 wtxids in an ancpkginfo, so I'll never send more than 21 txs in response to a getpkgtxns?
What if you received a getpkgtxns for wtxids that weren't in an ancpkginfo that you set?
That's fine imo - we'll also respond to getdata(tx) for transactions we didn't announce under certain conditions.
Can you say "I'll never accept a tx with more than 20 ancestors in my mempool, so I'll never send more than 21 wtxids in an ancpkginfo, so I'll never send more than 21 txs in response to a getpkgtxns?
I think it's better for ancpkginfo and getpkgtxns logic to be independent, as the idea is to be able to reuse getpkgtxns for other package info types in the future. One might imagine a future where we have cluster package info (or something), and even though ancestor limit is 21, there is a cluster with 22 transactions.
I agree in principle, but I think there probably should be some explicit limit for what you can put in getpkgtxns. With 65 byte txs being standard, you could fit ~60k txs in a pkgtxns response; is there anything beyond "whoops, your response would exceed MAX_PROTOCOL_MESSAGE_LENGTH so you get a notfound instead" that we could use to trim nonsense requests early? There's a big difference between 21 and 60k!
Ah good point. Perhaps we limit to, say, 50? 100? (Arbitrarily picked, 25 just seems a little bit too limiting)
If we pick an explicit limit, and want to re-use this message for later info fetching types, we should pick something "stupidly" high to not interfere with known ancestor package/cluster sizes being considered
e.g., 500?
Marking as resolved based on #1382 (review)
372 | +non-package relay peers. 373 | +</ref> 374 | +<ref>'''Is Package Erlay possible?''' 375 | +A client using BIP330 reconciliation-based transaction relay (Erlay) is able to use package relay 376 | +without interference. After reconciliation, any transactions with unconfirmed ancestors may be 377 | +relayed using ancestor package relay.
We already have the transaction at that point, so maybe better to phrase it as "any transaction with unconfirmed ancestors may have those ancestors resolved using ancestor package relay" ?
344 | + 345 | +# The "pkgtxns" message has the structure defined above, with pchCommand == "pkgtxns". 346 | + 347 | +# A "pkgtxns" message should contain the transaction data requested using "getpkgtxns". 348 | + 349 | +# Upon receipt of a "pkgtxns" message, the node should validate the transactions together as a package rather than individually.
As per discussion, I'm not sure what this sentence actually means. It's up to the node implementation to decide how to handle these, including trying individual validation first, then subpackage subsets...
Deleted
Updated for suggestions from @ajtowns, thanks!
Major changes (still need to update implementation to reflect these):
PKG_RELAY_PKGTXNS = 1 << 0 and PKG_RELAY_ANC is now 1 << 1 (signaling both is versions=3). (https://github.com/bitcoin/bips/pull/1382#discussion_r1225156027)getpkgtxns request. (https://github.com/bitcoin/bips/pull/1382#discussion_r1228918874)329 | + 330 | +# A "getpkgtxns" message should be used to request some list of transactions specified by witness transaction id. It indicates that the node wants to receive either all the specified transactions or none of them. This message is intended to allow nodes to avoid downloading and storing transactions that cannot be validated without each other. The list of transactions does not need to correspond to a previously-received pkginfo message. 331 | + 332 | +# Upon receipt of a "getpkgtxns" message, a node should respond with either a "pkgtxns" containing all of the requested transactions in the same order specified in the "getpkgtxns" request or one "notfound" message of type MSG_PKGTXNS and combined hash of all of the wtxids in the "getpkgtxns" request (only one "notfound" message and nothing else), indicating one or more of the transactions is unavailable. 333 | + 334 | +# A "getpkgtxns" message must contain at most 100 wtxids. Upon receipt of a "getpkgtxns" message with more than 100 wtxids, a node may ignore the message (to avoid calculating the combined hash) and disconnect the sender.