wtxidrelay
and addrv2
/sendaddrv2
, and add fSuccessfullyConnected
doxygen documentation to clarify that it is set to true on VERACK.
wtxidrelay
and addrv2
/sendaddrv2
, and add fSuccessfullyConnected
doxygen documentation to clarify that it is set to true on VERACK.
If we’re going to change the software, just making it ignore the message if received post-VERACK (for compatibility with earlier drafts of the BIP and hence rc2) seems simpler than adding special cases.
Deliberately disconnecting nodes for behaving in ways we know is not harmful seems very backwards to me.
Tend toward NACK. There are 20.99 nodes on signet without addrv2 support and will keep a connection just fine. If your addrman is populated with those (or with rc3/final/master nodes), you won’t have any issues either. The issue #20637 is purely about a vanilla start with empty addrman and the only seed being the node that is run by @ajtowns or @kallewoof (I don’t remember). The solution is to update that node, not to introduce more branches in net processing.
Even if people want to go forward with patching the source code, the current patch is too broad. The only thing changes should be the disconnection, not the way how addrv2 messages are recognized.
the current patch is too broad. The only thing changes should be the disconnection, not the way how addrv2 messages are recognized
This does only change the disconnection?
This does only change the disconnection?
It will run pfrom.m_wants_addrv2 = true;
on signet, which will not be run on mainnet. The whole goal of signet was to be exactly like mainet, except for the pow.
Well, adding a test for the signet chain pretty much guarantees different behaviour on signet than mainnet… (And should probably be if (m_chainparams.GetConsensus().signet_blocks)
anyway, if it were desirable)
I think just being deliberately compatible with (or at least tolerant of) implementations of earlier versions of the BIP would make much more sense, so either:
0if (pfrom->m_wants_addrv2) {
1 LogPrintf(BCLog::NET, "peer=%d sent SENDADDRV2 twice; disconnecting\n", pfrom->GetId());
2 pfrom->fDisconnect = 1;
3 return;
4}
5pfrom->m_wants_addrv2 = true;
6return;
or just:
0if (pfrom.fSuccessfullyConnected) return; // ignore sendaddrv2 received too late
1pfrom->m_wants_addrv2 = true;
2return;
(or both, even).
Deliberately disconnecting nodes for behaving in ways we know is not harmful seems very backwards to me.
Tell that to your past self. The disconnection logic has been suggested by you for wtxid, and as such carried over to addv2 ;)
https://github.com/bitcoin/bitcoin/pull/18044/files#r446755938
Deliberately disconnecting nodes for behaving in ways we know is not harmful seems very backwards to me.
Tell that to your past self. The disconnection logic has been suggested by you for wtxid, and as such carried over to addv2 ;)
https://github.com/bitcoin/bitcoin/pull/18044/files#r446755938
If there had already been nodes on the network sending wtxid relay mesages after verack, disconnecting them would have also been a bad idea.
should probably be
if (m_chainparams.GetConsensus().signet_blocks)
Thanks! I was telling myself there was surely a better way of checking for it.
2530+ // Ignore peers that send a SENDADDRV2 message after VERACK.
2531+ return;
2532+ }
2533+ if (pfrom.m_wants_addrv2) {
2534+ // Disconnect peers that send SENDADDRV2 twice.
2535+ LogPrint(BCLog::NET, "peer=%d sent SENDADDRV2 twice; disconnecting\n", pfrom.GetId());
~0 on this, I’d prefer to keep it as is if remotely possible, and not change mainnet logic for a temporary signet issue at the least.
Deliberately disconnecting nodes for behaving in ways we know is not harmful seems very backwards to me.
It’s not harmful but also, it’s unclear what to do in that case? Should we ignore the message? Should be back-patch addrv2 support into the existing connection?
The reason for adding the disconnection is that at verack
time the protocol negotiation is finalized. This might allow for some kinds of optimizations, as well as knowing for sure whether a peer supports addrv1 or not at a clear point in time. This is useful if in the future there are plans to phase out addrv1.
Dropped commits 764042ecf74b778bd999961ba44d9648ae5bf4cb and f26e5ebc60ea2f41b959bc8b136719e250fe8761 and updated 5866a3ef2948945b15b936b572cae1a7321849a9 based on the review feedback (thanks!)
Will drop 5866a3ef2948945b15b936b572cae1a7321849a9 if #20648 is preferred. I tested #20648 and it resolves the issue as well.
Sorry for the temporary PR title, reinstated the original title.
2556@@ -2557,8 +2557,10 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat
2557 // BIP155 defines feature negotiation of addrv2 and sendaddrv2, which must happen
2558 // between VERSION and VERACK.
2559 if (msg_type == NetMsgType::SENDADDRV2) {
2560- if (pfrom.fSuccessfullyConnected) {
2561+ if (pfrom.fSuccessfullyConnected && !m_chainparams.GetConsensus().signet_blocks) {
2562 // Disconnect peers that send a SENDADDRV2 message after VERACK.
2563+ // A temporary exception is made for bootstrapping signet support in v0.21.0.
2564+ // After 0.21.0, this can be reverted.
2565 pfrom.fDisconnect = true;
Behaviour doesn’t match description – per the commit title, this should be:
0 if (pfrom.fSuccessfullyConnected) {
1 if (!m_chainparams.GetConsensus().signet_blocks) {
2 pfrom.fDisconnect = true;
3 }
4 return;
5 }
If you want the behaviour that’s implemented, should be described as “support sendaddrv2 after verack on signet only for compatibility with earlier bip drafts” or similar.
~0 on this, I’d prefer to keep it as is if remotely possible, and not change mainnet logic for a temporary signet issue at the least.
It’s not a signet issue; it’s a “nodes running #19954 (oct 11) can’t communicate with nodes running #20564 (dec 9)” issue. That happens to affect signet seriously, since signet was only merged a couple of weeks earlier (sep 21), so all existing signet nodes that haven’t updated in the past week are running code that’s incompatible with current master/rc3.
I thought the whole point of #20564 was to avoid disconnecting peers running other software unnecessarily, so I’m at a complete loss as to why we’d want to simultaneously introduce code that disconnects earlier versions of our own code, especially ones that have already been release candidates. This just makes absolutely no sense to me.
Deliberately disconnecting nodes for behaving in ways we know is not harmful seems very backwards to me.
It’s not harmful but also, it’s unclear what to do in that case? Should we ignore the message?
Yes. Ignoring messages is how we tolerate other nodes doing protocols we don’t support – in this case the unsupported protocol is the earlier version of sendaddrv2/bip155.
Should be back-patch addrv2 support into the existing connection?
That would be fine too – it’d be removing three lines of code instead of one – but it doesn’t seem particularly necessary. The idea isn’t that you have to support every protocol every other node uses, you just have to tolerate them as far as reasonably possible.
The reason for adding the disconnection is that at
verack
time the protocol negotiation is finalized. This might allow for some kinds of optimizations, as well as knowing for sure whether a peer supports addrv1 or not at a clear point in time. This is useful if in the future there are plans to phase out addrv1.
That’s the reason for finalising it before verack, and ignoring it afterwards; it’s not a reason for disconnecting peers who send it afterwards.
The reason for adding the disconnection is just to make it as obvious as possible for anyone developing new software and testing it against core that they’ve misimplemented the spec (ref)– they’ll test it, get an immediate disconnect, debug it, read the spec and go “oops” and fix it. Or at least that was the reasoning for disconnecting when receiving wtxidrelay late. And that’s free if you’ve got a new message for the feature that no one else has ever used before in an incompatible way. But while that was true for wtxidrelay, it’s not true for sendaddrv2. It’s obviously not true on signet, but it’s not true on mainnet either.
I mean, ultimately, it doesn’t matter – everyone running master/0.21rc’s released this past quarter will upgrade pretty soon, and if the upgrade’s a bit ugly, that’s survivable too. But keeping our p2p graph well connected is pretty important, and this just seems… way too carefree for something that important is the politest phrasing I can come up with.
I’m at a complete loss as to why we’d want to simultaneously introduce code that disconnects earlier versions of our own code, especially ones that have already been release candidates.
master
isn’t considered to be stable and shouldn’t be used by anyone except developers. rc
s might be considered more stable, but we never promised to support rc
s that have been followed up by other rc
s or point releases. rc
s should only be used to do initial testing. We shouldn’t assume that people download rc1 and run it in production for years.
I don’t think it matters much if we ignore the message or disconnect. The only reason I am NACK on this pull is that it is adding signet specific code for an issue that has nothing to do with signet fundamentally.
I also dislike adding signet-specific exception code here.
As far as disconnecting for late-term sendaddrv2 goes, I think the discussion has been primarily between (1) accepting and acting upon sendaddrv2 at any point and (2) rejecting it entirely if it arrives too late. The possibility of just ignoring it if it arrives too late also exists, and doesn’t incur the risk of needing to forever support clients requesting it late either.
Conceptually, I think it makes sense to disconnect connections whenever a peer asks us something we cannot provide, and we have no less invasive way of signalling that. E.g. we disconnect a peer that asks for a block we don’t have, because it’s the polite thing to do - otherwise they’d keep waiting for us.
I don’t think that really applies here. sendaddrv2 means “I understand addrv2 and prefer it”, but it’s only a minor disservice to a peer to not obey it. That’s not the same as wtxid relay, which specifies that transaction announcements must use wtx - in that case there is no way we can still service the peer’s request.
I don’t think it’s a disaster to disconnect on late sendaddrv2, as it is a violation of the protocol spec. But there also isn’t a very strong reason for it.
Dropped 5866a3ef2948945b15b936b572cae1a7321849a9 as #20637 is now resolved (and I agree that the fixes that arrived after this one are preferable).
In light of the discussion, replaced it with 2327cadff6685ec353f33f409c0693310af2baa3 that ignores post-verack sendaddrv2 messages rather than disconnecting, and updated the PR title/description. Feel free to suggest closing this as well.
Right, the solution to ignore seems most reasonable to me if we don’t disconnect for it.
review ACK 2327cadff6685ec353f33f409c0693310af2baa3
ok, I’m ±0. The code seems fine and safe but I don’t see a huge benefit to making this change.
The benefit of a polite disconnect is pointed out by @ajtowns above - it’s immediately obvious to implementers that they’ve done something wrong if they send a sendaddrv2
after verack
, whereas with this change it’s a bit harder to troubleshoot.
review ACK 2327cadff6685ec353f33f409c0693310af2baa3
The code looks correct. However, the only software affected by this is Bitcoin Core pre-release versions. All version that are affected by the incompatibility won’t be made compatible with this change. This simply changes incompatibility by disconnection to incompatibility by not sending addrv2. Disconnection makes it easier to spot incorrectly implemented new software, because not everyone reads the debug log or even has -debug=net enabled. So longer term, if this gets merged, it might also be good to revert it back to disconnection.
Concept meh. A quick disconnect indeed seems more developer friendly, all things equal.
I also run signet seed nodes at v0.21.0rc3, which I’m happy to update at the next release candidate.
Code review ACK 2327cad (the first documentation commit is worth merging in any case)
This needs a 0.21 milestone.
Let’s also clarify in net.h
that fSuccessfullyConnected
implies VERACK
has been sent or received.
The change to ignore rather than disconnect might be safer wrt p2p network connectivity for the launch of 0.21, but we seem to be mitigated here, so I’ve dropped the commit unless there is consensus to bring it back.
Let’s also clarify in
net.h
thatfSuccessfullyConnected
impliesVERACK
has been sent or received.
Done.
2553@@ -2555,10 +2554,11 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat
2554 return;
2555 }
2556
2557+ // BIP155 defines feature negotiation of addrv2 and sendaddrv2, which must happen
Thanks for catching the French spelling slipping into the English :smiley:; fixed the commit message.
BIP155 refers to both addrv2
and sendaddrv2
as messages and doesn’t use the word “feature”, so I think the current wording is ok but happy to consider a new/better proposal.
Concept ACK, referring to BIPs makes the life of potential code readers easier
nit: in the commit message, s/negociation/negotiation/
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
No conflicts as of last run.
and add fSuccessfullyConnected doxygen documentation
to clarify that it is set to true on VERACK