Allow block announcements with headers

sdaftuar commented at 8:03 pm on July 30, 2015: member

This is an implementation of #5982 (and based off @sipa’s commit in #5307, which has been squashed into this commit).

Currently new blocks are announced via inv messages. Since headers-first download was introduced in 0.10, blocks are not processed unless the headers for the block connect to known, valid headers, so peers receiving inv messages respond with a getheaders and a getdata: the expectation is that the announcing peer will send the headers for the block and any parents unknown to the peer before delivering the new block.

In the case of a reorg, nodes currently announce only the new tip, and not the blocks that lead to the tip. This means that an extra round-trip is required to relay a reorg, because the peer receiving the inv for the tip must wait for the response to the getheaders message before being able to request the missing blocks. (For additional reasons discussed in #5307 (comment), it’s not optimal to inv all the blocks added in the reorg to a headers-first peer, because this results in needless traffic due to the way getheaders requests are generated when direct-fetching inv‘ed blocks.)

This pull implements a new sendheaders p2p message, which a node sends to its peers to indicate that the node prefers to receive new block announcements via a headers message rather than an inv. This implementation does a few things:

When announcing a new block to a peer that prefers headers, try to announce all the blocks that are being added, back to the last fork point, by sending headers for each.
Avoid sending headers for blocks that are already known to the peer (determined by checking pindexBestKnownBlock and pindexBestHeaderSent and their ancestors).
- pindexBestKnownBlock is updated based on inv messages, headers sent from that peer, and getheaders messages sent from the peer.
- pindexBestHeaderSent is a new variable that tracks the best header we have sent to the peer
Avoid sending more than 8 headers to the peer. This code is designed to be optimized for relaying at the tip, so a large reorg is some kind of error condition; in that case avoid spamming the peer with a lot of headers and instead fall back to the old inv mechanism.
If the headers to be announced aren’t known to connect to headers that the peer has, then revert to sending an inv at the tip instead. This is designed to avoid DoS points from sending headers that don’t connect. Since every new block should generally be announced by one peer in any pair of peers, it’s expected that once headers-announcement begins on a link (ie once two nodes are known to be synced), it should be able to continue.
After #6148 is merged, this would allow pruning nodes to generally be able to relay reorgs to peers via headers announcements. The code that implements direct fetching upon receipt of a headers message will send getdata requests to any peer that announces a block via a headers message.

This pull includes a new python p2p test exercising the new functionality.

BIP 130 describes the proposed new sendheaders p2p message here: https://github.com/bitcoin/bips/blob/master/bip-0130.mediawiki

rebroad commented at 8:55 pm on July 30, 2015: contributor

This isn’t really needed is it? If a node is already sending “getheaders” then presumably a node can assume it prefers headers rather than invs, perhaps?

sdaftuar commented at 0:28 am on July 31, 2015: member

@rebroad Sorry, I just realized I should have emphasized more clearly that this PR also includes code to enable direct-fetching of blocks based on headers messages. Without that change, nodes that received a headers message would otherwise wait to download blocks through the existing parallel fetch mechanism, which would generally make headers-announcement inferior to inv-announcements (because we have direct-fetch code in the inv-processing logic, which means we’d be quicker to request a new block).

Consequently I think it makes sense for nodes to somehow opt-in to the new behavior. 0.10 and 0.11 nodes should continue to receive inv’s even after this code is merged.

One thing that wasn’t clear to me was whether it’s really necessary to do the protocol version bump in order to deploy the new p2p message, given that the new message is backwards compatible – nodes are free to ignore it. However I’m not sure I have a good grasp of how the protocol versions should be thought about.

laanwj added the label P2P on Jul 31, 2015

rebroad commented at 11:35 am on July 31, 2015: contributor

@sdaftuar Why not use the existing parallel fetch mechanism? I don’t see any advantage in fetching the block outside of that.

sipa commented at 12:37 pm on July 31, 2015: member

The existing parallel fetch mechanism has a delay and an extra roundtrip, because it first needs to ask for headers before it can start fetching any block.

Sending all headers immediately when announcing the block avoids that round-trip, potentially improving propagation speed significantly over high-latency links.

sdaftuar force-pushed on Aug 3, 2015

sdaftuar commented at 5:44 pm on August 3, 2015: member

Fixed @casey’s nits

sdaftuar force-pushed on Sep 9, 2015

sdaftuar commented at 5:09 pm on September 9, 2015: member

Rebased.

sdaftuar force-pushed on Sep 22, 2015

sdaftuar commented at 3:38 pm on September 24, 2015: member

I’m planning to send the draft BIP out to the mailing list for comments, but before I do so, does anyone here have any guidance about whether it is necessary to bump the protocol version to introduce the new sendheaders p2p message? It is safe to ignore the message, but I’m not sure if its generally considered helpful to change the protocol version when adding even an optional message…

laanwj commented at 3:58 pm on September 24, 2015: member

I think it’s helpful to bump the protocol version in this case. Although it’s only an optional hint, it may provide more clarity (eg when documenting) and help troubleshooting.

laanwj commented at 9:52 am on October 6, 2015: member

How is the BIP for this coming along? Need any help?

sdaftuar commented at 2:34 pm on October 6, 2015: member

@theuni suggested some alternate ideas to the “sendheaders” p2p message: either extend the version message to include a bool that indicates the preference to receive headers announcements, or even just allocate a service bit that indicates that preference.

Between those 3 options I don’t feel strongly about the best way to deploy – and in particular I can understand why it might not be great to go with adding a new p2p message that causes node state/behavior to change. @theuni mentioned he might respond on-list with his alternate suggestions; any thoughts here on those ideas?

sipa commented at 2:42 pm on October 6, 2015: member

@sdaftuar @theuni Extending “version” continuously isn’t a very scalable approach, and requires pretty strong consensus to do, as the order of entries matters. I don’t really understand why we kept doing that for so long.

A service bit has the same “synchronization” problem, but does have the extra advantage of making the property searchable.

gmaxwell commented at 4:25 pm on October 6, 2015: contributor

I don’t think it’s important that the property is searchable. For me I’d ask should this be its own message, or should we define a “flags” message, for sending more, non-searchable capabilities.

theuni commented at 2:16 am on October 8, 2015: member

@sipa @gmaxwell My preference for not sending a new messages comes from an event-driven implementor’s POV. If a remote nodes are allowed to switch preferences mid-stream, it can greatly (and needlessly) complicate the local node’s sending logic.

The easy way to avoid that is to disallow changing the preference back once set (this seems to be the case in @sdaftuar’s BIP). Taking that a step further, it also makes sense to only accept the message early in the connection process, say directly after version/verack, and before any inv. And if that’s the case, it may as well just be considered as part of the handshake.

Essentially, I would much prefer to avoid making life-of-the-connection properties stateful unless necessary. Extending the version message makes sense to me, but I understand @sipa’s objection there. @gmaxwell’s suggestion seems reasonable, assuming that the “flags” message had the requirement of being sent directly after version/verack. Absence of the message would actively (though perhaps unknowingly) communicate the desire for default flags (historic behavior). Again, this just seems like an extension of the version message to me, but if changes there are deemed problematic, this would be my preference.

sdaftuar force-pushed on Oct 16, 2015

sdaftuar commented at 6:27 pm on October 16, 2015: member

Rebased.

gavinandresen commented at 7:34 pm on October 21, 2015: contributor

This significantly speeds up new block propagation in the normal, build-on-the-best-chain case.

Benchmark numbers from a 5-node, 4-network-hop test network I created that relays empty blocks from massachusetts to los angeles and back, twice (round-trip latency of 100 msec):

Before: 1,300 msec latency from first block-creating node sending an ‘inv’ to last node in the chain receiving the ‘block’ message

With this pull: 670 msec

So concept ACK: I haven’t reviewed the code yet, and have no opinion on bumping protocol version versus new message.

gmaxwell commented at 11:56 pm on October 21, 2015: contributor

@gavinandresen Any chance you could repeat your test with the code in #6867? (This change is great, and we should do it, but I expect that almost all of the improvement in that benchmark will be from setting TCP_NODELAY.)

gavinandresen commented at 0:18 am on October 22, 2015: contributor

Yes, I’ll re-run tomorrow when I’m back in my office.

Gavin Andresen

On Oct 21, 2015, at 7:56 PM, Gregory Maxwell notifications@github.com wrote:

@gavinandresen Any chance you could repeat your test with the code in #6867? (This change is great, and we should do it, but I expect that almost all of the improvement in that benchmark will be from setting TCP_NODELAY.)

— Reply to this email directly or view it on GitHub.

sdaftuar commented at 8:46 pm on October 23, 2015: member

I believe this is ready for review; so far it doesn’t seem like the BIP (which relates to activation/deployment only) is likely to change substantively from what was originally proposed (see bitcoin/bips#221). It would be great to get this merged for 0.12 if possible.

gmaxwell added this to the milestone 0.12.0 on Nov 5, 2015

laanwj commented at 12:22 pm on November 13, 2015: member

I don’t think it’s important that the property is searchable. For me I’d ask should this be its own message, or should we define a “flags” message, for sending more, non-searchable capabilities.

Late to the party, but I prefer the way how it is implemented now - to have a separate message for requesting this feature - to a generic ‘flags’ message, as well as to adding a version bit.

Rationale: If centralization bottlenecks can be avoided (“another set of flags to allocate”), that’s strongly preferable.

utACK, intend to test.

in src/main.cpp: in 83602363a3 outdated

4526@@ -4439,6 +4527,28 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
4527             pfrom->PushMessage("getheaders", chainActive.GetLocator(pindexLast), uint256());
4528         }
4529 
4530+        CNodeState *nodestate = State(pfrom->GetId());
4531+        // If this set of headers is valid and ends in a block with more work

sipa commented at 4:10 pm on November 13, 2015:

Can’t this result in duplicate in-flights? The fact that it’s not yet our tip does not imply we’re not downloading it yet.

EDIT: nevermind, those aren’t added to vToFetch.

in src/main.cpp: in 83602363a3 outdated

5104+                BOOST_FOREACH(const uint256 &hash, pto->vBlockHashesToAnnounce) {
5105+                    BlockMap::iterator mi = mapBlockIndex.find(hash);
5106+                    assert(mi != mapBlockIndex.end());
5107+                    CBlockIndex *pindex = mi->second;
5108+                    if (chainActive[pindex->nHeight] != pindex) {
5109+                        // Bail out if we reorged away from this block

sipa commented at 4:17 pm on November 13, 2015:

Do we still want to inv in this case?

sdaftuar commented at 5:58 pm on November 13, 2015:

It’s possible (probable?) that the last entry in vBlockHashesToAnnounce is our current tip (so we do want to inv it), but it is also possible that it hasn’t made it there yet, and that last entry is some old tip that has now been reorged out.

I think it would be safe to add a check below to only inv the last item if it’s an ancestor of our current tip; does that sound right?

sipa commented at 6:09 pm on November 13, 2015:

I think it’s harmless, as the current code in this PR mimicks the old behaviour. But if we’re doing the check anyway, I think we can skip the entry.

sdaftuar commented at 6:11 pm on November 13, 2015:

I think the code to address this edge case is easy to write, but I believe this situation is so rare that it might be better to not add this code that we can’t really test? (At least I can’t figure out how I’d test it…)

I think it might be better to just add a LogPrint("net", ...) debug message in the event that we’re sending an inv for a block that is not on the main chain, rather than change network behavior. The downside to sending an extra inv in what we think is a rare situation is quite small, whereas if there’s a bug in this code somehow and it overly suppresses inv’s that could be both problematic and difficult to track down.

sipa commented at 6:13 pm on November 13, 2015:

Fair enough. It won’t hurt.

sipa commented at 4:22 pm on November 13, 2015: member

Concept ACK. Thorough code review ACK with one small nit. Lightly tested (two peers with this pull succesfully relay blocks across the internet, verified with -debug=net).

sipa commented at 4:25 pm on November 13, 2015: member

Needs rebase.

sdaftuar force-pushed on Nov 13, 2015

sdaftuar commented at 6:25 pm on November 13, 2015: member

Rebased and added log message when trying to announce a stale block.

sdaftuar commented at 10:03 pm on November 16, 2015: member

I’m looking into an issue @morcos noticed this afternoon; it seems problematic to update pindexBestKnownBlock from a locator received with a getheaders message, because that would imply we can download such blocks from our peer. Yet our peer can generate locators from headers they have rather than from blocks they have.

I am testing just removing that block of code; will update this PR once I confirm that is safe.

sipa commented at 10:42 pm on November 16, 2015: member

@sdaftuar @morcos Good catch, and too bad. I doubt it matters much.

sdaftuar force-pushed on Nov 17, 2015

sdaftuar commented at 3:40 pm on November 17, 2015: member

Updated this PR to eliminate updating pindexBestKnownBlock from the locator, and squashed back down to one commit.

The reason I had put this in initially was that I was concerned about there being a potential bootstrapping problem, but after further thought and some light testing I don’t think there’s a problem. The initial getheaders sync that happens after a connection is established should ensure that headers announcements start flowing immediately.

in src/main.cpp: in 4faca45a67 outdated

4581             if (pindexLast != NULL && header.hashPrevBlock != pindexLast->GetBlockHash()) {
4582                 Misbehaving(pfrom->GetId(), 20);
4583                 return error("non-continuous headers sequence");
4584             }
4585+            BlockMap::iterator it = mapBlockIndex.find(header.GetHash());
4586+            if (it != mapBlockIndex.end()) {

morcos commented at 6:10 pm on November 17, 2015:

It might be better to delete this check. pindexLast shouldn’t be set to a block that returns false from AcceptBlockHeader even if we already have the header.

in src/main.cpp: in 4faca45a67 outdated

4573@@ -4515,12 +4574,21 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
4574         }
4575 
4576         CBlockIndex *pindexLast = NULL;
4577+        std::vector<CBlockIndex *> vToFetch;
4578+        bool fCanDirectFetch = CanDirectFetch(chainparams.GetConsensus());

morcos commented at 6:24 pm on November 17, 2015:

add !fimporting && !fReindex

sdaftuar commented at 8:40 pm on November 17, 2015:

This block of code already has a guard so that we don’t process headers received while importing/reindexing.

in src/main.cpp: in 4faca45a67 outdated

4615@@ -4542,6 +4616,28 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
4616             pfrom->PushMessage("getheaders", chainActive.GetLocator(pindexLast), uint256());
4617         }
4618 
4619+        CNodeState *nodestate = State(pfrom->GetId());
4620+        // If this set of headers is valid and ends in a block with more work
4621+        // than our tip, download as much as possible.
4622+        if (fCanDirectFetch && pindexLast->IsValid(BLOCK_VALID_TREE) && chainActive.Tip()->nChainWork < pindexLast->nChainWork) {

morcos commented at 6:25 pm on November 17, 2015:

nit: already checked fCanDirectFetch above. maybe check vToFetch.size()

in qa/rpc-tests/sendheaders.py: in 4faca45a67 outdated

22+Part 1: No headers announcements before "sendheaders"
23+a. node mines a block [expect: inv]
24+   send getdata for the block [expect: block]
25+b. node mines another block [expect: inv]
26+   send getheaders and getdata [expect: headers, then block]
27+c. node mines another block [expect: inv]

morcos commented at 7:59 pm on November 17, 2015:

comment incorrect

in qa/rpc-tests/sendheaders.py: in 4faca45a67 outdated

285+        # PART 2
286+        # 2. Send a sendheaders message and test that headers announcements
287+        # commence and keep working.
288+        test_node.send_message(msg_sendheaders())
289+        prev_tip = int("0x" + self.nodes[0].getbestblockhash() + "L", 0)
290+        test_node.get_headers(locator=[prev_tip], hashstop=0L)

morcos commented at 8:04 pm on November 17, 2015:

this shouldn’t be necessary

sdaftuar commented at 9:02 pm on November 17, 2015:

Surprisingly, this code is necessary to make the test work. Turns out there’s an unusual feature in the getheaders handling, where we set pindexBestHeaderSent to chainActive.Tip() in two situations: if we actually are sending the tip, or if we have nothing to send. I think the latter case can only happen if our peer has our tip in their locator, so I believe this behavior is desirable, if surprising.

I’ve added a comment to the relevant line of code in main.cpp explaining this behavior.

in qa/rpc-tests/sendheaders.py: in 4faca45a67 outdated

293+
294+        # Now that we've synced headers, headers announcements should work
295+        tip = self.mine_blocks(1)
296+        assert_equal(inv_node.check_last_announcement(inv=[tip]), True)
297+        assert_equal(test_node.check_last_announcement(headers=[tip]), True)
298+        test_node.get_data([tip])

morcos commented at 8:05 pm on November 17, 2015:

why do you do this?

sdaftuar commented at 1:17 pm on November 18, 2015: member

Pushed a commit to address @morcos’ comments.

In cleaning up the RPC test I was surprised with what happens to pindexBestHeaderSent when a peer sends a getheaders with a locator that is up-to-date with our tip. Though I took out the code to parse the locator and update pindexBestKnownBlock, it turns out that if we generate no headers to return then we will end up setting pindexBestHeaderSent to equal chainActive.Tip().

This behavior was just due to a quirk in the code (the test that does that is checking to see if we’ve walked off the end of chainActive and assumes that if we’re at a NULL pointer then we must have returned our tip to the peer), but I believe this is correct behavior. I added a comment to the code explaining this quirk, but am leaving it in because I think it’s correct and beneficial.

On a separate note: I was reminded when I looked at this again that if a peer sends us a getheaders with a locator that is caught up to our tip, then we’ll send back an empty headers message. Is it important that this behavior be preserved? I left it unchanged for now, but I can clean this up and suppress an empty response if it’s safe to do so, either in this pull or separately.

sdaftuar force-pushed on Nov 23, 2015

Allow block announcements with headers

This replaces using inv messages to announce new blocks, when a peer requests
(via the new "sendheaders" message) that blocks be announced with headers
instead of inv's.

Since headers-first was introduced, peers send getheaders messages in response
to an inv, which requires generating a block locator that is large compared to
the size of the header being requested, and requires an extra round-trip before
a reorg can be relayed.  Save time by tracking headers that a peer is likely to
know about, and send a headers chain that would connect to a peer's known
headers, unless the chain would be too big, in which case we revert to sending
an inv instead.

Based off of @sipa's commit to announce all blocks in a reorg via inv,
which has been squashed into this commit.

9fa8c48abb

sdaftuar force-pushed on Nov 23, 2015

sdaftuar commented at 8:49 pm on November 23, 2015: member

I discovered there were some problems in the direct fetch logic.

The code was structured so that we could only direct fetch blocks which were announced in a set of headers. However, in the case of a reorg, we may have some of the headers on the reorged-to-chain, which our peers will not re-announce to us – meaning that the direct-fetch logic wouldn’t request needed missing blocks immediately.

I’ve restructured the direct fetch logic so that if we want to download towards the latest block a peer has announced to us, then we walk back from that tip and consider every block until we find one that is on our current chain. (We try to download every block that we don’t have that is not already in-flight, and we bail out if we’re walking back too many blocks.)

I also realized that sendheaders.py had no tests for the direct fetch logic, so I updated it to exercise that part of the code as well.

dcousens commented at 1:39 am on November 24, 2015: contributor

Great work @sdaftuar

in src/main.cpp: in 9fa8c48abb

4583@@ -4521,6 +4584,7 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
4584                 Misbehaving(pfrom->GetId(), 20);
4585                 return error("non-continuous headers sequence");
4586             }
4587+            BlockMap::iterator it = mapBlockIndex.find(header.GetHash());

sdaftuar commented at 1:18 pm on November 24, 2015:

Looks like I forgot to remove this.

sipa commented at 1:03 pm on November 28, 2015: member

Needs rebase.

sipa commented at 12:06 pm on November 29, 2015: member

Closed via #7129.

sipa closed this on Nov 29, 2015

rebroad referenced this in commit 44fd3250da on Dec 7, 2016

MarcoFalke locked this on Sep 8, 2021

Allow block announcements with headers #6494