Relay compact block messages prior to full block connection

TheBlueMatt commented at 7:35 am on December 19, 2016: member

First clean up net_processing to never have non-const access to CBlockIndex*es.

Then take advantage of #9026 to relay compact block announcements prior to full validation of the block. This has the (incredibly) nice property of not re-deserializing the same block over and over again in SendMessages when we go to announce the block.

Finally caches the block which was just announced to respond to getblocktxn requests without a similar deserialize (this could probably be extended to be used by other requests as well, but other requests are still going to want to lock cs_main and check that the block didn’t get rejected as invalid first).

fanquake added the label P2P on Dec 19, 2016

fanquake added the label Validation on Dec 19, 2016

gmaxwell commented at 7:40 am on December 19, 2016: contributor

could we have some benchmarks do demonstrate the gains from this? though I don’t doubt them, this is a fair amount of code… and benchmarks would be important for regression testing this important behavior.

TheBlueMatt commented at 7:58 am on December 19, 2016: member

Which part, specifically, would you like to see benchmarks on?

On December 18, 2016 11:40:22 PM PST, Gregory Maxwell notifications@github.com wrote:

could we have some benchmarks do demonstrate the gains from this? though I don’t doubt them, this is a fair amount of code… and benchmarks would be important for regression testing this important behavior.

– You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/bitcoin/bitcoin/pull/9375#issuecomment-267900573

gmaxwell commented at 8:41 am on December 19, 2016: contributor

latency to relay a block across many connections would be the obvious thing?

TheBlueMatt force-pushed on Dec 19, 2016

morcos commented at 4:50 pm on December 21, 2016: member

Is there ever a case where we would accept a block ourselves but we wouldn’t want to immediately announce it? I looked through all the places we call AcceptBlock/ProcessNewBlock and couldn’t immediately see a problem, but want to make sure we think about it carefully. I noticed this because I found it odd you had to change the preciousblock.py test. BTW, a better change there may just be to allow duplicate-inconclusive in the list of results..

in src/net_processing.cpp: in 0e6f73893f outdated

1531+                const CBlock& block = *most_recent_block;
1532+                BlockTransactions resp(req);
1533+                for (size_t i = 0; i < req.indexes.size(); i++) {
1534+                    if (req.indexes[i] >= block.vtx.size()) {
1535+                        Misbehaving(pfrom->GetId(), 100);
1536+                        LogPrintf("Peer %d sent us a getblocktxn with out-of-bounds tx indices", pfrom->id);

rebroad commented at 5:26 pm on December 21, 2016:

this chunk of code is duplicated elsewhere?

TheBlueMatt commented at 8:07 am on December 22, 2016:

OK, DRY’d.

in src/net_processing.cpp: in 1fcf0cc93b outdated

767+        CNodeState &state = *State(pnode->GetId());
768+        // If the peer has, or we announced to them the previous block already,
769+        // but we don't think they have this one, go ahead and announce it
770+        if (state.fPreferHeaderAndIDs && (!fWitnessEnabled || state.fWantsCmpctWitness) &&
771+                !PeerHasHeader(&state, pindex) && PeerHasHeader(&state, pindex->pprev)) {
772+            connman->PushMessage(pnode, msgMaker.Make(NetMsgType::CMPCTBLOCK, cmpctblock));

morcos commented at 7:10 pm on December 21, 2016:

maybe add LogPrint("net", "%s sending header-and-ids %s to peer %d..

TheBlueMatt commented at 8:07 am on December 22, 2016:

Fixed.

in src/net_processing.cpp: in 0e6f73893f outdated

759+void PeerLogicValidation::NewPoWValidBlock(const CBlockIndex *pindex, const std::shared_ptr<const CBlock>& pblock) {
760+    CBlockHeaderAndShortTxIDs cmpctblock(*pblock, true);
761+    CNetMsgMaker msgMaker(PROTOCOL_VERSION);
762+
763+    LOCK(cs_main);
764+    bool fWitnessEnabled = IsWitnessEnabled(pindex, Params().GetConsensus());

morcos commented at 7:35 pm on December 21, 2016:

shouldn’t this be pindex->pprev? I don’t suppose it matters much, but I think you’ll be thinking segwit activated 1 block early

TheBlueMatt commented at 8:07 am on December 22, 2016:

Fixed.

rebroad commented at 0:06 am on December 27, 2016:

@morcos don’t you mean one block late?

morcos commented at 7:54 pm on December 21, 2016: member

This seems like its going to be super awesome. code review ACK modulo nits, doing some testing…

TheBlueMatt commented at 8:08 am on December 22, 2016: member

Fixed comments, note that the SendBlockTransactions function seems a bit odd now, but the goal longer-term is to move messages into their own processing groups, where such functions will be really sane.

TheBlueMatt force-pushed on Dec 22, 2016

gmaxwell commented at 12:36 pm on December 22, 2016: contributor

Perhaps Newpowvalidblock should be gated on there actually being any peers that request a compact block from us? Otherwise we’ll go through the work of constructing a compact block there only to send it to no one. :)

Thoughts on caching the compact block like you cache the block? You would want it for non-pref peers that request it after you announce the header… and it’s reasonably small.

morcos commented at 3:25 pm on December 22, 2016: member

utACK 5bc61fa

jonasschnelli added this to the milestone 0.14.0 on Dec 22, 2016

TheBlueMatt commented at 11:31 pm on December 22, 2016: member

@gmaxwell, well those requests are somewhat contradictory, so I picked the cache-compact-representation option :p.

gmaxwell commented at 6:08 pm on December 23, 2016: contributor

@TheBlueMatt not so! as it could skip processing when there are no peers requesting, and update caches any time it processes!

I don’t think it matters except in one case:

Isolated mining node with no HB requesting peers, constructing the short block earlier will delay block validation which delays updating the create new block. I’m guessing the delay is <10ms, so probably inconsequential. But thought I’d bring it to your attention.

TheBlueMatt commented at 6:18 pm on December 23, 2016: member

Yea, I’m not worried about the processing time for calling that once. Long term I’d like to run most of the CValidationInterface callbacks into background threads, but that’s probably an 0.15-targeted thing.

On December 23, 2016 1:08:51 PM EST, Gregory Maxwell notifications@github.com wrote:

@TheBlueMatt not so! as it could skip processing when there are no peers requesting, and update caches any time it processes!

I don’t think it matters except in one case:

Isolated mining node with no HB requesting peers, constructing the short block earlier will delay block validation which delays updating the create new block. I’m guessing the delay is <10ms, so probably inconsequential. But thought I’d bring it to your attention.

– You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/bitcoin/bitcoin/pull/9375#issuecomment-269026347

in src/net_processing.cpp: in 1fcf0cc93b outdated

759+    LOCK(cs_main);
760+    bool fWitnessEnabled = IsWitnessEnabled(pindex, Params().GetConsensus());
761+
762+    connman->ForEachNode([this, &cmpctblock, pindex, &msgMaker, fWitnessEnabled](CNode* pnode) {
763+        // TODO: Avoid the repeated-serialization here
764+        if (pnode->nVersion < INVALID_CB_NO_BAN_VERSION)

gmaxwell commented at 10:27 pm on December 23, 2016:

I believe this needs a fDisconnect check.

TheBlueMatt commented at 2:05 am on December 24, 2016:

Fixed

TheBlueMatt force-pushed on Dec 24, 2016

TheBlueMatt commented at 4:18 pm on December 24, 2016: member

Squashed with no diff-tree.

in src/validation.cpp: in c3084dd684 outdated

3034 {
3035     {
3036         LOCK(cs_main);
3037         for (const CBlockHeader& header : headers) {
3038-            if (!AcceptBlockHeader(header, state, chainparams, ppindex)) {
3039+            // cast away the ppindex-returns-const CBlockIndex - we're just assigning it to a CBlockIndex*

theuni commented at 8:15 pm on December 29, 2016:

Hmm, this is non-obvious to a caller. And it’s especially bad that one of them relies on the const not actually being const :(

Maybe return an updated index, or null in the case of failure? Or return a pair?

TheBlueMatt commented at 8:41 pm on December 29, 2016:

Really? I dont see anything non-obvious here? The caller passes in a reference to a const CBlockIndex*, and the function sets that element to the new CBlockIndex*, ie it is returning a const CBlockIndex*, not editing a CBlockIndex which was passed in.

ryanofsky commented at 5:31 pm on January 3, 2017:

To avoid the const cast, you could do something like:

0            CBlockIndex* pindex = nullptr;
1            if (!AcceptBlockHeader(header, state, chainparams, &pindex)) {
2                return false;
3            }
4            *ppindex = pindex;

sipa commented at 6:25 pm on January 11, 2017:

I prefer @ryanofsky’s code.

TheBlueMatt commented at 10:57 pm on January 11, 2017:

OK, changed.

in src/validation.cpp: in 1af5e30d2c outdated

3801@@ -3799,16 +3802,19 @@ bool LoadExternalBlockFile(const CChainParams& chainparams, FILE* fileIn, CDiskB
3802                     std::pair<std::multimap<uint256, CDiskBlockPos>::iterator, std::multimap<uint256, CDiskBlockPos>::iterator> range = mapBlocksUnknownParent.equal_range(head);
3803                     while (range.first != range.second) {
3804                         std::multimap<uint256, CDiskBlockPos>::iterator it = range.first;
3805-                        if (ReadBlockFromDisk(block, it->second, chainparams.GetConsensus()))

theuni commented at 8:34 pm on December 29, 2016:

Maybe teach ReadBlockFromDisk to read to a shared_ptr reference to avoid the temptation?

theuni commented at 8:37 pm on December 29, 2016:

Or just use a new precursiveblock here with limited scope

TheBlueMatt commented at 10:16 pm on December 29, 2016:

Fixed.

in src/net_processing.cpp: in a14126ad3b outdated

756+    CBlockHeaderAndShortTxIDs cmpctblock(*pblock, true);
757+    CNetMsgMaker msgMaker(PROTOCOL_VERSION);
758+
759+    LOCK(cs_main);
760+    bool fWitnessEnabled = IsWitnessEnabled(pindex->pprev, Params().GetConsensus());
761+    uint256 hashBlock(pblock->GetHash());

theuni commented at 8:53 pm on December 29, 2016:

How about generating a set of to-be-sent nodes here, with cs_main scope-locked. Then pass the set into ForEachNode and send only to the matches?

That saves cs_main (or whatever lock) from being held for the duration, and opens the door to deduped sending.

TheBlueMatt commented at 9:11 pm on December 29, 2016:

AFICT the only difference is that we can skip calling connman->PushMessage….is connman->PushMessage slow enough to care?

theuni commented at 9:18 pm on December 29, 2016:

You can avoid calling it for each node by iterating mapNodeState for the set up front.

One PushMessage is no concern, but avoiding it for the entire ForEachNode would be nice.

theuni commented at 9:25 pm on December 29, 2016:

Actually, nevermind. We can improve CConnman’s interface for this kind of thing later.

TheBlueMatt commented at 9:27 pm on December 29, 2016:

Yup, that was my (too heavily implied) point :).

in src/net_processing.cpp: in 2ad1f254aa outdated

763@@ -760,6 +764,12 @@ void PeerLogicValidation::NewPoWValidBlock(const CBlockIndex *pindex, const std:
764     bool fWitnessEnabled = IsWitnessEnabled(pindex->pprev, Params().GetConsensus());
765     uint256 hashBlock(pblock->GetHash());
766 
767+    {

theuni commented at 9:04 pm on December 29, 2016:

Make this a member of PeerLogicValidation, then you can use a weak_ptr for the global, and I think no locks are needed

TheBlueMatt commented at 9:57 pm on December 29, 2016:

I’m confused as to exactly what you’re suggesting here. I don’t particularly want a weak_ptr because I do want to hold the latest block for responses for a while after the block is connected/broadcast.

theuni commented at 1:17 am on December 30, 2016:

These are the types of changes that make me nervous about shared_ptrs, because it gets easy to start stashing things that have no obvious lifetime constraint.

I was suggesting that you hold the shared_ptr in the PeerLogicValidation class. You then use a global weak_ptr which is overwritten by each new block. Then at any random time you can use std::shared_ptr<const CBlock> bestblock = shared_ptrglobal_last_block.lock().

That way you can extend the lifetime of the whatever block is current, but it’s also obvious when it will be destroyed (with the PeerLogicValidation destruction).

TheBlueMatt commented at 3:37 am on January 1, 2017:

Hmm…does this make it that much more obvious? We still have a shared_ptr which is overwritten in NewPoWValidBlock and other sites which take references to it to extend its lifetime. I can add a comment noting that users MUST NOT extend the lifetime significantly, would that be sufficient to address your concern?

in src/net_processing.cpp: in 2ad1f254aa outdated

1103+        }
1104+        resp.txn[i] = block.vtx[req.indexes[i]];
1105+    }
1106+    LOCK(cs_main);
1107+    CNetMsgMaker msgMaker(pfrom->GetSendVersion());
1108+    int nSendFlags = State(pfrom->GetId())->fWantsCmpctWitness ? 0 : SERIALIZE_TRANSACTION_NO_WITNESS;

theuni commented at 9:11 pm on December 29, 2016:

No need to stay locked for PushMessage

TheBlueMatt commented at 9:59 pm on December 29, 2016:

Is it slow enough to matter? After #9414 (and some std::atomic usage) we need to remove the cs_main lock completely here to make getblocktxn for the current head cs_main-free.

TheBlueMatt force-pushed on Dec 29, 2016

TheBlueMatt commented at 10:18 pm on December 29, 2016: member

Fixed one of @theuni’s comments and rebased. Diff is https://0bin.net/paste/lZfeK4vg0ZdebkqX#AJKsFyZ3Ai1oiaRiUt-1S0tjFK9wrZ70aP74IepwbsY

gmaxwell commented at 2:48 am on December 30, 2016: contributor

One behavior that should be mentioned is that caching means that if I am connected to a since node via multiple channels, it will identify itself as the same node by using the same salt across them. The salt sharing is something the protocol was carefully designed to enable, and there are many other ways that a node identifies itself as equal on different links: so I don’t consider it a concern. But since we do sometimes fix ‘fingerprinting’ it might surprise someone.

(My view is that we fix fingerprinting that happens at different times– that the host that was at 1.2.3.4 is now at 2.3.4.5–, but verifying that two peers you are currently connected to are really the same peer, is too hard to avoid.)

morcos commented at 5:40 pm on December 30, 2016: member

re-ACK 8022035

huge improvement to cache the cmpctblock, reduces time to relay cmpctblock to each additional peer from 30ms to 1ms.

morcos commented at 0:29 am on December 31, 2016: member

Mentioned on IRC, combining this with #9400 can lead to intermittent failures of sendheaders.py

#9400 causes the node announcing the reorg in that test to be a HB peer, and block requests generated by FindNextBlocksToDownload in response to cmpctblocks relayed by NewPoWValidBlock are in a race condition against the tip being updated in the announcing peer.

TheBlueMatt commented at 4:16 pm on December 31, 2016: member

As @morcos points out, this needs a bit more thought - currently you might announce a compact block in HB mode, the receiving peer might then, for various reasons, request a response in the form of a full block instead of as a getblocktxn (eg #8800). At this point, you unlock cs_main prior to ActivateBestChain in ProcessNewBlock, allowing the other peer’s block request the be processed, but you will not respond because you “dont have the block”. A simple fix might be calling ActivateBestChain prior to block requests (since we cannot simply respond with the block without having validated it - this would be a protocol violation).

TheBlueMatt commented at 3:32 am on January 1, 2017: member

Pushed a new commit which appears to fix @morcos’ test failures in #9447. See the BIP clarifications at https://github.com/bitcoin/bips/pull/486 for associated info.

in src/net_processing.cpp: in 73817e2294 outdated

781+        // If the peer has, or we announced to them the previous block already,
782+        // but we don't think they have this one, go ahead and announce it
783+        if (state.fPreferHeaderAndIDs && (!fWitnessEnabled || state.fWantsCmpctWitness) &&
784+                !PeerHasHeader(&state, pindex) && PeerHasHeader(&state, pindex->pprev)) {
785+
786+            LogPrint("net", "%s sending header-and-ids %s to peer %d\n", "PeerLogicValidation::NewPoWValidBlock",

rebroad commented at 10:27 am on January 1, 2017:

__func__ can be used here?

TheBlueMatt commented at 3:35 pm on January 1, 2017:

I’m not sure that it can due to the use of the inline function - do you care to test what __FUNC__ means on different compilers in this case?

morcos commented at 5:44 pm on January 1, 2017:

I tried that in my first patch to print this. it printed “operator ()”

in src/net_processing.cpp: in 73817e2294 outdated

1524+        // known chain now. This is super overkill, but we handle it better
1525+        // for getheaders requests, and there are no known nodes which support
1526+        // compact blocks but still use getblocks to request blocks.
1527+        {
1528+            CValidationState dummy;
1529+            ActivateBestChain(dummy, Params());

morcos commented at 5:53 pm on January 1, 2017:

Just a note that we’ll be calling this for every GETBLOCKS request. But I suppose ActivateBestChain isn’t too expensive in the common case.

TheBlueMatt commented at 11:54 pm on January 1, 2017:

Yea, its lightweight-enough that we cant really consider it a DoS issue (please review, but I’m pretty sure here), and its not like we care much about the performance of GETBLOCKS anymore…

sipa commented at 8:25 pm on January 11, 2017:

All it does is a FindMostWorkChain, which should be very cheap if we’re already synced (return the end from setBlockIndexCandidates, and comparing it to chainActive).

sipa commented at 8:27 pm on January 11, 2017:

ActivateBestChain is usually called without holding cs_main (as it tries to release the lock in between activations if there are multiple). Is it necessary to call it within cs_main here?

theuni commented at 8:59 pm on January 11, 2017:

Mentioned on IRC as well:

An example of an unintended side-effect of this:

miner mines a block, hands it to ProcessNewBlock()
ProcessMessages thread simultaneously receives a GETBLOCKS
ProcessMessages calls ActivateBestChain just after miner finishes AcceptBlock() and releases cs_main.
ConnectTip ends up getting called with a null block, activating the chain with the miner’s new block, but forcing it to be read from disk
Miner’s ActivateBestChain is a no-op

The above is unlikely and harmless (minus the msec it takes to read/de-serialize from disk), but it’s an example of how someone hammering a miner with GETBLOCKS would be capable of affecting validation. Presumably with multi-threaded message processing, a node hammering with GETBLOCKS might be able to force all block reads to disk this way.

If the intention is simply to ensure that we’ve fully validated all known blocks at this moment, I’d be much more comfortable with some function that simply blocks until any current ActivateBestChain() completes. That would also protect us from oopses in the future when processing becomes multi-threaded.

TheBlueMatt commented at 6:57 pm on January 12, 2017:

@sipa Yes, moved this one out of the cs_main, but the other one is much harder to move out of cs_main (luckily the one that handles getdatas is much less likely to have long-runtime because its specifically gated already). @theuni Will follow up on IRC, though I fixed the particular issue you mentioned (not that its a response to your general concern).

sipa commented at 5:59 pm on January 1, 2017: member

I wish that we had a way to announce that we accept blocks that are not fully validated (as long as they satisfy PoW and size limits). Then compact blocks could be orthogonal to the choice of relay before/after validation.

morcos commented at 3:30 am on January 2, 2017: member

ACK 73817e2

EDIT (ACK contingent on @sdaftuar’s 3 proposed changes)

instagibbs commented at 4:45 pm on January 3, 2017: member

utACK https://github.com/bitcoin/bitcoin/commit/73817e22946d0f00957b18443635656ec40914ad

in src/net_processing.cpp: in 802203579d outdated

2796                     int nSendFlags = state.fWantsCmpctWitness ? 0 : SERIALIZE_TRANSACTION_NO_WITNESS;
2797-                    connman.PushMessage(pto, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, cmpctblock));
2798+
2799+                    LOCK(cs_most_recent_block);
2800+                    if (most_recent_block_hash == pBestIndex->GetBlockHash()) {
2801+                        connman.PushMessage(pto, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, *most_recent_compact_block));

sdaftuar commented at 6:21 pm on January 3, 2017:

The cached compact block was built using wtxid’s rather than txid’s, so we have to check to see if the peer wants compact witness ids before we can use the cached version, no?

sdaftuar commented at 6:54 pm on January 3, 2017: member

Is there ever a case where we would accept a block ourselves but we wouldn’t want to immediately announce it? @morcos @TheBlueMatt One area worth some thought is how to handle block relay while processing a reorg. Before this PR, we wouldn’t relay any block with less work than our tip unless we had completed a reorg to a new tip. After this PR, I believe we would relay blocks with work less than or equal to our tip (as compact blocks, even) before we know that the reorg would succeed.

And in fact, upon receipt of a compact block with <= work than our tip, we don’t bother trying to process. So I’m not sure there’s any benefit to relaying before validation for blocks that aren’t just building on our tip?

sdaftuar commented at 9:03 pm on January 3, 2017: member

Actually I think there are some problems with relaying blocks that have work equal to our tip. Conceptually, we shouldn’t be helping our peers end up on a tip that is competing with ours, so even though we download blocks at the same work as our tip, I think it’s important that we not relay such blocks.

More practically, in the CMPCTBLOCK handling code, we call UpdateBlockAvailability() after processing the header, which will overwrite the value of pindexBestKnownBlock for our peer. I believe this means that if we relay a block that has the same work as our tip, then our peers will “forget” that they can even download our actual tip from us(!).

My suggestion: only call NewPoWValidBlock() in AcceptBlock() for blocks that build on our tip.

in src/net_processing.cpp: in 73817e2294 outdated

948@@ -949,6 +949,16 @@ void static ProcessGetData(CNode* pfrom, const Consensus::Params& consensusParam
949                 BlockMap::iterator mi = mapBlockIndex.find(inv.hash);
950                 if (mi != mapBlockIndex.end())
951                 {
952+                    if (mi->second->nChainTx && !mi->second->IsValid(BLOCK_VALID_SCRIPTS) &&

sdaftuar commented at 9:06 pm on January 3, 2017:

I should think this through a bit more, but I believe the requirement that the block be VALID_SCRIPTS is stronger than is needed – in particular, if two blocks arrive simultaneously, we might relay both before picking which of the two is our tip, and then gating ProcessGetData() on VALID_SCRIPTS will cause us to stall a peer who picked the other.

Perhaps we should change VALID_SCRIPTS to VALID_TRANSACTIONS below, and then actually I’m not sure if we would still need this code path?

morcos commented at 9:48 pm on January 3, 2017:

@gmaxwell see this comment. I think this would cover the case of needing to cache more than one block..

TheBlueMatt commented at 1:54 am on January 4, 2017:

This seems like a rather large change (that I dont particularly want to do in this PR). Is it sufficient to not do the pre-forward unless the new block builds on our tip?

morcos commented at 3:02 am on January 4, 2017:

commented on IRC, but no i think we need a fix here.

Otherwise we can be in a situation where we stall a peer who asks for a block we just announced to them.

i.e. We try and connect A and A’ near simultaneously as new tips. We announce both, but whichever gets connected first, the other we will stall requests for. This can happen even without multi-threaded ProcessMessages via a race with submitblock.

sdaftuar commented at 5:15 pm on January 4, 2017:

I think it is safe to drop the VALID_SCRIPTS check altogether (ie no need to replace with any other validity check such as VALID_TRANSACTIONS as I suggested above) when we are setting the send variable below, and we don’t need to call ActivateBestChain() here at all. We are still adequately protected against fingerprinting attacks with the GetBlockProofEquivalentTime() check.

When we added the fingerprinting protection, it was the case that we only relayed things that were on our chain (at some point), so the VALID_SCRIPTS check was fine as a belt-and-suspenders test, but now that this PR relaxes that requirement, I think it makes sense to remove.

sdaftuar commented at 5:16 pm on January 4, 2017:

ping @sipa, please see above…

sdaftuar commented at 7:25 pm on January 6, 2017:

Ok after further thought and offline discussion, I retract my objection to leaving the VALID_SCRIPTS check in.

My original concern was that there were some race conditions where we might announce a block pre-validation, and then never test the block’s validity, resulting in stalling block download to a peer for a valid block. After the change to only do pre-validation announcements for a single block at a given height and only for blocks that build on our tip, I think this is no longer an issue.

A secondary concern was generally figuring out how to handle block requests for invalid blocks. In the typical case of a cmpctblock announcement being followed by a getblocktxn request, I believe everything works as intended (a node would provide the blocktxn, and the peer would not punish if the block turned out to be invalid). However if a peer somehow fell back to a getdata request for that block, then this code would cause us to stall the peer’s block download, likely resulting in them disconnecting us.

The alternative, however, of delivering the block when we know it is invalid is potentially even worse – they would ban us, absent more code changes to handle this situation (which #9026 overlooked), which is a lot more code complexity.

I haven’t been able to come up with realistic scenarios where someone could provoke these conditions (ie, produce an invalid block in such a way as to also cause nodes to fall back to getdata responses to the cmpctblock announcement), so I think this is an acceptable behavior to introduce for now. In the future, though it would be nice to add a protocol extension so that we could have the option of telling our peer that a block request is being ignored, so that the peer could do something smarter than just disconnect us for stalling download.

To remind us, I think it’d be nice to add a comment in ProcessGetData() that mentions this issue, something like:

0// TODO: extend the protocol to allow us to tell our peer that we're not going to deliver a block that we've previously announced, eg because we don't think the block is valid.
1// This would give allow our peer to be make a better decision than just wait indefinitely or disconnect us for stalling block download.

TheBlueMatt commented at 1:59 am on January 4, 2017: member

Pushed fixes for @sdaftuar’s comments, note that travis should fail the second-to-last-one as I added a test for the bug identified.

Make CBlockIndex*es in net_processing const 80175472d1

[qa] Make compact blocks test construction using fetch methods 9a0b2f4c5b

[qa] Avoid race in preciousblock test.

If node 0 is sufficiently fast to announce its block to node 1,
node 1 might already have the block by the time the
node_sync_via_rpc loop gets around to node 1, resulting in the
submitblock result "duplicate-inconclusive" as node 1 has the block,
but prefers an alternate chain.

8baaba653e

Call AcceptBlock with the block's shared_ptr instead of CBlock& 180586fd44

TheBlueMatt force-pushed on Jan 4, 2017

TheBlueMatt commented at 8:57 pm on January 4, 2017: member

Rebased to clean up merge conflict, squashing the fixup commits. No changes to code.

Add a CValidationInterface::NewPoWValidBlock callback 6987219577

Relay compact block messages prior to full block connection c802092142

Cache most-recently-announced block's shared_ptr 9eaec08dd2

Cache most-recently-connected compact block 5749a853b9

Ensure we meet the BIP 152 old-relay-types response requirements

In order to do this, we must call ActivateBestChain prior to
responding getdata requests for blocks which we announced using
compact blocks.

For getheaders responses we dont need code changes, but do note
that we must reset the bestHeaderSent so that the SendMessages call
re-announces the header in question.

While we could do something smarter for getblocks, calling
ActivateBestChain is simple and more obviously correct, instead of
doing something more similar to getheaders.

See-also the BIP clarifications at
https://github.com/bitcoin/bips/pull/486

9eb67f5000

TheBlueMatt force-pushed on Jan 5, 2017

TheBlueMatt commented at 3:35 pm on January 5, 2017: member

Added a check that we only ever fast-announce once per block height, see https://0bin.net/paste/f1-cJq66rGul-VBY#vUW-L7QZR4+e8u/vx1ZNk1x3+5qn50ysFOghGC1gyLy I dont think this is strictly required, but it simplifies review significantly and is likely a better behavior anyway, since you can never reconstruct two compact blocks at the same height reasonable right now anyway.

Also fixed a rebase issue where a diff chunk ended up in the wrong commit.

Avoid holding cs_most_recent_block while calling ReadBlockFromDisk c1ae4fcf7d

in src/net_processing.cpp: in 5749a853b9 outdated

2868-                    CBlockHeaderAndShortTxIDs cmpctblock(block, state.fWantsCmpctWitness);
2869+
2870                     int nSendFlags = state.fWantsCmpctWitness ? 0 : SERIALIZE_TRANSACTION_NO_WITNESS;
2871-                    connman.PushMessage(pto, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, cmpctblock));
2872+
2873+                    LOCK(cs_most_recent_block);

ryanofsky commented at 5:50 pm on January 5, 2017:

Might be good to release cs_most_recent_block in the else case during ReadBlockFromDisk.

morcos commented at 6:19 pm on January 6, 2017: member

I’m the boy who cried ACK (4 times already) on this PR, so I’ll refrain from repeating it again, but your last changes look good to me and I feel like all open concerns are addressed.

I’m going to put it into action combined with #9441 on a few nodes.

morcos commented at 3:26 am on January 7, 2017: member

@TheBlueMatt What do you think about making the LogPrints say: sending header-and-ids 0000myluckyhash to peer=8888 instead of sending header-and-ids 0000myluckyhash to peer 8888

You only add one of those in this PR, the other LogPrint for header-and-ids does the same thing, but its a bit annoying that they are formatted differently than other net debug messages.

TheBlueMatt commented at 4:49 pm on January 7, 2017: member

@morcos did it separately in #9486.

in src/validationinterface.h: in c1ae4fcf7d outdated

 7@@ -8,6 +8,7 @@
 8 
 9 #include <boost/signals2/signal.hpp>
10 #include <boost/shared_ptr.hpp>
11+#include <memory>

laanwj commented at 12:47 pm on January 10, 2017:

This include is unnecessary

sipa commented at 8:04 pm on January 11, 2017:

I don’t think we should be relying on indirect includes, and this file does directly use shared_ptr.

TheBlueMatt commented at 4:31 pm on January 10, 2017: member

The include is only unnecessary because I’m sure boost/shared_ptr includes it. It is appropriate to add because we’re now using shared_ptrs in one of the functions.

On January 10, 2017 4:47:55 AM PST, “Wladimir J. van der Laan” notifications@github.com wrote:

laanwj commented on this pull request.

@@ -8,6 +8,7 @@

#include <boost/signals2/signal.hpp> #include <boost/shared_ptr.hpp> +#include

This include is unnecessary

– You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/bitcoin/bitcoin/pull/9375#pullrequestreview-15907842

laanwj commented at 12:55 pm on January 11, 2017: member

Ok, fair enough.

laanwj commented at 12:59 pm on January 11, 2017: member

utACK c1ae4fc

in src/net_processing.cpp: in 9eaec08dd2 outdated

1606-            }
1607-            resp.txn[i] = block.vtx[req.indexes[i]];
1608-        }
1609-        int nSendFlags = State(pfrom->GetId())->fWantsCmpctWitness ? 0 : SERIALIZE_TRANSACTION_NO_WITNESS;
1610-        connman.PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::BLOCKTXN, resp));
1611+        SendBlockTransactions(block, req, pfrom, connman);

sipa commented at 8:03 pm on January 11, 2017:

Should we unlock cs_main before this call?

TheBlueMatt commented at 10:49 pm on January 11, 2017:

We take cs_main again inside SendBlockTransactions, so I dont think its worth unlocking here.

sipa commented at 10:54 pm on January 11, 2017:

It probably doesn’t matter too much here, but it’s easier to reason about function that are only called with certain locks held or not held, and not both depending on the circumstances.

TheBlueMatt commented at 10:59 pm on January 11, 2017:

I hope to remove the cs_main in the function (and then not call it with cs_main) after #9419 + a few more changes.

theuni commented at 9:17 pm on January 11, 2017: member

utACK all except #9375 (review). (see IMO we need a more general-purpose solution for fencing here, otherwise we’re asking for future bugs.

Use a temp pindex to avoid a const_cast in ProcessNewBlockHeaders 0df777db6d

Call ActivateBestChain without cs_main/with most_recent_block

There is still a call to ActivateBestChain with cs_main if a peer
requests the block prior to it being validated, but this one is
more specifically-gated, so should be less of an issue.

962f7f054f

Add comment to describe callers to ActivateBestChain 73666ad059

TheBlueMatt commented at 8:17 pm on January 12, 2017: member

Address all of @sipa’s comments, and added a comment to (hopefully) appease @theuni.

morcos commented at 9:38 pm on January 12, 2017: member

OK, I’m going for number 5!

tested ACK c1ae4fc utACK 73666ad

To reemphasize, the caching in this PR provides a huge performance win.

theuni commented at 3:08 am on January 13, 2017: member

@TheBlueMatt Thanks for adding the comment. utACK 73666ad05932429c860efe74eb388d212c152fc4.

sdaftuar commented at 3:58 pm on January 13, 2017: member

utACK 73666ad

in src/net_processing.cpp: in 5749a853b9 outdated

753@@ -754,10 +754,11 @@ void PeerLogicValidation::SyncTransaction(const CTransaction& tx, const CBlockIn
754 
755 static CCriticalSection cs_most_recent_block;
756 static std::shared_ptr<const CBlock> most_recent_block;
757+static std::shared_ptr<CBlockHeaderAndShortTxIDs> most_recent_compact_block;

sipa commented at 6:48 pm on January 13, 2017:

Can this be a std::make_shared<const CBlockHeaderAndShortTxIDS> (slightly easier to reason about thread safety when shared_ptr objects are const).

TheBlueMatt commented at 8:50 pm on January 13, 2017:

Ok, done.

in src/net_processing.cpp: in 5749a853b9 outdated

2871-                    connman.PushMessage(pto, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, cmpctblock));
2872+
2873+                    LOCK(cs_most_recent_block);
2874+                    if (most_recent_block_hash == pBestIndex->GetBlockHash()) {
2875+                        if (state.fWantsCmpctWitness)
2876+                            connman.PushMessage(pto, msgMaker.Make(nSendFlags, NetMsgType::CMPCTBLOCK, *most_recent_compact_block));

sipa commented at 8:34 pm on January 13, 2017:

Idea for follow-up PR: cache the CSerializedNetMsg instead of the CBlockHeaderAndShortTxIDs (smaller and faster to process)?

sipa commented at 8:41 pm on January 13, 2017: member

utACK 73666ad05932429c860efe74eb388d212c152fc4

TheBlueMatt commented at 8:50 pm on January 13, 2017: member

Pushed one (hopefully last) commit which only adds two const’s to net_processing at @sipa’s request.

Make most_recent_compact_block a pointer to a const 02ee4eb263

in src/net_processing.cpp: in 38bcdf72fd outdated

758+static std::shared_ptr<const CBlockHeaderAndShortTxIDs> most_recent_compact_block;
759 static uint256 most_recent_block_hash;
760 
761 void PeerLogicValidation::NewPoWValidBlock(const CBlockIndex *pindex, const std::shared_ptr<const CBlock>& pblock) {
762-    std::shared_ptr<CBlockHeaderAndShortTxIDs> pcmpctblock = std::make_shared<CBlockHeaderAndShortTxIDs> (*pblock, true);
763+    std::shared_ptr<const CBlockHeaderAndShortTxIDs> pcmpctblock = std::make_shared<CBlockHeaderAndShortTxIDs> (*pblock, true);

sipa commented at 9:14 pm on January 13, 2017:

Also add const to the make_shared parameter?

TheBlueMatt commented at 9:28 pm on January 13, 2017:

Ok, did so and squashed it into the previous commit.

TheBlueMatt force-pushed on Jan 13, 2017

sipa merged this on Jan 13, 2017

sipa closed this on Jan 13, 2017

sipa referenced this in commit 3908fc4728 on Jan 13, 2017

in src/net_processing.cpp: in 02ee4eb263

968+                        {
969+                            LOCK(cs_most_recent_block);
970+                            a_recent_block = most_recent_block;
971+                        }
972+                        CValidationState dummy;
973+                        ActivateBestChain(dummy, Params(), a_recent_block);

rebroad commented at 2:21 pm on January 17, 2017:

I’m struggling to understand why this is really needed, unless it is here to cater for a re-org that goes back more than 30 days as the logic below allows blocks to be sent that are not in the main chain but are less than 30 days old.

gladcow referenced this in commit 0e096e7756 on Mar 8, 2018

gladcow referenced this in commit ee728dd709 on Mar 13, 2018

gladcow referenced this in commit 086f64731f on Mar 14, 2018

gladcow referenced this in commit d36a8179bf on Mar 15, 2018

gladcow referenced this in commit 22d0d3a7dc on Mar 15, 2018

gladcow referenced this in commit 2783896a39 on Mar 15, 2018

gladcow referenced this in commit 060cc80016 on Mar 15, 2018

gladcow referenced this in commit ba9f284b30 on Mar 24, 2018

gladcow referenced this in commit e7ae1ecfb2 on Apr 4, 2018

UdjinM6 referenced this in commit bc45a2f87a on Apr 11, 2018

andvgal referenced this in commit fd5c50bc2b on Jan 6, 2019

CryptoCentric referenced this in commit dd3fd51204 on Feb 28, 2019

MarcoFalke locked this on Sep 8, 2021

Relay compact block messages prior to full block connection #9375