p2p: Prevent block index fingerprinting by sending additional getheaders messages

dergoegge commented at 4:47 pm on March 15, 2022: member

The block index might contain stale blocks that are not part of the main chain. If a malicious peer is able to probe a node’s block index for certain stale blocks then it can use this information to fingerprint the node.

When receiving headers (either through a cmpctblock or headers messages) a node will send getheaders if the predecessor of the first header does not exist. This leaks information from the block index if the predecessor of the header is a stale block because no getheaders will be sent in that case revealing that the stale block exists in the index.

This PR prevents this fingerprinting by sending additional getheaders messages in cases where not doing so leaks the existence of stale blocks. To determine when additional messages should be send, we introduce the PeerManagerImpl::m_chain_tips_sets map which keeps track of seen chain tips per network, effectively creating a per network view of the node’s global block index. We only try to accept new headers if they connect to anything in our global index and they connect to our active chain or to a chain that was previously sent to us by a peer on the same network. We send a getheaders message should these conditions not be met.

fanquake added the label P2P on Mar 15, 2022

dergoegge force-pushed on Mar 15, 2022

DrahtBot commented at 6:50 pm on March 15, 2022: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	naumenkogs, ariard, fjahr, sdaftuar, stratospher
Stale ACK	jnewbery

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#26621 (refactor: Continue moving application data from CNode to Peer by dergoegge)
#26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)
#26378 (refactor: Pass reference to last header, not pointer by MarcoFalke)
#25968 (Optimizations & simplifications following #25717 by sipa)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

dergoegge force-pushed on Mar 15, 2022

naumenkogs commented at 1:29 pm on March 19, 2022: member

Concept ACK Neat and creative finding.

As I understood it, An attacker is currently able to see whether a node received a stale block previously, by sending headers on top of that stale block (does it require proof-of-work for this attack?). This PR prevents it by sending getheaders even if a block was received already (but was marked stale).

in src/net_processing.cpp:617 in 3a11265753 outdated

613@@ -614,7 +614,7 @@ class PeerManagerImpl final : public PeerManager
614      * and in best equivalent proof of work) than the best header chain we know
615      * about and we fully-validated them at some point.
616      */
617-    bool BlockRequestAllowed(const CBlockIndex* pindex) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
618+    bool BlockRequestAllowed(const CBlockIndex* pindex, bool check_valid_scripts = true) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

naumenkogs commented at 1:35 pm on March 19, 2022:

I’m not sure this is a good var name. This specific PR has nothing to do with valid scripts for example.

in src/net_processing.cpp:1458 in 3a11265753 outdated

1456 {
1457     AssertLockHeld(cs_main);
1458     if (m_chainman.ActiveChain().Contains(pindex)) return true;
1459-    return pindex->IsValid(BLOCK_VALID_SCRIPTS) && (pindexBestHeader != nullptr) &&
1460-           (pindexBestHeader->GetBlockTime() - pindex->GetBlockTime() < STALE_RELAY_AGE_LIMIT) &&
1461+    return (!check_valid_scripts || pindex->IsValid(BLOCK_VALID_SCRIPTS)) &&

naumenkogs commented at 2:23 pm on March 19, 2022:

Seems like a good refactor candidate, as these conditions are hard to follow and review.

dergoegge commented at 1:49 pm on March 21, 2022:

Good point, i added a refactor commit that also adds some comments.

in src/net_processing.cpp:2133 in 3a11265753 outdated

2128@@ -2129,7 +2129,8 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
2129         //   don't connect before giving DoS points
2130         // - Once a headers message is received that is valid and does connect,
2131         //   nUnconnectingHeaders gets reset back to 0.
2132-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2133+        const CBlockIndex* prev_header_index = m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock);
2134+        if ((!prev_header_index || !BlockRequestAllowed(prev_header_index, false)) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {

naumenkogs commented at 2:25 pm on March 19, 2022:

Here and below, we would benefit from explaining why it’s fine to getheaders if we have a parent, but it is a stale block

dergoegge commented at 1:51 pm on March 21, 2022:

I will actually think about this some more because i am not sure if there are no side effects to requesting headers that we already have in our index.

dergoegge force-pushed on Mar 21, 2022

dergoegge commented at 1:59 pm on March 21, 2022: member

Thank you @naumenkogs for the review!

does it require proof-of-work for this attack?

It does not, the header that the attacker uses can be completely invalid. The proof-of-work check happens later on in ChainstateManager::ProcessNewBlockHeaders. The test i included also uses headers that do not have any proof-of-work on them.

in src/net_processing.cpp:617 in 4e4150671d outdated

613@@ -614,7 +614,7 @@ class PeerManagerImpl final : public PeerManager
614      * and in best equivalent proof of work) than the best header chain we know
615      * about and we fully-validated them at some point.
616      */
617-    bool BlockRequestAllowed(const CBlockIndex* pindex) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
618+    bool BlockRequestAllowed(const CBlockIndex* pindex, bool allow_potentially_invalid_headers = false) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

0xB10C commented at 7:41 am on April 7, 2022:

0    bool BlockRequestAllowed(const CBlockIndex* pindex, const bool allow_potentially_invalid_headers = false) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

in src/net_processing.cpp:1454 in 4e4150671d outdated

1450@@ -1451,13 +1451,38 @@ bool PeerManagerImpl::MaybePunishNodeForTx(NodeId nodeid, const TxValidationStat
1451     return false;
1452 }
1453 
1454-bool PeerManagerImpl::BlockRequestAllowed(const CBlockIndex* pindex)
1455+bool PeerManagerImpl::BlockRequestAllowed(const CBlockIndex* pindex, bool allow_potentially_invalid_headers)

0xB10C commented at 7:41 am on April 7, 2022:

0bool PeerManagerImpl::BlockRequestAllowed(const CBlockIndex* pindex, const bool allow_potentially_invalid_headers)

in src/net_processing.cpp:1462 in 4e4150671d outdated

1462+
1463+    if (!m_chainman.ActiveChain().Contains(pindex)) {
1464+        // This is a stale block and we restrict access to them under certain
1465+        // conditions to avoid leaking a fingerprint.
1466+
1467+        if (!allow_potentially_invalid_headers) {

0xB10C commented at 7:44 am on April 7, 2022:

As mentioned in the PR review club, the motivation for introducing allow_potentially_invalid_headers isn’t clear from the PR or commit.

0xB10C commented at 7:55 am on April 7, 2022: contributor

I found the wording of the first sentence of the second paragraph in the commit “[net processing] Prevent block index fingerprinting when receiving he…” to be confusing. It sounds like we now send (in)valid headers to prevent the attack.

This commit prevents malicious peers from performing such an attack by sending (in)valid headers that extend a stale chain on the victims node.

Maybe drop the “performing such an attack”

This commit prevents malicious peers from sending (in)valid headers that extend a stale chain on the victims node.

Maybe relevant for other reviewers: https://bitcoincore.reviews/24571

I’m planning to test this against a node on my reorg-Signet.

in src/net_processing.cpp:2158 in 4e4150671d outdated

2153@@ -2129,7 +2154,8 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
2154         //   don't connect before giving DoS points
2155         // - Once a headers message is received that is valid and does connect,
2156         //   nUnconnectingHeaders gets reset back to 0.
2157-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2158+        const CBlockIndex* prev_header_index = m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock);
2159+        if ((!prev_header_index || !BlockRequestAllowed(prev_header_index, /* allow_potentially_invalid_headers= */ true)) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {

fanquake commented at 10:27 am on April 7, 2022:

0        if ((!prev_header_index || !BlockRequestAllowed(prev_header_index, /*allow_potentially_invalid_headers=*/true)) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {

Anywhere you are adding / touching named args, please use the style from the developer docs.

in src/net_processing.cpp:3535 in 4e4150671d outdated

3530@@ -3505,7 +3531,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3531         {
3532         LOCK(cs_main);
3533 
3534-        if (!m_chainman.m_blockman.LookupBlockIndex(cmpctblock.header.hashPrevBlock)) {
3535+        const CBlockIndex* prev_header_index = m_chainman.m_blockman.LookupBlockIndex(cmpctblock.header.hashPrevBlock);
3536+        if (!prev_header_index || !BlockRequestAllowed(prev_header_index, /* allow_potentially_invalid_headers= */ true)) {

ariard commented at 5:06 pm on April 13, 2022:

Here and above you might add a LogPrint notifying a non-allowed header has been denied of request, like the others callsites of BlockRequestAllowed (“ignoring request from peer=%i for old blocks that isn’t in the main chain”). Nice to monitor eventual fingerprinting attacks attempts or other anomalies.

dergoegge commented at 11:37 am on April 27, 2022:

With the new approach the log message reflects whether the previous block existed or if the peer was not allowed to know about it (potential fingerprint).

ariard commented at 5:07 pm on April 13, 2022: member

Concept ACK.

Do you think such anti-fingerprinting protections should be extended to BIP157 messages request handling ? Even if there are few nodes offering such service now.

dergoegge commented at 4:13 pm on April 14, 2022: member

Thanks everyone for the review so far! I am working on addressing all your feedback. I will mark this as a draft for now and rework my approach because currently this messes with large reorgs in some instances. I am working on a functional test for these scenarios which i might PR separately as none of the current tests caught this. @ariard Good point bringing up BIP157 requests but luckily we already handle them safely, see: https://github.com/bitcoin/bitcoin/blob/b69fd5eaa99f84b62a49d7c7f48d8cee1227592a/src/net_processing.cpp#L2437

dergoegge marked this as a draft on Apr 14, 2022

DrahtBot added the label Needs rebase on Apr 20, 2022

fjahr commented at 1:36 pm on April 24, 2022: contributor

Concept ACK

dergoegge force-pushed on Apr 27, 2022

dergoegge commented at 11:41 am on April 27, 2022: member

I updated this PR with the new approach (see commits and new PR description) and it’s now ready for review again. Most of the review comments did not apply anymore so i marked them as resolved.

The previous approach can be found on this branch. The test for the reorg issue i mentioned before can be found here.

dergoegge marked this as ready for review on Apr 27, 2022

DrahtBot removed the label Needs rebase on Apr 27, 2022

in src/net_processing.cpp:656 in 2021f7c706 outdated

651@@ -640,6 +652,9 @@ class PeerManagerImpl final : public PeerManager
652     void UpdateBlockAvailability(NodeId nodeid, const uint256& hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
653     bool CanDirectFetch() EXCLUSIVE_LOCKS_REQUIRED(cs_main);
654 
655+    void UpdateKnownChainTips(NodeId node_id, const CBlockIndex& new_index);
656+    bool IsIndexOnKnownChain(NodeId node_id, const CBlockIndex& index);

jnewbery commented at 10:09 am on May 3, 2022:

This function could be introduced in commit 0fdd0eb66a85189f2c32c33defb4322fd2a914dd ([net processing] Prevent fingerprinting when receiving header), rather than the previous commit.

in src/net_processing.cpp:2252 in 2021f7c706 outdated

2265+                if (nodestate->nUnconnectingHeaders % MAX_UNCONNECTING_HEADERS == 0) {
2266+                    Misbehaving(pfrom.GetId(), 20, strprintf("%d non-connecting headers", nodestate->nUnconnectingHeaders));
2267+                }
2268+            } else {
2269+                BlockValidationState state;
2270+                state.Invalid(BlockValidationResult::BLOCK_MISSING_PREV, "prev-blk-not-found");

jnewbery commented at 10:10 am on May 3, 2022:

It seems wrong to set a validation state in net_processing - these should be set in validation and acted on by the client. I think here you could just call Misbehaving() directly.

in src/net_processing.cpp:1086 in 2021f7c706 outdated

1082@@ -1068,6 +1083,54 @@ void PeerManagerImpl::UpdateBlockAvailability(NodeId nodeid, const uint256 &hash
1083     }
1084 }
1085 
1086+void PeerManagerImpl::UpdateKnownChainTips(NodeId node_id, const CBlockIndex& new_index)

jnewbery commented at 10:11 am on May 3, 2022:

Did you look at implementing this logic in validation? Carrying out logic on chain tips and ancestors seems like a job for validation rather than net_processing.

dergoegge commented at 2:26 pm on May 17, 2022:

Thanks for the suggestion, I moved this into validation

in src/net_processing.cpp:336 in 2021f7c706 outdated

331+    Mutex m_known_chain_tips_mutex;
332+    /**
333+     * Chain tips that this peer knows we have because they send them to us.
334+     * Limited to the number of chain tips in our global block index.
335+     */
336+    std::set<std::reference_wrapper<const CBlockIndex>, CompareCBlockIndexRef>

jnewbery commented at 10:13 am on May 3, 2022:

Would it be better to track chain tips per-network rather than per-peer? Is it a privacy leak if one peer on a network can determine that the node received a stale block header from a different peer on the same network?

dergoegge commented at 4:20 pm on May 9, 2022:

You are right, per-network would be enough, no need to keep this around for every peer. That would probably also require less changes to the tests.

naumenkogs commented at 9:11 am on May 18, 2022:

Does it mean that the attack will be still possible for a tor-only node with one bind=onion?

dergoegge commented at 10:20 am on May 19, 2022:

Could you elaborate how you think the attack would be possible in that case?

jnewbery commented at 10:18 am on May 3, 2022: contributor

Concept ACK. Nice find.

As an alternative approach, did you consider pruning stale tips from the block index once they’re more than a certain age (eg a month)? Or perhaps just making them unavailable from outside validation if no peer has informed you about them for some period of time.

dergoegge commented at 4:16 pm on May 9, 2022: member

Thanks for the review @jnewbery !

As an alternative approach, did you consider pruning stale tips from the block index once they’re more than a certain age (eg a month)?

I did and that would squash this class of fingerprinting once and for all but I decided not pursue this idea because we pass around CBlockIndex pointers in a lot of places and I was worried that we invalidate pointers by erasing them from the block index while some other component still holds a reference to them. It seems to me that the assumption “nothing ever gets deleted from the block index” is one of the reasons why we pass around CBlockIndex pointers so much and I did not want to mess with that assumption.

Or perhaps just making them unavailable from outside validation if no peer has informed you about them for some period of time.

Will take a look into going more in this direction.

DrahtBot added the label Needs rebase on May 13, 2022

dergoegge force-pushed on May 17, 2022

dergoegge commented at 2:47 pm on May 17, 2022: member

Rebased and addressed @jnewbery’s review.

We now keep track of one chain tip set per network instead of one per peer
The chain tip set logic has moved to validation.h/cpp

in src/validation.cpp:5234 in 7b847b85c2 outdated

5229+{
5230+    AssertLockHeld(cs_main);
5231+    const CBlockIndex* index{m_blockman.LookupBlockIndex(blockhash)};
5232+    if (!index) return false;
5233+
5234+    if (m_best_header->GetAncestor(index->nHeight) == index || m_best_header == index) return true;

maflcko commented at 2:57 pm on May 17, 2022:

0    if (m_best_header->GetAncestor(index->nHeight) == index) return true;

nit: Is this needed?

in src/validation.cpp:5249 in 7b847b85c2 outdated

5244+    const CBlockIndex* index{m_blockman.LookupBlockIndex(blockhash)};
5245+    if (!index) return false;
5246+
5247+    // `index` is in the chain tip set if one the tips has `index` as an ancestor.
5248+    return std::any_of(chain_tips.cbegin(), chain_tips.cend(), [&index, &m_blockman = m_blockman](const uint256& tip_hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main) {
5249+        const CBlockIndex& tip_index{*Assume(m_blockman.LookupBlockIndex(tip_hash))};

maflcko commented at 2:59 pm on May 17, 2022:

0        const CBlockIndex& tip_index{*Assert(m_blockman.LookupBlockIndex(tip_hash))};

Otherwise you might be running into UB in production.

in src/validation.cpp:5262 in 7b847b85c2 outdated

5257+
5258+    const CBlockIndex* new_index{m_blockman.LookupBlockIndex(new_blockhash)};
5259+    if (!new_index) return;
5260+
5261+    const auto& last_tip_it = std::find_if(chain_tips.cbegin(), chain_tips.cend(), [&new_index, &m_blockman = m_blockman](const uint256& tip_hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main) {
5262+        const CBlockIndex& tip_index{*Assume(m_blockman.LookupBlockIndex(tip_hash))};

maflcko commented at 2:59 pm on May 17, 2022:

Same

in src/net_processing.cpp:2612 in 7b847b85c2 outdated

2607@@ -2585,7 +2608,12 @@ void PeerManagerImpl::ProcessGetCFCheckPt(CNode& peer, CDataStream& vRecv)
2608 void PeerManagerImpl::ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing)
2609 {
2610     bool new_block{false};
2611-    m_chainman.ProcessNewBlock(block, force_processing, &new_block);
2612+    if (m_chainman.ProcessNewBlock(block, force_processing, &new_block)) {
2613+        const PeerRef peer{Assume(GetPeerRef(node.GetId()))};

maflcko commented at 3:01 pm on May 17, 2022:

Is peer unused?

DrahtBot removed the label Needs rebase on May 17, 2022

dergoegge force-pushed on May 17, 2022

dergoegge commented at 4:32 pm on May 17, 2022: member

Thanks for the review @MarcoFalke! I took all your suggestions.

in src/net.cpp:271 in 59a8a59bc4 outdated

266@@ -267,6 +267,15 @@ std::optional<CAddress> GetLocalAddrForPeer(CNode *pnode)
267     return std::nullopt;
268 }
269 
270+uint64_t GetUniqueNetworkID(const CConnman& connman, const CNode& node)
271+{
272+    auto local_socket_bytes = node.addrBind.GetAddrBytes();
273+    return connman.GetDeterministicRandomizer(RANDOMIZER_ID_UNIQUE_NETWORK_ID)

jnewbery commented at 12:54 pm on May 18, 2022:

(Doesn’t necessarily need to be done in this PR)

GetDeterministicRandomizer() doesn’t need to be a member of CConnman since it’s a stateless util function. It could be moved to src/random.cpp. Then, GetUniqueNetworkID() wouldn’t need to take a CConnman& argument.

dergoegge commented at 10:03 am on May 19, 2022:

GetDeterministicRandomizer() does make use of nSeed0 and nSeed1 to construct the hasher but I think we could also just be using a CSipHasher(0, 0) here, since these IDs don’t need to be unique per node. I left it as is to maintain current behavior.

in src/validation.cpp:5261 in 59a8a59bc4 outdated

5256+    AssertLockHeld(cs_main);
5257+
5258+    const CBlockIndex* new_index{m_blockman.LookupBlockIndex(new_blockhash)};
5259+    if (!new_index) return;
5260+
5261+    const auto& last_tip_it = std::find_if(chain_tips.cbegin(), chain_tips.cend(), [&new_index, &m_blockman = m_blockman](const uint256& tip_hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main) {

jnewbery commented at 1:06 pm on May 18, 2022:

I don’t think performance is critical here, but it seems more efficient to pass over chain_tips once and do both the ancestor and descendant check on the same pass. Is there a reason you chose not to do that?

dergoegge commented at 10:04 am on May 19, 2022:

No reason, changed this to be done in one pass.

in src/net_processing.cpp:633 in 59a8a59bc4 outdated

628+     * our global block index.
629+     */
630+    std::map<uint64_t, std::set<uint256>> m_chain_tip_sets GUARDED_BY(cs_main);
631+
632+    /** Get the chain tip set for pfrom's network. */
633+    std::set<uint256>& GetChainTipSetForPeer(CNode& pfrom) EXCLUSIVE_LOCKS_REQUIRED(cs_main);

jnewbery commented at 1:07 pm on May 18, 2022:

Maybe name this GetChainTipSetForPeerNetwork()?

in src/validation.cpp:5249 in 59a8a59bc4 outdated

5243+
5244+    const CBlockIndex* index{m_blockman.LookupBlockIndex(blockhash)};
5245+    if (!index) return false;
5246+
5247+    // `index` is in the chain tip set if one the tips has `index` as an ancestor.
5248+    return std::any_of(chain_tips.cbegin(), chain_tips.cend(), [&index, &m_blockman = m_blockman](const uint256& tip_hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main) {

jnewbery commented at 1:11 pm on May 18, 2022:

Perhaps:

0    return std::any_of(chain_tips.cbegin(), chain_tips.cend(), [&index, &blockman = m_blockman](const uint256& tip_hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main) {

since blockman in the lambda is a local variable, not a member variable.

in src/net_processing.cpp:2336 in 59a8a59bc4 outdated

2222+                if (nodestate->nUnconnectingHeaders % MAX_UNCONNECTING_HEADERS == 0) {
2223+                    Misbehaving(pfrom.GetId(), 20, strprintf("%d non-connecting headers", nodestate->nUnconnectingHeaders));
2224+                }
2225+            } else {
2226+                // Same as MaybePunishNodeForBlock for BlockValidationResult::BLOCK_MISSING_PREV
2227+                Misbehaving(pfrom.GetId(), 10, "invalid header received");

jnewbery commented at 2:07 pm on May 18, 2022:

I understand that this is here to maintain existing behaviour. I think in a future PR we could remove this special casing and always handle this with the nUnconnectingHeaders method, even if nCount == MAX_BLOCKS_TO_ANNOUNCE

dergoegge commented at 10:10 am on May 19, 2022:

Agreed, but wanted to leave this for a future PR.

in src/net_processing.cpp:2186 in 59a8a59bc4 outdated

2192-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2193-            nodestate->nUnconnectingHeaders++;
2194+        const CBlockIndex* prev_header_index = m_chainman.m_blockman.LookupBlockIndex(headers.front().hashPrevBlock);
2195+        bool prev_block_on_known_chain = prev_header_index &&
2196+                                         (m_chainman.IsBlockInMainOrBestChain(prev_header_index->GetBlockHash()) ||
2197+                                          m_chainman.IsBlockInChainTipSet(prev_header_index->GetBlockHash(), GetChainTipSetForPeer(pfrom)));

jnewbery commented at 2:16 pm on May 18, 2022:

Maybe this should be factored into its own function, instead of repeated twice.

In general, I think we should look at all calls to LookupBlockIndex() and check whether they result in externally observable behaviour changes, and consider changing those LookupBlockIndex() calls to use this function.

dergoegge commented at 10:12 am on May 19, 2022:

Put this into its own function LookupBlockIndexForPeer and replaced two more uses of LookupBlockIndex() that allowed for the same type of fingerprinting.

jnewbery commented at 2:17 pm on May 18, 2022: contributor

Concept and approach ACK. This is looking good!

dergoegge force-pushed on May 19, 2022

dergoegge commented at 10:17 am on May 19, 2022: member

Thanks @jnewbery for the review and for pointing out to me that the same fingerprinting issue exists for blocktxn requests. I added an additional commit for that as well as one more location (receiving invs for blocks).

I would encourage reviewers to have a look at all uses of LookupBlockIndex() to see if there are more places that allow for this type of fingerprinting. Any use of LookupBlockIndex() that results in externally observable behavior changes should be amended.

DrahtBot added the label Needs rebase on May 20, 2022

dergoegge force-pushed on May 23, 2022

DrahtBot removed the label Needs rebase on May 23, 2022

in src/net_processing.cpp:675 in ba6edfbdd4 outdated

670+     * Lookup a block index for a peer in a privacy preserving manner.
671+     *
672+     * The block index will only be returned if it exists in our global index
673+     * and if the peer is allowed to know about it.
674+     */
675+    const CBlockIndex* LookupBlockIndexForPeer(const CNode& pfrom, const uint256& hash)

jnewbery commented at 8:28 am on May 23, 2022:

0    const CBlockIndex* LookupBlockIndexForPeer(const CNode& node, const uint256& hash)

pfrom is the old hungarian-style naming where p stood for “pointer to”.

(same for the other new functions)

in src/net_processing.cpp:3160 in ba6edfbdd4 outdated

3155@@ -3110,7 +3156,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3156             }
3157 
3158             if (inv.IsMsgBlk()) {
3159-                const bool fAlreadyHave = AlreadyHaveBlock(inv.hash);
3160+                const bool fAlreadyHave = AlreadyHaveBlock(pfrom, inv.hash);
3161                 LogPrint(BCLog::NET, "got inv: %s  %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "new", pfrom.GetId());

jnewbery commented at 9:25 am on May 23, 2022:

This log can now be misleading if we receive a block INV for a block we already have, but the peer shouldn’t know about, since it’ll say “new”.

dergoegge commented at 11:26 am on May 30, 2022:

Changed it to “maybe new”, is that better?

In this location we are not able to tell from the result of LookupBlockIndexForPeer whether the index did not exist or if the peer should not know about it. I could add logging LookupBlockIndexForPeer, what do you think about that?

jnewbery commented at 2:35 pm on May 30, 2022:

Changed it to “maybe new”, is that better?

That seems sufficient here.

I could add logging LookupBlockIndexForPeer, what do you think about that?

LookupBlockIndexForPeer() is a utility function which is called from several places, so I think that might be a bit noisy.

in src/net_processing.cpp:1858 in ba6edfbdd4 outdated

1852@@ -1815,9 +1853,9 @@ bool PeerManagerImpl::AlreadyHaveTx(const GenTxid& gtxid)
1853     return m_recent_rejects.contains(hash) || m_mempool.exists(gtxid);
1854 }
1855 
1856-bool PeerManagerImpl::AlreadyHaveBlock(const uint256& block_hash)
1857+bool PeerManagerImpl::AlreadyHaveBlock(const CNode& pfrom, const uint256& block_hash)
1858 {
1859-    return m_chainman.m_blockman.LookupBlockIndex(block_hash) != nullptr;
1860+    return LookupBlockIndexForPeer(pfrom, block_hash) != nullptr;

jnewbery commented at 9:25 am on May 23, 2022:

Should we just get rid of this function and inline this single line in the only place that AlreadyHaveBlock() is called?

dergoegge commented at 11:26 am on May 30, 2022:

Inlined the call to LookupBlockIndexForPeer.

jnewbery commented at 9:32 am on May 23, 2022: contributor

Code review ACK faf80f4f9cc5f05c877eaec661b9729792a14471

in src/net_processing.cpp:3308 in da1240c742 outdated

3304@@ -3305,7 +3305,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3305         {
3306             LOCK(cs_main);
3307 
3308-            const CBlockIndex* pindex = m_chainman.m_blockman.LookupBlockIndex(req.blockhash);
3309+            const CBlockIndex* pindex{LookupBlockIndexForPeer(pfrom, req.blockhash)};

mzumsande commented at 4:01 pm on May 23, 2022:

How could one have used BLOCKTXN for fingerprinting considering that we’d not serve data for blocks with a height difference of <10 to our tip (MAX_BLOCKTXN_DEPTH) anyway?

dergoegge commented at 11:11 am on May 25, 2022:

Good that you made me double check, because this can actually not be used for fingerprinting but for a different reason than the one you mentioned.

If the requested block is older than MAX_BLOCKTXN_DEPTH blocks, we treat it as a normal getdata request for that block see: https://github.com/bitcoin/bitcoin/blob/8c721fff3a007db613b22464b40fe7cd6b1ead80/src/net_processing.cpp#L3278-L3287

So i thought the fingerprint would be: If the block exists in the nodes index, it will respond with blocktxn or block depending on the height. If the block does not exist, it won’t respond at all.

However, when we process block requests (ProcessGetBlockData()) we check for non main chain requests using BlockRequestAllowed(), so GETBLOCKTXN can not be abused for fingerprinting. https://github.com/bitcoin/bitcoin/blob/8c721fff3a007db613b22464b40fe7cd6b1ead80/src/net_processing.cpp#L1933-L1936

I will remove the commit.

jnewbery commented at 12:34 pm on May 25, 2022:

@mzumsande and I discussed this yesterday and we came to the same conclusion - that this could be used to determine what stale blocks the peer has that are older than 10 blocks but less than 30 days old. That’s not a fingerprint attack.

I think that we can probably just remove the whole logic below “If an older block is requested…”:

https://github.com/bitcoin/bitcoin/blob/b4f686952a60bbadc7ed2250651d0d6af0959f4d/src/net_processing.cpp#L3278-L3289)

and exit early if the block is deeper than 10 deep.

The rationale of using outbound bandwidth as DoS mitigation doesn’t seem very good to me, and if we return early because the block is older than 10 deep, then we don’t have an expensive disk read (we just look up the block index in the in-memory BlockMap and then return immediately). I think that can be done in a follow-up PR.

mzumsande commented at 4:07 pm on May 23, 2022: contributor

If I understand this correctly, the current approach is a cache m_chain_tip_sets which gets cleared each time the node stops and restarts. Since I would assume that many nodes would restart at least once in the timespan where enough alternative tips for fingerprinting exist, isn’t this effectively a limitation similar to the one-month one in BlockRequestAllowed but based on uptime instead of a fixed period, with the actual tip set database per network only having a real effect for extremely stable peers that have been running non-stop for many months or years?

IIRC, an earlier version of this PR basically enforced the 1-month limit for non-best-chain blocks of BlockRequestAllowed also for HEADERSetc. - if that approach had a problem with large reorgs, wouldn’t the same problem also apply to peers that happened to restart recently and lost their tips database (and therefore sometimes for reorgs much smaller than one month if a restart happened more recently)?

dergoegge commented at 5:15 pm on May 25, 2022: member

@mzumsande The problem with the one-month limit was that it would reject headers that connect to an old fork chain that is longer than 2000 (MAX_HEADERS_RESULTS) blocks. Example: block-index(2) With the “one month limit” approach, if block x is older than one month we reject any headers that build on top of x and re-request the 2000 headers that connect to c, so we can’t reorg to the chain of block x if needed. With the tip set approach we would also reject and re-request but when the re-requested headers arrive we add the tip of those to the tip set for that network, so further headers that connect to that tip would be not be rejected.

If a node restarts and then receives headers that connect to an old fork it would re-download the entire fork and update the tip set accordingly to allow for forks of arbitrary length.

mzumsande commented at 10:14 am on May 26, 2022: contributor

The problem with the one-month limit was that it would reject headers that connect to an old fork chain that is longer than 2000 (MAX_HEADERS_RESULTS) blocks.

Thanks a lot for the explanation, that wasn’t clear to me at all! Maybe also link this explanation in the OP so that it becomes clear why you chose the tip set approach over simpler alternatives (that wouldn’t work with large reorgs).

dergoegge force-pushed on May 30, 2022

in src/net_processing.cpp:3154 in 0f816ff2e6 outdated

3150@@ -3110,8 +3151,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3151             }
3152 
3153             if (inv.IsMsgBlk()) {
3154-                const bool fAlreadyHave = AlreadyHaveBlock(inv.hash);
3155-                LogPrint(BCLog::NET, "got inv: %s  %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "new", pfrom.GetId());
3156+                const bool fAlreadyHave = LookupBlockIndexForPeer(pfrom, inv.hash) != nullptr;

jnewbery commented at 2:10 pm on May 30, 2022:

0                const bool fAlreadyHave{LookupBlockIndexForPeer(pfrom, inv.hash) != nullpt};

jnewbery commented at 2:12 pm on May 30, 2022: contributor

ACK 0f816ff2e60c5e440543ebb738db29bd285facb0

in src/net_processing.cpp:664 in 0f816ff2e6 outdated

659+     * Map of seen chain tip sets keyed by unique network ID.
660+     *
661+     * The size of each chain tip set is limited to the number of chain tips in
662+     * our global block index.
663+     */
664+    std::map<uint64_t, std::set<uint256>> m_chain_tip_sets GUARDED_BY(cs_main);

maflcko commented at 8:57 am on May 31, 2022:

Looks like this data structure is append-only for each new connection?

dergoegge commented at 9:58 am on May 31, 2022:

It is append only (just like the block index) but not every new connection will add something to it. If a connection from a new network is made, then a new std::set<uint256> is added to the map for that network. The std::set<uint256> itself is limited to the number of tips in the block index.

maflcko commented at 10:16 am on May 31, 2022:

I am mostly asking because I am worried about nodes running OOM after they cycled through connections for a few months

dergoegge commented at 11:38 am on May 31, 2022:

I had that same worry when writing this code but OOM errors should not be possible unless there is a bug in my implementation.

The logic for adding an entry to the m_chain_tip_sets map is the same logic that is used for the addr cache, so it can have at most as many entries as the addr cache (ie one for each network/bound address).

The chain tips sets are only modified through ChainstateManager::UpdateChainTipSet and an entry can only be added to the set if it already exists in the block index. See: https://github.com/dergoegge/bitcoin/blob/0f816ff2e60c5e440543ebb738db29bd285facb0/src/validation.cpp#L5259-L5260 If the new entry has one of the tips in the tip set as an ancestor then the existing tip is replaced, so there will never be more tips in each tip set than there are tips in the global index.

jnewbery commented at 12:44 pm on May 31, 2022:

Looks like this data structure is append-only for each new connection? @MarcoFalke note that this map is keyed by the unique network id, i.e. a combination of the peer’s network and the local bound address/port. For most nodes, there is only one unique network id (ipv4 with a single bound local address), but for nodes running tor/i2p/etc or with multiple -bind addresses, there may be a handful.

in src/net_processing.cpp:2650 in a66b8e2a0a outdated

2644@@ -2626,7 +2645,11 @@ void PeerManagerImpl::ProcessGetCFCheckPt(CNode& peer, CDataStream& vRecv)
2645 void PeerManagerImpl::ProcessBlock(CNode& node, const std::shared_ptr<const CBlock>& block, bool force_processing)
2646 {
2647     bool new_block{false};
2648-    m_chainman.ProcessNewBlock(block, force_processing, &new_block);
2649+    if (m_chainman.ProcessNewBlock(block, force_processing, &new_block)) {
2650+        LOCK(cs_main);
2651+        m_chainman.UpdateChainTipSet(block->GetHash(), GetChainTipSetForPeerNetwork(node));

sdaftuar commented at 2:23 pm on May 31, 2022:

Logically, I think this line should be in UpdatedBlockTip()? ProcessNewBlock() success does not imply that the new block was accepted as a tip, just that it passed enough checks to be stored to disk.

This is probably harmless because of the checks in UpdateChainTipSet() to ensure that we’re only adding appropriate things anyway, but someone reading the code in the future might wonder why something that operates on chain tips is here and not in UpdatedBlockTip().

dergoegge commented at 10:11 am on June 3, 2022:

UpdatedBlockTip() does not convey any information about the origin (the peer that sent us the block) of the new tip which makes it impossible to update the tip set without storing that information separately. mapBlockSource would work for that, if we wouldn’t remove elements from it in BlockChecked.

A slight improvement could be to move the UpdateChainTipSet() call a couple lines down into if(new_block)?

There are also two more calls to UpdateChainTipSet() (when receiving headers and compact blocks) that could not be moved into UpdatedBlockTip(), since UpdatedBlockTip() only fires for full blocks but we need to update the tip set if we only receive headers as well.

sdaftuar commented at 1:52 pm on June 3, 2022:

On reflection, perhaps the best thing is to leave this as-is and just add a comment mentioning that while this blockhash may not be a chain tip, it’s possible that a new header was added and so we invoke UpdateChainTipSet anyway to ensure that the header would be observable by the peer if a chain were to be built on top of it (and ancestors of existing tips are filtered anyway).

in src/validation.cpp:5235 in f8f0dd4afa outdated

5230+    AssertLockHeld(cs_main);
5231+
5232+    const CBlockIndex* index{m_blockman.LookupBlockIndex(blockhash)};
5233+    if (!index) return false;
5234+
5235+    if (m_best_header->GetAncestor(index->nHeight) == index) return true;

sdaftuar commented at 2:41 pm on May 31, 2022:

Rather than check m_best_header, I think we can permit lookup of any block header that has at least as much work as our tip. There may be multiple chain tips that have more work than our active tip, and there’s not much reason to prefer one to the others. Using m_best_header is implicitly relying on proof-of-work as an our anti-DoS measure, and so we can generalize that to any chain with more work than our tip.

dergoegge commented at 9:44 am on June 3, 2022:

This sounds good to me, however I am not sure if it should replace the m_best_header ancestor check or if it should be added as a third check.

Ancestors of m_best_header (or any header that has more work than the active tip) could have less work than the active tip and therefore simply doing the “more work” check would lead to IsBlockInMainOrBestChain returning false for them. Doing the ancestor check for all headers that have more work than the tip would be ideal but that does not seem easily doable atm.

Maybe we should use BlockRequestAllowed instead and get rid of IsBlockInMainOrBestChain. What do you think of that?

sdaftuar commented at 2:25 pm on June 3, 2022:

I was thinking about why we have this function, and I think it’s for two reasons:

We may have used our current active chain or m_best_header in generating a locator that went out in a getheaders message to this peer.
We want the common case of lookups into a high-work chain (ie where proof-of-work is used to prevent inexpensive fingerprinting) to work wherever possible, without an extra round trip.

For case 1), I mentioned elsewhere that I think we need to explicitly insert (or otherwise track) the chain tips we send in a locator into the peer’s network’s chain tip set, or else we could get a failure to process a response if we reorg in between (or if m_best_header updates to an alternate fork).

For case 2), I think we want to generally be permissive for anything that would be hard for an attacker to use to cheaply fingerprint us, but if we get everything else right then we shouldn’t need to catch every case. For instance I think if there’s some high-work header that no one on our particular network has given us and that we haven’t used in a locator, that there’s no reason we need to allow it to be observable immediately (though of course there’s also no harm in making it observable).

To that end, I think this code is fine as-is (perhaps a comment could be added
to further explain the thinking here). If we wanted to make slight improvements, we could consider things like looking at the work on the header, and if it has more work than our tip we can make it observable, or we could go further and track all chain tips that have more work than our tip (not just the one cached in m_best_header) and compare against their ancestors, etc. However, none of that seems necessary to me right now for reasonably efficient performance.

in src/net_processing.cpp:1827 in f8f0dd4afa outdated

1822+{
1823+    const CBlockIndex* index{m_chainman.m_blockman.LookupBlockIndex(hash)};
1824+    if (!index) return nullptr;
1825+
1826+    if (m_chainman.IsBlockInMainOrBestChain(hash) ||
1827+        m_chainman.IsBlockInChainTipSet(hash, GetChainTipSetForPeerNetwork(node))) {

sdaftuar commented at 3:11 pm on May 31, 2022:

If we add the criteria that a block is observable by a peer if it is an ancestor of pindexBestKnownBlock for the peer that is giving it to us, then I think we can go back to keeping the BlockRequestAllowed() rule that I believe was originally in this PR, so that old blocks that are off the main chain are generally unobservable.

This should allow us to process a large reorg from a peer while providing better fingerprinting protection, by not allowing any blocks that are sufficiently old and off the main chain from being observable by anyone (other than peer that is in the middle of giving such a chain to us). In particular this prevents fingerprinting a node over time – how useful that is, if we are using the same network address anyway is admittedly not very clear to me(!), but it just seems like an easier heuristic to understand that we never allow old blocks that are off the main chain to be observable.

If we did that, then I’m not sure we need any kind of per-network map for evaluating observability; the rule could be that an entry in our block index is observable by a given peer if:

it is on the main chain, or
it has at last as much work as our tip, or
it forks off the main chain recently (ie the BlockRequestAllowed test), or
it is an ancestor of the peer’s pindexBestKnownBlock

This seems to be sufficient for anti-fingerprinting while still allowing headers sync.

dergoegge commented at 6:22 pm on May 31, 2022:

I agree that using BlockRequestAllowed for this would be better as it comes with less complexity than what is in this PR. In practice, it would probably also always work out since we don’t really expect large reorgs (1000s of blocks). However, I don’t see how using pindexBestKnownBlock would solve the issues with that approach. pindexBestKnownBlock is of type const CBlockIndex*, so it can only point to something that already exists in our block index. We have to accept a header from a peer for us to set pindexBestKnownBlock.

The description in this gist should make it clear what the actual problem was. It is quite the edge case but I don’t know how to get around it without keeping track of which chain tip a peer is allowed to build on. In the gist’s example, pindexBestKnownBlock would still be set to block “z” for the peer that is sending us headers building on block “x”, so the ancestor check for it does not seem to work here.

In particular this prevents fingerprinting a node over time

We could also reset the chain tip set once per month (or every n blocks) to achieve this.

sdaftuar commented at 6:38 pm on May 31, 2022:

Ah right, if the peer’s best known block has more work than the last header we receive from them when we’re downloading headers towards their new tip, then we won’t reset it until we get to a header that has more work. So you’re right, what I wrote doesn’t work.

I think we could still get the job done with just a small amount of per-peer state. If we stored the last locator we sent a peer in a getheaders message, then in theory we should permit observability of any header referenced in the locator by the peer that we sent the message to. The downside to trying that now is that currently we don’t limit the number of simultaneous in-flight getheaders messages, so it is slightly messy to keep track of (potentially) multiple locators. But given that we currently generate our locators as being ancestors of some last header, perhaps it would be enough to essentially keep a set of chain tips per peer that represent the last header of some outstanding getheaders message, and then remove from that set after some timeout or after a getheaders response comes in that builds off such a tip?

Typically you’d expect the set to be empty or have one entry in it, but if somehow you triggered multiple outstanding getheaders messages, you’d be able to deal with it. (I actually have a branch I’m working on that would try to prevent sending multiple getheaders simultaneously to a peer, perhaps we can discuss offline sometime.)

If the approach I’m suggesting is too complex for now, I also think it would be ok to proceed along the lines of what you have, which seems conceptually to be a strict improvement, and we could revisit this logic down the road.

dergoegge commented at 12:34 pm on June 3, 2022:

perhaps it would be enough to essentially keep a set of chain tips per peer that represent the last header of some outstanding getheaders message

I like this idea and maybe some of the current logic in this PR could be reused for that (mainly the chain tip set logic). But its not just headers that allowed for this fingerprinting, there are two more uses of LookupBlockindex that are problematic. i.e. when receiving invs for blocks and receiving compact blocks, see here and here.

If the approach I’m suggesting is too complex for now, I also think it would be ok to proceed along the lines of what you have, which seems conceptually to be a strict improvement, and we could revisit this logic down the road.

Yea, i would prefer keeping it as is and maybe revisit later on if we find an even better solution.

sdaftuar commented at 1:04 pm on June 3, 2022:

Yea, i would prefer keeping it as is and maybe revisit later on if we find an even better solution.

Agreed, the more I think about this the more I’m concluding that the improvements I’m thinking of would be much easier if we got to a place where we’ve improved the way we send getheaders messages (so that we don’t have more than one in flight to a peer). I think that would help with the situations where we’re in the middle of headers sync and somehow receive an INV or a compact block, and have to reason about whether the header in question should be observable. (The easiest answer, I think, is that if we’re in the middle of headers sync we shouldn’t be issuing a new getheaders in the first place!)

I guess the only downside is that I do think the per-network visibility rules introduced here may not be necessary in the future, but I think that’s okay on the whole even if it turned out to be the case, as it would still be an improvement for now. I’ll try to focus the remainder of my review on this particular approach! Thanks for the discussion.

sdaftuar commented at 3:27 pm on May 31, 2022: member

Concept ACK (and nice find!)

I think we can make this better by going back to using the BlockRequestAllowed criteria, and coupling that with global and per-peer checks to see if a block header should be observable, which would eliminate the need for per-network maps that get reset when nodes go offline.

Also, I’m not entirely sure the idea of per-network visibility into our block index makes sense, as it opens the door to fingerprinting over time or being able to test whether a node has been online for a while or not.

in src/validation.cpp:5228 in a66b8e2a0a outdated

5223@@ -5224,3 +5224,32 @@ ChainstateManager::~ChainstateManager()
5224         i.clear();
5225     }
5226 }
5227+
5228+void ChainstateManager::UpdateChainTipSet(const uint256& new_blockhash, std::set<uint256>& chain_tips) const

sdaftuar commented at 1:45 pm on June 3, 2022:

Why is this function in ChainstateManager, and why are we passing in hashes rather than const CBlockIndex* objects? Operating on existing const CBlockIndex* entries (ie to find ancestors and look at values like the height) is permitted without holding cs_main, so nothing in here strikes me as needing to be in validation.cpp.

Instead I think we could just make this part of net_processing’s PeerManagerImpl, which makes sense to me as this is logic in support of something that doesn’t need to be viewable outside of our net processing code.

Also the chain_tips set could be a set of pointers, rather than a set of uint256’s, which is slightly better.

jnewbery commented at 2:42 pm on June 3, 2022:

I think it’d be much better if net processing wasn’t holding pointers into validation’s internal data. We’re moving in a direction where the validation engine is more modularized and isolated from the rest of the application (eg see #24303), so we should be reducing the places where other components hold pointers into validation’s data.

I think it might make sense eventually for validation to prune out stale block headers from the block index that are older than a certain age. That’s not currently possible because although the ChainstateManager owns the CBlockIndex objects (through BlockManager.m_block_index), other components may be holding raw pointers to that data.

sdaftuar commented at 2:55 pm on June 3, 2022:

I don’t think this logic makes sense here – I don’t think we’ll be removing the concept of what a CBlockIndex is from net_processing, certainly not anytime soon and probably not ever, particularly with regards to the fundamental data structure of what a headers chain is, and the idea of what proof-of-work is (which is used implicitly as anti-DoS in many places, and likely more in the future).

So I’d say that operations that exist only on CBlockIndex objects don’t need to be tied to validation, as this interface is exposed to our whole codebase right now.

To your point about pruning, it doesn’t seem plausible to me to even consider pruning until we do a lot more work on what parts of the block index are exposed, and in particular, to allow net_processing to communicate to the hypothetical block index pruner which block index entries shouldn’t be pruned (because they may be needed in a headers sync, for example). That is so far away that trying to design for that now seems impractical, and in the meantime we would be introducing additional code complexity in logic (like in this PR) where we don’t take advantage of the fact that these entries persist once inserted – which in turn raises questions around whether headers sync is guaranteed to work if a header were to be pruned from the index in the middle of that process.

jnewbery commented at 3:45 pm on June 3, 2022:

I think we’re just going to fundamentally disagree on this point. An object should never hold a raw pointer to another component’s internal data. It’s unsafe and leads to all kinds of data corruption, use-after-free, etc bugs.

I understand that Bitcoin Core has historically used lots of raw pointers to memory not owned by the object holding the pointer, and that many examples still exist. I don’t think that’s a good reason to add another instance. Projects accumulate technical debt by thousands of incremental changes. We should always try to avoid adding more unsafe code, unless there’s a very very good reason to go in the other direction.

To your point on code complexity, that’s also subjective. I’ve reviewed both versions of this PR (with the logic in net_processing and validation), and this version seems no more complex to me. In fact, keeping data as private as possible within components and defining interfaces to access the minimal information seems to me to reduce complexity over the long run.

sdaftuar commented at 6:22 pm on June 3, 2022:

I think you’re right that we’re going to disagree, but for a topic like this – of where in our code base it is permitted to access CBlockIndex* objects and their member functions, and where it’s permitted to use and store these objects while not holding a lock – the project needs to make a decision about what our architecture looks like, and arrive at a common understanding (even if some might disagree on what the optimal architecture would be). So towards that end, I will try to explain further why I think it’s illogical to use general arguments around “holding raw pointers to another component’s data” as a reason to have this code live in validation specifically for the functions introduced in this PR.

These functions – UpdateChainTipSet(), IsBlockInMainOrBestChain(), and IsBlockInChainTipSet() – seem to do the following:

look up block index entries in our blockmanager
look up our current chain tip (ActiveChain.Tip())
look up m_best_header (the most-work header we know of)
invoke public functions on block index entries (namely, GetAncestor())

Other than the block index and ActiveChain.Tip(), no chainstate-specific state is accessed and (importantly!) no consensus validation of any sort is performed in these functions. These functions do not change any state inside of validation either.

So to argue that these functions do not belong in net_processing, I think you’d have to argue that some of those operations should be private to the validation code. It is difficult for me to see for which of those that should be the case, because I don’t think net_processing would work at all without being able to look at the current tip (and see how much work it has, compared to the work of other headers a peer might be announcing), or look up block index entries, or know what the most work headers chain is (really this is a value that only exists for net_processing in the first place, so that we can save time when performing headers sync on startup). I’d add that the public functions of CBlockIndex would seem to exist precisely so that they could be invoked by other modules (otherwise they would not be public).

So the only remaining point that I think you are making which we could consider is whether caching CBlockIndex pointers after lookup and using them without holding cs_main is harmful. If that were possibly the case, then our validation interface callbacks – which pass const CBlockIndex* arguments without holding cs_main – would be broken. All of the places in our code where we look up a block index entry and then release cs_main and continue to operate on it – including in code touched by this PR (see ProcessHeadersMessage) – would also be broken. Never mind the countless other places in at least net_processing where we cache pointers to block index entries, such as in CNodeState.

Coming back to the particular functions introduced here: the data structures passed around to these functions (the sets of chain tips that are being processed) are not even a validation data structure. They are created in this PR and are owned by net_processing. The semantics of what goes in them is specified by net_processing. Having checks on them happen in validation is moving net_processing logic into validation – a layer violation.

In summary, this general philosophical claim that using CBlockIndex* pointers in net_processing is dangerous strikes me as unhelpful and out of sync with how our code is designed. And it’s hard for me to see what operations these particular functions do that we don’t want net_processing to be able to do generally, while it also seems like adding a net_processing specific function to validation is leaking implementation specifics outside their natural scope.

dongcarl commented at 8:46 pm on June 9, 2022:

Although this is undoubtedly an important convo to have, I don’t think dergoegge opened this PR to spend time on finding the perfect ownership design is for this vestigial part of our code. So let’s find a good place for this PR, then we can discuss architecture in a separate issue. How about the below:

UpdateChainTipSet should be moved to PeerManagerImpl (or even to a static function to really limit its scope) since it ostensibly only operates on PeerManagerImpl::m_chain_tip_sets. It should definitely not be part of anything’s public interface in a commonly-included header file.
While I totally agree that we should minimize our use of raw pointers especially across objects with no ownership relations, I also don’t think that using a uint256 is the best in-between substitute since it introduces some indirections. In the long run I don’t think we’ll end up using const CBlockIndex * or uint256 here, so I don’t think we should spend any more brain power on this in this PR. Let’s leave it as is and open a separate discussion about how to deal with CBlockIndex ownership.

dergoegge commented at 2:00 pm on June 10, 2022:

Thanks Carl for chiming in and for helping me decide how to move forward here.

I moved all the chain tip logic back into net_processing (static functions) and switched back to using const CBlockIndex* for the sets.

jnewbery commented at 10:56 am on June 14, 2022:

Thanks @dongcarl - I agree that this PR shouldn’t be held up on this discussion, and I’m certainly not going to stand in the way if people prefer the alternative approach.

However, I do think some points have been raised here that are incorrect:

1.

Coming back to the particular functions introduced here: the data structures passed around to these functions (the sets of chain tips that are being processed) are not even a validation data structure. They are created in this PR and are owned by net_processing.

UpdateChainTipSet should be moved to PeerManagerImpl (or even to a static function to really limit its scope) since it ostensibly only operates on PeerManagerImpl::m_chain_tip_sets.

UpdateChainTipSet() and UpdateChainTipSet() act on uint256& or stl containers of uint256s. Those are completely generic types that already exist in validation’s public interface. They’re not net-processing-specific, even if the only client component that uses these methods are net_processing.

2.

Other than the block index and ActiveChain.Tip(), no chainstate-specific state is accessed and (importantly!) no consensus validation of any sort is performed in these functions.

It’s absolutely fine for an object to expose public methods that use but don’t change internal state, which is exactly why the const keyword exists for methods. We already have const methods exposed by ChainstateManager, such as UpdateUncommittedBlockStructures() and GenerateCoinbaseCommitment(), which both use internal validation functions (eg GenerateCoinbaseCommitment()) and expose a public interface to a single client component (in that case, the miner). For IsBlockInMainOrBestChain() and IsBlockInChainTipSet() we use the internal state of the block storage module and m_block_index and expose a public interface to net_processing. Those methods are completely analogous, and this is exactly how a loosely-coupled project should be structured.

Let’s leave it as is and open a separate discussion about how to deal with CBlockIndex ownership.

Good idea! Please let me know if you open an issue/PR.

While I totally agree that we should minimize our use of raw pointers especially across objects with no ownership relations, I also don’t think that using a uint256 is the best in-between substitute since it introduces some indirections.

I’m not sure I understand what you mean by indirections here. Net processing needs a handle to look up and refer to blocks within validation. We could use pointers to addresses in validation’s memory, which isn’t safe. We could use uint256s which are guaranteed to be unique per block as long as sha256 isn’t broken, but are 256 bits instead of 64 bits. We could hash the block hash down to a smaller domain if memory is a concern (which isn’t the case here).

sipa commented at 3:34 pm on June 14, 2022:

I completely agree with the aim to move away from exposing CBlockIndex’s internals to all its users. It complicates reasoning about locking, makes it harder to move to different representations (which perhaps split concerns and mutable/immutable values better), makes it hard to even consider options like not keeping all headers in memory all the time, and just generally makes code hard to reason about.

However, I don’t think the option (as I understand what you’re envisioning is) of (a) making validation responsible for managing the internals/ownership of CBlockIndex (or whatever replaces its function), and (b) having the interface to its users to identify block header be uint256 block hashes… is the only possibility.

Specifically, what I’d envision is something like:

A separate module that manages/owns the set of block headers. While validation/blockstorage is probably the only one that needs the ability to modify the set/contents of block-related data currently held in CBlockIndex, there are lots of places in the code that need information about blocks and their relation. I think it would be cleaner to separate from validation.
Other modules (including validation, and net_processing, and wallet, …) refer to block headers using some sort of black box handler type. It may be a wrapper around a pointer to an internal object (like CBlockIndex*), or something that holds an index number in a table held by this block header management module. All operations (gathering information, changing state, finding common ancestors, …) on it however require calling the block header module. This has the advantage over raw CBlockIndex* points of getting the encapsulation needed for reasons listed above. But it also has the advantage over uint256s that it’s (a) smaller, most likely (b) more efficient (doesn’t need constant lookup in a map for every operations) and (c) can support a model where not all block headers are always get in memory but while still guaranteeing that something won’t disappear while another module is referencing it.

To be clear, I’m not arguing for this specific approach, but mostly giving it as an example that there are multiple possible avenues this can go in. And from the perspective of this potential outcome, I think moving block header relation logic to validation, and using uint256s as a way to reference them, is a step away from the goal rather than towards it. An approach where modules do keep working with CBlockIndex*es now, and if and when this refactor happens, those pointers are replaced with handle types and indirections are replaced with block header module function calls, would be much less of a detour.

jnewbery commented at 3:51 pm on June 14, 2022:

Thanks @sipa. Interesting. It’s not immediately obvious to me how the relationship between validation and a block header module would work (would validation be a client of the block header module with special write access?), but we can leave that discussion for another issue.

in src/net_processing.cpp:664 in a66b8e2a0a outdated

659+     * Map of seen chain tip sets keyed by unique network ID.
660+     *
661+     * The size of each chain tip set is limited to the number of chain tips in
662+     * our global block index.
663+     */
664+    std::map<uint64_t, std::set<uint256>> m_chain_tip_sets GUARDED_BY(cs_main);

sdaftuar commented at 1:48 pm on June 3, 2022:

It may be better to not guard this by cs_main, to avoid unnecessary lock contention.

dergoegge commented at 2:01 pm on June 10, 2022:

cs_main is already held in all the relevant locations so I thought I might as well use cs_main here instead of introducing another mutex (not even sure if that would be needed).

in src/validation.cpp:5228 in f8f0dd4afa outdated

5224@@ -5225,6 +5225,33 @@ ChainstateManager::~ChainstateManager()
5225     }
5226 }
5227 
5228+bool ChainstateManager::IsBlockInMainOrBestChain(const uint256& blockhash) const

sdaftuar commented at 1:55 pm on June 3, 2022:

Similar to my other comment, I think this function makes more sense in net_processing.cpp.

in src/net_processing.cpp:2248 in f8f0dd4afa outdated

2254-        // - Once a headers message is received that is valid and does connect,
2255-        //   nUnconnectingHeaders gets reset back to 0.
2256-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2257-            nodestate->nUnconnectingHeaders++;
2258+        if (!LookupBlockIndexForPeer(pfrom, headers.front().hashPrevBlock)) {
2259             m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::GETHEADERS, m_chainman.ActiveChain().GetLocator(m_chainman.m_best_header), uint256()));

sdaftuar commented at 2:02 pm on June 3, 2022:

Since this getheaders message uses m_best_header for the locator, it’s possible that before we get the response, that m_best_header might change to be a different chain tip (if we learn of a more work headers chain from another peer in between).

In that situation, we might get a response from our peer that would appear to be trying to observe a header that is not permitted, and thus trigger another getheaders message, which would be unfortunate.

I think to avoid this, we need to explicitly add the chain tip we’re building a locator from to the chain tip set for a peer network whenever we send out a getheaders. (This would cover the cases where we send a getheaders based on our active chain tip as well, in the event that we reorged away from it.)

dergoegge commented at 2:01 pm on June 10, 2022:

Good point, added a call to UpdateChainTipSet whenever we sent out a getheaders. It’s probably a rather rare edge case but still worth handling imo.

in src/net_processing.cpp:2570 in f8f0dd4afa outdated

2253-        //   don't connect before giving DoS points
2254-        // - Once a headers message is received that is valid and does connect,
2255-        //   nUnconnectingHeaders gets reset back to 0.
2256-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2257-            nodestate->nUnconnectingHeaders++;
2258+        if (!LookupBlockIndexForPeer(pfrom, headers.front().hashPrevBlock)) {

sdaftuar commented at 2:10 pm on June 3, 2022:

Gating the logic below only on whether the header lookup succeeds means, in particular, that if a peer sends us a full headers message that doesn’t connect to our block index, then we’ll issue a getheaders message in response to try again. While we also will assign DoS points, this is a behavior change that I’m not sure is necessary? In the event a peer is giving us full headers that don’t connect, they are likely broken, and going into an infinite loop of 2000 headers being dumped at us would be unfortunate (ie if it was with a peer that had the noban permission set).

I think it would be better to only request more headers if it looks like a BIP 130 unconnecting headers situation, and otherwise just assign DoS points.

dergoegge commented at 2:01 pm on June 10, 2022:

Good catch, i did not intent to introduce a behavior change here. Fixed!

DrahtBot added the label Needs rebase on Jun 8, 2022

dergoegge force-pushed on Jun 10, 2022

dergoegge commented at 2:05 pm on June 10, 2022: member

@sdaftuar Thanks for the thorough review and for challenging the approach! I rebased and addressed all your comments.

in src/net_processing.cpp:2733 in 4696d9f202 outdated

2729+        // not previously received on this peer's network and so we invoke
2730+        // UpdateChainTipSet to ensure that the header is observable by any
2731+        // peer on the same network. (Ancestors of existings tips are ignored
2732+        // by UpdateChainTipSet)
2733+        const CBlockIndex* block_index{nullptr};
2734+        WITH_LOCK(cs_main, block_index = m_chainman.m_blockman.LookupBlockIndex(block->GetHash()));

maflcko commented at 2:15 pm on June 10, 2022:

nit: Funny that this compiles (Edit, probably fixed by this pull request: https://github.com/bitcoin/bitcoin/pull/24931/files#r887670454)

If you decide to keep cs_main, would be good to remove this double lock:

0        block_index = m_chainman.m_blockman.LookupBlockIndex(block->GetHash());

If you decide to switch to a new mutex for GetChainTipSetForPeerNetwork, you can keep it.

dergoegge commented at 2:34 pm on June 10, 2022:

Good catch, very interesting that this compiles without a warning.

maflcko commented at 2:59 pm on June 10, 2022:

Ok, checked with #24931 merged:

0net_processing.cpp:2733:10: warning: cannot call function 'MaybeCheckNotHeld' while mutex 'cs_main' is held [-Wthread-safety-analysis]
1         WITH_LOCK(cs_main, block_index = m_chainman.m_blockman.LookupBlockIndex(block->GetHash()));
2         ^
3./sync.h:305:30: note: expanded from macro 'WITH_LOCK'
4#define WITH_LOCK(cs, code) (MaybeCheckNotHeld(cs), [&]() -> decltype(auto) { LOCK(cs); code; }())
5                             ^

dergoegge force-pushed on Jun 10, 2022

DrahtBot removed the label Needs rebase on Jun 10, 2022

in src/net_processing.cpp:2391 in 7f992f50af outdated

2317@@ -2267,6 +2318,8 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
2318 
2319         assert(pindexLast);
2320         UpdateBlockAvailability(pfrom.GetId(), pindexLast->GetBlockHash());
2321+        UpdateChainTipSet(GetChainTipSetForPeerNetwork(pfrom), pindexLast);

sdaftuar commented at 6:54 pm on June 13, 2022:

I noticed that we only add to the chain tip set if headers validation for the entire group of headers succeeds. This is an edge-case scenario, but if an (eg) inbound peer relayed us a set of headers that ended in a header which was “cached invalid”, then we wouldn’t ban the peer, and we also wouldn’t update the chain tip set for the headers that led up to the invalid one.

I think this could manifest itself if, say, there were a fork and some inbound peer was relaying a competing chain to us, but occasionally relaying valid blocks that fork away from the invalid ones too. Perhaps we can check to see if pindexLast is set (after invoking ProcessNewBlockHeaders) and if it is, then we update the chain tip set before returning?

dergoegge commented at 9:23 am on June 21, 2022:

Good catch, I did not think about this at all.

Added a commit to ensure that we always continue processing if pindexLast is not nullptr after m_chainman.ProcessNewBlockHeaders.

in src/net_processing.cpp:983 in be75d3d3d8 outdated

978+    // new fork to the tip set.
979+    chain_tips.insert(potential_new_tip);
980+}
981+
982+/** Check if a block is on the main chain or best (more work than the active tip) chain. */
983+static bool IsBlockInMainOrBestChain(const CBlockIndex& index, const CChain& active_chain,

jnewbery commented at 11:41 am on June 14, 2022:

Should this require cs_main? The only place it’s called has cs_main locked, and the chain could change while this function is being called if it’s not held.

jnewbery commented at 11:45 am on June 14, 2022:

I really don’t like the idea of passing around references to CChain objects in net_processing. Yes, I know they’re already exposed by m_chainman.ActiveChain(), but this is the first place that CChain is explicitly used as a type in a function signature in net_processing. That seems to be moving us in the wrong direction.

I’m also not sure what the benefit of making these functions static is. PeerManagerImpl is inside the unnamed namespace and so only has internal linkage. Making it static doesn’t limit its scope any more and only means you need to pass additional parameters in.

dergoegge commented at 9:20 am on June 21, 2022:

Moved the functions to PeerManagerImpl.

dergoegge commented at 9:20 am on June 21, 2022:

Yes it should, fixed!

in src/net_processing.cpp:2300 in be75d3d3d8 outdated

2308-        if (!m_chainman.m_blockman.LookupBlockIndex(headers[0].hashPrevBlock) && nCount < MAX_BLOCKS_TO_ANNOUNCE) {
2309-            nodestate->nUnconnectingHeaders++;
2310-            m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::GETHEADERS, m_chainman.ActiveChain().GetLocator(m_chainman.m_best_header), uint256()));
2311-            LogPrint(BCLog::NET, "received header %s: missing prev block %s, sending getheaders (%d) to end (peer=%d, nUnconnectingHeaders=%d)\n",
2312+        if (!LookupBlockIndexForPeer(pfrom, headers.front().hashPrevBlock)) {
2313+            LogPrint(BCLog::NET, "received header %s: %s: %s, sending getheaders (%d) to end (peer=%d, nUnconnectingHeaders=%d)\n",

jnewbery commented at 12:15 pm on June 14, 2022:

This log is wrong now that you’ve put the PushMessage() call inside the if (nCount < MAX_BLOCKS_TO_ANNOUNCE) block below (we’ll log that we’re sending getheaders even if we don’t send getheaders).

dergoegge commented at 9:20 am on June 21, 2022:

Changed the log message. It no longer prints “sending getheaders” but additionally prints nCount which indicates whether or not a getheaders was sent.

in src/net_processing.cpp:3712 in be75d3d3d8 outdated

3709+        if (!LookupBlockIndexForPeer(pfrom, cmpctblock.header.hashPrevBlock)) {
3710             // Doesn't connect (or is genesis), instead of DoSing in AcceptBlockHeader, request deeper headers
3711-            if (!m_chainman.ActiveChainstate().IsInitialBlockDownload())
3712+            if (!m_chainman.ActiveChainstate().IsInitialBlockDownload()) {
3713                 m_connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::GETHEADERS, m_chainman.ActiveChain().GetLocator(m_chainman.m_best_header), uint256()));
3714+                UpdateChainTipSet(GetChainTipSetForPeerNetwork(pfrom), m_chainman.m_best_header);

jnewbery commented at 12:20 pm on June 14, 2022:

Are we calling UpdateChainTipSet() any time a getheaders message is sent? If so, what do you think about factoring that out into a function, so that if more PushMessage(getheaders) are added, people don’t forget to add UpdateChainTipSet()?

sdaftuar commented at 1:18 pm on June 14, 2022:

I also think this is a good idea, but I wanted to mention that if you don’t feel like doing it here, I’d be happy to include this in a followup PR (I’m working on a branch that would improve our logic for sending getheaders messages in the first place, so that we wouldn’t have more than one in-flight to a peer at a time, and as part of that I’m consolidating the calls to PushMessage(getheaders) into a single function already).

dergoegge commented at 9:21 am on June 21, 2022:

Seemed easy enough, so i included it here.

in src/net_processing.cpp:3713 in f3ed724ac1 outdated

3215@@ -3222,8 +3216,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3216             }
3217 
3218             if (inv.IsMsgBlk()) {
3219-                const bool fAlreadyHave = AlreadyHaveBlock(inv.hash);
3220-                LogPrint(BCLog::NET, "got inv: %s  %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "new", pfrom.GetId());
3221+                const bool fAlreadyHave{LookupBlockIndexForPeer(pfrom, inv.hash) != nullptr};
3222+                LogPrint(BCLog::NET, "got inv: %s  %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "maybe new", pfrom.GetId());
3223 
3224                 UpdateBlockAvailability(pfrom.GetId(), inv.hash);

sdaftuar commented at 2:52 pm on June 14, 2022:

This call to UpdateBlockAvailability means that if we get an INV for a block that should not be observable, then we’ll still set pindexBestKnownBlock for the peer to be the block index entry that we’re hiding.

I have been trying to figure out whether there is some way this can be leaked (because if so, that would be a fingerprint vector) but I have not yet come up with one that doesn’t require extensive proof-of-work. However I wanted to mention this in case someone else comes up with a way that we might leak the value.

naumenkogs commented at 9:49 am on August 30, 2022:

Good point! I haven’t found a way to actually exploit it either.

jamesob commented at 1:58 pm on June 15, 2022: member

Benchmarked this at the request of @sdaftuar; looks like there’s no degradation in performance.

ibd local range 667200 697200

commands index

bench name	command
ibd.local.range.667200.697200	`bitcoind -dbcache=10000 -debug=coindb -debug=bench -listen=0 -connect=0 -addnode=127.0.0.1:8888 -prune=9999999 -printtoconsole=0 -assumevalid=000000000000000000176c192f42ad13ab159fdb20198b87e7ba3c001e47b876`

#24571 vs. $mergebase (absolute)

bench name	x	#24571	$mergebase
ibd.local.range.667200.697200.total_secs	2	4929.4609 (± 1.3847)	4953.3385 (± 5.9733)
ibd.local.range.667200.697200.peak_rss_KiB	2	4808580.0000 (± 780.0000)	4809936.0000 (± 728.0000)
ibd.local.range.667200.697200.cpu_kernel_secs	2	234.0400 (± 0.3800)	235.5450 (± 0.2550)
ibd.local.range.667200.697200.cpu_user_secs	2	28716.0650 (± 1.6650)	28770.5150 (± 0.4050)

#24571 vs. $mergebase (relative)

bench name	x	#24571	$mergebase
ibd.local.range.667200.697200.total_secs	2	1	1.005
ibd.local.range.667200.697200.peak_rss_KiB	2	1	1.000
ibd.local.range.667200.697200.cpu_kernel_secs	2	1	1.006
ibd.local.range.667200.697200.cpu_user_secs	2	1	1.002

dergoegge force-pushed on Jun 21, 2022

dergoegge commented at 9:27 am on June 21, 2022: member

Thanks @jnewbery and @sdaftuar for the reviews and thanks @jamesob for the benchmark!

0xB10C commented at 12:31 pm on June 23, 2022: contributor

I’ve manually tested that a localhost, testnet node on 772ee2ccbfe4c57b12d4bdd1187e295a64ee2b95 sends a getheaders for all fake headers building on top of a testnet stale block sent to it via a headers message. Running the same manual test on master reveals which stale blocks the node knows.

I’ve not tested this for compact blocks (covered in functional tests) or block invs (not covered). Also not tested the PeerManagerImpl::m_chain_tips_sets per network logic (not covered).

DrahtBot added the label Needs rebase on Jun 27, 2022

dergoegge force-pushed on Jun 27, 2022

DrahtBot removed the label Needs rebase on Jun 27, 2022

in src/net_processing.cpp:2271 in fbf4497310 outdated

2267+            // The received headers chain could contain valid headers followed
2268+            // by invalid headers. We still want to continue processing the
2269+            // valid headers. `pindexLast` will point to block index of the
2270+            // last valid header.
2271+            if (!pindexLast) {
2272+                return;

sdaftuar commented at 9:14 pm on June 28, 2022:

Hmm, so this is a bigger change than I was trying to suggest; I only meant that we should update the chain tip set if pindexLast is non null, even if an invalid block is encountered. I’m not sure we should invoke all the logic that follows, because some of the processing after this includes sending new p2p messages (such as getheaders if the message was full), which wouldn’t make sense if we encountered an invalid header, even if we weren’t going to disconnect the peer.

I think it’s possible that some of the logic that follows could be invoked for the valid headers if we knew that we we weren’t about to disconnect this peer, but I feel like such a significant behavior change is out of scope for this PR.

My suggestion would be to drop this commit and instead just invoke UpdateChainTipSet for pindexLast (if non-null) before returning, after calling MaybePunishNodeForBlock?

dergoegge commented at 8:17 am on June 29, 2022:

You are right, i went a bit overboard with this. Changed it to just update the chain tip set like you suggested.

dergoegge force-pushed on Jun 29, 2022

DrahtBot added the label Needs rebase on Jul 4, 2022

dergoegge force-pushed on Jul 5, 2022

DrahtBot removed the label Needs rebase on Jul 5, 2022

DrahtBot added the label Needs rebase on Jul 19, 2022

0xB10C commented at 8:42 am on July 19, 2022: contributor

Also not tested the PeerManagerImpl::m_chain_tips_sets per network logic (not covered).

IIRC @dergoegge you’ve mentioned offline that we should be possible to test this (I don’t recall exactly how though). Do you think it’s worth adding a unit or functional test for this?

dergoegge force-pushed on Jul 19, 2022

dergoegge commented at 3:07 pm on July 19, 2022: member

Rebased. @0xB10C Thanks for the reminder! Yes, we could (and probably should) test that a header received on one network is only visible (via the fingerprinting technique) on that same network but not on others. We can simulate Tor connections in the functional tests using the -bind=...=onion option (the p2p_getaddr_caching.py test also does this). I actually think that the functional test I have here currently could be improved to better cover the use of m_chain_tips_sets in general (gonna push an update for that this week).

DrahtBot removed the label Needs rebase on Jul 19, 2022

dergoegge force-pushed on Jul 27, 2022

dergoegge commented at 4:31 pm on July 27, 2022: member

Updated the tests. They now better cover the use of the chain tip sets and simulate an onion connection using -bind=...=onion.

in src/net.h:1145 in a424f96cd9 outdated

1140@@ -1141,6 +1141,12 @@ class CConnman
1141     friend struct ConnmanTestMsg;
1142 };
1143 
1144+/**
1145+ * Get a unique identifier for a peer based on the connections network and the

naumenkogs commented at 6:25 am on August 30, 2022:

This might be confusing for the reader:

function name refers to Network ID
the comment refers to network + socket address

naumenkogs commented at 10:06 am on August 30, 2022:

I guess the issue is Network's ID vs peer's ID, or even the meaning of the word network (Internet is the network too…) (not 100% sure this is worth addressing)

dergoegge commented at 5:08 pm on September 9, 2022:

I have changed the comment up a bit, let me know if you think that was an improvement

in src/net.cpp:271 in a424f96cd9 outdated

265@@ -266,6 +266,18 @@ std::optional<CService> GetLocalAddrForPeer(CNode& node)
266     return std::nullopt;
267 }
268 
269+uint64_t GetUniqueNetworkID(const CConnman& connman, const CNode& node)
270+{
271+    auto local_socket_bytes = node.addrBind.GetAddrBytes();

naumenkogs commented at 6:33 am on August 30, 2022:

This could be const? (yeah i know i wrote this code :P)

dergoegge commented at 4:53 pm on September 9, 2022:

Done

in src/net_processing.cpp:1127 in 4af7fc8be1 outdated

1015@@ -996,6 +1016,33 @@ static bool CanServeWitnesses(const Peer& peer)
1016     return peer.m_their_services & NODE_WITNESS;
1017 }
1018 
1019+void PeerManagerImpl::UpdateChainTipSet(std::set<const CBlockIndex*>& chain_tips, const CBlockIndex* potential_new_tip)
1020+{
1021+    AssertLockHeld(cs_main);
1022+    if (!potential_new_tip) return;

naumenkogs commented at 6:38 am on August 30, 2022:

What’s the reason of allowing a nullptr here, but then terminating if it’s passed? Perhaps this check is better before the call, and here the parameter should be non-nullptr?

dergoegge commented at 5:36 pm on September 14, 2022:

Changed this to take a reference to CBlockIndex, since most of the call sites where guaranteed to pass a non-null index anyway.

in src/net_processing.cpp:1124 in 4af7fc8be1 outdated

1015@@ -996,6 +1016,33 @@ static bool CanServeWitnesses(const Peer& peer)
1016     return peer.m_their_services & NODE_WITNESS;
1017 }
1018 
1019+void PeerManagerImpl::UpdateChainTipSet(std::set<const CBlockIndex*>& chain_tips, const CBlockIndex* potential_new_tip)

naumenkogs commented at 6:43 am on August 30, 2022:

Why don’t pass the peer instead of chain_tips here, and call GetChainTipSetForPeerNetwork internally?

dergoegge commented at 5:36 pm on September 14, 2022:

Done!

in src/net_processing.cpp:724 in 509b073c46 outdated

720@@ -721,6 +721,26 @@ class PeerManagerImpl final : public PeerManager
721     void UpdateChainTipSet(std::set<const CBlockIndex*>& chain_tips, const CBlockIndex* potential_new_tip)
722         EXCLUSIVE_LOCKS_REQUIRED(cs_main);
723 
724+    /** Check if a block is on the main chain or best (more work than the active tip) chain. */

naumenkogs commented at 6:49 am on August 30, 2022:

unless best is some terminology, better should be more accurate?

naumenkogs commented at 6:50 am on August 30, 2022:

terminology i guess (m_best_header), feel free to ignore and close this comment

dergoegge commented at 4:53 pm on September 9, 2022:

#25717 actually introduced IsAncestorOfBestHeaderOrTip which essentially does the same thing that IsBlockInMainOrBestChain is doing, so i have dropped IsBlockInMainOrBestChain entirely.

in src/net_processing.cpp:1151 in 509b073c46 outdated

1070+    if (m_chainman.ActiveChain().Contains(&index)) return true;
1071+    if (m_chainman.m_best_header && m_chainman.m_best_header->GetAncestor(index.nHeight) == &index) return true;
1072+    return false;
1073+}
1074+
1075+bool PeerManagerImpl::IsBlockInChainTipSet(const CBlockIndex& index, const std::set<const CBlockIndex*>& chain_tips)

naumenkogs commented at 6:53 am on August 30, 2022:

Similarly to the previous commit, why don’t obtain GetChainTipSetForPeerNetwork internally?

in src/net_processing.cpp:2937 in 509b073c46 outdated

2603+            return;
2604         }
2605-        return;
2606     }
2607 
2608     // At this point, the headers connect to something in our block index.

naumenkogs commented at 7:13 am on August 30, 2022:

This comment is kinda outdated now (should add for a given peer/network or something?)?

dergoegge commented at 5:36 pm on September 14, 2022:

I think this comment is still accurate, at that point the headers will never not connect to something in the index.

in src/net_processing.cpp:3859 in a89e7e690f outdated

3855@@ -3856,7 +3856,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3856         {
3857         LOCK(cs_main);
3858 
3859-        if (!m_chainman.m_blockman.LookupBlockIndex(cmpctblock.header.hashPrevBlock)) {
3860+        if (!LookupBlockIndexForPeer(pfrom, cmpctblock.header.hashPrevBlock)) {

naumenkogs commented at 9:36 am on August 30, 2022:

I might be not getting something… In previous commit, we MaybeSendGetHeaders if this condition is true. Here, we MaybeSendGetHeaders if the same condition is false. Seems strange, but maybe I’m missing the reasoning.

naumenkogs commented at 9:41 am on August 30, 2022:

Ah, I guess this branch assumes that this new header extends our best chain, and would try to build on it.

So the probe would be to send two different compact blocks: one on top of the stale block, another one not on top of it. After this change, it would make a difference only if the stale block is registered by the peer’s network id. Alright, makes sense. Feel free to resolve.

naumenkogs commented at 9:41 am on August 30, 2022:

Could benefit from a comment :)

dergoegge commented at 7:15 pm on September 12, 2022:

Added some comments here

in src/net_processing.cpp:3710 in 88615157f8 outdated

3358@@ -3365,8 +3359,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3359             }
3360 
3361             if (inv.IsMsgBlk()) {
3362-                const bool fAlreadyHave = AlreadyHaveBlock(inv.hash);
3363-                LogPrint(BCLog::NET, "got inv: %s  %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "new", pfrom.GetId());
3364+                const bool fAlreadyHave{LookupBlockIndexForPeer(pfrom, inv.hash) != nullptr};

naumenkogs commented at 10:26 am on August 30, 2022:

Variable name should be updated to already_heard_from_peer or something

dergoegge commented at 4:57 pm on September 9, 2022:

You’re right, it’s not the greatest name but i think renaming it is out-of-scope here.

DrahtBot added the label Needs rebase on Aug 30, 2022

[net] Create GetUniqueNetworkID helper

This commit extracts the logic from the addr cache id generation into
its own function (GetUniqueNetworkID).

fbf1921300

dergoegge force-pushed on Sep 12, 2022

dergoegge commented at 7:21 pm on September 12, 2022: member

Rebased. @naumenkogs Thanks for the review! I have addressed some of your comments and will get to the others as soon as I can.

DrahtBot removed the label Needs rebase on Sep 12, 2022

[net processing] Keep track of seen chain tips per network

This commit introduces the `PeerManagerImpl::m_chain_tip_sets` map which
keeps track of seen chain tips per network. No DoS risk is introduced by
keeping these sets of chain tips since each set in the map will have at
most as many entries as there are chain tips in our global index.

86f19bdab7

[net processing] Prevent fingerprinting when receiving headers

The block index might contain stale blocks that are not part of the main
chain. If a malicious peer is able to probe a node's block index for
certain stale blocks then it can use this information to fingerprint the
node.

Before this commit such an attack is possible due to the logic when
receiving headers. A malicous peer can probe for a stale block in a
node's index by sending a headers message in which the first header
extends a stale chain and has invalid proof-of-work. Nodes behave
differently when receiving such a headers message, depending on whether
or not the stale block exists in its block index. If it exists, the node
will disconnect as the proof-of-work on the new header was invalid. If
it does not exist, the node will send a getheaders message in an attempt
to connect the two chains.

In this commit we prevent this attack by making use of the
`PeerManagerImpl::m_chain_tip_sets` map which we introduced in the
previous commit. We only try to accept new headers if they connect to
anything in our global index and they connect to our active chain or to
a chain that was previously sent to us by a peer on the same network. We
send a getheaders message should these conditions not be met.

b10569e8bb

[net processing] Prevent fingerprinting when receiving compact blocks

This commit prevents the same attack that was described in the previous
commit when receiving compact blocks.

06cc8ae077

[net processing] Prevent fingerprinting when receiving block invs 56ee277470

[test] Extend header fingerprinting tests d700342327

dergoegge force-pushed on Sep 14, 2022

in src/net.h:1171 in fbf1921300 outdated

1166@@ -1167,6 +1167,13 @@ class CConnman
1167     friend struct ConnmanTestMsg;
1168 };
1169 
1170+/**
1171+ * Get a unique identifier for the network (e.g. Tor, IPv4/6) of a connection.

naumenkogs commented at 10:47 am on September 16, 2022:

nit: From my reading of the first line of the comment, it looks like this is just a network type identifier (without the local socket). e.g., tor=0, ipv4=1, etc.

DrahtBot commented at 4:29 pm on December 8, 2022: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

DrahtBot added the label Needs rebase on Dec 8, 2022

in src/net_processing.cpp:793 in 86f19bdab7 outdated

788+    /**
789+     * Update the chain tip set for a given peer and new block hash by
790+     * replacing an existing tip if it is an ancestor of the new block or by
791+     * adding a new tip if none of the existing chain tips were replaced.
792+     */
793+    void UpdateChainTipSet(const CNode& node, const CBlockIndex& potential_new_tip)

stratospher commented at 7:21 pm on March 3, 2023:

did i understand this correctly? - UpdateChainTipSet updates chain tip set based on: 1. the whole block/valid headers we received and were told of from other nodes on the same network 2. when we switch networks, chain tip set initially would only contain tip of active chain. 3. inbound and outbound connections on same network would have different chain tip set. is this intentional?

if we mine a block and tell other nodes on the network about it, should we update the chain tip set? (i haven’t checked if UpdateChainTipSet is called in all the locations.) are there other situations in which a block isn’t inserted into the chain tip set?

in src/net_processing.cpp:2056 in b10569e8bb outdated

2051+    AssertLockHeld(cs_main);
2052+
2053+    const CBlockIndex* index{m_chainman.m_blockman.LookupBlockIndex(hash)};
2054+    if (!index) return nullptr;
2055+
2056+    if (IsAncestorOfBestHeaderOrTip(index) ||

stratospher commented at 7:25 pm on March 3, 2023:

IsAncestorOfBestHeaderOrTip would already by contained in IsBlockInChainTipSet right? do we include IsAncestorOfBestHeaderOrTip check for performance gain instead of iterating over all the chain tips? or is there some other reason?

stratospher commented at 7:35 pm on March 3, 2023: contributor

Concept ACK. I’ve tested this out on a couple of reorg scenarios in regtest. Liked how test_header_leak_via_headers() checks whether stale header leaks should happen or not.

Summarising this PR in case it’s useful to other reviewers - if a node on 2 different networks have same stale block indices, it’s highly possible that it’s the same node (fingerprinting). behaviour on master :

node1 sends sequence of headers built on some stale block to node2.
if node2 knows stale block, node2 won’t send GETHEADERS
if node2 doesn’t know stale block, node2 sends GETHEADERS (of it’s active chain tip)
this behaviour difference can be used to identify the stale blocks a node has.

behaviour on PR:

node1 sends sequence of headers built on some stale block to node2.
if node2 knows stale block, node2 sends GETHEADERS (of it’s active chain tip)
if node2 doesn’t know stale block, node2 sends GETHEADERS (of it’s active chain tip)
presence/absence of stale blocks can’t be identified now.

dergoegge commented at 3:06 pm on March 19, 2023: member

@stratospher Thank you for having a look at this! There is an alternative approach to the one in this PR that I have been meaning to implement and I forgot to put this PR in draft mode.

We made some changes to the headers sync logic (https://github.com/bitcoin/bitcoin/pull/25454), which causes us to now only have one getheaders request in flight for the same peer. If we start storing the locator that is sent in the getheaders message and only allow headers in response that connect to the locator, then the issue described in this PR is avoided as well. There are some edge cases around headers announcements which are received without sending a getheaders prior but otherwise this approach would be much less complex.

dergoegge marked this as a draft on Mar 19, 2023

dergoegge commented at 3:21 pm on May 30, 2023: member

Closing for now, can be marked up for grabs.

Someone should look into implementing the alternative approach I have described here: #24571 (comment).

dergoegge closed this on May 30, 2023

fanquake added the label Up for grabs on May 30, 2023

bitcoin locked this on May 29, 2024

p2p: Prevent block index fingerprinting by sending additional getheaders messages #24571

Reviews

Conflicts

commands index

#24571 vs. $mergebase (absolute)

#24571 vs. $mergebase (relative)