Prioritize processing of peers based on their CPU usage

vasild commented at 12:56 pm on October 4, 2024: contributor

Please describe the feature you’d like to see added.

Currently, we process messages to/from all peers in a loop where every peer is processed once (has the same weight). The list of peers is shuffled before every loop.

Considering a scenario where we spend considerably more CPU time for some peers compared to others, does it make sense to de-prioritize CPU-hungry peers? This might be happening deliberately (a CPU DoS attack) or not.

For example: if we spent 5 CPU seconds to process each one of the peers Bad and Demanding and 1 CPU second to process peers Normal and Light. Then on the next loop, we can process just Normal and Light so they now account to 2 CPU seconds each and skip Bad and Demanding. Do a few loops like this until everybody is around 5 CPU seconds and then process all of them again.

I am not sure how much sense this makes, but at least it seems worthy of brainstorming.

#30572 aims to address a problem of peers sending a lot of costly-to-validate-but-invalid transactions in an attempt to CPU DoS a node. To me it seems that whether those transactions are requested by the victim or are sent unsolicited is secondary. More importantly, willingly or not, some peers are eating the CPU and some not, so this is a broader issue.

Describe the solution you’d like

Aim to spend approximately the same amount of CPU time for every peer. Or, within some reasonable margin, e.g. if the difference between the lightest and the heaviest peer is more than 10x, then trigger some protective mechanism.

Describe any alternatives you’ve considered

Drop unsolicited transactions. IMO that would not really resolve the DoS.

Please leave any additional context

I got this idea while reading https://delvingbitcoin.org/t/cpu-usage-of-peers/196/2

vasild added the label Feature on Oct 4, 2024

0xB10C commented at 9:43 pm on October 4, 2024: contributor

Interesting idea - I remember that we briefly chatted about this a while ago.

How does this behave during e.g. IBD when a “good” peer sends us a bunch of blocks and we spent a lot of CPU time validating them? Does something like this need to be excluded? I guess it’s similar for peers that constantly send us transactions we don’t know about yet (maybe because they are just better connected or they are broadcasting them) - not sure if we want to process their messages slower.

vasild commented at 5:17 am on October 5, 2024: contributor

Good questions that I do not have an answer to.

I guess this should start with planting some metrics - CPU time used by each peer in the last 1, 5 and 15 minutes and assessing what a “usual” situation looks like.

Specific to CPU DoS via invalid transactions: for peers that send us such transactions, maybe multiply the CPU time by a factor e.g. 10x or 100x. Or the other way around - nullify CPU time spend on definitely-good-and-positive things like new blocks or new transactions that we accepted.

laanwj added the label Brainstorming on Oct 5, 2024

ariard commented at 11:59 pm on October 13, 2024: none

I think you can have a look on the discussions in the old #21224 PR. At the time, there were few ideas discussed for the wider mitigations of CPU usage as a denial-of-service. While effectively, unsolicited transactions is far less concerning than CPU usage as a denial-of-service, when #21224 was opened it was realized one of the first step to have DoS mitigations in depth about DoSy transaction-relay was to halt the processing of unrequested transactions.

See again what I was advocating at the time “Currently, an attacker can open multiple inbound connections to a node and send expensive to validate, junk transactions. Once the canonical INV/GETDATA sequence is enforced on the network, a further protection would be to deprioritize bandwidth and validation resources allocation, or even to wither connections with such DoSy peers.”. I think it’s still true and one of the approach is effectively to monitor the CPU usage of peers, yet any mitigation should ensure that the validated transactions are still carrying significant miners fees.

E.g, transaction A and B can have the same validation cost in terms of hashing and signature verification, yet transaction A can have an absolute fee of 10 000 sats and transaction B can have an absolute fee of only 1 000 sats.

yuvicc commented at 10:07 am on November 26, 2024: contributor

Good questions that I do not have an answer to.

I guess this should start with planting some metrics - CPU time used by each peer in the last 1, 5 and 15 minutes and assessing what a “usual” situation looks like.

Specific to CPU DoS via invalid transactions: for peers that send us such transactions, maybe multiply the CPU time by a factor e.g. 10x or 100x. Or the other way around - nullify CPU time spend on definitely-good-and-positive things like new blocks or new transactions that we accepted.

Building that feature in Bitcoin Core is not a good idea IMO, since you need some kind of shared state to keeps the stats of each and every peer which will take extra storage and memory overhead!! There might be some better way around.

vasild commented at 10:22 am on November 26, 2024: contributor

We already keep stats for each and every peer, displayed in the getpeerinfo RPC, e.g. bitcoin-cli getpeerinfo.

jonatack commented at 3:57 pm on November 26, 2024: member

Concept ACK on potentially adjusting peers based on their resource usage, though the devil may be in the details. If I’m not misremembering, this has been brought up as an idea worth exploring since several years.

rebroad commented at 8:22 pm on November 28, 2024: contributor

We already keep stats for each and every peer, displayed in the getpeerinfo RPC, e.g. bitcoin-cli getpeerinfo.

do those stats include CPU time though?

yuvicc commented at 9:17 am on November 30, 2024: contributor

We already keep stats for each and every peer, displayed in the getpeerinfo RPC, e.g. bitcoin-cli getpeerinfo.

do those stats include CPU time though?

We don’t keep CPU time of each peer, you can see here https://developer.bitcoin.org/reference/rpc/getpeerinfo.html?highlight=peerinfo

mzumsande commented at 3:45 pm on April 14, 2025: contributor

In many circumstances, using lots of our CPU is a desirable property and should be rewarded, not punished - also outside of IBD. Some of the most undesirable peers are spy nodes that never send us anything. But imagine a peer that is so well-connected and fast that it offers us many new transactions and blocks earlier than our other peers - wouldn’t we want to be connected to this peer, even if it takes up much more CPU than the other peers?

So I think we’d have to do one of two things:

Restrict the DoS score to known problematic activities (such as providing us with txns that we don’t accept, but that don’t result in a disconnection). This has limited benefits because it wouldn’t help with unknown DoS vectors.
Whitelist “good” activities, such as providing us txns and blocks that turn out to be valid, and not include these in the DoS metric. I don’t know how viable it is to find all of these good activities, but this would allow us to de-prioritize peers that DoS us with unknown methods.

vasild commented at 8:52 am on April 15, 2025: contributor

I agree. Maybe it would make sense to count “CPU wasted” for definitely wasted CPU (first point in the comment above) and “CPU useful” (second point). As an extension to #31672 (in that PR or in a followup).

rebroad commented at 7:03 pm on July 28, 2025: contributor

In many circumstances, using lots of our CPU is a desirable property and should be rewarded, not punished - also outside of IBD. Some of the most undesirable peers are spy nodes that never send us anything. But imagine a peer that is so well-connected and fast that it offers us many new transactions and blocks earlier than our other peers - wouldn’t we want to be connected to this peer, even if it takes up much more CPU than the other peers?

So I think we’d have to do one of two things:

Restrict the DoS score to known problematic activities (such as providing us with txns that we don’t accept, but that don’t result in a disconnection). This has limited benefits because it wouldn’t help with unknown DoS vectors.

Whitelist “good” activities, such as providing us txns and blocks that turn out to be valid, and not include these in the DoS metric. I don’t know how viable it is to find all of these good activities, but this would allow us to de-prioritize peers that DoS us with unknown methods.

Absolutely. CPU usage alone is a useless metric. But if combined with TXs (from this peer) entering the mempool (for example), then it might become a bit more useful.

Prioritize processing of peers based on their CPU usage #31033

Please describe the feature you’d like to see added.

Is your feature related to a problem, if so please describe it.

Describe the solution you’d like

Describe any alternatives you’ve considered

Please leave any additional context