Rewrite DoS interface between validation and net

sdaftuar commented at 4:57 pm on January 10, 2019: member

This is a rebase of #11639 with some fixes for the last few comments which were not yet addressed.

The original PR text, with some strikethroughs of text that is no longer correct:

This cleans up an old main-carryover - it made sense that main could decide what DoS scores to assign things because the DoS scores were handled in a different part of main, but now validation is telling net_processing what DoS scores to assign to different things, which is utter nonsense. Instead, we replace CValidationState’s nDoS and CorruptionPossible with a general ValidationInvalidReason, which net_processing can handle as it sees fit. I keep the behavior changes here to a minimum, but in the future we can utilize these changes for other smarter behavior, such as disconnecting/preferring to rotate outbound peers based on them providing things which are invalid due to SOFT_FORK because we shouldn’t ban for such cases.

This is somewhat complementary with, though obviously conflicts heavily with #11523, which added enums in place of DoS scores, as well as a few other cleanups (which are still relevant).

Compared with previous bans, the following changes are made:

Txn with empty vin/vout or null prevouts move from 10 DoS points to 100. Loose transactions with a dependency loop now result in a ban instead of 10 DoS points. ~~BIP68-violation no longer results in a ban as it is SOFT_FORK.~~ ~~Non-SegWit SigOp violation no longer results in a ban as it considers P2SH sigops and is thus SOFT_FORK.~~ ~~Any script violation in a block no longer results in a ban as it may be the result of a SOFT_FORK. This should likely be fixed in the future by differentiating between them.~~ Proof of work failure moves from 50 DoS points to a ban. Blocks with timestamps under MTP now result in a ban, blocks too far in the future continue to not result in a ban. Inclusion of non-final transactions in a block now results in a ban instead of 10 DoS points.

Note: The change to ban all peers for consensus violations is actually NOT the change I’d like to make – I’d prefer to only ban outbound peers in those situations. The current behavior is a bit of a mess, however, and so in the interests of advancing this PR I tried to keep the changes to a minimum. I plan to revisit the behavior in a followup PR.

EDIT: One reviewer suggested I add some additional context for this PR:

The goal of this work was to make net_processing aware of the actual reasons for validation failures, rather than just deal with opaque numbers instructing it to do something.

In the future, I’d like to make it so that we use more context to decide how to punish a peer. One example is to differentiate inbound and outbound peer misbehaviors. Another potential example is if we’d treat RECENT_CONSENSUS_CHANGE failures differently (ie after the next consensus change is implemented), and perhaps again we’d want to treat some peers differently than others.

laanwj added the label Validation on Jan 10, 2019

DrahtBot commented at 6:31 pm on January 10, 2019: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#15681 ([mempool] Allow one extra single-ancestor transaction per package by TheBlueMatt)
#15505 ([p2p] Request NOTFOUND transactions immediately from other outbound peers, when possible by sdaftuar)
#15253 (Net: Consistently log outgoing INV messages by Empact)
#13868 (Remove unused fScriptChecks parameter from CheckInputs by Empact)
#13525 (Report reason inputs are nonstandard from AreInputsStandard by Empact)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

practicalswift commented at 6:34 pm on January 10, 2019: contributor

Concept ACK

laanwj added this to the "Blockers" column in a project

sipa commented at 6:17 pm on January 11, 2019: member

Concept ACK; I’ll review for changes in behavior for specific validation reasons later.

laanwj commented at 2:23 pm on January 14, 2019: member

Concept ACK

in src/consensus/validation.h:77 in 94874ddfdf outdated

73@@ -30,32 +74,24 @@ class CValidationState {
74         MODE_INVALID, //!< network rule violation (DoS value may be set)
75         MODE_ERROR,   //!< run-time error
76     } mode;
77-    int nDoS;
78+    ValidationInvalidReason reason;

jnewbery commented at 7:52 pm on January 14, 2019:

nit: change to m_reason and avoid all the non-shadowing naming tricks below (reasonIn and _reason)

sdaftuar commented at 9:40 pm on January 31, 2019:

Fixed in latest commit.

in src/consensus/validation.h:61 in 94874ddfdf outdated

55+     * activation, or witness may have been malleated (which includes
56+     * non-standard witnesses).
57+     */
58+    TX_WITNESS_MUTATED,
59+    /**
60+     * Tx already in mempool or conflicts with a tx in the chain

jnewbery commented at 8:03 pm on January 14, 2019:

nit: s/conflicts with a tx in the chain/conflicts with a confirmed transaction. Same comment below for “exists in the mempool or on chain”

sdaftuar commented at 9:40 pm on January 31, 2019:

I don’t really think “confirmed transaction” is any clearer than “tx in the chain” – if anything, the latter seems more specific to me, as “confirmed” is a concept that only makes sense in the context of the chain that you’re on, which “tx in the chain” is more explicit about.

I’m going to leave this comment intact, pending other opinions.

sipa commented at 8:56 pm on February 8, 2019:

I think the current wording is fine.

in src/validation.cpp:3344 in 94874ddfdf outdated

3340@@ -3333,7 +3341,9 @@ static bool ContextualCheckBlock(const CBlock& block, CValidationState& state, c
3341     // the block hash, so we couldn't mark the block as permanently
3342     // failed).
3343     if (GetBlockWeight(block) > MAX_BLOCK_WEIGHT) {
3344-        return state.DoS(100, false, REJECT_INVALID, "bad-blk-weight", false, strprintf("%s : weight limit failed", __func__));
3345+        // We can call this a consensus failure as any data-providers who provided

jnewbery commented at 10:09 pm on January 14, 2019:

This seems entirely obvious and not requiring a comment to me, which makes me think there’s some subtlety I’ve missed. Is this just saying that if we receive a block with witness data, it should be valid-according-to-BIP141?

Pedantic nit: I’d also avoid talking about ‘data-providers’ in validation.cpp. After this PR, validation should be unconcerned with data-providers and only be validating blocks based on the block data.

sdaftuar commented at 1:23 am on January 15, 2019:

I believe this comment is contrasting a CONSENSUS failure from a RECENT_CONSENSUS_CHANGE – I think in @TheBlueMatt’s original PR he had some validation failures marked as RECENT_CONSENSUS_CHANGE, but eventually we decided to switch them all out (and reserve RECENT_CONSENSUS_CHANGE as something we might do in the future).

I think I agree with you philosophically that validation ought not be very concerned with ‘data providers’, but I think the ValidationReasons interface is also driven by the needs of net_processing, so sometimes we may need to explain reasons that maybe don’t make sense in a totally neutral validation library because our application requires it. RECENT_CONSENSUS_CHANGE is one such possible example (though we’re not using it in this PR and I am not sure we ever will); the BLOCK_INVALID_HEADER enum I added is another (net_processing needs to be able to distinguish some headers failures from others, in order to maintain the current ban behavior).

Anyway I’ll update this comment to be clearer.

ajtowns commented at 5:31 am on January 15, 2019:

I think this comment is justifying upgrading the (at the time recent) segwit test from a RECENT_CONSENSUS_CHANGE to just CONSENSUS_CHANGE, the reason being that either you’ve got an old client that didn’t provide segwit data – in which case this test won’t trigger because the bad-blk-length test will already have failed – or it is providing segwit data but doing it wrong, in which case there’s no reason to use the more forgiving RECENT version. So I think just dropping the comment (now that segwit isn’t recent) is fine, fwiw.

in src/net_processing.cpp:982 in 94874ddfdf outdated

837+        // building off an invalid or missing block -- are punished regardless
838+        // (see below).
839+        return !via_compact_block;
840+    case ValidationInvalidReason::BLOCK_INVALID_HEADER:
841+    case ValidationInvalidReason::BLOCK_CHECKPOINT:
842+    case ValidationInvalidReason::BLOCK_INVALID_PREV:

jnewbery commented at 10:42 pm on January 14, 2019:

It’s unclear to me whether peers should always be punished for BLOCK_INVALID_PREV. For example, if the previous block was invalid because of RECENT_CONSENSUS_CHANGE and the peer wasn’t punished, should it be punished for relaying this descendant block?

Should compact block peers be punished for relaying the block if its parent is invalid? My reading of https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#pre-validation-relay-and-consistency-considerations is that the answer is no.

Same question for MaybePunishNode() below.

ajtowns commented at 5:56 am on January 15, 2019:

If miners have mostly upgraded then building on top of a RECENT_CONSENSUS_CHANGE block should be rare enough for this not to be a huge problem.

If not, and we want to cope with a moderately controversial consensus upgrade, then we probably want to track whether blocks failed due to RECENT_CONSENSUS_CHANGE and mark their children as also failing due to RECENT_CONSENSUS_CHANGE (after checking PoW at least). Working all that out doesn’t seem necessary for this patchset to me though.

sdaftuar commented at 6:07 pm on January 15, 2019:

With respect to the issue you’re bringing up, I believe the behavior in this PR matches existing behavior, in which case I’d prefer to defer improvement to a future PR. If I’m missing some way that we’ve made things different or worse though let me know.

As for BIP 152:

A node MUST NOT send a cmpctblock message without having validated that the header properly commits to each transaction in the block, and properly builds on top of the existing, fully-validated chain with a valid proof-of-work either as a part of the current most-work valid chain, or building directly on top of it. A node MAY send a cmpctblock before validating that each transaction in the block validly spends existing UTXO set entries.

jnewbery commented at 10:44 pm on January 14, 2019: member

I’ve reviewed a5415e85c. A few nits/questions inline.

in src/net_processing.cpp:1532 in 94874ddfdf outdated

1528@@ -1453,6 +1529,7 @@ bool static ProcessHeadersMessage(CNode *pfrom, CConnman *connman, const std::ve
1529                 // etc), and not just the duplicate-invalid case.
1530                 pfrom->fDisconnect = true;
1531             }
1532+            MaybePunishNode(pfrom->GetId(), state, /*via_compact_block*/ false, "invalid header received");

sdaftuar commented at 1:29 am on January 15, 2019:

Self-review: I think adding this line here may be a bug. At any rate, there is a serious confusion between the hacky punish_invalid bool in the existing code and the introduction of MaybePunishNode in this PR that ought to be cleaned up.

ajtowns commented at 2:48 pm on January 15, 2019:

This replaces the Misbehaving(.., "invalid header received"); from earlier; shouldn’t be introducing a bug (unless the move to below the if introduces one)?

sdaftuar commented at 5:54 pm on January 15, 2019:

I believe there is an unintended behavior change here – previously, “duplicate invalid” headers were not assigned DoS points. We added a bunch of logic (just above this line of code) to punish outbound peers for providing invalid headers.

After the rewrite in this PR, CACHED_INVALID is a bannable offense from any peer (other than in HB compact block relay).

I’ll rework this…

ryanofsky commented at 6:03 pm on February 12, 2019:

re: #15141 (review)

I’ll rework this…

This is resolved now?

in src/consensus/validation.h:83 in 6ee2c4551d outdated

106@@ -117,14 +107,7 @@ class CValidationState {
107     bool IsError() const {
108         return mode == MODE_ERROR;
109     }
110-    bool CorruptionPossible() const {
111-        return corruptionPossible;
112-    }

ajtowns commented at 6:07 am on January 15, 2019:

Maybe having inline bool CorruptionPossible() const { return reason == BLOCK_MUTATED; } would make for nicer code elsewhere?

in src/validation.cpp:906 in 6ee2c4551d outdated

888@@ -889,7 +889,6 @@ static bool AcceptToMemoryPoolWorker(const CChainParams& chainparams, CTxMemPool
889                 // Only the witness is missing, so the transaction itself may be fine.
890                 state.Invalid(ValidationInvalidReason::TX_WITNESS_MUTATED, false,
891                           state.GetRejectCode(), state.GetRejectReason(), state.GetDebugMessage());
892-                state.SetCorruptionPossible();

ajtowns commented at 12:57 pm on January 15, 2019:

This change isn’t a clean refactor – !state.CorruptionPossible() would have returned false after this, but its replacement in this commit (ie, state.GetReason() != BLOCK_MUTATED) will return true. I think this is okay though, since CorruptionPossible() is only checked for block updates, and this just deals with mempool tx’s, and the uses of state.CorruptionPossible() that this would have effected were already changed to state.GetReason() != TX_WITNESS_MUTATED in an earlier commit.

in src/blockencodings.cpp:206 in 963699d131 outdated

202@@ -203,7 +203,7 @@ ReadStatus PartiallyDownloadedBlock::FillBlock(CBlock& block, const std::vector<
203         // but that is expensive, and CheckBlock caches a block's
204         // "checked-status" (in the CBlock?). CBlock should be able to
205         // check its own merkle root and cache that check.
206-        if (state.CorruptionPossible())
207+        if (state.GetReason() == ValidationInvalidReason::BLOCK_MUTATED)

ajtowns commented at 2:18 pm on January 15, 2019:

Seems like this change could be squashed into “Remove references to CValidationState’s DoS and CorruptionPossible” ?

ajtowns commented at 2:40 pm on January 15, 2019: member

Looks good to me; haven’t fully looked through “Use new reason-based DoS/disconnect logic instead of state.nDoS” though.

in src/consensus/tx_verify.cpp:163 in a5415e85ca outdated

159@@ -160,24 +160,24 @@ bool CheckTransaction(const CTransaction& tx, CValidationState &state, bool fChe
160 {
161     // Basic checks that don't depend on any context
162     if (tx.vin.empty())
163-        return state.DoS(10, false, REJECT_INVALID, "bad-txns-vin-empty");
164+        return state.DoS(10, ValidationInvalidReason::CONSENSUS, false, REJECT_INVALID, "bad-txns-vin-empty");

ryanofsky commented at 7:30 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

Note: word-diff is useful here to review new function arguments:

0git log -p -n1 -U0 --word-diff-regex=. a5415e85caaf2f5a77d6bae9574bb6d21139ee34

in src/validation.cpp:724 in a5415e85ca outdated

715@@ -716,27 +716,22 @@ static bool AcceptToMemoryPoolWorker(const CChainParams& chainparams, CTxMemPool
716                               fSpendsCoinbase, nSigOpsCost, lp);
717         unsigned int nSize = entry.GetTxSize();
718 
719-        // Check that the transaction doesn't have an excessive number of

ryanofsky commented at 7:36 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34):

Why remove this comment?

kallewoof commented at 5:45 am on March 5, 2019:

Good question.

Sjors commented at 1:37 pm on March 6, 2019:

Note that MAX_BLOCK_SIGOPS has been renamed / replaced by MAX_STANDARD_TX_SIGOPS_COST as part of SegWit in 2b1f6f9ccf36f1e0a2c9d99154e1642f796d7c2b. In addition to this comment, MAX_BLOCK_SIGOPS is also still mentioned in the function test framework. But that doesn’t explain why the comment can be removed.

sdaftuar commented at 3:16 pm on March 7, 2019:

I think this comment is not very helpful. It was originally added in #4150, and in the review on that PR people complained that the phrasing in this comment is confusing (“invalid rather than merely non-standard” - huh?).

If reviewers prefer it, then I can just improve this comment rather than delete it – it just seems to me like the code reads just fine on its own.

ajtowns commented at 8:11 am on April 1, 2019:

MAX_BLOCK_SIGOPS is replaced by MAX_BLOCK_SIGOPS_COST (which is just multiplied by the witness scale factor of 4) as part of segwit. MAX_STANDARD_TX_SIGOPS_COST is just a separate rule at the relay/mempool level saying “you have to use at least 5 tx’s to hit the sigop limit”.

I agree that the comment’s just confusing given how we understand “invalid” (breaks consensus rules) vs “non-standard” (not interesting for mempool/relay, but not too strongly punished either).

in src/validation.cpp:1994 in a5415e85ca outdated

1987@@ -1988,11 +1988,17 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl
1988         {
1989             CAmount txfee = 0;
1990             if (!Consensus::CheckTxInputs(tx, state, view, pindex->nHeight, txfee)) {
1991+                if (state.GetReason() == ValidationInvalidReason::TX_MISSING_INPUTS) {
1992+                    // CheckTxInputs may return MISSING_INPUTS but we can't return that, as
1993+                    // it's not defined for a block, so we reset the reason flag to CONSENSUS here.
1994+                    state.DoS(state.GetDoS(), ValidationInvalidReason::CONSENSUS, false,

ryanofsky commented at 7:44 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

This seems like it is doubling the state.nDoS level, in addition to updating the reason enum:

https://github.com/bitcoin/bitcoin/blob/a5415e85caaf2f5a77d6bae9574bb6d21139ee34/src/consensus/validation.h#L96

Would suggest replacing this change something more straightforward like state.UpdateReason(ValidationInvalidReason::CONSENSUS)

in src/validation.cpp:2018 in a5415e85ca outdated

2036+                if (state.GetReason() == ValidationInvalidReason::TX_NOT_STANDARD) {
2037+                    // CheckInputs may return NOT_STANDARD for extra flags we passed,
2038+                    // but we can't return that, as it's not defined for a block, so
2039+                    // we reset the reason flag to CONSENSUS here.
2040+                    // (note that this may not be the case until we add additional
2041+                    // soft-fork flags to our script flags, in which case we  need to

ryanofsky commented at 7:45 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

Extra space on this line

in src/validation.cpp:2042 in a5415e85ca outdated

2039+                    // we reset the reason flag to CONSENSUS here.
2040+                    // (note that this may not be the case until we add additional
2041+                    // soft-fork flags to our script flags, in which case we  need to
2042+                    // be careful to differentiate RECENT_CONSENSUS_CHANGE and
2043+                    // CONSENSUS)
2044+                    state.DoS(state.GetDoS(), ValidationInvalidReason::CONSENSUS, false,

ryanofsky commented at 7:47 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

This also seems to double state.nDoS.

in src/validation.cpp:1989 in a5415e85ca outdated

1987@@ -1988,11 +1988,17 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl
1988         {
1989             CAmount txfee = 0;
1990             if (!Consensus::CheckTxInputs(tx, state, view, pindex->nHeight, txfee)) {
1991+                if (state.GetReason() == ValidationInvalidReason::TX_MISSING_INPUTS) {
1992+                    // CheckTxInputs may return MISSING_INPUTS but we can't return that, as
1993+                    // it's not defined for a block, so we reset the reason flag to CONSENSUS here.

ryanofsky commented at 7:52 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

Is there a check for the requirement that MISSING_INPUTS is not used for a block? I would expect to see an assert(reason != MISSING_INPUTS) or assert(ValidForBlock(reason)) or something like that somewhere.

in src/validation.cpp:3106 in a5415e85ca outdated

3137+            return state.DoS(100, ValidationInvalidReason::CONSENSUS, false, REJECT_INVALID, "bad-cb-multiple", false, "more than one coinbase");
3138 
3139     // Check transactions
3140-    for (const auto& tx : block.vtx)
3141-        if (!CheckTransaction(*tx, state, true))
3142-            return state.Invalid(false, state.GetRejectCode(), state.GetRejectReason(),

ryanofsky commented at 7:57 pm on January 15, 2019:

In commit “Add useful-for-dos “reason” field to CValidationState” (a5415e85caaf2f5a77d6bae9574bb6d21139ee34)

Note: I guess this line used to set state.corruptionPossible = false but no longer does.

https://github.com/bitcoin/bitcoin/blob/cebe910718ae4f099f292736192a4e725ad02b94/src/consensus/validation.h#L54-L58

New way seems better.

Sjors commented at 1:28 pm on March 6, 2019:

So if I understand this correctly:

state.Invalid() just calls state.DoS() with level=0andcorruptionIn=false` (default).
CheckTransaction() can currently fail in various ways, calling:
- state.DoS with:
  - level 10 or 100: why isn’t this higher level a problem?
  - corruptionIn not specified (so defaults to false)

sdaftuar commented at 3:04 pm on March 7, 2019:

level 10 or 100: why isn’t this higher level a problem? @sjors I don’t understand your question – can you rephrase?

ajtowns commented at 9:29 am on April 1, 2019:

If I understand correctly: the higher level (ie, changing the 10’s to 100’s in 96cedc8d0c0e3ad279bc2223a7fc3185b17ebde5 - the “clean up banning levels” commit) isn’t a problem because these failures are all consensus ones, so any reasonable implementation shouldn’t be making them. (Except for immature coinbase and missing inputs at the mempool level which are downgraded elsewhere)

in src/net_processing.cpp:1233 in 6bdc4491e0 outdated

1084         if (it != mapBlockSource.end() && State(it->second.first) && state.GetRejectCode() > 0 && state.GetRejectCode() < REJECT_INTERNAL) {
1085             CBlockReject reject = {(unsigned char)state.GetRejectCode(), state.GetRejectReason().substr(0, MAX_REJECT_MESSAGE_LENGTH), hash};
1086             State(it->second.first)->rejects.push_back(reject);
1087-            if (nDoS > 0 && it->second.second)
1088-                Misbehaving(it->second.first, nDoS);
1089+            MaybePunishNode(/*nodeid=*/ it->second.first, state, /*via_compact_block=*/ !it->second.second);

ryanofsky commented at 8:21 pm on January 15, 2019:

In commit “Use state reason field to check for collisions in cmpctblocks” (963699d1316f6b14c98a4624f766393379db85e1)

Since the mapBlockSource bool is now being passed as !via_compact_block, it seems like the field description should mention something about setting it based on whether the source was a compact or full block:

https://github.com/bitcoin/bitcoin/blob/6bdc4491e06433eb380ca3b8bc3e7c15f06aee8b/src/net_processing.cpp#L104-L105

ryanofsky commented at 9:27 pm on January 15, 2019: member

Started reviewing this, but IMO, the way this PR is structured makes it difficult to verify that it doesn’t unintentionally change behavior.

I think a nicer way to write this would be to have one commit adding empty ValidationInvalidReason, MayResultInDisconnect, and MaybePunishNode definitions, and adding a state.Invalid() overload taking an optional ValidationInvalidReason argument. Then have a sequence of small followup commits which each add a few enum values at a time, passing them through state.Invalid() and translating them into Misbehaving() calls, where each commit is self contained and is deals with related reasons so it is easy to spot and understand changes in behavior.

If this is a bad idea, or too much work, I’d be ok with trying to review this PR as it is, but I wanted to suggest something to be able to have more confidence in it, and to maybe make it easier to find other reviewers.

a5415e85caaf2f5a77d6bae9574bb6d21139ee34 Add useful-for-dos “reason” field to CValidationState (1/8)
33213ad4ed9c5cef893285a7880ca708fb86a4ff Add functions to convert CValidationInterface’s reason to DoS info (2/8)
6bdc4491e06433eb380ca3b8bc3e7c15f06aee8b Use new reason-based DoS/disconnect logic instead of state.nDoS (3/8)
963699d1316f6b14c98a4624f766393379db85e1 Use state reason field to check for collisions in cmpctblocks (4/8)
06e4247ede5d052c9680f9bacee6ec52b83cc097 Prep for scripted-diff by removing some \ns which annoy sed. (5/8)
81318965979971dbcf04df5a216ee1a687d8173f scripted-diff: Remove DoS calls to CValidationState (6/8)
6ee2c4551d055dd3c7bf28ed4bde7c566d75dfef Remove references to CValidationState’s DoS and CorruptionPossible (7/8)
94874ddfdf4ae9c4b10f3f91d5e3280e2c24c371 Update some comments in validation.cpp as we arent doing DoS there (8/8)

in src/test/txvalidation_tests.cpp:56 in a642744cc5 outdated

52@@ -53,9 +53,8 @@ BOOST_FIXTURE_TEST_CASE(tx_mempool_reject_coinbase, TestChain100Setup)
53     BOOST_CHECK(state.IsInvalid());
54     BOOST_CHECK_EQUAL(state.GetRejectReason(), "coinbase");
55 
56-    int nDoS;
57-    BOOST_CHECK_EQUAL(state.IsInvalid(nDoS), true);
58-    BOOST_CHECK_EQUAL(nDoS, 100);
59+    BOOST_CHECK_EQUAL(state.IsInvalid(), true);

ajtowns commented at 1:19 pm on January 18, 2019:

We checked state.IsInvalid() a couple of lines earlier, so this addition is redundant.

ajtowns commented at 1:42 pm on January 18, 2019: member

Started reviewing this, but IMO, the way this PR is structured makes it difficult to verify that it doesn’t unintentionally change behavior.

FWIW, I’ve had a go at redoing the patchset to try to make the (potential) functionality changes more clear: https://github.com/ajtowns/bitcoin/commits/201901-dosreasons

This has (I think) all the behaviour changes first:

d9451de0d0 drop obsolete comment
acdb469525 [refactor] stateDummy -> orphan_state
5cd7a4d338 [refactor] Use maybepunish etc
0d1d471eac [refactor] drop IsInvalid(nDoSOut)
a0776a5d8a set nDoS rather than bumping it
e0cff4e133 Clean up banning levels

before introducing the new reason field, along with checks that the implied DoS value for each reason matches the actual DoS values presented/used:

89e8dea284 [refactor] Add useful-for-dos "reason" field to CValidationState

which then allows dropping the instance variables:

27089e55be [refactor] Drop redundant nDoS, corruptionPossible, SetCorruptionPossible

Then the code is changed to use reasons directly:

15d9023106 LookupBlockIndex -> CACHED_INVALID
519fb78934 CorruptionPossible -> TX_WITNESS_MUTATED
221d17f332 CorruptionPossible -> BLOCK_MUTATED
32747d0746 [refactor] Use Reasons directly instead of DoS codes

And the now obsolete DoS/etc stuff is dropped:

4d110a59c6 [refactor] Prep for scripted-diff by removing some \ns which annoy sed.
9a89a47257 scripted-diff: Remove DoS calls to CValidationState
327591b016 [refactor] Drop unused state.DoS(), state.GetDoS(), state.CorruptionPossible()

That leaves a couple more things:

94cf0deffb [refactor] Update some comments in validation.cpp as we arent doing DoS there
04c6b24a66 [refactor] swap if/else order
96f0ee075d remaining commits vs 94874ddfdf4ae9c4b10f3f91d5e3280e2c24c371

but finally ends up with the same code as this PR (minus the latest commit anyway).

Anyway I think this approach might be easier to review? It could also allow splitting the PR into two – one making the changes to DoS behaviour but not changing the way DoS works; followed by a second PR that actually adds the Reasons and refactors but doesn’t change behaviour.

(Proof of concept only: bunches of these commits should probably be combined, commit messages need improvement, and I think I lost a bunch of authorship info)

EDIT:

I’ve added an extra commit prior to the DoS->Invalid refactor, namely “5b15205883 Allow use of state.Invalid() for all reasons” that avoids assertions that Invalid() is only used for DoS-level-0 problems failing.

That just leaves one test failure in the intermediate commits; feature_block.py fails after the changing the DoS levels but before adding the “reason” code. I think this is due to lowering bad-txns-inputs-missingorspent from 100 to 0 with the tests still expecting a disconnect when that happens in a block. Adding the reasons fixes this because that includes code to update that problem from 0/TX_MISSING_INPUTS to 100/CONSENSUS when it affects a block rather than a loose transaction. (And similarly, the tests are adujsted to expect disconnects due to premature coinbase spends, but that functionality only occurs as part of the 0/TX to 100/CONS step)

sdaftuar commented at 2:46 pm on January 18, 2019: member

Thanks all for the review so far!

I’d started taking a stab at rewriting this; I’ll continue with my approach to see how it ends up but @ajtowns thank you for your help – @ryanofsky if you have any thoughts on @ajtowns’s rework please let me know, happy to adapt his breakdown and include here if that approach looks good.

ryanofsky commented at 3:12 pm on January 18, 2019: member

@ryanofsky if you have any thoughts on @ajtowns’s rework please let me know

Took a quick look, and I think ajtowns’s refactor is great. It’s a slightly different approach than I suggested in that the 32747d0746d91a8f63e39cedfb232f8c36b33bc6 commit which starts using reason codes is done all at once instead of incrementally as reasons are added, so it requires a little bit of grepping to verify, but this is easy to do and I think it’s a huge improvement.

I think it would be best to use ajtown’s branch here, unless you’ve done a lot of work on your own already or see problems I’m missing.

naumenkogs commented at 7:16 pm on January 21, 2019: member

Concept ACK, I will take a closer look once the code is updated per comments above I guess.

jnewbery removed this from the "Blockers" column in a project

sdaftuar force-pushed on Jan 24, 2019

sdaftuar commented at 6:26 pm on January 24, 2019: member

I have redone this along the lines of @ajtowns branch, and cleaned up each commit (I think!) so that each one should be logically correct, pass tests, etc.

I’ve saved the original version of this PR here: https://github.com/sdaftuar/bitcoin/commits/15141.original

The diff between the two is pretty small (just some formatting changes that were getting tedious to resolve, and I removed a couple lines that some reviewers had commented on as being unnecessary):

 0diff --git a/src/consensus/tx_verify.cpp b/src/consensus/tx_verify.cpp
 1index fb04c1c0abf..a7b31ff7c56 100644
 2--- a/src/consensus/tx_verify.cpp
 3+++ b/src/consensus/tx_verify.cpp
 4@@ -221,8 +221,7 @@ bool Consensus::CheckTxInputs(const CTransaction& tx, CValidationState& state, c
 5 
 6         // If prev is coinbase, check that it's matured
 7         if (coin.IsCoinBase() && nSpendHeight - coin.nHeight < COINBASE_MATURITY) {
 8-            return state.Invalid(ValidationInvalidReason::TX_MISSING_INPUTS, false,
 9-                REJECT_INVALID, "bad-txns-premature-spend-of-coinbase",
10+            return state.Invalid(ValidationInvalidReason::TX_MISSING_INPUTS, false, REJECT_INVALID, "bad-txns-premature-spend-of-coinbase",
11                 strprintf("tried to spend coinbase at depth %d", nSpendHeight - coin.nHeight));
12         }
13 
14diff --git a/src/consensus/validation.h b/src/consensus/validation.h
15index daf8b9b87cc..09a5630a4f3 100644
16--- a/src/consensus/validation.h
17+++ b/src/consensus/validation.h
18@@ -81,8 +81,8 @@ private:
19 public:
20     CValidationState() : mode(MODE_VALID), reason(ValidationInvalidReason::NONE), chRejectCode(0) {}
21     bool Invalid(ValidationInvalidReason reasonIn, bool ret = false,
22-             unsigned int chRejectCodeIn=0, const std::string &strRejectReasonIn="",
23-             const std::string &strDebugMessageIn="") {
24+            unsigned int chRejectCodeIn=0, const std::string &strRejectReasonIn="",
25+            const std::string &strDebugMessageIn="") {
26         reason = reasonIn;
27         chRejectCode = chRejectCodeIn;
28         strRejectReason = strRejectReasonIn;
29diff --git a/src/test/txvalidation_tests.cpp b/src/test/txvalidation_tests.cpp
30index 00fd7fef12a..aa30129361f 100644
31--- a/src/test/txvalidation_tests.cpp
32+++ b/src/test/txvalidation_tests.cpp
33@@ -52,8 +52,6 @@ BOOST_FIXTURE_TEST_CASE(tx_mempool_reject_coinbase, TestChain100Setup)
34     // Check that the validation state reflects the unsuccessful attempt.
35     BOOST_CHECK(state.IsInvalid());
36     BOOST_CHECK_EQUAL(state.GetRejectReason(), "coinbase");
37-
38-    BOOST_CHECK_EQUAL(state.IsInvalid(), true);
39     BOOST_CHECK(state.GetReason() == ValidationInvalidReason::CONSENSUS);
40 }
41 
42diff --git a/src/validation.cpp b/src/validation.cpp
43index e1f562ffbfd..20759bf96e7 100644
44--- a/src/validation.cpp
45+++ b/src/validation.cpp
46@@ -888,7 +888,7 @@ static bool AcceptToMemoryPoolWorker(const CChainParams& chainparams, CTxMemPool
47                 !CheckInputs(tx, stateDummy, view, true, scriptVerifyFlags & ~SCRIPT_VERIFY_CLEANSTACK, true, false, txdata)) {
48                 // Only the witness is missing, so the transaction itself may be fine.
49                 state.Invalid(ValidationInvalidReason::TX_WITNESS_MUTATED, false,
50-                          state.GetRejectCode(), state.GetRejectReason(), state.GetDebugMessage());
51+                        state.GetRejectCode(), state.GetRejectReason(), state.GetDebugMessage());
52             }
53             return false; // state filled in by CheckInputs
54         }
55@@ -1980,7 +1980,7 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl
56                     // CheckTxInputs may return MISSING_INPUTS but we can't return that, as
57                     // it's not defined for a block, so we reset the reason flag to CONSENSUS here.
58                     state.Invalid(ValidationInvalidReason::CONSENSUS, false,
59-                              state.GetRejectCode(), state.GetRejectReason(), state.GetDebugMessage());
60+                            state.GetRejectCode(), state.GetRejectReason(), state.GetDebugMessage());
61                 }
62                 return error("%s: Consensus::CheckTxInputs: %s, %s", __func__, tx.GetHash().ToString(), FormatStateMessage(state));
63             }
64@@ -3341,8 +3341,6 @@ static bool ContextualCheckBlock(const CBlock& block, CValidationState& state, c
65     // the block hash, so we couldn't mark the block as permanently
66     // failed).
67     if (GetBlockWeight(block) > MAX_BLOCK_WEIGHT) {
68-        // We can call this a consensus failure as any data-providers who provided
69-        // us with witness data can be expected to support SegWit validation.
70         return state.Invalid(ValidationInvalidReason::CONSENSUS, false, REJECT_INVALID, "bad-blk-weight", strprintf("%s : weight limit failed", __func__));
71     }

Also if this version is not actually easier to review I’m happy to go back to the original or try another approach.

jnewbery added this to the "Blockers" column in a project

in src/validation.cpp:3123 in 94c2cdb880 outdated

3120-        if (!CheckTransaction(*tx, state, true))
3121-            return state.Invalid(false, state.GetRejectCode(), state.GetRejectReason(),
3122-                                 strprintf("Transaction check failed (tx hash %s) %s", tx->GetHash().ToString(), state.GetDebugMessage()));
3123+    for (const auto& tx : block.vtx) {
3124+        if (!CheckTransaction(*tx, state, true)) {
3125+            LogPrintf("Transaction check failed (tx hash %s) %s\n", tx->GetHash().ToString(), state.GetDebugMessage());

ryanofsky commented at 10:22 pm on January 24, 2019:

In commit “Check transactions just logs a message” (94c2cdb88049af5283a7c1f52ea6e52ac2946686)

Could you update the commit message to say whether this commit changes behavior at all, and what the motivation is? At first glance it seems like this probably doesn’t change behavior, and the only motivation is to simplify code. But I could easily be missing something.

sdaftuar commented at 4:50 pm on January 29, 2019:

Done

ryanofsky commented at 7:35 pm on March 5, 2019:

re: #15141 (review)

Done

Thanks, new commit is “Remove redundant state.Invalid() call after CheckTransaction()” (59ff8e67c2c62ec11d76d3d1b54dc4829363ad5e)

in src/net_processing.cpp:826 in 8226bed419 outdated

821+}
822+
823+static bool MaybePunishNode(NodeId nodeid, const CValidationState& state, bool via_compact_block, const std::string& message = "") {
824+    int nDoS = state.GetDoS();
825+    if (nDoS > 0 && !via_compact_block) {
826+         LOCK(cs_main);

ryanofsky commented at 10:31 pm on January 24, 2019:

In commit “[refactor] Use maybepunish etc” (8226bed4191a50129ac6fdbcb8fad5e1c6b7cacd)

Note: This acquires lock recursively in PeerLogicValidation::BlockChecked. Seems fine, but just wanted to note it wasn’t happening before.

in src/validation.cpp:1434 in e534b0b78b outdated

1431-                    // invalid in new blocks, e.g. an invalid P2SH. We DoS ban
1432-                    // such nodes as they are not following the protocol. That
1433-                    // said during an upgrade careful thought should be taken
1434-                    // as to the correct behavior - we may want to continue
1435-                    // peering with non-upgraded nodes even after soft-fork
1436-                    // super-majority signaling has occurred.

ryanofsky commented at 10:46 pm on January 24, 2019:

In commit “[refactor] Update some comments in validation.cpp as we arent doing DoS there” (e534b0b78bec49750421b5f52012b857df197e24)

Why remove this comment entirely?

sdaftuar commented at 4:51 pm on January 29, 2019:

Updated with a new comment.

ryanofsky commented at 7:36 pm on March 5, 2019:

re: #15141 (review)

Updated with a new comment.

Thanks, new commit is “[refactor] Update some comments in validation.cpp as we arent doing DoS there” (9b7978efe3d127fa7833d6561a9d053c6820dc1b)

ryanofsky commented at 10:49 pm on January 24, 2019: member

Started review (will update list below with progress).

59ff8e67c2c62ec11d76d3d1b54dc4829363ad5e Remove redundant state.Invalid() call after CheckTransaction() (1/27)
e7edc15f86fcd247d92273cb7ea094d59e1faefa drop obsolete comment (2/27)
f3fd64cc049a2056b2fcbed136ff9e487db57a25 [refactor] stateDummy -> orphan_state (3/27)
4159f7ca7b449195f0e8a3f67a4045409c703e9d [refactor] Use maybepunish etc (4/27)
bfa94c76c607c741e1eca7ffdd1bef99271ea37d Update comment to reference MaybePunishNode (5/27)
70906c5b410027919e7eda2932093be3e69b18f3 [refactor] drop IsInvalid(nDoSOut) (6/27)
858bae6104ba7d16a637920d7802bb4be4c64994 set nDoS rather than bumping it (7/27)
96cedc8d0c0e3ad279bc2223a7fc3185b17ebde5 Clean up banning levels (8/27)
6e27c500f254ffb009cb54a06bb9861585c0d126 === end of functionality changes (9/27)
ac3873e2a92457995f7e5a9e5fc24352af360c6b [refactor] Add useful-for-dos “reason” field to CValidationState (10/27)
bee1d4f5e29c8c447ac47a608240b38216750072 TX_MISSING_INPUTS now has a DoS score of 0 (11/27)
95d7de9ab85f41aafed8a3cccbd09481497a5cb8 ==== start switch to reasons (12/27)
6e3332a76f227b6dd6068513807934fd1b3d936b [refactor] Drop redundant nDoS, corruptionPossible, SetCorruptionPossible (13/27)
c558ebaa6d02154eaf762a28d2a7e954acee0661 LookupBlockIndex -> CACHED_INVALID (14/27)
8ed1801e06bcac1ce5dce594a9bc548db3b54fc2 CorruptionPossible -> TX_WITNESS_MUTATED (15/27)
3048533275227e67ce22931c6360513bddbd1767 CorruptionPossible -> BLOCK_MUTATED (16/27)
346699322ca820c5d95c255386df3ce1fb1f3d11 [refactor] Use Reasons directly instead of DoS codes (17/27)
9dd6fc18658b36b63b9f264676ac484879597b83 Fix handling of invalid headers (18/27)
5e85f548fb305dde9de0a4f9a309ef1ff9f5b764 ==== drop nDoS info (19/27)
e5f43f3239dc60680646b79f65e9cde745abd760 Allow use of state.Invalid() for all reasons (20/27)
5507feabe7bfc8fe599c0505cb64bf33ddb0ded6 [refactor] Prep for scripted-diff by removing some \ns which annoy sed. (21/27)
c664daf1530f58feb1a1fccd2e5ed80563389126 scripted-diff: Remove DoS calls to CValidationState (22/27)
8e4590e522d4903d970cdaafb95e4fdfccf792fb [refactor] Drop unused state.DoS(), state.GetDoS(), state.CorruptionPossible() (23/27)
f494f78a1a8b82cef5e908588ec362296dba2188 ==== cleanup (24/27)
9b7978efe3d127fa7833d6561a9d053c6820dc1b [refactor] Update some comments in validation.cpp as we arent doing DoS there (25/27)
99d96897b2eb7300cc09389edaf2537f4d45f95b [refactor] swap if/else order (26/27)
7682566acd7b04aee9425772823e77f230268ad8 nit: reason -> m_reason (27/27)

sdaftuar force-pushed on Jan 29, 2019

sdaftuar commented at 4:52 pm on January 29, 2019: member

I addressed @ryanofsky’s comments so far (which rewrote the git history, since one of the commit messages changed, so I also squashed in a comment change as well). Previous version of this PR is now here: https://github.com/sdaftuar/bitcoin/commits/15141.1.

DrahtBot added the label Needs rebase on Feb 8, 2019

sdaftuar commented at 5:57 pm on February 8, 2019: member

This needs a simple rebase, but can I get concept ACK/NACK from more reviewers on whether the reworked form of this PR (which broke things up into many more commits) is preferable compared to the original formulation?

sipa commented at 6:43 pm on February 8, 2019: member

I haven’t reviewed the last few commits yet (only up to “[refactor] Use Reasons directly instead of DoS codes”), but so far the structure is very clear. Concept ACK on that.

sdaftuar force-pushed on Feb 8, 2019

sdaftuar commented at 7:31 pm on February 8, 2019: member

Thanks @sipa. Rebased. Prior version is here: 15141.2

DrahtBot removed the label Needs rebase on Feb 8, 2019

sipa commented at 8:19 pm on February 8, 2019: member

One overall comment: it seems there is a subset of ValidationInvalidReasons that are valid for transactions, and another subset that is valid for blocks. Perhaps it’s useful to have functions to test whether one belongs to those sets, and invoke those functions in assertions after validation returns in their respective contexts. That seems a bit more future-proof than just having comments of the form “CheckTxInputs may return MISSING_INPUTS but we can’t return that”. It would make me also a bit more comfortable with changes to checks from CorruptionPossible() to testing for a specific invalidity reason (assuming we know TX_WITNESS_MUTATED in the only tx-valid corruptionpossible one, and BLOCK_MUTATED the only block-valid corruptionpossible one).

in src/net_processing.cpp:359 in 7682566acd outdated

355+
356+    //! Whether this peer is a manual connection
357+    bool m_is_manual_connection;
358+
359+    CNodeState(CAddress addrIn, std::string addrNameIn, bool is_inbound, bool is_manual) :
360+        address(addrIn), name(addrNameIn), m_is_inbound(is_inbound),

sipa commented at 8:58 pm on February 8, 2019:

Nit: you can use name(std::move(addrNameIn)) here to avoid a copy.

sdaftuar commented at 4:58 pm on March 2, 2019:

Fixed.

in src/net_processing.cpp:1018 in 7682566acd outdated

1018+            }
1019+
1020+            // Disconnect outbound (but not inbound) peers if on an invalid chain.
1021+            // Exempt HB compact block peers and manual connections.
1022+            if (!via_compact_block && !node_state->m_is_inbound && !node_state->m_is_manual_connection) {
1023+                Misbehaving(nodeid, 100, message);

sipa commented at 9:05 pm on February 8, 2019:

The comment says “disconnect”, but the DoS score will also cause a ban here. Is that intentional? (it seems it’s retaining existing behavior, so I assume it is).

sdaftuar commented at 3:28 pm on February 10, 2019:

This is actually a behavior change from existing behavior, but hopefully a relatively harmless one. Here’s the relevant snippet from master:

https://github.com/bitcoin/bitcoin/blob/2945492424934fa360f86b116184ee8e34f19d0a/src/net_processing.cpp#L1552-L1585

It’s a bit hard to decipher because of the multiple layers going on here, but basically punish_duplicate_invalid is only set to true for outbound and non-manual peers relaying us headers outside of HB compact block mode, and nDoS is set to 0 for the cached-invalid case:

https://github.com/bitcoin/bitcoin/blob/2945492424934fa360f86b116184ee8e34f19d0a/src/validation.cpp#L3341-L3357

So this does indeed result in a ban rather than just a disconnect for an outbound peer that announces an invalid header. If there’s no reason that this is materially worse than other ban behaviors that exist, then I think I’d prefer to stick with this behavior change for now, and try to improve ban-behavior globally after this PR (which should become much easier, now that banning is contained to one place in the code in a more understandable way).

I can update the comment though to make this clearer.

ryanofsky commented at 10:02 pm on February 11, 2019:

re: #15141 (review)

Note: Thread pertains to commit “Fix handling of invalid headers” (9dd6fc18658b36b63b9f264676ac484879597b83)

TheBlueMatt commented at 7:21 pm on February 21, 2019:

This appears to be correct to me. Obviously should update the comment to note this.

sdaftuar commented at 5:01 pm on March 2, 2019:

Fixed comment in latest commit.

MarcoFalke deleted a comment on Feb 10, 2019

in src/consensus/tx_verify.cpp:163 in 96cedc8d0c outdated

159@@ -160,9 +160,9 @@ bool CheckTransaction(const CTransaction& tx, CValidationState &state, bool fChe
160 {
161     // Basic checks that don't depend on any context
162     if (tx.vin.empty())
163-        return state.DoS(10, false, REJECT_INVALID, "bad-txns-vin-empty");
164+        return state.DoS(100, false, REJECT_INVALID, "bad-txns-vin-empty");

ryanofsky commented at 8:55 pm on February 11, 2019: