In #6931 we introduced a possible consensus breaking change by misunderstanding how completely BIP 34 obviated the need for BIP 30. Unfixed, this could break consensus after block height about 1.9M. Explained in code comment.
h/t @sdaftuar
1881 | + // The second case is block 176,684 which has an indicated block height of 1882 | + // 490,897. Unfortunately this issue was not discovered until about 2 weeks 1883 | + // before block 490,897 so there was not much opportunity to address this 1884 | + // case other than to carefully analyze it and determine it would not be a 1885 | + // problem. Block 490,897 was, in fact, mined with a different coinbase 1886 | + // than 176,684, but it is important to note that even if it hadn't
Maybe put "block" in front of blocks being named by number. (and everywhere else)
I was reading it as the the height in coinbase aka the number 176,684 in the block's coinbase at height 490,897.
1863 | + // the BIP34 height of 227,931 which have a scriptSig with a valid-looking 1864 | + // BIP34 height for a later block height. These coinbases run the risk of 1865 | + // being duplicated when we reach that later block height. An exhaustive 1866 | + // search of all mainnet coinbases before the BIP34 height which have a 1867 | + // scriptSig indicating a later block height reveals 2 cases where the later 1868 | + // block height is before 1,983,702.
Is the significance of 1,938,702 just that it is the third lowest height found encoded in the old coinbases? The sentence doesn't say where 1,938,702 comes from, and this is written like it should be obvious what the number means.
If it is just the third lowest height found in the search, I think it would be clearer to say something like "An exhaustive search of all blocks before the BIP34 height found that the lowest height values indicated in coinbase scriptSigs were: 209,921, 490,897, and 1,983,702."
1857 | @@ -1858,12 +1858,58 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl 1858 | // before the first had been spent. Since those coinbases are sufficiently buried its no longer possible to create further 1859 | // duplicate transactions descending from the known pairs either. 1860 | // If we're on the known chain at height greater than where BIP34 activated, we can save the db accesses needed for the BIP30 check. 1861 | + 1862 | + // The exception to this logic is if there are coinbases from blocks before
Rather than start your new comment explaining where the previous comment is incorrect, I think it makes sense to remove/reword the paragraph above.
I also think it would make the whole comment clearer if you explicitly introduced terminology at the top of the comment. Something like:
- block height: the height of the block in the blockchain
- coinbase height: the height indicated in the coinbase transaction's scriptSig under BIP34 rules.
All blocks with block height >= 227,931 must have coinbase height equal to block height. This is a consensus rule.
rather than sprinkling the rest of the comment with multiple terms for the same concept: a scriptSig with a valid-looking BIP34 height..., scriptSig indicating..., indicated block height, coinbase with indicated height.
I think it is useful to understand the basic logic that BIP 34 implies BIP 30 is met and doesn't need to be checked. But then understand why it's not quite that simple. It also matches the historical evolution of the code. So I left this comment as correction to the previous one.
concept ACK 9ab7fe0b3f50b2cc3514975b26eb7c63134bdd40.
I'm currently running my own analysis of pre-BIP34 blocks to make sure that it agrees with what you found. It would be good to have a gist showing those results with code so other people can easily validate for themselves.
In general, I'm not sure whether code comments are appropriate for such a long and nuanced description. It certainly breaks up the flow of the ConnectBlock(). However, more verbose is generally better than less, and I don't have a better suggestion of where to put this level of comment that would stay maintained with the code.
1866 | + // search of all mainnet coinbases before the BIP34 height which have a 1867 | + // scriptSig indicating a later block height reveals 2 cases where the later 1868 | + // block height is before 1,983,702. 1869 | + 1870 | + // There are many cases which indicate block height 1,983,702 or later, so 1871 | + // we simply remove the optimiztion to skip BIP30 checking for blocks at
typo: optimization
Concept ACK.
code-level utACK, modulo comment comments.
Concept ACK. Though, I have never checked if any of the heights or hashes make sense. Wouldn't it help review if #6931 or this pull actually included the code that was used to determine those magic numbers?
My analysis matches yours exactly. The pre-BIP34 blocks with coinbase height > block height are (in ascending order):
block_height | coinbase_height 209920 | 209921 176684 | 490897 164384 | 1983702 169895 | 3708179 <---- I expect to be dead before this height ....
so you're right that the only heights that we should care about are 209921, 490897 and 1983702.
If anyone wants to re-run the analysis, the gist with my code is here: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
(this is quite slow since it's hitting the RPC interface 3 times for each block. It'd be faster to deserialize block files directly, patch bitcoind to print this data itself, or use BlockSci or a similar block analysis tool)
code utACK, haven't actually read through the whole comment.
I agree with the above analysis that block 1983702 is the lowest block height that we need to worry about. We could soft fork in something prior to that block to guarantee that the txids will be different, e.g. block height in the nLockTime field.
Comment updated to hopefully be clearer. Code unchanged.
For reference here is how I double checked the possible violations. I patched bitcoind with this snippet and then ran with -reindex-chainstate.
@@ -1724,6 +1724,16 @@ static bool ConnectBlock(const CBlock& block, CValidationState& state, CBlockInd
}
}
}
+ if (block.vtx[0]->vin[0].scriptSig.size() >= 4 && block.vtx[0]->vin[0].scriptSig[0] == 3) {
+ std::vector<unsigned char> vch;
+ for (int i=1;i<4;i++)
+ vch.push_back( block.vtx[0]->vin[0].scriptSig[i]);
+ CScriptNum height(vch,false);
+ int height_indicated = height.getint();
+ if (height_indicated > pindex->nHeight) {
+ fprintf(stdout,"Potential future duplicate: block %d has indicated height %d\n",pindex->nHeight,height_indicated);
+ }
+ }
}
// Start enforcing BIP68 (sequence locks) and BIP112 (CHECKSEQUENCEVERIFY) using versionbits logic.
Tested ACK 3f3b991
Looks sensible to me. Although I think we should definitely make this a chain parameter, or at least define a constant, instead of hardcoding this magic number:
Optional TODO: Have BIP30 checking activate at a per chain block height.
I don't think we're in enough of a hurry here to skip that.
@laanwj I'm happy to define it as a constant locally, but it seems like adding it as a chain parameter just adds unnecessary cruft to the code. This is such an obscure (and irrelevant for decades) element of the consensus that it seems better to keep it entirely contained to one spot in the code base, so it doesn't occupy mental energy in other places. That said, will defer to your judgement...
EDIT: Or if we think testnet3 will get longer than 1.9 million blocks, then I suppose that's an argument for per chain so we don't do unnecessary calculations on testnet3.
Nono I'm OK with that, but if we actively argue against making it a chain parameter we should remove the TODO as well, otherwise it's inviting someone else to make that change. My only point is that there is no hurry to merge this, so no reason to leave unfinished TODOs in the code.
updated to a constant and removed optional TODO, I believe its better to keep this mess contained to one section of the code
utACK 5b8b387752e8c493a8e87795ae6ddb78b45b1a5d
Difference is that there is now a constexpr int BIP34_IMPLIES_BIP30_LIMIT and the comment has been updated.
Travis failed because of a timeout. I'll kick it.
ACK