Update nChainTx to 64bit type #29258

issue ajtowns openend this issue on January 17, 2024
  1. ajtowns commented at 7:42 am on January 17, 2024: contributor

    https://github.com/bitcoin/bitcoin/blob/8106b268cde8e97a7c330afdda39b6bb55e5574a/src/chain.h#L178-L186

    Hey, it’s 2024! nChainTx seems to currently be 953368191 (height 826100), which is hex 38d3 3e7f, so we’re only 22% of the way to an overflow, and would still need another 200k blocks (~ four years) full of 60 byte transactions to get there.

  2. ajtowns commented at 7:51 am on January 17, 2024: contributor
  3. willcl-ark added the label Validation on Jan 24, 2024
  4. russeree commented at 4:24 pm on January 27, 2024: contributor
    Would this be a hard fork because it would be loosening the conditions for validation?
  5. mzumsande commented at 4:29 pm on January 29, 2024: contributor

    Would this be a hard fork because it would be loosening the conditions for validation?

    I don’t think so, since I don’t think the value of nChainTx is used in validation (or do you see a spot where it is?). It’s used for logging (progress estimation), and whether it is 0 is used in HaveNumChainTxs. The latter could become problematic in case of an overflow, since if the unsigned int would overflow exactly to zero (which is a bit unlikely given that there are a few thousand txns in each block but ceratainly possible) the node would crash in https://github.com/bitcoin/bitcoin/blob/478ac185be49feffd1231366c07e06068683f941/src/validation.cpp#L3017.

  6. russeree commented at 4:44 pm on January 29, 2024: contributor
    ^^^ The above was the line that I was referencing. Thank you.
  7. Sjors commented at 11:26 am on February 6, 2024: member

    There was also the idea of dropping this number entirely: #13875 (comment)

    Though see also #13875 (comment)

  8. maflcko commented at 11:44 am on February 6, 2024: member

    Though see also #13875 (comment)

    This is nTx, not nChainTx.

  9. maflcko commented at 12:09 pm on February 6, 2024: member

    For reference, I removed nChainTx from consensus in #29349. The next steps would be to add a member to CChain to account for the sum of nTx, and update it when SetTip is called. Then, update the RPC, wallet and estimate functions to derive the nChainTx they need from the nChainTx at the tip, walking the relevant portion of the chain.

    This is possible, but I am not sure if it is worth it, because:

    • It is more code overall.
    • The RPC, wallet, and logging code is slower when it has to walk the chain.

    The benefit mainly seems to be to reduce the memory footprint? On 64-bit platforms, sizeof(CBlockIndex) is 144. Adding 4 bytes or removing 4 bytes is less than 10% of cost or savings. Ref: https://godbolt.org/z/rKEPYTh4j

    Removing nTx is even harder, so less worth it. Especially, given that it could be reduced to 16 bits in the memory layout, if needed.

    This leads to an alternative solution to not increase the memory footprint: Move the 32+32 bits used for nTx and nChainTx into a single 64-bit blob, where the 16 bit are for nTx and the remaining for nChainTx. But, again, the overall heap memory usage shouldn’t change much, either way.

  10. ryanofsky commented at 8:08 pm on February 6, 2024: contributor

    This leads to an alternative solution to not increase the memory footprint: Move the 32+32 bits used for nTx and nChainTx into a single 64-bit blob, where the 16 bit are for nTx and the remaining for nChainTx. But, again, the overall heap memory usage shouldn’t change much, either way.

    This does seem like an nice and easy solution. Would be good to incorporate in #29331

    As a tangent, I was thinking along the same lines last week before implementing #29370 and came up with worse solution, taking advantage of the fact that when nChainTx is set, you can derive nTx by subtracting nChainTx - pprev->nChainTx. In case it’s interesting:

     0class CBlockIndex
     1{
     2public:
     3    CBlockIndex* pprev{nullptr};
     4
     5    // Indicate whether num_tx_value contains total number of transactions in
     6    // this block, or total number of transactions in the chain up and including
     7    // this block.
     8    enum NumTxType { UNSET = 0, BLOCK_TX = 1, CHAIN_TX = 2 };
     9    NumTxType num_tx_type : 4 {UNSET};
    10    uint64_t num_tx_value : 60;
    11
    12    uint64_t NumBlockTx()
    13    {
    14        if (num_tx_type == BLOCK_TX) return num_tx_value;
    15        if (num_tx_type == CHAIN_TX) {
    16            assert(pprev && pprev->num_tx_type == CHAIN_TX);
    17            return num_tx_value - pprev->num_tx_value;
    18        }
    19        return 0;
    20    }
    21
    22    uint64_t NumChainTx()
    23    {
    24        return num_tx_type == CHAIN_TX ? num_tx_value : 0;
    25    }
    26};
    
  11. maflcko commented at 12:32 pm on February 7, 2024: member

    This does seem like an nice and easy solution. Would be good to incorporate in #29331

    Not sure if it so easy. On 32-bit platforms, nTx currently starts at byte 60, so making it 16-bit will still eat 32-bit with padding. So forcing a 64-bit blob of both values is needed, but that requires changing all code that uses either field. Not sure if worth it for a +-5% memory difference?

  12. ryanofsky commented at 3:59 pm on February 8, 2024: contributor

    Not sure if it so easy. On 32-bit platforms, nTx currently starts at byte 60, so making it 16-bit will still eat 32-bit with padding.

    I didn’t test on a 32-bit platform, but aren’t bitfields a good solution to this? The following seems to work for me, increasing nChainTx range without increasing CBlockIndex size on x86_64 (144 bytes):

     0--- a/src/chain.h
     1+++ b/src/chain.h
     2@@ -178,7 +178,7 @@ public:
     3     //! Note: this value is faked during UTXO snapshot load to ensure that
     4     //! LoadBlockIndex() will load index entries for blocks that we lack data for.
     5     //! [@sa](/bitcoin-bitcoin/contributor/sa/) ActivateSnapshot
     6-    unsigned int nTx{0};
     7+    uint16_t nTx : 16 {0};
     8 
     9     //! (memory only) Number of transactions in the chain up to and including this block.
    10     //! This value will be non-zero only if and only if transactions for this block and all its parents are available.
    11@@ -188,7 +188,7 @@ public:
    12     //! have the underlying block data available during snapshot load.
    13     //! [@sa](/bitcoin-bitcoin/contributor/sa/) AssumeutxoData
    14     //! [@sa](/bitcoin-bitcoin/contributor/sa/) ActivateSnapshot
    15-    unsigned int nChainTx{0};
    16+    uint64_t nChainTx : 48 {0};
    17 
    18     //! Verification status of this block. See enum BlockStatus
    19     //!
    20@@ -412,7 +412,12 @@ public:
    21 
    22         READWRITE(VARINT_MODE(obj.nHeight, VarIntMode::NONNEGATIVE_SIGNED));
    23         READWRITE(VARINT(obj.nStatus));
    24-        READWRITE(VARINT(obj.nTx));
    25+
    26+        unsigned int num_tx;
    27+        SER_WRITE(obj, num_tx = obj.nTx);
    28+        READWRITE(VARINT(num_tx));
    29+        SER_READ(obj, obj.nTx = num_tx);
    30+
    31         if (obj.nStatus & (BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO)) READWRITE(VARINT_MODE(obj.nFile, VarIntMode::NONNEGATIVE_SIGNED));
    32         if (obj.nStatus & BLOCK_HAVE_DATA) READWRITE(VARINT(obj.nDataPos));
    33         if (obj.nStatus & BLOCK_HAVE_UNDO) READWRITE(VARINT(obj.nUndoPos));
    34--- a/src/validation.cpp
    35+++ b/src/validation.cpp
    36@@ -5119,7 +5119,7 @@ double GuessVerificationProgress(const ChainTxData& data, const CBlockIndex *pin
    37 
    38     double fTxTotal;
    39 
    40-    if (pindex->nChainTx <= data.nTxCount) {
    41+    if (static_cast<int64_t>(pindex->nChainTx) <= data.nTxCount) {
    42         fTxTotal = data.nTxCount + (nNow - data.nTime) * data.dTxRate;
    43     } else {
    44         fTxTotal = pindex->nChainTx + (nNow - pindex->GetBlockTime()) * data.dTxRate;
    
  13. maflcko commented at 4:42 pm on February 8, 2024: member

    bitfields

    Jup, that didn’t work for me. Or maybe I did something wrong. See https://godbolt.org/z/z6EoWKbjY

    0   60:0-15 |   int16_t nTx
    1   64:0-47 |   int64_t nChainTx
    

    (nTx eats byte 60-64, even though it is only a two byte bitfield)

  14. ryanofsky commented at 5:13 pm on February 8, 2024: contributor

    Jup, that didn’t work for me. Or maybe I did something wrong. See https://godbolt.org/z/z6EoWKbjY

    It’s not that surprising that the compiler adds padding so the struct member with uint64_t type is inside a field aligned to 8 bytes. But if you move the two bitfield members to begin on an 8 byte boundary (like between the nDataPos and nUndoPos fields, or to the beginning of the struct, it does get packed without extra padding). So I think the diff above is an easy way to avoid consuming extra memory on 64 bit platforms, even though we could go further with more changes to get the same optimization on 32 bit.

  15. ajtowns commented at 5:33 pm on February 8, 2024: contributor

    bitfields

    Jup, that didn’t work for me. Or maybe I did something wrong. See https://godbolt.org/z/z6EoWKbjY

    0   60:0-15 |   int16_t nTx
    1   64:0-47 |   int64_t nChainTx
    

    (nTx eats byte 60-64, even though it is only a two byte bitfield)

    I think the problem here is you have an odd number of 32-bit pointers prior to nTx, so it’s misaligned to be combined with nChainTx. So you end up with nTx [4B-6B padding] nChainTx nStatus [4B padding]. Moving nStatus prior to nTx gives an even number of 32-bit objects prior to nTx, so drops the size back down.

  16. ryanofsky commented at 8:01 pm on March 8, 2024: contributor

    When updating this should also update the type of the AssumeutxoData::nChainTx field:

    https://github.com/bitcoin/bitcoin/blob/1cd2e29870e4ad3b7c57b1d567df0e6df56572b0/src/kernel/chainparams.h#L57


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me