signed overflow in coinstats index #26362

issue ajtowns openend this issue on October 21, 2022
  1. ajtowns commented at 11:52 am on October 21, 2022: contributor

    https://github.com/bitcoin/bitcoin/blob/6d4048468430d9d1fe5e7c5fcda13708879d1083/src/index/coinstatsindex.h#L33-L35 https://github.com/bitcoin/bitcoin/blob/6d4048468430d9d1fe5e7c5fcda13708879d1083/src/kernel/coinstats.h#L54-L59

    The values:

    • m_total_prevout_spent_amount
    • m_total_new_outputs_ex_coinbase_amount
    • m_total_coinbase_amount

    are all subject to overflowing an int64_t (ie, CAmount). This is visible in practice as of signet block 112516; with total_prevout_spent rolling over to -92230771076.81494784 (from from 92232270378.54920553 BTC at block 112515). With debug builds, this will cause a crash due to the -ftrapv compiler option.

    Making these just be per-block totals instead of cumulative totals should fix this in practice. I think that would require a full index rebuild, though?

    Even then I think it would still be technically possible to overflow m_block_prevout_spent_amount or m_total_new_outputs_ex_coinbase_amount with a single block: you can fit over 15,000 65-byte transactions spending to padding DROP TRUE in a block, and if you start with a utxo containing 6.15M BTC and chain it through 15,000 txs, then that will overflow a signed 64 bit counter. That’s not reproducible on regtest, though, since the halving schedule there limits the total supply to ~15,000 regtest-BTC, which means you need about 6.15M transactions to trigger the same overflow.

    Since we don’t actually expose the cumulative totals; I observed the values above by patching gettxoutsetinfo in rpc/blockchain to add:

    0/*line ~877*/    {RPCResult::Type::STR_AMOUNT, "total_prevout_spent", "Total amount of all prevouts spent up to this block"},
    1/*line ~981*/    block_info.pushKV("total_prevout_spent", ValueFromAmount(stats.total_prevout_spent_amount));
    
  2. ajtowns added the label Bug on Oct 21, 2022
  3. maflcko added the label Brainstorming on Oct 21, 2022
  4. maflcko added the label UTXO Db and Indexes on Oct 21, 2022
  5. maflcko added the label RPC/REST/ZMQ on Oct 21, 2022
  6. ajtowns commented at 12:13 pm on October 21, 2022: contributor
    cc @fjahr
  7. mzumsande commented at 5:48 pm on October 21, 2022: contributor
    This can also be reproduced with the undefined sanitizer. Keeping running totals but using a user-defined datatype that is larger than int64_t could be an alternative fix. Though that would also require the index to be rebuilt - I can’t think of a way to fix this without a breaking change to the index.
  8. maflcko commented at 7:07 am on October 22, 2022: member
    Is it important to report this number? Why not remove it from the RPC result?
  9. fjahr commented at 7:58 pm on October 23, 2022: contributor

    Since we don’t actually expose the cumulative totals

    FWIW, they would be possible to get the cumulative numbers with #19792. The reason to use them rather than the per block values, was some of the following review feedback:

    • Cumulative seems to be easier to use downstream #19521 (comment)
    • It’s easier to go from cumulative to per-block rather than the other way around #19521 (comment)

    I haven’t gotten much feedback on use of the index lately and the use of the cumulative numbers was only in combination with not-yet-merged #19792 so I think removing the cumulative numbers should not be causing too many complaints. Pinging @PierreRochard

    Is it important to report this number? Why not remove it from the RPC result?

    IIRC users that were interested in auditing the coin supply in detail were very interested in this, so I would much rather like to fix it than remove it.

    I can’t think of a way to fix this without a breaking change to the index.

    Me neither as of now but I had some ideas on “gracefully” upgrading indices because there were so many ideas floating around for coinstatsindex two years ago. Will check if that is valuable in any way.

  10. maflcko commented at 7:24 am on October 24, 2022: member

    IIRC users that were interested in auditing the coin supply in detail were very interested in this, so I would much rather like to fix it than remove it.

    Wouldn’t it be easier to audit the supply if the equivalent metrics are reported: fee, subsidy, unspendable, unclaimed. At least per-block those should all be less than MAX_MONEY.

  11. ajtowns commented at 12:54 pm on October 24, 2022: contributor

    I don’t think a total/per-block cumulative prevout_spent_amount is very interesting for auditing purposes – it’s easy to produce large values just by cycling funds to yourself. I think the interesting identities are sum(coinbase-fees) + sum(unclaimed) = sum(subsidy) = spendable + unspendable and each of those values and each of the sums are bounded by MAX_MONEY (though sum(coinbase) and sum(fees) independently would be unbounded).

    Note that total_out in getblockstats has the same edge-case. Adding it was suggested at #10757 (comment)

  12. fjahr commented at 0:30 am on October 31, 2022: contributor

    I have a draft for changing the cumulative values to be per block. So far I have not gotten any feedback but I think this should be ok to change as the feedback in the original PR that I linked above has not making it sound like a hard requirement to have cumulative values. I will clean up the PR and test it more thoroughly in the next couple of days but I tested it on signet already and should fix the issue there.

    For the approach to upgrade: my current thinking is that it is probably best to build the new version of the index in a new folder and give the user a notice in the logs that they can delete the old index folder. While many users may have this useless data on their disk for a while because they ignore the logs, it seems like the cleanest approach to me. Of course the old index folder can also be deleted automatically but I am not sure this is desirable. I am thinking of users that may jump between different versions of core, maybe the latest release and master. When switching between versions before and after the upgrade, this approach should produce the least side effects. Worst case I think they delete the old index after the reading of the log message but then when they switch back to an older version they see the old index being rebuilt. But if the new index is built in the same folder, it would be much worse and the index would be corrupted leading to very confusing issues for these users. I am sure something could be done to mitigate that but it seems like much more complex solution that is not absolutely necessary.

    Happy to hear your feedback, especially on the approach to upgrade!

  13. luke-jr commented at 7:29 pm on November 5, 2022: member

    Upgrade approach looks okay to me for this, but maybe the index format should be rethought so new fields can be added without repeating it again.

    Also I’m wondering if maybe there should be a blockhash<->id mapping shared by all the indexes…

  14. achow101 commented at 4:02 pm on July 6, 2024: member
    Just ran into this.
  15. fjahr commented at 4:04 pm on July 6, 2024: contributor
    I will reopen my fix, I just got lost on the the upgrade mechanics for indices and then lost focus.
  16. fjahr commented at 10:47 am on July 17, 2024: contributor
    #30469 is a new try at fixing this

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me