txindex: use 5-byte siphash keys to optimize disk usage #35531

pull andrewtoth wants to merge 4 commits into bitcoin:master from andrewtoth:txindex_optimization changing 5 files +225 −22
  1. andrewtoth commented at 10:33 PM on June 14, 2026: contributor

    The current txindex uses the full 32-byte txid as keys, which takes up about 66 GB of disk space today on mainnet. Using a 5-byte key prefix instead drops the disk usage to 32 GB - cutting the size to less than half.

    Using the full 32-bytes is unnecessary since a 5-byte salted siphash will produce collisions in about 1 in 1.1 trillion. Some collisions will occur, but the penalty is just an extra disk read, deserialization and hash. The tx position can be appended to the key instead of used as a value, and a LevelDB iterator can seek to the prefix and then scan for the correct tx. This is an almost identical approach to txospenderindex.

    If a tx is not found with this method, we fallback to looking up the legacy entry. With this method a user with an existing db can opt to erase the indexes/txindex folder and reindex, or keep the current index and new entries will be appended with the smaller footprint.

    The time to index was faster on my machine with this method, 1h32m vs current 1h50m. Lookups are roughly the same, around 0.2ms per lookup with getrawtransaction. When testing on mainnet, I got ~860k 2-way collisions, and 1 3-way collision that worst case could cause an extra 2 false positives when reading.

  2. DrahtBot commented at 10:33 PM on June 14, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35531.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK l0rinc, optout21, theStack, sedited

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34729 (Reduce log noise by ajtowns)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

    LLM Linter (✨ experimental)

    Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

    • TxIndex(interfaces::MakeChain(m_node), 1_MiB, true) in src/test/txindex_tests.cpp

    <sup>2026-06-19 00:29:31</sup>

  3. andrewtoth renamed this:
    txindex: use siphash keys to optimize disk usage
    txindex: use 8-byte siphash keys to optimize disk usage
    on Jun 14, 2026
  4. sipa commented at 10:43 PM on June 14, 2026: member

    With 1.376e9 transactions in total, the chance of having at least one collision if each is given a 64-bit random identifier, is around 5%.

  5. andrewtoth commented at 10:46 PM on June 14, 2026: contributor

    @sipa Yes, apologies if I was not clear in the PR description, but collisions are handled with this change. When testing I had 2 sets of 2 txs that collided.

  6. sipa commented at 10:51 PM on June 14, 2026: member

    Oh, I see! Sorry, I saw the number of 1 in 18.4 quintillion and jumped to conclusions.

    Neat. You're essentially treating the database as a set of (txid siphash, tx data) pairs, rather than as a (txid siphash) -> (tx data) map, so it functions as a multimap instead.

  7. andrewtoth commented at 10:55 PM on June 14, 2026: contributor

    Thanks! Yes, collisions are rare enough that this will not have a noticeable read penalty. A collision may incur an extra disk read, deserialization and sha256 hash.

  8. sipa commented at 11:53 PM on June 14, 2026: member

    If existence of collisions isn't the criterion to judge this by, but the expected read amplification from those collisions, we can possibly go even lower?

    With 5-byte salted hashes, the expected amplication factor is under 1.002 per read, up to 2 billion txids, and would save a few extra gigabytes. That might even be a bigger speed win than the loss from that amplication. 4-byte hashes would give an amplication that's probably too much.

  9. andrewtoth force-pushed on Jun 15, 2026
  10. andrewtoth renamed this:
    txindex: use 8-byte siphash keys to optimize disk usage
    txindex: use 5-byte siphash keys to optimize disk usage
    on Jun 15, 2026
  11. andrewtoth commented at 2:30 AM on June 15, 2026: contributor

    Updated to use 5-byte siphash. The reindex was faster by 8 minutes, and the db size is now 32 GB. I got 860k 2-way collisions now though, instead of 2 before, and I got 1 3-way collision. The read penalty was not really measurable from my machine though, likely because I am using a fast laptop with fast directly connected storage.

  12. l0rinc commented at 10:40 AM on June 15, 2026: contributor

    Concept ACK, will play with this after we're done with the compactions

  13. in src/index/txindex.cpp:53 in e3f35ee417
      48 | +    return TxHashKeyPrefix{
      49 | +        static_cast<uint8_t>(siphash >> 56),
      50 | +        static_cast<uint8_t>(siphash >> 48),
      51 | +        static_cast<uint8_t>(siphash >> 40),
      52 | +        static_cast<uint8_t>(siphash >> 32),
      53 | +        static_cast<uint8_t>(siphash >> 24),
    


    optout21 commented at 11:05 AM on June 15, 2026:

    e3f35ee txindex: use 5-byte siphash keys to optimize disk usage:

    Can the const offsets be given in hex? Structure shows better through the hex numbers 18, 20, 28, 30, 38.


    optout21 commented at 11:08 AM on June 15, 2026:

    e3f35ee txindex: use 5-byte siphash keys to optimize disk usage:

    I may be shooting in the dark, but could it be that doing the right shifts incrementally (i.e., first by 24, then 4 times by 8; placed in reverse order), is slightly more efficient? (shifts with smaller size; a micro-optimization).


    andrewtoth commented at 1:03 AM on June 16, 2026:

    Done.


    andrewtoth commented at 1:04 AM on June 16, 2026:

    Hmm not sure that could really make a difference that would be visible to a user or to benchmarks here?


    optout21 commented at 11:55 AM on June 16, 2026:

    I was able to measure only minimal differences in performance. I microbenchmarked only the shift operations. The result was 3% speedup, or 0.000486 microsec per iteration (1518827 microsec vs 1470169, for 100000000 iterations). I think this is negligible.

    Please set this thread to Resolved.

    The two versions compared were:

                buf[0] = static_cast<uint8_t>(siphash >> 0x38);
                buf[1] = static_cast<uint8_t>(siphash >> 0x30);
                buf[2] = static_cast<uint8_t>(siphash >> 0x28);
                buf[3] = static_cast<uint8_t>(siphash >> 0x20);
                buf[4] = static_cast<uint8_t>(siphash >> 0x18);
    
                siphash >>= 24;
                buf[4] = static_cast<uint8_t>(siphash);
                siphash >>= 8;
                buf[3] = static_cast<uint8_t>(siphash);
                siphash >>= 8;
                buf[2] = static_cast<uint8_t>(siphash);
                siphash >>= 8;
                buf[1] = static_cast<uint8_t>(siphash);
                siphash >>= 8;
                buf[0] = static_cast<uint8_t>(siphash);
    

    sipa commented at 12:12 PM on June 16, 2026:

    If the CPU performance of constructing the hash is at all relevant (I don't know), we could consider a faster hash function (like #35215).


    andrewtoth commented at 12:36 PM on June 16, 2026:

    we could consider a faster hash function (like #35215).

    The unfortunate thing about this use case is that we would have to decide on this before a release, since changing the hash function after the fact would require a reindex.


    sipa commented at 1:26 PM on June 16, 2026:

    Indeed, changing it is painful once released.

    Just to assess whether that's worth investigating at all, would someone benchmark with and without the simplified siphash there?


    andrewtoth commented at 1:41 PM on June 16, 2026:

    I was looking to do that :). But, the hash function there is optimized for COutPoint, which has an extra 4 bytes after the uint256. @l0rinc is that sufficiently faster if we just pass a constant as the extra, or is there a more optimized version we can run that omits the extra field?


    optout21 commented at 1:48 PM on June 16, 2026:

    Yes, #35215 was optimization for the extra vout 8 bytes, which is not the case here. A speedup could be obtained with a hash that internally works with 5-byte-only values, but on 64-bit architecture that's not really faster than operations with 8-byte values, so I don't see an easy win here.

    What could be measured as a boundary data point is just taking 5 bytes of the TXID without any extra hashing/salting, and if the speedup is significant, considered.


    sipa commented at 1:50 PM on June 16, 2026:

    I think you have it backwards, @optout21.

    We're discussing replacing the SipHash function used to compute the 5-byte value; it's not using it as an input.

    The input we need here is the txid. In #35215 the input is a COutPoint.


    l0rinc commented at 1:52 PM on June 16, 2026:

    @l0rinc is that sufficiently faster if we just pass a constant as the extra, or is there a more optimized version we can run that omits the extra field?

    We can add a versions without the extra of course. I will help with benchmarking this after the compaction work is behind us - unless you think this is more urgent for some reasonn.


    andrewtoth commented at 1:53 PM on June 16, 2026:

    A tangent but this now got me thinking, if it works well for txindex, we could extend this method to chainstate. siphash the CoinEntry as key prefix, append the Coin to the key, and seek to the prefix and scan for the right outpoint.


    sipa commented at 1:54 PM on June 16, 2026:

    @andrewtoth I think taking the same design approach into account, we can drop 1 extra SipHash round from the construction in #35215 if there is no extra uint32_t to add, so 4 instead of 5 rounds (compared to 14 rounds with traditional SipHash-2-4).


    l0rinc commented at 1:56 PM on June 16, 2026:

    Yes, that's what I mean. I don't mind adding it to #35215 if you think it's a good idea.


    andrewtoth commented at 2:06 PM on June 16, 2026:

    seek to the prefix and scan for the right outpoint.

    Actually that won't work, because we don't have the outpoint we can reconstruct like we can the txid from the transaction we read.


    optout21 commented at 3:48 PM on June 16, 2026:

    We're discussing replacing the SipHash function used to compute the 5-byte value

    Yes, I didn't mean otherwise. My point was (maybe not clearly expressed) that internally SipHash works with 8-byte values, which is a waste, if in the end only a 5-byte hash is needed. A custom version working with 5-byte values is conceivable, but since 64-bit CPUs are optimized for 64 bit width, it probably wouldn't be faster.

    My other point was that to get an upper bound on the speedup possible through tweaking the hash, it's possible to measure an oversimplified solution, where no hash is used at all, but 5 bytes are taken directly from the 32-byte TXID (which is itself a hash). I'm not saying that this would be an acceptable solution (thought it might), but it would be faster than any optimized hash (to get the 5 bytes).


    sipa commented at 4:14 PM on June 16, 2026:

    @optout21 I see what you mean now, but that is not what we're talking about.

    SipHash, and all variants of it, internally work with 64-bit values, that's not going to change. It would be an entirely distinct hash function, which needs separate analysis, to change that. All designs just truncate the final 64-bit value to 40-bit.

    What is being discussed now is independent from the 8/5 byte question. The idea is that, since this PR bites the bullet in changing the data layout to be hash-based, we might as well pick an efficient hash function. Right now, this PR uses normal SipHash-2-4. @l0rinc's PR linked above introduces a more experimental (and custom) variant of SipHash-1-3 with jumboblocks and without padding. Using that same variant here would mean a construction that only needs 4 SipHash rounds (all operating on four internal 64-bit values) rather than 14 SipHash rounds. This same change would be possible even if this PR used 8-byte ids.



    andrewtoth commented at 11:33 PM on June 18, 2026:

    Ran some benchmarks, 2 runs each of master, siphash24, jumbo siphash13, and just taking the first 5 bytes (unsafe, but the best we can expect to get).

    Siphash24 is 13 minutes faster than master, and siphash13 is another 90 seconds faster than that. The raw first 5 byte variant was only 11s faster than siphash13, so we're very close to the theoretical limit.

    I have a very fast CPU, so a slower machine might show a bigger speedup with siphash13. An extra 90 seconds though doesn't seem like a deal breaker for me though. If we can get the siphash13 in before this is released we should definitely take it though.

    Variant min sync avg sync max sync avg (h:m:s) size (GiB) vs master
    master (full 32B txid) 6286s 6355s 6424s 1h45m55s 65.85
    normal (5B SipHash-2-4) 5550s 5578s 5607s 1h32m58s 32.18 −12.2% time
    jumbo (5B SipHash-1-3) 5437s 5488s 5538s 1h31m27s 32.07 −13.7% time
    raw5 (first 5B of txid) 5417s 5476s 5536s 1h31m16s 32.04 −13.8% time
  14. optout21 commented at 11:37 AM on June 15, 2026: contributor

    Concept ACK (e3f35ee4171875124104b404464151f9b3da1566)

    The reduction of index size is very positive!

    The approach for graceful DB update is interesting, it's nice that reindex is not forced.

    Is there a benchmark that exposes the effect of this change?

    Out of curiosity, what was the number of bytes used before switching to 5? 8? And the resulting size? (unfortunately the original version & values were not preserved in the description).

  15. andrewtoth force-pushed on Jun 16, 2026
  16. andrewtoth commented at 1:11 AM on June 16, 2026: contributor

    @optout21 the initial run was using an 8-byte siphash. The db was 36 GB and it took 1h40m.

    Is there a benchmark that exposes the effect of this change?

    The goal was to shrink the db size on disk, but of course if performance was negatively affected it might not be worth it. There are two relevant metrics - syncing the index and reading entries. It seems the former gets a nice bump from this as well. I am assuming it's mostly from the fact that we write a lot less data to disk.

    Benchmarking the sync speed can be done by deleting the /indexes/txindex directory in the datadir, and then restarting and waiting for the txindex is enabled at height log line. This can be compared to the txindex thread start log line to get the delta.

    Benchmarking the read speed can be done with the following script using apache bench:

    TXID="<txid>"
    printf '{"jsonrpc":"1.0","id":"ab","method":"getrawtransaction","params":["%s",0]}\n' "$TXID" > /tmp/data.json
    ab -n 10000 -c 1 -k -A user:password -p /tmp/data.json -T application/json http://127.0.0.1:8332/
    
  17. andrewtoth force-pushed on Jun 16, 2026
  18. DrahtBot added the label CI failed on Jun 16, 2026
  19. DrahtBot commented at 1:19 AM on June 16, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task iwyu: https://github.com/bitcoin/bitcoin/actions/runs/27586700495/job/81558529020</sub> <sub>LLM reason (✨ experimental): CI failed because the IWYU (include-what-you-use) check detected missing includes and forced a formatting/patch to src/index/txindex.cpp, causing the job to exit non-zero.</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  20. refactor: extract read_tx in TxIndex::FindTx
    Non-functional refactor that extracts reading the tx from a disk position to a lambda,
    so it can be reused by other disk positions in the next commit.
    44495886f8
  21. andrewtoth force-pushed on Jun 16, 2026
  22. DrahtBot removed the label CI failed on Jun 16, 2026
  23. theStack commented at 2:57 PM on June 16, 2026: contributor

    Concept ACK

  24. mzumsande commented at 5:18 PM on June 16, 2026: contributor

    In case of a downgrade, the behavior is not ideal. Nothing will break and there will be no warnings or errors, but some transactions won't be returned after getrawtransaction queries even though they exist, while others will be returned normally - depending on which version indexed them.

    So I think that this would need to be documented well. Alternatively, an upgrade similar to how it was done with coinstatsindex in 30.0 might be cleaner - especially since the txindex syncs very fast in comparison.

  25. andrewtoth commented at 10:11 PM on June 16, 2026: contributor

    @mzumsande good observation.

    We could make ReadBestBlock/WriteBestBlock virtual, and override them in txindex to use a new locator, say Bv2, that is used to track the latest block. If Bv2 isn't found, lookup B and start from there. That way a downgrade will ignore newer hashed entries and resync from where the latest legacy entries are.

  26. in src/index/txindex_key.h:19 in 94d7377eda
      14 | +#include <cstddef>
      15 | +#include <cstdint>
      16 | +#include <ios>
      17 | +
      18 | +namespace txindex {
      19 | +constexpr uint8_t DB_TXINDEX_HASHED{'T'};
    


    optout21 commented at 9:22 AM on June 17, 2026:

    94d7377 txindex: use 5-byte siphash keys to optimize disk usage:

    Nit: The 'T' could be confused with the legacy 't', discussing/debugging/etc. in a mixed legacy-new DB environment, maybe a different letter could be picked, to reduce the risk of confusion.


    andrewtoth commented at 12:19 AM on June 19, 2026:

    Updated to 'x'.

  27. optout21 commented at 9:25 AM on June 17, 2026: contributor

    LGTM! Reviewed code, tested lightly locally, including reindex (unpgrade and downgrade). Not ack'ing now, as I can see the pending points:

    • Optimized hash from #35215. This is not a must, can do without (but if it lands earlier, it should be taken; if not, can be done later.)
    • Discussion about downgrade scenario.
  28. sedited commented at 9:51 AM on June 17, 2026: contributor

    Concept ACK

    As for the fallback, I'm curious how much slower this makes transaction querying. Do you think a forced migration to the new DB would be too expensive?

  29. andrewtoth commented at 2:13 PM on June 18, 2026: contributor

    @sedited I don't think the new queries are noticeably slower. It's one more read of a non-existent entry.

    We could do a migration like coinstatsindex in v30, where we keep the old db and reindex in a new directory as @mzumsande suggested. I assumed it would be better to have a graceful upgrade, but I suppose every user will want to reindex to reap the disk savings. In that case, we can write release notes to tell users to wipe their old txindex directory if they are not planning on downgrading? This path will also make the code changes simpler since we don't have to support upgrade and downgrade.

    What does everyone think - support graceful upgrade/downgrade, or just index into a new directory?

  30. l0rinc commented at 2:21 PM on June 18, 2026: contributor

    What does everyone think - support graceful upgrade/downgrade, or just index into a new directory?

    Wouldn't on-demand migration provide the possibility to do both, as you mentioned? Start immediately and the system will fall back to amortized O(1) migration, or delete everything and do an O(n) migration? (note that I still haven't reviewed it in detail, only responding to the question)

  31. mzumsande commented at 2:36 PM on June 18, 2026: contributor

    What does everyone think - support graceful upgrade/downgrade, or just index into a new directory?

    The exact same procedure would probably not be a good idea - if we'd keep the old index with its 66GB around by default, users who don't do anything manually (which are probably most) would experience a increase in disk space because they would have both. That was not a problem for coinstatsindex because it is much smaller.

  32. optout21 commented at 2:53 PM on June 18, 2026: contributor

    The graceful upgrade without forced reindex is very user-friendly (at the price of hybrid data in the DB, and logic to try to read both). The downgrade being the less frequent use case, I think it's acceptable to require a reindex in that case. However, the question is how to prevent the old code from using the existing DB. Maybe at graceful upgrade rename the index DB (but keep its content), so the old code will not find it. Just an idea.

  33. sedited commented at 7:16 PM on June 18, 2026: contributor

    What does everyone think - support graceful upgrade/downgrade, or just index into a new directory?

    I think what you have now is fine tbh. I asked the question before because I was curious to hear some of your reasoning.

  34. andrewtoth force-pushed on Jun 19, 2026
  35. txindex: use a new block locator for downgrade safety
    The hashed txindex entries cannot be found by older nodes. Record sync
    progress under a new locator key so a downgraded node will not rely on
    entries indexed by upgraded nodes, and instead continue syncing from the
    legacy locator.
    a7a599a593
  36. txindex: use 5-byte siphash keys to optimize disk usage
    Use a 5-byte salted siphash to key txindex entries,
    instead of the full 32-byte txid. Store the tx position
    after the hash in the key, so an iterator can scan
    through any collisions and return the correct tx.
    
    Fallback to the legacy key lookup if the tx is not found.
    716a1f03af
  37. tests: cover txindex hash prefix collisions and legacy lookups 171418eb14
  38. andrewtoth force-pushed on Jun 19, 2026
  39. DrahtBot added the label CI failed on Jun 19, 2026
  40. DrahtBot commented at 12:29 AM on June 19, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task No wallet: https://github.com/bitcoin/bitcoin/actions/runs/27797474238/job/82260299474</sub> <sub>LLM reason (✨ experimental): CI failed due to a Clang -Werror compile error: deleting BaseIndex::DB with virtual functions but a non-virtual destructor (-Wdelete-non-abstract-non-virtual-dtor).</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  41. andrewtoth commented at 12:36 AM on June 19, 2026: contributor

    Thanks for all your responses. Seems we're all in agreement about keeping graceful upgrades and downgrades.

    I pushed a commit 736410a1739fe8fd5c7c4140929d115ace6475ee that updates the block locator for txindex, and reads the legacy block locator if the new version is not yet written. This lets us continue to use the legacy index when upgrading, and an older node will ignore hashed entries when downgrading and resync from where the legacy locator left off.

    If a user wipes their txindex and resyncs with the new hashed entries, then downgrades, they will reindex legacy again and have both indexes in one db. I think that's ok though, we should just mention in the release notes to wipe the db only if you don't plan on downgrading.

  42. DrahtBot removed the label CI failed on Jun 19, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-19 10:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me