Reducing in-RAM block index size (reviving #24760 with measurements) #35612

issue ptrinh opened this issue on June 27, 2026
  1. ptrinh commented at 12:36 AM on June 27, 2026: none

    Following up on #24760 ("The BlockIndex/BlockMap should not live in memory all the time"), which was closed as stale in 2024 with "feel free to open a new issue". In that thread the main open question was empirical: is the block index actually a meaningful share of memory, and is it worth the complexity? Here is some measurement, plus a motivation that is sharper than the default-node case discussed in 2024: memory-constrained / low-power nodes.

    Measurements

    BlockMap = std::unordered_map<uint256, CBlockIndex> at current mainnet height (~955k entries).

    Microbench (real unordered_map, default allocator, 955k entries, peak RSS):

    entry layout sizeof BlockMap RSS bytes/entry delta
    current CBlockIndex 144 B 231 MB ~254 B -
    drop cached header (re-read from BlockTreeDB) 96 B 173 MB ~190 B -58 MB
    DB-backed essentials only 40 B 114 MB ~126 B -117 MB

    (The ~254 B/entry includes real map node + bucket overhead, matching the ~216-224 B figure from #24760. It is notably larger than sizeof-based math, so measuring matters.)

    Real-node confirmation (a synced mainnet node, restart, RSS sampled through the load phases): the fresh process goes from ~120 MB to a settled ~340 MB across the ~6 s block-index load ("Loading block index" -> "Loaded best chain"), i.e. ~220 MB of resident BlockMap, consistent with the microbench. ("Using 2.0 MiB for block index database" confirms this is the in-RAM map, not LevelDB cache.) It grows with chain height.

    Why this matters more for constrained nodes

    On a default node, dbcache + mempool dominate and the block index is a minor share, which is roughly the conclusion of the 2024 thread. But on a deliberately lean low-power node (-blocksonly, no txindex/blockfilterindex, minimal -dbcache), those consumers shrink to near-nothing and the block index becomes the dominant non-reclaimable floor - roughly ~220 MB out of a ~300-400 MB total. The chainstate is mmap'd and reclaimable; the BlockMap is not. For a 512 MB-class device, 58-117 MB is 11-23% of total RAM.

    Possible directions (in increasing scope)

    1. Lazy-load the cached block header (nVersion, hashMerkleRoot, nTime, nBits, nNonce; 48 B) and re-read from BlockTreeDB on demand, keeping a small recent-window cache for hot paths (MTP, header relay). These fields are already persisted on disk, and are read through a bounded surface (GetBlockHeader() plus ~34 direct accesses). ~58 MB. Lowest risk.
    2. sipa's suggestion in #24760: decouple CBlockIndex::pprev, pull prev from DB/cache and link by hash (looping pprev is rare outside deep reorgs; height arithmetic covers the common case). Enables a DB-backed, much smaller resident index. ~117 MB+. Larger change.

    Caveats and open questions

    • The savings figures are from a faithful microbench + a real-node baseline confirmation, not yet a full modified-bitcoind before/after. Happy to build that prototype if there is appetite.
    • Tradeoff is RAM vs occasional disk reads (header serve, deep-reorg traversal) and added complexity in foundational code. Whether that tradeoff is worth it for the constrained-node use case is the real question.
    • Is there interest in this with the low-power framing, and if so which direction is preferred before someone invests in a full implementation?
  2. maflcko commented at 7:34 AM on June 27, 2026: member

    For a 512 MB-class device, 58-117 MB is 11-23% of total RAM.

    Are 512MB devices common or even supported? I had the impression that 1024MB was the minimum. Even compilation recommends 1.5 GB (https://github.com/bitcoin/bitcoin/blob/master/doc/build-unix.md#memory-requirements)

    • Tradeoff is RAM vs occasional disk reads (header serve, deep-reorg traversal) and added complexity in foundational code. Whether that tradeoff is worth it for the constrained-node use case is the real question.

    Hmm, is the unordered_map in different memory pages? If the historic entries (loaded after a restart) are next to each other in memory pages, I'd presume that you can achieve the same with swap already today, without any code changes?

  3. maflcko added the label Resource usage on Jun 27, 2026
  4. ptrinh commented at 7:56 AM on June 27, 2026: none

    Good points, thanks.

    On 512MB: fair, I overstated that - ~1GB is the realistic floor and the build docs back that up. A few reframes that don't hinge on 512MB:

    • The value isn't a specific spec, it's lowering the cost/barrier to running a full node. Being able to repurpose older or cheaper low-RAM hardware - which gets more attractive as RAM prices climb - lets more people run a node cheaply, and more reachable nodes is a decentralization benefit for the network. Lower-end support is a means to that, not the goal in itself.
    • The BlockMap is also monotonic: it grows ~13 MB/year (~250 B/entry) and is never reclaimed, so it's a slowly-worsening non-reclaimable floor for every node, not just small ones. At a 1GB floor it's still ~6-12%.

    Not claiming it's urgent - just that the motivation holds beyond tiny devices.

    On swap: I don't think swap gets you there, for a few reasons:

    • The entries aren't page-contiguous by recency. BlockMap is an unordered_map with one heap-allocated node per entry, and LoadBlockIndexGuts inserts them in BlockTreeDB key order (~block hash), so hot (recent) and cold (historic) entries end up interleaved across pages. Almost every page holds some hot entry, so there are few/no all-cold pages for the kernel to evict.
    • It's anonymous memory, so reclaim requires swap specifically - the kernel can't just drop the pages. Many constrained/SBC deployments run swapless, and where swap exists, faulting back in on access adds latency.
    • The directions here move that data to file-backed leveldb, where it's already persisted: the kernel can then drop those clean pages for free under pressure, no swap required. That's the part swap can't replicate.

    So the page interleaving is sort of the crux - it's exactly what a code change (segregating or DB-backing cold entries) would address and what swap can't. Whether that's worth the complexity is still the open question; I just wanted to cover why "use swap" doesn't already get it for free.

  5. maflcko commented at 9:59 AM on June 27, 2026: member
    • LoadBlockIndexGuts inserts them in BlockTreeDB key order (~block hash)

    Ok, I see. In theory the BlockMap could be backed by an append-only hive (https://en.cppreference.com/cpp/container/hive) where insertion is roughly height-based, but of course it won't work on swapless systems.

    I understand the current temporary RAM price climb, but I wonder if there are real users asking for such optimizations. There is a chance that the RAM prices will fall again over the next years and an additional ~13 MB/year doesn't sound that expensive anyway.

  6. ptrinh commented at 11:30 AM on June 27, 2026: none

    Agreed on all of that, and the hive angle is nice - height-ordered allocation would make the cold entries page-adjacent so swap could evict them, though as you note that does not help the swapless case.

    To follow up on your earlier 512MB question: I am not arguing 512MB is a comfortable default - you are right that is ~1GB. The case I have in mind is deliberate memory budgeting on a shared host rather than rare hardware. Concretely: capping a node VM/container at ~512MB on a Proxmox box or NAS so it co-exists with other services, or fitting a cheaper VPS tier. That is my own setup, and it lowers both hardware and maintenance cost versus a dedicated larger box. Raspberry Pi Zero-class devices are the embedded end of the same spectrum. IBD on such a device is impractical, but the common pattern is to sync on a fast machine and copy the datadir over, so the constrained box only ever runs steady state - which is exactly where this RAM sits. So it is a real deployment pattern, even if I cannot claim broad demand beyond that.

    That said, I take your cost/benefit point. Two clarifications rather than a push:

    • The main lever is the one-time ~58-117 MB off the current ~220 MB baseline, not the ~13 MB/year growth - I mentioned the yearly figure only to note it is monotonic.
    • What it frees is non-reclaimable (anon) memory, which is the part that actually drives OOM under pressure, versus the reclaimable chainstate mmap. So it helps tight-memory stability a bit more than the raw MB suggests, though I agree that is niche.

    If RAM prices normalize and nobody is hitting this in practice, I am fine leaving it as a documented measurement/analysis for if/when it becomes more pressing, rather than pushing a foundational change without demonstrated demand. Thanks for digging into it.

  7. BLUE-STEEL3 commented at 5:11 PM on June 27, 2026: none

    I agree that once you run with -blocksonly, no txindex/blockfilterindex, and a small dbcache, the BlockMap becomes one of the largest non-reclaimable memory consumers. The ~220 MB figure at current chain height is significant on 512 MB-class devices.

    Preferred approach:

    I think we should start with the lower-risk option: lazy-loading the cached block header fields (nVersion, hashMerkleRoot, nTime, nBits, nNonce). This gives a meaningful reduction (~58 MB according to your numbers) while keeping the change relatively contained. It also lets us measure the real-world cost of occasional disk reads before considering bigger changes.

    If this direction shows good results, then exploring sipa’s suggestion (decoupling pprev and making the index more DB-backed) would be the logical next step for further savings.

    Before moving to implementation, it would be useful to see:

    • A breakdown of how often those header fields are accessed outside of GetBlockHeader() during normal operation.
    • A proposed design for the small “recent window” cache (size, eviction policy, and locking strategy).
    • A prototype (even behind a compile-time flag) with before/after memory numbers on a constrained setup.

    Would you be interested in putting together a draft implementation plan or a small prototype for the first approach? I’d be happy to help review or test it.


    This version is polite, clear, technically sound, and encourages progress without overcommitting. You can copy and paste it directly.

  8. bitcoin blocked a user on Jun 27, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-30 05:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me