Hello everyone. I have a proposal to improve the memory usage in Bitcoin Core. I hope I can win your attention and interest for this :-)
Introduction
When the bitcoin blockchain was small, it made sense to load the block index to memory. It acts as an optimization where the block index is basically a handy tool to know whether blocks exist, and learn certain characteristics about them, such as their error state, number of transactions, header, etc.
However, the problem with the block index is that it’s unbounded AND grows in memory; strictly. Loading bitcoin core already occupies tons of memory and takes lots of time on small machines. We’re over 2 GB of memory at times, and that’s growing. Running bitcoin core on smaller machines is even impossible (such as Raspberry Pi Zero) because of that (without using swap memory and similar workarounds, which have their problems); and there’s no reason for that other than memory restrictions. Bitcoin transaction and block processing is simple and isn’t really that demanding.
From a probabilistic point of view, the higher the number of blocks, the less likely we need all the block index entries all the time after a block is synced.
The criticality of this issue depends on how one looks at it; maybe “This is fine” (pun intended). To me, unbounded growing data in memory is not the right way to go, and sooner or later this will become a real problem, especially that bitcoin is aimed to decentralization by allowing everyone to run a node. The logic is very simple: The more memory bitcoin core uses, the less people will want to run a node (all the time).
I would like to propose a solution (and work on it) that makes the memory usage bounded.
I would appreciate your opinions on the matter primarily to learn what you think and secondarily to avoid expending efforts that aren’t appreciated in the bitcoin dev community. Especially that I’m not really innovating here (e.g., upgrading consensus), but I’m attempting to solve a technical issue that I solved before elsewhere.
The solution
TLDR: LRU cache for the block index.
Long answer: There is already an LRU cache in leveldb controlled using the command line argument -dbcache
, which distributes memory over different parts on initialization. However, this isn’t used for the block index and its dependents. Depending on performance metrics, we can either simply read the block index directly from leveldb through a proper interface (and use its LRU cache), or we can write our own LRU cache that lives between leveldb and txdb or even between txdb and that block index interface directly (I have written a concurrent transactional LRU cache before, but that’s another discussion). If the one from leveldb performs well enough, maybe that’s good enough, and the task becomes much simpler. This can be done in multiple steps over time and not all at once.
Effects
The block manager (singleton) is the responsible owner of the block index. Bitcoin Core uses this map to directly locate the block index by simply doing a find
, since it’s just an unordered map/hash map.
0typedef std::unordered_map<uint256, CBlockIndex*, BlockHasher> BlockMap;
Let’s discuss the effects of this proposal
First effect: CBlockIndex by value, not by reference
The correct way I see to do this is to have some interface instead of accessing the BlockMap directly, which will act indirectly a value-based map (as opposed to pointers). In its simplest form, it’s implemented to just use the same map. The side effect of this will be that modifying the block index of a block in any way has to be done through a proper read-modify-write operation. While this seems too elaborate compared to the pointer stuff done now, we have to acknowledge this is what we do anyway since cs_main
lock is held during any modification/addition of a block index entry. The only difference is that because of the isolation resulting from this solution, there might be a little performance drawback because of copying back and forth. However, compilers may be able to optimize this if the block index entry lives only in memory. It’s a long shot, though.
Second effect: No more linking blocks by pointer
Currently, blocks are linked together using pointers. Every block index has a pointer that links to the previous block in memory. By simply using the CBlockIndex::pprev
member, we get the previous block’s block index. This has to change to a function that pulls prev
from the database/cache. The linking should not be done by block index but by block hashes. This should not have significant effect on performance assuming we don’t consistently loop over the pprev
to reach a block in the history. In fact, the only reason to do such a thing is a (long) reorg. Usually the best way to loop to history over the mainchain is using block height arithmetic. The original bitcoin core used to do pprev looping a lot (for example even to create a CBlockLocator
starting from the tip), but right now this isn’t the case anymore since block heights of the main chain are registered in CChain
.
More side effects: Many things have to go to the DB
The problem is bigger than just BlockMap
. Other things will be affected as well, such as CChain
. The fact that there are many dependents on BlockMap
pointers is part of the reason why the memory consumption is so high. All this has to go and be replaced by proper indexing in the key/value database at hand; i.e., leveldb.
Steps to complete this task
- Create the interfaces that will replace everything that uses
CBlockIndex
pointers, still with a global BlockMap through an interface, not throughstd::unordered_map
directly. - Test the new design and ensure that the isolation is successful.
- Get rid of BlockMap and use the database (and whatever caching layer above it) to store the block index
- Profit! Enjoy a much lower memory footprint, and more decentralization by having smaller hardware run full-nodes with less work.
Let me know what you think, and hopefully I didn’t oversee a anything in my analysis.