Add a pool for locked memory chunks, replacing LockedPageManager
.
This is something I’ve been wanting to do for a long time. The current approach of preventing swapping of sensitive information by locking objects where they happen to be on the stack or heap in-place causes a lot of mlock/munlock system call churn, slowing down any handling of keys.
Not only that, but locked memory is a limited resource and using a lot of it bogs down the system by increasing overall swappiness, so the previous approach of locking every page that may contain any key information (but also other information) is wasteful.
Thus replace it with a consolidated pool of locked memory, so that chunks of “secure” memory can be allocated and freed without any system calls, and there is little memory overhead as possible (for example, administrative structures are not themselves in locked memory). The pool consists of one of more arenas, which divide a contiguous memory range into chunks. Arenas are allocated per 256 kB (configurable). If all current arenas are full, allocate a new one. Arenas are directly allocated from the OS with the appropriate memory page allocation API. No arenas are ever freed unless the program exits.
- I’ve kept the arena implementation itself basic, for easy review. If this turns out to be ever a bottleneck we can consider adding free-lists per chunk size, per-arena locking and other typical C heap optimizations, but I don’t want to overdesign this for no good reason. Let’s agree it’s a lot better than what we have now. Unit tests have been added.
- To keep a handle on how much locked memory we’re using I’ve added a RPC call
getmemoryinfo
that returns statistics from LockedPoolManager. This can also be used to check whether locking actually succeeded (to prevent sudden crashes with low ulimits it is not fatal if locking fails, but it is useful to be able to see if all key data is still in unswappable memory). - This is a more portable and future-proof API. Some OSes may not be able to pin e.g. stack or heap pages in place but have an API to (de)allocate pinned or locked memory pages.
Review notes
- Please review the wallet commits carefully. Especially where arrays have been switched to vectors, that no
sizeof(vectortype)
remains in the memcpys and memcmps usage (ick!), and.data()
or&vec[x]
is used as appropriate instead of&vec
which would overwrite the vector structure.
Measurements
Immediately after startup, loading a fairly large wallet.
Amount of memory locked cat /proc/$PID/status | grep VmLck
, current master:
0 VmLck: 1376 kB
With this patch:
0 VmLck: 512 kB
Syscall count strace -cf
over whole run (start+shutdown immediately) current master:
0 0.00 0.000328 0 10541 mlock
1 0.00 0.000114 0 10541 munlock
With this patch:
0 0.00 0.000000 0 2 mlock
1 0.00 0.000000 0 2 munlock