This PR will not be updated anymore because the project is now split up into multiple pull requests. But I will use it to keep track of the projects PRs:
| Content | PR | Status | Review Club |
|---|
| Add Muhash and SHA256Writer (unused) | #19055 | Merged | notes | Add Muhash Python implementation (unused) | #19105 | Merged | notes | | Add hash_type NONE (Muhash code independent) | #19328 | Merged | - | | Add hash_type MUHASH | #19145 | Merged | ? | | Add ASM optimizations for MuHash | #19181 | Open | ? | | Add CoinStatsIndex | #19521 | Merged | ? | | Parallelize gettxoutsetinfo | | | | | Remove legacy UTXO set hash | | (future) | |
Since it's starting to get confusing I am trying to visualize the dependencies of the open PRs in this depency graph:
+--------------+ +-------------+ +----------+
|#19328 | |#19105 | |#19055 |
|hash_type NONE| |Muhash Python| |Muhash CPP|
+----+-----+---+ +------+------+ +-+----+---+
^ ^ ^ ^ ^
| | | | |
| | +---------+------+ | |
| +----+#19145 +-----+ |
| |hash_type MUHASH| |
| +----------------+ |
| +------+---+
+----+--------+ |#19181 |
|#19521 | |MuHash ASM|
|No hash Index| +----------+
+-------------+
This implements an index of coin statistics with the goal of making the response time of the gettxoutsetinfo RPC call dramatically faster. Currently, this RPC is scanning the full UTXO set every time it is called which makes it hard to use for users that want to continually check the coin supply or compare UTXO set hashes between different nodes. It is especially challenging in periods of multiple quickly mined blocks, even relatively fast machines.
Implementation overview:
- The current serialization of the UTXO set for the purpose of hashing is changed based on sipa's concept in proposed on the mailing list in 2017 and #10434
- The hashing algorithm in for the UTXO set is changed to Muhash, which was also implemented by sipa
- CoinStatsIndex is added which keeps an index of all values that would require
gettxoutsetinfoto scan the UTXO set - CoinStatsIndex can be activated through the flag
-coinstatsindex
Todos/Open questions:
- The transactions count is currently not implemented as it seems not knowable ex-post without running a full IBD to build up the index. I am looking for a solution to this. Ideas are:
- Require
txindexand the transactions count from it - Remove transactions count or mark it as unreliable in a different way if coinstatsindex is enabled
- Require
- The transactions count question is also interesting because it seems to be the only obstacle to providing access to historical coin statistics, which may be a nice follow-up feature if transaction counts can be ignored somehow
- More benchmarking to evaluate potential switch from Muhash to ECMH as the hashing algorithm
- IBD benchmarking with the index enabled
- Tests could be improved
This is an extension to my rolling UTXO set hash proposal from last year.
Edit: 3c5c1ca should probably be squashed into the prior commits but I wanted to leave sipa's commits unchanged for the start.