[IBD] batch block reads/writes during `AutoFile` serialization

l0rinc commented at 2:34 pm on December 21, 2024: contributor

This change is part of [IBD] - Tracking PR for speeding up Initial Block Download

Summary

We can serialize the blocks and undos to any Stream which implements the appropriate read/write methods. AutoFile is one of these, writing the results “directly” to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as observed by Martinus in a similar change).

Unlocking new optimization opportunities

Buffered writes will also enable batched obfuscation calculations (implemented in #31144) - especially since currently we need to copy the write input’s std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly.

Measurements (micro benchmarks, full IBDs and reindexes)

Microbenchmarks for [Read|Write]BlockBench show a ~30%/168% speedup with macOS/Clang, and ~19%/24% with Linux/GCC (the follow-up XOR batching improves these further):

Before:

ns/op	op/s	err%	total	benchmark
2,271,441.67	440.25	0.1%	11.00	`ReadBlockBench`
5,149,564.31	194.19	0.8%	10.95	`WriteBlockBench`

After:

ns/op	op/s	err%	total	benchmark
1,738,683.04	575.15	0.2%	11.04	`ReadBlockBench`
3,052,658.88	327.58	1.0%	10.91	`WriteBlockBench`

Before:

ns/op	op/s	err%	ins/op	cyc/op	IPC	bra/op	miss%	total	benchmark
6,895,987.11	145.01	0.0%	71,055,269.86	23,977,374.37	2.963	5,074,828.78	0.4%	22.00	`ReadBlockBench`
5,152,973.58	194.06	2.2%	19,350,886.41	8,784,539.75	2.203	3,079,335.21	0.4%	23.18	`WriteBlockBench`

After:

ns/op	op/s	err%	ins/op	cyc/op	IPC	bra/op	miss%	total	benchmark
5,771,882.71	173.25	0.0%	65,741,889.82	20,453,232.33	3.214	3,971,321.75	0.3%	22.01	`ReadBlockBench`
4,145,681.13	241.21	4.0%	15,337,596.85	5,732,186.47	2.676	2,239,662.64	0.1%	23.94	`WriteBlockBench`

2 full IBD runs against master (compiled with GCC where the gains seem more modest) for 888888 blocks (seeded from real nodes) indicates a ~7% total speedup.

 0COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
 1STOP_HEIGHT=888888; DBCACHE=1000; \
 2C_COMPILER=gcc; CXX_COMPILER=g++; \
 3BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
 4(for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
 5hyperfine \
 6  --sort 'command' \
 7  --runs 2 \
 8  --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
 9  --parameter-list COMMIT ${COMMITS// /,} \
10  --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
11    cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
12    cmake --build build -j$(nproc) --target bitcoind && \
13    ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
14  --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
15  "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
16d2b72b1369 refactor: rename leftover WriteBlockBench
17652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
18Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
19  Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
20  Range (min … max):   41277.786 s … 41778.421 s    2 runs
21 
22Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
23  Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
24  Range (min … max):   38458.957 s … 39083.957 s    2 runs
25 
26Relative speed comparison
27        1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
28        1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)

DrahtBot commented at 2:34 pm on December 21, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31551.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	maflcko, ryanofsky, hodlinator, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#32043 ([IBD] Tracking PR for speeding up Initial Block Download by l0rinc)
#31144 ([IBD] multi-byte block obfuscation by l0rinc)
#29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)
#29307 (util: explicitly close all AutoFiles that have been written by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

l0rinc force-pushed on Dec 21, 2024

l0rinc renamed this:
~~optimization: bulk reads(27%)/writes(290%) in [undo]block [de]serialization~~
optimization: bulk reads(27%)/writes(290%) in [undo]block [de]serialization, 6% faster IBD
on Dec 21, 2024

l0rinc force-pushed on Dec 21, 2024

DrahtBot added the label Needs rebase on Jan 22, 2025

l0rinc force-pushed on Jan 22, 2025

DrahtBot removed the label Needs rebase on Jan 22, 2025

l0rinc force-pushed on Jan 22, 2025

l0rinc marked this as ready for review on Jan 22, 2025

l0rinc force-pushed on Jan 22, 2025

DrahtBot commented at 11:14 pm on January 22, 2025: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/36027778098

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot added the label CI failed on Jan 22, 2025

l0rinc force-pushed on Jan 22, 2025

DrahtBot removed the label CI failed on Jan 23, 2025

l0rinc marked this as a draft on Jan 23, 2025

l0rinc marked this as ready for review on Jan 23, 2025

in src/node/blockstorage.cpp:1029 in 0e9fb2ed33 outdated

1026+
1027     // Open history file to read
1028     AutoFile filein{OpenBlockFile(pos, true)};
1029     if (filein.IsNull()) {
1030-        LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
1031+        LogError("OpenBlockFile failed for %s", pos.ToString());

maflcko commented at 1:58 pm on January 24, 2025:

I am not a fan of mixing logging changes into a commit that does something completely different (claimed optimization).

Also, not everyone has srclocation logging enabled, so the log strings should still be unique.

Looks like the prior pull did something similar and now there are two LogError("OpenUndoFile failed"); in two different contexts. So when removing the __func__, it would be good to add back any context that was removed (i.e. whether it was opened for reading or writing).

l0rinc commented at 2:30 pm on January 24, 2025:

Thank you, excellent observation. I have separated the comment updates to a final commit and fixed the previous PR’s comments as well. Done

hodlinator commented at 2:04 pm on April 3, 2025:

(Was just about to advocate for the original version of this change, but agree it’s good to be able to tell errors apart).

l0rinc force-pushed on Jan 24, 2025

in src/node/blockstorage.cpp:690 in f8b1b218b4 outdated

688-    HashVerifier verifier{filein}; // Use HashVerifier as reserializing may lose data, c.f. commit d342424301013ec47dc146a4beb49d5c9319d80a
689     try {
690+        // Read block
691+        filein >> undo_size;
692+        if (undo_size > MAX_SIZE) {
693+            LogError("%s: Refusing to read undo data of size: %d", __func__, undo_size);

maflcko commented at 3:42 pm on January 24, 2025:

nit in f8b1b218b4f8533128b337168b6769e2e4c30c02: I think for new logs, __func__ is not needed. Refusing to read undo data of size looks unique as well.

l0rinc commented at 4:26 pm on January 24, 2025:

Done

in src/node/blockstorage.cpp:682 in f8b1b218b4 outdated

712         LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
713         return false;
714     }
715 
716-    // Verify checksum
717-    if (hashChecksum != verifier.GetHash()) {

maflcko commented at 3:46 pm on January 24, 2025:

nit in https://github.com/bitcoin/bitcoin/commit/f8b1b218b4f8533128b337168b6769e2e4c30c02: Any reason to re-order this? Should be fine to keep the scopes as-is, since they are so small.

l0rinc commented at 4:09 pm on January 24, 2025:

You mean by leaving a hashChecksum reference outside the try, initialize it inside and validate it outside as well? I can do that (or bring them inside in a separate commit), but I didn’t like that. Can you show me with a diff with the version that you like?

maflcko commented at 4:17 pm on January 24, 2025:

All good, I see it wouldn’t make the diff smaller. Feel free to resolve.

in src/node/blockstorage.cpp:966 in f72c2dd3a0 outdated

962@@ -963,23 +963,34 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
963             return false;
964         }
965         // Open history file to append
966-        AutoFile fileout{OpenUndoFile(pos)};
967+        AutoFile fileout{m_undo_file_seq.Open(pos, false), {}}; // We'll obfuscate ourselves

maflcko commented at 3:54 pm on January 24, 2025:

in f72c2dd3a0c5005bb527587dc5576a19dc6a790d: I wonder if it would be more encapsulated to add a new AutoFile::write_large(std::vector<std::byte>&&) method which consumes a large byte blob. This would allow to keep the xor internal to AutoFile, reducing the complexity here and allowing other places to use it as well?

l0rinc commented at 4:10 pm on January 24, 2025:

Interesting idea - what do others think?

l0rinc commented at 7:16 pm on January 26, 2025:

Done in a separate commit, the result is a lot cleaner, thanks for the suggestion

in src/node/blockstorage.cpp:682 in 2a5c52ac4e outdated

678@@ -679,7 +679,7 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
679     // Open history file to read
680     AutoFile filein{OpenUndoFile(pos, true)};
681     if (filein.IsNull()) {
682-        LogError("OpenUndoFile failed for %s", pos.ToString());
683+        LogError("%s: OpenUndoFile failed for %s", __func__, pos.ToString());

maflcko commented at 3:57 pm on January 24, 2025:

nit in 2a5c52ac4e0c098dc6dcf354d9e00fc5c78b6dfe: An alternative would be to say LogError("OpenUndoFile failed for %s while reading", _);

l0rinc commented at 4:26 pm on January 24, 2025:

Done

maflcko commented at 4:04 pm on January 24, 2025: member

2 full IBD runs against master for 870k blocks (seeded from real nodes) indicates a 6% total speedup. Note that this is only measurable with full IBD, not with reindex(-chainstate) since the gain comes from block reading/writing.

I don’t understand this. -reindex(-chainstate) requires the read of blocks, so why would it not be measurable? See also #28226 which this pull is inspired by(?) and which claims a speedup in reindex-chainstate.

l0rinc force-pushed on Jan 24, 2025

l0rinc commented at 4:30 pm on January 24, 2025: contributor

which claims a speedup in reindex-chainstate

So the 9% improvement in unserializing seems to translate to roughly 1.2% total runtime improvement in -reindex-chainstate on my machine.

I can measure several runs to see if it makes any difference (to be able to claim anything reliable about a ~1% potential speedup), but the reads are only 27% faster, while the writes are 2.9x faster, so I’m not surprised that reindex doesn’t shine compared to IBD. But I’m currently rerunning them with the rebased PR to make sure my measurements were realistic. If you have any concrete measurement you’d like me to do, feel free to specify them.

l0rinc force-pushed on Jan 26, 2025

l0rinc renamed this:
~~optimization: bulk reads(27%)/writes(290%) in [undo]block [de]serialization, 6% faster IBD~~
optimization: bulk reads(32%)/writes(298%) in [undo]block [de]serialization, ~6% faster IBD
on Jan 27, 2025

l0rinc commented at 4:48 pm on January 27, 2025: contributor

which this pull is inspired by(?)

I did bump into it, but basically reinvented it (not the first time) while investigating why XOR is slow. I’ll add @martinus as a coauthor as well, thanks for the hint @maflcko, he deserves it.

-reindex(-chainstate) requires the read of blocks, so why would it not be measurable?

Thanks for the push, I have redone all the measurements: reindex-chainstate is very stable (see the updated PR description for details), two runs indicate a 4% speedup (with GCC, will try with Clang as well since it indicated a bigger speedup in the microbenchmarks):

0Summary
1  COMMIT=97b4a50c714c801e08bc02648a9c259f284069c2 ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -printtoconsole=0 -reindex-chainstate -connect=0 ran
2    1.04 ± 0.00 times faster than COMMIT=5acf12bafeb126f2190b3f401f95199e0eea90c9 ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -printtoconsole=0 -reindex-chainstate -connect=0

While IBD is closer to the user’s experience (having actual block- and undo writes, which is important since they were modified here), it’s also less stable, so I ran 3 rounds of each, indicating a 6% speedup (bringing reindex and IBD speeds closer together):

0Summary
1  COMMIT=0e9fb2ed330960c0e4cd36b077c64ac7d0f84240 ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -printtoconsole=0 ran
2    1.06 ± 0.02 times faster than COMMIT=5acf12bafeb126f2190b3f401f95199e0eea90c9 ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -printtoconsole=0

l0rinc force-pushed on Jan 27, 2025

l0rinc renamed this:
~~optimization: bulk reads(32%)/writes(298%) in [undo]block [de]serialization, ~6% faster IBD~~
[IBD] batch `AutoFile` reads/writes during serialization
on Mar 12, 2025

l0rinc renamed this:
~~[IBD] batch `AutoFile` reads/writes during serialization~~
[IBD] batch block reads/writes during `AutoFile` serialization
on Mar 12, 2025

ryanofsky commented at 9:40 pm on March 13, 2025: contributor

Code review d700b14491a27a6f4f9069dde86fe914784346ca

This approach seems reaonable but maybe it would be good to have a more general solution to providing buffered reads and writes instead of having these ad-hoc std::vector<uint8_t> / SpanReader / DataStream / write_large calls.

Ideally instead of:

0std::vector<uint8_t> mem(blk_size);
1filein >> Span{mem};
2SpanReader(mem) >> TX_WITH_WITNESS(block);
3// ...
4fileout.write_large(DataStream(block_size) << TX_WITH_WITNESS(block));

Code could look more like:

0BufferedReader{fileIn, blk_size} >> TX_WITH_WITNESS(block);
1// ...
2BufferedWriter{fileout, block_size} << TX_WITH_WITNESS(block);

But I guess as long as the buffers are only going to be used once and thrown away in this case (and not be fixed-size ring buffers) having buffer classes wouldn’t substantively improve this code much aside from making it look nicer. Buffer stream classes would seem more useful in cases where more fields are serialized and sizes are not known in advance.

Just wanted to suggest this in case there are other places where serialization is slow and buffering could speed it up. We might want a more general solution if that is the case.

l0rinc force-pushed on Mar 14, 2025

l0rinc commented at 2:20 pm on March 14, 2025: contributor

Thanks for the review @ryanofsky

would be good to have a more general solution to providing buffered reads and writes

That’s what I had originally (with a fixed-size buffer, so was even more general than your suggestion), making the call-site-diff trivial: #31539. It was suggested by @theuni that it may be simpler to serialize to buffer directly. It does seem simpler to reason about this one, especially since they basically had the same speed. What do you think?

What other places are candidates for similar buffering?

ryanofsky commented at 3:33 pm on March 14, 2025: contributor

No strong opinion. I guess I’d be a little disappointed if you initially had a general solution that made it easy to enable buffering any place and then were told that instead of using that, you should add kludges to this code and implement ad-hoc buffering here. I don’t think buffering is difficult to reason about and don’t think buffering is the type of thing that needs to be deployed on a case-by-case basis, especially since performance of buffering should mostly depend on where data is being written, not what data is being written. So if there is evidence that XORed file i/o is faster with buffering one place, probably we should expect XORed file i/o with buffering to be faster other places as well.

I have no problems with current version of this PR though, and I can understand reasons why there might be pushback against implementation in #31539 since since the BufferedReadOnlyFile class implemented there duplicated functionality of AutoFile, didn’t support writing, and didn’t work with any type of stream. So it could be faulted not just for introducing a new abstraction, but for the abstraction not being usable in more places and not being compatible with current stream classes.

Would maybe suggest adding a little “Design considerations” or “Alternatives considered” section to end of PR description to link to #31539 and say how current design was chosen, because it would have been useful to see and know about the alternative when reviewing this PR.

l0rinc commented at 4:20 pm on March 14, 2025: contributor

since the BufferedReadOnlyFile class implemented there duplicated functionality of AutoFile, didn’t support writing, and didn’t work with any type of stream

There was a writer as well, but it resulted in some weird CI failures and given this alternative I just closed the PR.

disappointed if you initially had a general solution that made it easy to enable buffering any place and then were told that instead of using that

No, the suggested approach was better, that’s why I kept this one. I’d be curious what other reviewers think, but I don’t mind re-generalizing it (now or in a follow-up, when we find other usages).

XORed file i/o is faster with buffering one place, probably we should expect XORed file i/o with buffering to be faster other places as well

It’s not just the xor though, many small writes seemed to be slower in general, it seems faster if we buffer instead of the OS (likely because of locality, the file is a global resource that needs locking).

don’t think buffering is the type of thing that needs to be deployed on a case-by-case basis

I can get behind that, can you point me to other usages where we could use this?

ryanofsky commented at 5:01 pm on March 14, 2025: contributor

I can get behind that, can you point me to other usages where we could use this?

I don’t know, but if I were looking I’d just grep for places where AutoFile class is used. I don’t have any real concerns here and I’m fine with the current approach. To the extent I do have concerns they are about readability and maintainability of code not performance, so I like an approach that makes it trivial to turn buffering on and off, which is I why thought BufferedReader / BufferedWriter stream adapters suggested in #31551 (comment) might be a good idea. But when I posted that I didn’t know a similar idea had already been tried before, so wouldn’t suggest changing the approach again at this point without feedback from the other reviewers.

l0rinc force-pushed on Mar 14, 2025

l0rinc commented at 11:09 pm on March 14, 2025: contributor

Thanks again for the hint @ryanofsky, I went through all changes again and reworked the batching - it’s so much better this way and it includes your idea, @theuni’s and @maflcko’s.

Additionally I fixed a few things after rebase:

There was a leftover SaveBlock from the previous refactor;
Commits and docs still contained build/src/benc/bench_bitcoin - changed them to build/bin/bench_bitcoin;
BLOCK_SERIALIZATION_HEADER_SIZE is used inline in the examples now and is too long - simplified it in a separate commit;
Added BufferedFileW and BufferedFileR, encapsulating the buffering (and the new AutoFile::write_large) so that we can have a similar usage to what you suggested;
I have also extracted the low-risk refactorings (which are needed for the batching later) to a separate commit to minimize the diffs in the risky commits.

The performance is same as before, the commits should be a lot simpler to review.

DrahtBot added the label Needs rebase on Mar 20, 2025

l0rinc force-pushed on Mar 20, 2025

DrahtBot added the label CI failed on Mar 20, 2025

DrahtBot commented at 1:35 pm on March 20, 2025: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39110927855

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

in src/streams.h:485 in cf019b73a7 outdated

480+    explicit BufferedFileW(AutoFile& file, const uint32_t buffer_size)
481+        : m_file(file), m_buffer_size{buffer_size}, m_buf{buffer_size} {}
482+
483+    ~BufferedFileW()
484+    {
485+        Assert(m_buf.size() <= m_buffer_size);

maflcko commented at 2:25 pm on March 20, 2025:

This could be an assume, given that the program can continue normally, albeit it is a pessimization or possible OOM on low memory machines?

l0rinc commented at 4:48 pm on March 20, 2025:

This could be an assume

I’m fine with both, thanks, changed reader & writer.

This could be an assume, given that the program can continue normally, albeit it is a pessimization or possible OOM on low memory machines?

You mean reading the whole block into memory could be the last straw? Can you please detail the pessimization scenario?

maflcko commented at 5:37 pm on March 20, 2025:

I just mean that the assert gives a hard upper bound, but the assume is a bit weaker on release builds, so if the code is used in a place where the upper bound is too small, the assume-version would continue to work on machines with enough memory and only crash “when needed” (aka when running oom).

DrahtBot removed the label Needs rebase on Mar 20, 2025

l0rinc force-pushed on Mar 20, 2025

DrahtBot removed the label CI failed on Mar 20, 2025

ryanofsky commented at 8:00 pm on March 24, 2025: contributor

Code review 2c073b90ae977af1ac01164a894edd978529fdc6. I think the new BufferedFile approach implemented is reasonable and a little nicer than the previous approach which made lower-level calls in block storage code to allocate and pass buffers. But the need to hardcode and compute sizes in 8782d6ed09bbb947426c1ed54ad4d5188326764d and 2c073b90ae977af1ac01164a894edd978529fdc6 still seems to make the storage code more awkward and fragile, and require earlier refactoring changes that make the PR more complicated overall.

If it would be possible, and if it would ok with you and other reviewers, I would prefer if it we could make minimal changes to the blockstorage code and instead add BufferedReader/BufferedWriter streams that only add buffering without making other changes. A diff of what this could look like is:

  0--- a/src/node/blockstorage.cpp
  1+++ b/src/node/blockstorage.cpp
  2@@ -660,11 +660,12 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
  3     const FlatFilePos pos{WITH_LOCK(::cs_main, return index.GetUndoPos())};
  4 
  5     // Open history file to read
  6-    AutoFile filein{OpenUndoFile(pos, true)};
  7-    if (filein.IsNull()) {
  8+    AutoFile file{OpenUndoFile(pos, true)};
  9+    if (file.IsNull()) {
 10         LogError("OpenUndoFile failed for %s", pos.ToString());
 11         return false;
 12     }
 13+    BufferedReader filein{file};
 14 
 15     // Read block
 16     uint256 hashChecksum;
 17@@ -937,11 +938,12 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
 18             return false;
 19         }
 20         // Open history file to append
 21-        AutoFile fileout{OpenUndoFile(pos)};
 22-        if (fileout.IsNull()) {
 23+        AutoFile file{OpenUndoFile(pos)};
 24+        if (file.IsNull()) {
 25             LogError("OpenUndoFile failed");
 26             return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
 27         }
 28+        BufferedWriter fileout{file};
 29 
 30         // Write index header
 31         fileout << GetParams().MessageStart() << blockundo_size;
 32@@ -986,11 +988,12 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
 33     block.SetNull();
 34 
 35     // Open history file to read
 36-    AutoFile filein{OpenBlockFile(pos, true)};
 37-    if (filein.IsNull()) {
 38+    AutoFile file{OpenBlockFile(pos, true)};
 39+    if (file.IsNull()) {
 40         LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
 41         return false;
 42     }
 43+    BufferedReader filein{file};
 44 
 45     // Read block
 46     try {
 47@@ -1039,11 +1042,12 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
 48         return false;
 49     }
 50     hpos.nPos -= 8; // Seek back 8 bytes for meta header
 51-    AutoFile filein{OpenBlockFile(hpos, true)};
 52-    if (filein.IsNull()) {
 53+    AutoFile file{OpenBlockFile(hpos, true)};
 54+    if (file.IsNull()) {
 55         LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
 56         return false;
 57     }
 58+    BufferedReader filein{file};
 59 
 60     try {
 61         MessageStartChars blk_start;
 62@@ -1082,12 +1086,13 @@ FlatFilePos BlockManager::WriteBlock(const CBlock& block, int nHeight)
 63         LogError("FindNextBlockPos failed");
 64         return FlatFilePos();
 65     }
 66-    AutoFile fileout{OpenBlockFile(pos)};
 67-    if (fileout.IsNull()) {
 68+    AutoFile file{OpenBlockFile(pos)};
 69+    if (file.IsNull()) {
 70         LogError("OpenBlockFile failed");
 71         m_opts.notifications.fatalError(_("Failed to write block."));
 72         return FlatFilePos();
 73     }
 74+    BufferedWriter fileout{file};
 75 
 76     // Write index header
 77     fileout << GetParams().MessageStart() << block_size;
 78--- a/src/streams.cpp
 79+++ b/src/streams.cpp
 80@@ -64,6 +64,15 @@ void AutoFile::read(std::span<std::byte> dst)
 81     }
 82 }
 83 
 84+void AutoFile::read_buffer(std::span<std::byte>& dst)
 85+{
 86+    size_t size{detail_fread(dst)};
 87+    if (size != dst.size() && !feof()) {
 88+        throw std::ios_base::failure("AutoFile::read_buffer: fread failed");
 89+    }
 90+    dst = dst.first(size);
 91+}
 92+
 93 void AutoFile::ignore(size_t nSize)
 94 {
 95     if (!m_file) throw std::ios_base::failure("AutoFile::ignore: file handle is nullptr");
 96@@ -102,6 +111,16 @@ void AutoFile::write(std::span<const std::byte> src)
 97     }
 98 }
 99 
100+void AutoFile::write_buffer(std::span<std::byte> src)
101+{
102+    if (!m_file) throw std::ios_base::failure("AutoFile::write_buffer: file handle is nullptr");
103+    util::Xor(src, m_xor, *m_position); // obfuscate in-place
104+    if (std::fwrite(src.data(), 1, src.size(), m_file) != src.size()) {
105+        throw std::ios_base::failure("AutoFile::write_buffer: write failed");
106+    }
107+    if (m_position) *m_position += src.size();
108+}
109+
110 bool AutoFile::Commit()
111 {
112     return ::FileCommit(m_file);
113--- a/src/streams.h
114+++ b/src/streams.h
115@@ -465,6 +465,11 @@ public:
116         ::Unserialize(*this, obj);
117         return *this;
118     }
119+
120+    //! Read into a mutable buffer more flexibly than read(), shortening the buffer if reached EOF.
121+    void read_buffer(std::span<std::byte>& dst);
122+    //! Write a mutable buffer more efficiently than write(), XORing the buffer in-place.
123+    void write_buffer(std::span<std::byte> src);
124 };
125 
126 /** Wrapper around an AutoFile& that implements a ring buffer to
127@@ -614,4 +619,72 @@ public:
128     }
129 };
130 
131+template<typename S>
132+class BufferedReader
133+{
134+    S& m_stream;
135+    std::vector<std::byte> m_buffer;
136+    size_t m_pos; //!< Next m_buffer position to read.
137+public:
138+    BufferedReader(S& stream, size_t size=65536) : m_stream{stream}, m_buffer{size}, m_pos{size} {}
139+    void read(std::span<std::byte> dst)
140+    {
141+        while (!dst.empty()) {
142+            if (m_pos == m_buffer.size()) {
143+                std::span<std::byte> data{m_buffer};
144+                m_stream.read_buffer(data);
145+                m_buffer.resize(data.size()); // shrink on eof
146+                m_pos = 0;
147+            }
148+            if (m_buffer.size() == 0) throw std::ios_base::failure("BufferedReader::read: end of file");
149+            size_t n = std::min(dst.size(), m_buffer.size() - m_pos);
150+            std::copy(m_buffer.begin() + m_pos, m_buffer.begin() + m_pos + n, dst.begin());
151+            m_pos += n;
152+            dst = dst.subspan(n);
153+        }
154+    }
155+
156+    template <typename T>
157+    BufferedReader& operator>>(T&& obj)
158+    {
159+        ::Unserialize(*this, obj);
160+        return *this;
161+    }
162+};
163+
164+template<typename S>
165+class BufferedWriter
166+{
167+    S& m_stream;
168+    std::vector<std::byte> m_buffer;
169+    size_t m_pos; //!< Next m_buffer position to write.
170+public:
171+    BufferedWriter(S& stream, size_t size=65536) : m_stream{stream}, m_buffer{size}, m_pos{0} {}
172+    ~BufferedWriter() { flush(); }
173+
174+    void write(std::span<const std::byte> src)
175+    {
176+        while (!src.empty()) {
177+            size_t n = std::min(src.size(), m_buffer.size() - m_pos);
178+            std::copy(src.begin(), src.begin() + n, m_buffer.begin() + m_pos);
179+            m_pos += n;
180+            if (m_pos == m_buffer.size()) flush();
181+            src = src.subspan(n);
182+        }
183+    }
184+
185+    void flush()
186+    {
187+         m_stream.write_buffer(std::span{m_buffer}.first(m_pos));
188+         m_pos = 0;
189+    }
190+
191+    template <typename T>
192+    BufferedWriter& operator<<(const T& obj)
193+    {
194+        ::Serialize(*this, obj);
195+        return *this;
196+    }
197+};
198+
199 #endif // BITCOIN_STREAMS_H

This is untested but compiles. If it’s workable I think it would be a better direction to go in, both to keep blockstorage code simpler, but also to provide better support for buffered reads and writes in the stream API in case there are other parts of code that could benefit from them.

I am also ok with with current approach and happy to review if there are issues with this or it’s just not the direction we want to go.

l0rinc commented at 9:00 pm on March 24, 2025: contributor

@ryanofsky, thanks again for the review. Your suggestion seems to point back to the initial implementation of this PR, i.e. https://github.com/bitcoin/bitcoin/pull/31539/commits, where we didn’t know the full size of the block, and read and wrote in chunks. This current alternative was explicitly suggested by @theuni - and while it’s a bit messy to always read the size, the resulting code is faster and seem to have fewer moving parts (e.g. read doesn’t mutate the parameter, write doesn’t copy the write buffer, doesn’t loop inside on buffers of different size, etc), it’s why I went this way. Maybe there’s a middle ground by allocating max block size, ignoring the actual size - which would potentially simplify both approaches - would that be better than the current solutions? I’ll investigate tomorrow…

ryanofsky commented at 9:37 pm on March 24, 2025: contributor

@ryanofsky, thanks again for the review. Your suggestion seems to point back to the initial implementation of this PR, i.e. https://github.com/bitcoin/bitcoin/pull/31539/commits

Absolutely yes. There are a few things I would change about that PR but overall it seems a lot cleaner and simpler to me than this PR. If it doesn’t perform as well as this one that is surprising, because I thought the performance improvement here came from buffering, not from choosing buffers with magic sizes. But if you and other reviewers agree the current approach is better than the previous approach implemented in #31539, I guess this is the way to go, and I don’t mind reviewing it as-is.

l0rinc force-pushed on Mar 26, 2025

l0rinc commented at 2:54 pm on March 26, 2025: contributor

@ryanofsky, I’ve experimented with your version, turns out I’m mostly fine with both solutions. https://github.com/bitcoin/bitcoin/compare/2c073b90ae977af1ac01164a894edd978529fdc6..06f37689a3b79f2f65ac95d8ae54a407ed2d4502

I have based this version on the patch you’ve posted - with minor differences.

Removed all introduced undo_size seek backs - except in ReadRawBlock, which already steps back to get access to the size. I have cleaned it up in a separate commit - we don’t need buffering there since we’re not serializing anything there, just reading the whole block directly already.
read_buffer/write_buffer introduces non-trivial Duck Typing (i.e. now users have to know if the want to call void write(std::span<const std::byte> src); or void write_buffer(std::span<std::byte> src);), I don’t particularly like it, but we can live with it.
std::copy can be simplified to std::copy_n
the new AutoFile::write_buffer can now be used in AutoFile::write to reduce duplication
Changed read_buffer to return the size instead of mutating the parameter, this enabled us to avoid resizing the buffer for eof in BufferedReader (the benchmarks caught the slowdown)
Used DataStream as buffers
Added tests for the buffered writers - for the read/write methods and separately for the serialization part

The ReadBlockBench/WriteBlockBench speedups changed slightly from 24%/31% to 19%/24% on Linux/GCC, and from 32%/298% on Mac to 30%/ 194% To see how real performance is affected, I’m rerunning a few full IBDs to measure the performance change after the buffering changes (reindex won’t suffice here).

l0rinc force-pushed on Mar 28, 2025

l0rinc commented at 10:23 am on March 28, 2025: contributor

@maflcko / @ryanofsky, this implementation seems to fail in the same weird way as https://github.com/bitcoin/bitcoin/actions/runs/12918565902/job/36027337187?pr=31539 - with 100% tests passing. The last PR indicates it’s the batch writing part - could it be an OOM because the sanitizer overhead in DataStream#resize(1 << 20)? WriteBlockBench indicates that even a buffer size of 1 << 16 is half the speed (Linux/GCC speed is the same) - but I’ll try it, let’s see if it passes CI at least.

maflcko commented at 10:48 am on March 28, 2025: member

Does CI pass locally before and after? Measuring locally, what is the memory usage of the CI before and after?

It seems odd that using exact block sizes (this pull initially) passes, whereas using a 1<<20 approximation fails due to OOM.

l0rinc commented at 7:47 pm on March 30, 2025: contributor

It seems odd that using exact block sizes (this pull initially) passes, whereas using a 1«20 approximation fails due to OOM

Yes, I thought of the same, but the other PR had the same issue. I’m not familiar enough with the intricacies of ASan + LSan + UBSan + integer, no depends, USDT to have a good enough explanation - besides the instrumentation overhead causing it to overflow.

What is the memory usage of the CI before and after?

I have measured a valgrind/massif IBD until the be benchmarked block #413567, indicating that there isn’t a significant memory usage change - not sure about the CI usage, that seems messy to measure.

 0COMMITS="06f37689a3b79f2f65ac95d8ae54a407ed2d4502 8fe73a102f5b8406f166d7ac4b1a36bba84de7c4"; \
 1STOP_HEIGHT=413567; DBCACHE=1000; \
 2C_COMPILER=gcc; CXX_COMPILER=g++; \
 3BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
 4for COMMIT in $COMMITS; do \
 5  git fetch --all -q && git fetch origin $COMMIT && git log -1 --oneline $COMMIT && \
 6  killall bitcoind; rm -rf $DATA_DIR/* && git checkout $COMMIT && git clean -fxd && git reset --hard && \
 7  cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
 8  cmake --build build -j$(nproc) --target bitcoind && \
 9  ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0 && \
10  valgrind --tool=massif --time-unit=ms \
11    --massif-out-file="$LOG_DIR/massif-$COMMIT-$STOP_HEIGHT-$DBCACHE.out" \
12    ./build/bin/bitcoind -datadir=$DATA_DIR \
13    -stopatheight=$STOP_HEIGHT \
14    -dbcache=$DBCACHE \
15    -blocksonly=1 \
16    -printtoconsole=0 && \
17  cp $DATA_DIR/debug.log "$LOG_DIR/debug-$COMMIT-$STOP_HEIGHT-$DBCACHE.log" && \
18  ms_print "$LOG_DIR/massif-$COMMIT-$STOP_HEIGHT-$DBCACHE.out" | tee "$LOG_DIR/massif-$COMMIT-$STOP_HEIGHT-$DBCACHE.txt"; \
19done

06f37689a3 - size_t size = 1 « 20

 0    GB
 11.322^              #
 2     |              #     :     :               @      :    :               :
 3     |             @#     :     :    :          @:     :    :    :     :    :
 4     |            :@#    ::    ::    :    ::    @:     :   ::    :    ::   ::
 5     |            :@#   @::    ::   ::    :     @:    ::   ::    :    ::   @:
 6     |            :@#   @::    ::   ::    :     @:    ::   ::   ::    ::   @:
 7     |            :@#   @::    ::   ::   ::     @:   :::   ::   ::   :::   @:
 8     |            :@#   @::  ::::  :::   ::     @:  @:::  :::  :::   :::  :@:
 9     |          :@:@#  :@::  : ::  :::  :::     @:  @::: ::::  :::   :::  :@:
10     |         ::@:@#  :@::  : ::  :::  :::     @:  @::: ::::  :::   ::: ::@:
11     |        @::@:@#  :@::  : :: @:::  :::    :@:  @::: ::::  :::  @::: ::@:
12     |        @::@:@# ::@:: :: :: @:::  :::    :@:  @::: :::: :::: :@::: ::@::
13     |       @@::@:@# ::@:: :: :: @:::  :::    :@: :@::::::::@:::: :@::: ::@::
14     |     ::@@::@:@# ::@:: :: :::@::: :::: @  :@: :@::::::::@:::: :@::::::@::
15     |    :: @@::@:@#:::@::::: :::@::: :::: @  :@: :@::::::::@:::: :@::::::@::
16     |    :: @@::@:@#:::@::::: :::@:::::::: @:::@:::@::::::::@::::::@::::::@::
17     |   ::: @@::@:@#:::@::::: :::@:::::::: @: :@:::@::::::::@::::::@::::::@::
18     | ::::: @@::@:@#:::@::::: :::@:::::::: @: :@:::@::::::::@::::::@::::::@::
19     | ::::: @@::@:@#:::@::::: :::@:::::::: @: :@:::@::::::::@::::::@::::::@::
20     | ::::: @@::@:@#:::@::::: :::@:::::::: @: :@:::@::::::::@::::::@::::::@::
21   0 +----------------------------------------------------------------------->h
22     0                                                                   4.449

vs

8fe73a102f - size_t size = 1 « 16

 0    GB
 11.336^                 #
 2     |           @     #                   :   :           :     :
 3     |           @     #    :     :        :   :     :     :     :    :    :
 4     |          @@     #    :     :     :  :   :     :    ::    @:    :    :
 5     |          @@    :#    :    ::     :  :  ::     :   :::   :@:   ::   ::
 6     |          @@    :#    :    ::     :  :  ::    @:   :::   :@:   ::   ::
 7     |          @@    :#   @:    ::     :  :  ::    @:   :::  ::@:  :::   ::
 8     |        @@@@   ::#   @:   :::   :::  : :::  ::@:  @:::  ::@:  :::   ::
 9     |        @ @@   ::#  :@:   :::   : :  : :::  ::@:  @:::  ::@:  :::  :::
10     |      @@@ @@   ::#  :@:  ::::   : : :: :::  ::@:  @::: :::@: ::::  :::
11     |      @@@ @@ ::::#  :@:  ::::   : : :: :::  ::@: :@::: :::@: :::: :::: :
12     |      @@@ @@ : ::# ::@:  ::::  @: : ::@:::  ::@: :@:::::::@:::::: :::: :
13     |    ::@@@ @@ : ::# ::@:  ::::  @: : ::@::: :::@:::@:::::::@::::::@:::: :
14     |   :: @@@ @@:: ::#:::@:  ::::  @: ::::@::: :::@:::@:::::::@::::::@:::: :
15     |   :: @@@ @@:: ::#:::@:  ::::  @: ::::@:::@:::@:::@:::::::@::::::@::::::
16     |  ::: @@@ @@:: ::#:::@:@@::::::@: ::::@:::@:::@:::@:::::::@::::::@::::::
17     | :::: @@@ @@:: ::#:::@:@ ::::: @: ::::@:::@:::@:::@:::::::@::::::@::::::
18     | :::: @@@ @@:: ::#:::@:@ ::::: @: ::::@:::@:::@:::@:::::::@::::::@::::::
19     | :::: @@@ @@:: ::#:::@:@ ::::: @: ::::@:::@:::@:::@:::::::@::::::@::::::
20     | :::: @@@ @@:: ::#:::@:@ ::::: @: ::::@:::@:::@:::@:::::::@::::::@::::::
21   0 +----------------------------------------------------------------------->h
22     0                                                                   4.111

Does CI pass locally before and after?

It’s passing here as well now with 1 << 16 instead of 1 << 20.

in src/node/blockstorage.cpp:672 in fbe701bf32 outdated

665@@ -666,24 +666,26 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
666         return false;
667     }
668 
669-    // Read block
670-    uint256 hashChecksum;
671-    HashVerifier verifier{filein}; // Use HashVerifier as reserializing may lose data, c.f. commit d342424301013ec47dc146a4beb49d5c9319d80a
672     try {
673+        // Read block
674+        HashVerifier verifier{filein}; // Use HashVerifier, as reserializing may lose data, c.f. commit d3424243

ryanofsky commented at 0:11 am on April 2, 2025:

In commit “refactor: collect block read operations into try block” (fbe701bf32a33766d06bec58e4c0fb70e94f60b1)

It doesn’t seem great to move HashVerifier constructor and GetHash call inside the try block to catch std::exceptions when we never expect these methods to throw them. Can commit description be a little clearer about why this is necessary? Or is it not necessary but just being done as a kind of defensive programming? I guess it seems ok either way but would be good to be clear about the specfic reason in the commit description.

l0rinc commented at 8:43 am on April 2, 2025:

I know that try/catch is often used as narrowly as possible, as if it were an additional return value, but catching an error type means that we shouldn’t really care where it’s coming from. That enables us to separate the happy path (uninterrupted flow in the try) from the error cases (the catches).

We can wrap only the method that can currently throw, but that would break the logical flow, it would mix the main algorithm with the error handling. I know we often do that, but I wanted to untangle it here (especially since it was needed for the previous implementation, now it’s just a cleanup).

Edit: added a short line mentioning separation of concerns to the commit message.

ryanofsky commented at 2:03 pm on April 2, 2025:

re: #31551 (review)

Edit: added a short line mentioning separation of concerns to the commit message.

Thanks! Should “In preparation for upcoming changes” be dropped from the commit description then? That was the part that confused me. I was trying to figure out what was motivating this change and I couldn’t see any connection between catching HashVerifier errors and buffering (either implementation)

maflcko commented at 6:51 am on April 10, 2025:

in 467d2b380f4a780b9e17fe64d08b22e61f4a2554:

I don’t understand the commit message. It says “HashVerifier was included in the try/catch to include the undo_size serialization there as well”. However, I don’t see any undo_size symbol and git grep '\<undo_size\>' seems to agree.

l0rinc commented at 3:00 pm on April 10, 2025:

Good catch, it’s a leftover from https://github.com/bitcoin/bitcoin/commit/f72c2dd3a0c5005bb527587dc5576a19dc6a790d#diff-114c2880ec1ff2c5293ac65ceda0637bf92c05745b74b58410585a549464a33fR977-R989

in src/node/blockstorage.h:78 in 4d90db8b96 outdated

74@@ -75,10 +75,10 @@ static const unsigned int UNDOFILE_CHUNK_SIZE = 0x100000; // 1 MiB
75 static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000; // 128 MiB
76 
77 /** Size of header written by WriteBlock before a serialized CBlock (8 bytes) */
78-static constexpr size_t BLOCK_SERIALIZATION_HEADER_SIZE{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
79+static constexpr size_t HEADER_BYTE_SIZE{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};

ryanofsky commented at 0:22 am on April 2, 2025:

In commit “refactor: shorten BLOCK_SERIALIZATION_HEADER_SIZE” (4d90db8b960bf1ea969bedfebfdceccfabf6f9a5)

Maybe consider a more descriptive name like node::STORAGE_HEADER_BYTES or node::BLOCKFILE_HEADER_BYTES because the name node::HEADER_BYTE_SIZE is very generic, there’s no indication it is linked to blocks or storage, and code using it is less clear.

l0rinc commented at 11:02 am on April 2, 2025:

Changed to ::BLOCK_HEADER_BYTES in code and commit messages (it’s not a BLOCKFILE header, right?)

in src/node/blockstorage.cpp:1038 in fc97b81996 outdated

1034@@ -1035,11 +1035,11 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
1035     FlatFilePos hpos = pos;
1036     // If nPos is less than 8 the pos is null and we don't have the block data
1037     // Return early to prevent undefined behavior of unsigned int underflow
1038-    if (hpos.nPos < 8) {
1039+    if (hpos.nPos < HEADER_BYTE_SIZE) {

ryanofsky commented at 0:25 am on April 2, 2025:

In commit “refactor: replace raw values in ReadRawBlock with HEADER_BYTE_SIZE” (fc97b819969bb6c31245a8e2e0b5987881329967)

Right now we have comments in the code saying that HEADER_BYTE_SIZE is 8, but maybe it would be good to replace comment with a static_assert enforcing that. Could give more confidence in this code and prevent breakage in case size calculation is updated in the future.

l0rinc commented at 11:03 am on April 2, 2025:

The comment just referred to the hard-coded value, I left it like that for extra info - but I’ll just remove the duplication - no need to assert.

in src/streams.h:628 in 29678d79e7 outdated

617@@ -614,4 +618,41 @@ class BufferedFile
618     }
619 };
620 
621+template <typename S>
622+class BufferedReader

ryanofsky commented at 1:31 am on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (29678d79e7e127bf2694f83194f47e0bfd9849b9)

Implementation of this class seems reasonable, though I think some improvements would be possible, if interested:

I think it would be a little better to replace while loop here with direct read approach in BufferedReadOnlyFile::read from #31539 that skips reading into the buffer for the initial read and for large reads after that. I don’t think changing this should matter for this PR since the reads here should be small, but the other approach seems strictly better in general since it avoids pointless double copying and looping in the case of large reads (larger than the buffer size).
The DataStream m_buffer and size_t m_size fields here contain redundant size and position information that is not needed by this class, introducing unnecessary complexity and overhead and potential for bugs. It would be better if this class just had vector+position members as in #31551#pullrequestreview-2711550079, or if it just had a DataStream member since a DataStream object itself consists of a vector+position value.

l0rinc commented at 11:03 am on April 2, 2025:

Meh, DataStream is indeed forced here, simplified it, what you think of this? Reverted the implementation mostly to the version in #31539, as requested.

ryanofsky commented at 2:50 pm on April 2, 2025:

re: #31551 (review)

Reverted the implementation mostly to the version in #31539, as requested.

Thanks, new implementation seems to take care of both issues and also be a broader simpilfication.

in src/streams.h:670 in 8fe73a102f outdated

665+    size_t m_pos;
666+    size_t m_size;
667+
668+    void flush()
669+    {
670+        Assume(m_pos != 0);

ryanofsky commented at 1:41 am on April 2, 2025:

In commit “optimization: Bulk serialization writes in WriteBlockUndo and WriteBlock” (8fe73a102f5b8406f166d7ac4b1a36bba84de7c4)

This assume doesn’t seem good. Should just opening a stream and not writing data abort the program?

l0rinc commented at 11:04 am on April 2, 2025:

It would seem fishy to me if we’d do that currently, but made flush public and changed the assume to a guard

in src/streams.h:668 in 8fe73a102f outdated

663+    S& m_stream;
664+    DataStream m_buffer;
665+    size_t m_pos;
666+    size_t m_size;
667+
668+    void flush()

ryanofsky commented at 1:45 am on April 2, 2025:

In commit “optimization: Bulk serialization writes in WriteBlockUndo and WriteBlock” (8fe73a102f5b8406f166d7ac4b1a36bba84de7c4)

I think this should be a public method and callers should be able control when data is flushed without needing to destroy the object. Implementing buffering without ability to flush would seem like an unusual choice.

l0rinc commented at 11:04 am on April 2, 2025:

Done

in src/streams.h:666 in 8fe73a102f outdated

661+class BufferedWriter
662+{
663+    S& m_stream;
664+    DataStream m_buffer;
665+    size_t m_pos;
666+    size_t m_size;

ryanofsky commented at 1:55 am on April 2, 2025:

In commit “optimization: Bulk serialization writes in WriteBlockUndo and WriteBlock” (8fe73a102f5b8406f166d7ac4b1a36bba84de7c4)

Again m_size and m_pos are redundant because DataStream already contains a size and position. I didn’t look into whether it’s possible to use DataStream directly here, but it would be nice to actually use the state it carries or just replace it with a plain std::vector and position like #31551#pullrequestreview-2711550079 so the class is not tracking unused state that adds complexity and could cause bugs.

l0rinc commented at 11:04 am on April 2, 2025:

Replaced with vector and pos, thanks

ryanofsky approved

ryanofsky commented at 2:35 am on April 2, 2025: contributor

Code review ACK 8fe73a102f5b8406f166d7ac4b1a36bba84de7c4, since the code seems right and I didn’t see any issues, but IIUC something unexpected happens if (1«20) buffer sizes are used as described #31551 (comment) so that sounds like something that needs to be figured out.

Would be curious to know if other reviewers like the current approach, since a number of approaches have been tried, and this moves closer to some earlier approaches that maybe other reviewers didn’t like. I like the current approach because I was pushing for it, but I only suggested it because I didn’t know something like it had been tried before, so maybe there are drawbacks I don’t see here or this is just not appealing. I’d be happy with whatever approach can get accepted if not this one.

re: #31551 (comment)

read_buffer/write_buffer introduces non-trivial Duck Typing (i.e. now users have to know if the want to call void write(std::span<const std::byte> src); or void write_buffer(std::span<std::byte> src);), I don’t particularly like it, but we can live with it.

Would be curious what this means. The two write methods have different behavior because one of them mutates the buffer and the other one doesn’t. It would seem both write methods have different use-cases, and same for the read methods.

l0rinc force-pushed on Apr 2, 2025

l0rinc commented at 11:21 am on April 2, 2025: contributor

but IIUC something unexpected happens if (1«20) buffer sizes are used

Seems like a simple OOM caused by instrumentation overhead - @maflcko, can you help us out here, does this need more investigation?

read_buffer/write_buffer introduces non-trivial Duck Typing

would be curious what this means. The two write methods have different behavior because one of them mutates the buffer and the other one doesn’t. It would seem both write methods have different use-cases, and same for the read methods.

Yes, I mean we don’t have segregated interfaces which predestine the usages that are allowed (e.g. how a bufferedfile could have a buffered-reader and a buffered-writer interface to only allow a subset of the functionality). Here AutoFile doesn’t implement a Bufferable interface, and the BufferedReader doesn’t accept a explicit BufferedReader(Bufferable& stream, so there’s an implicit contract (incidental interface), that you can only use BufferedReader for a stream if they happen to implement read_buffer, because it’s used inside (i.e. compile-time duck typing). So if somebody wants to make any other stream BufferedReader compatible, they will have to look at the actual implementation details since there isn’t an interface guiding them. It’s fine, I don’t mind, just mentioned that it’s a slight complication in my opinion.

Rebased the PR in a separate push to hopefully resolve the unrelated Windows CI failure.

l0rinc requested review from ryanofsky on Apr 2, 2025

l0rinc requested review from maflcko on Apr 2, 2025

maflcko commented at 2:02 pm on April 2, 2025: member

but IIUC something unexpected happens if (1«20) buffer sizes are used

Seems like a simple OOM caused by instrumentation overhead - @maflcko, can you help us out here, does this need more investigation?

It is not a blocker, it just makes me curious that static buffer sizes cause it and dynamic ones don’t. If you want to look more into this, you could try IBD (or AU-IBD) again with asan enabled?

in src/node/blockstorage.cpp:999 in 5d3cd74ac8 outdated

 995@@ -995,7 +996,7 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
 996 
 997     try {
 998         // Read block
 999-        filein >> TX_WITH_WITNESS(block);
1000+        BufferedReader{filein} >> TX_WITH_WITNESS(block);

ryanofsky commented at 2:31 pm on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (5d3cd74ac8f6acd381a11fe469d37d3c8b1630e1)

There’s no bug here but it seems potentially unsafe to construct a buffered reader as a temporary wrapper around a stream, because the reader can read extra data and leave the underlying stream at an unpredictable position.

Buffered writers don’t have this problem because they flush when they are destroyed, and readers could perhaps avoid the problem in a similar way by rewinding when destroyed. But it seems like a more straightforward way to avoid this problem and make buffered readers safer would to be give them ownership of the underlying stream and require it to moved so there are errors or warnings if it is used incorrectly.

 0--- a/src/node/blockstorage.cpp
 1+++ b/src/node/blockstorage.cpp
 2@@ -665,7 +665,7 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
 3         LogError("OpenUndoFile failed for %s while reading", pos.ToString());
 4         return false;
 5     }
 6-    BufferedReader filein{file};
 7+    BufferedReader filein{std::move(file)};
 8 
 9     try {
10         // Read block
11@@ -989,15 +989,16 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
12     block.SetNull();
13 
14     // Open history file to read
15-    AutoFile filein{OpenBlockFile(pos, true)};
16-    if (filein.IsNull()) {
17+    AutoFile file{OpenBlockFile(pos, true)};
18+    if (file.IsNull()) {
19         LogError("%s: OpenBlockFile failed for %s", __func__, pos.ToString());
20         return false;
21     }
22+    BufferedReader filein{std::move(file)};
23 
24     try {
25         // Read block
26-        BufferedReader{filein} >> TX_WITH_WITNESS(block);
27+        filein >> TX_WITH_WITNESS(block);
28     } catch (const std::exception& e) {
29         LogError("%s: Deserialize or I/O error - %s at %s", __func__, e.what(), pos.ToString());
30         return false;
31--- a/src/streams.h
32+++ b/src/streams.h
33@@ -630,7 +630,8 @@ class BufferedReader
34     size_t m_buf_pos;
35 
36 public:
37-    explicit BufferedReader(S& stream, size_t size = 1 << 16) : m_src{stream}, m_buf(size), m_buf_pos{size} {}
38+    explicit BufferedReader(S&& stream, size_t size = 1 << 16)
39+    requires std::is_rvalue_reference_v<S&&> : m_src{stream}, m_buf(size), m_buf_pos{size} {}
40 
41     void read(std::span<std::byte> dst)
42     {

Would seem good to make this change, but not important

l0rinc commented at 8:17 pm on April 2, 2025:

Excellent observation, I haven’t thought of that since - as you’ve mentioned - I wasn’t thinking of a general purpose buffered reader originally.

in src/streams.cpp:67 in 5d3cd74ac8 outdated

63@@ -64,6 +64,16 @@ void AutoFile::read(std::span<std::byte> dst)
64     }
65 }
66 
67+size_t AutoFile::read_buffer(std::span<std::byte> dst)

ryanofsky commented at 2:41 pm on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (5d3cd74ac8f6acd381a11fe469d37d3c8b1630e1)

Looks like read_buffer method is unused so would be great to drop and simplify this commit.

l0rinc commented at 8:16 pm on April 2, 2025:

Ooops, letftover after all this refactoring back and forth :)

in src/streams.h:628 in 5d3cd74ac8 outdated

619@@ -614,4 +620,39 @@ class BufferedFile
620     }
621 };
622 
623+template <typename S>
624+class BufferedReader

ryanofsky commented at 3:13 pm on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (5d3cd74ac8f6acd381a11fe469d37d3c8b1630e1)

Could consider adding a comment like:

0/**
1 * Stream wrapper that buffers reads from an underlying stream to improve
2 * deserialization performance. Requires underlying stream to support read() and
3 * detail_fread() calls to support fixed-size and variable-sized reads,
4 * respectively (this interface could be improved and formalized in the future.)
5 */

l0rinc commented at 8:16 pm on April 2, 2025:

Added something similar to both methods, thank you

in src/streams.h:636 in 5d3cd74ac8 outdated

631+    explicit BufferedReader(S& stream, size_t size = 1 << 16) : m_src{stream}, m_buf(size), m_buf_pos{size} {}
632+
633+    void read(std::span<std::byte> dst)
634+    {
635+        if (m_buf_pos < m_buf.size()) {
636+            const auto available{Assume(std::min(dst.size(), m_buf.size() - m_buf_pos))};

ryanofsky commented at 3:24 pm on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (5d3cd74ac8f6acd381a11fe469d37d3c8b1630e1)

The Assume here will abort the program if dst is empty, which doesn’t seem great when this case could logically be handled by just not reading anything. Treating this as an error is also inconsistent with other stream read methods, and not enforced consistently in this method because if pos == size, the Assume will be skipped. Would suggest just dropping the Assume and not making a special case out of this.

l0rinc commented at 8:16 pm on April 2, 2025:

I have an even better idea:

0        if (const auto available{std::min(dst.size(), m_buf.size() - m_buf_pos)}) {

in src/streams.h:642 in 5d3cd74ac8 outdated

637+            std::copy_n(m_buf.begin() + m_buf_pos, available, dst.begin());
638+            m_buf_pos += available;
639+            dst = dst.subspan(available);
640+        }
641+        if (!dst.empty()) {
642+            Assume(m_buf_pos == m_buf.size());

ryanofsky commented at 3:30 pm on April 2, 2025:

In commit “optimization: Bulk serialization reads in UndoRead, ReadBlock” (5d3cd74ac8f6acd381a11fe469d37d3c8b1630e1)

Is there a reason assume not assert is used here? If this condition is false it would seem like a serious internal error that would be good not ignore, and I don’t think there is reason to think assert would be actually worse for performance, assuming the speedup in this PR just comes from making fewer read calls, not from this code being on a hot path. Would also suggest using assert since it’s already used in stream code and this would allow dropping the check.h dependency.

l0rinc commented at 8:16 pm on April 2, 2025:

Ok, changed to assert

ryanofsky approved

ryanofsky commented at 5:03 pm on April 2, 2025: contributor

Code review ACK 71610707a8b69eeef0c5345f78aac40c38fc3b88. Mostly small changes since last review, biggest one was changing the read method to be simpler and avoid looping.

re: #31551 (comment)

Yes, I mean we don’t have segregated interfaces which predestine the usages that are allowed

Understood now thanks. IMO, using duck typing and not defining formal interfaces here is a more of a benefit than a drawback, since it can let this code evolve easily without requiring wider refactoring, and let it be generic without implementing functionality that’s not actually used. But this is a tradeoff.

l0rinc force-pushed on Apr 2, 2025

l0rinc commented at 9:34 pm on April 2, 2025: contributor

BufferedReader takes ownership of the underlying stream now;
read_buffer was removed;
In the tests made sure file_size and max_read_length are never empty and that test state doesn’t propagate between comparisons;
Fixed writeer typo;
~~Added back BufferedWriter size = 1 << 20 - I have a hunch of why it failed, let’s see if it passes now;~~ I have to figure this out before it’s merged

https://github.com/bitcoin/bitcoin/compare/71610707a8b69eeef0c5345f78aac40c38fc3b88..85b0c6d309cea981fda75e8cf268a144ae472d8d

https://github.com/bitcoin/bitcoin/compare/aef99ea38d85d7e56e64bbd0809a7298f21e7b8a..652b4e3de5c5e09fb812abe265f4a8946fa96b54

Rebased again in separate commit since QT5 wasn’t found after the QT6 merge.

DrahtBot added the label CI failed on Apr 3, 2025

in src/streams.h:636 in 652b4e3de5 outdated

630+    S& m_src;
631+    BufferData m_buf;
632+    size_t m_buf_pos;
633+
634+public:
635+    explicit BufferedReader(S&& stream, size_t size = 1 << 16)

hodlinator commented at 8:15 am on April 3, 2025:

nit: 1 << 16 ~= 65KiB. Have a feeling this is a common size of buffers operated on by syscalls, might be nice with a brief comment/reference.

hodlinator commented at 8:35 pm on April 4, 2025:

nit: Could add a comment explaining why we take ownership of stream, a shorter version of #31551 (review) like:

0//! Takes ownership of stream since otherwise we might leave it at an unexpected position due to buffering when we go out of scope.

l0rinc commented at 12:14 pm on April 6, 2025:

Valid question - it wasn’t calculated but rather measured - see #31539:

Testing with various buffer sizes showed that 16 KiB provides the best performance.

This was the value which produces the same performance as reading the whole block into memory (lower values are slower, bigger ones have the same speed).

For writes however this value was measure to be1<<20 (the current 1<<16 there is slower than writing the whole buffer in one go), but for some reason that I’m still investigating, that OOMs the CI, so I’m using the same value for both now.

For reference, the existing BufferedFile uses a buffer of 8MB.

l0rinc commented at 12:15 pm on April 6, 2025:

Added something similar

hodlinator commented at 1:05 pm on April 8, 2025:

Could comment on chosen defaults in commit messages?

l0rinc commented at 2:00 pm on April 8, 2025:

Will do if I retouch, thanks

l0rinc commented at 3:02 pm on April 10, 2025:

Done

hodlinator commented at 10:17 pm on April 10, 2025:

Typo + make it a tad bit less absolute?

0- The buffer sizes were selected by so that the buffered reader produces the same performance as reading the whole block into memory (lower values are slower, bigger ones have the same speed).
1+ The buffer sizes were selected so that the buffered reader produces the same performance as reading the whole block into memory on measured setups (lower values are slower, bigger ones have the same speed).

I guess it would be good for us reviewers to at least play around with the buffer sizes too to see if at least the microbenchmarks show anything surprising.

l0rinc commented at 9:47 pm on April 13, 2025:

Reworded, thanks. Let me know what values you’re getting if you try out different buffer sizes.

hodlinator commented at 10:06 am on April 15, 2025:

Included local results from increasing size of write buffer from 1 << 16 to 1 << 20 in latest review. Only increases ops/s from 511.77 to 514.57, so less than 1%.

in src/test/streams_tests.cpp:562 in 652b4e3de5 outdated

554@@ -553,6 +555,135 @@ BOOST_AUTO_TEST_CASE(streams_buffered_file_rand)
555     fs::remove(streams_test_filename);
556 }
557 
558+BOOST_AUTO_TEST_CASE(buffered_reader_matches_autofile_random_content)
559+{
560+    for (int rep{0}; rep < 10; ++rep) {
561+        const size_t file_size{1 + m_rng.randrange<size_t>(1 << 17)};
562+        const size_t buf_size{1 + m_rng.randrange(file_size)};

hodlinator commented at 8:24 am on April 3, 2025:

What’s your reasoning for making this a unit test over a fuzz-test? Randomness + loop makes it feel fuzz-adjacent.

l0rinc commented at 12:14 pm on April 6, 2025:

The version you started reviewing still contained a loop, but the latest doesn’t. This is indeed a differential test (comparing against a simpler reference implementation), but it’s not exploratory, I made sure it results in full coverage, so I don’t see it as a fuzz, just a randomized unit test. Also, since it’s an important part of the suite, wanted to make sure it’s always executed with the test suite.

in src/test/streams_tests.cpp:567 in 652b4e3de5 outdated

562+        const size_t buf_size{1 + m_rng.randrange(file_size)};
563+        const FlatFilePos pos{0, 0};
564+
565+        const FlatFileSeq test_file{m_args.GetDataDirBase(), "buffered_file_test_random", node::BLOCKFILE_CHUNK_SIZE};
566+        const std::vector obfuscation{m_rng.randbytes<std::byte>(8)};
567+        AutoFile{test_file.Open(pos, false), obfuscation}.write(m_rng.randbytes<std::byte>(file_size));

hodlinator commented at 8:37 am on April 3, 2025:

nit: Could name bool parameter at least on first call:

0        AutoFile{test_file.Open(pos, /*read_only=*/false), obfuscation}.write(m_rng.randbytes<std::byte>(file_size));

l0rinc commented at 12:14 pm on April 6, 2025:

Done

in src/streams.h:470 in 652b4e3de5 outdated

464@@ -465,8 +465,13 @@ class AutoFile
465         ::Unserialize(*this, obj);
466         return *this;
467     }
468+
469+    //! Write a mutable buffer more efficiently than write(), obfuscating the buffer in-place.
470+    void write_buffer(std::span<std::byte> src);

hodlinator commented at 8:42 am on April 3, 2025:

nit: Slight preference for making the in/out aspect more prominent through Doxygen syntax:

0    //! [@param](/bitcoin-bitcoin/contributor/param/)[in,out] src  Obfuscated in-place for efficiency, before being written to the file.
1    void write_buffer(std::span<std::byte> src);

l0rinc commented at 12:14 pm on April 6, 2025:

Do you consider it an out because the input is mutated? We’re not using that anywhere, it’s an implementation detail that I don’t think we want to guarantee

hodlinator commented at 8:03 am on April 8, 2025:

I consider it a loaded foot-gun for future users of the API. Not something we want to guarantee, but rather warn about. Current comment is okay, just a bit less eye-catching in my view.

in src/streams.h:628 in 652b4e3de5 outdated

623+ * Wrapper that buffers reads from an underlying stream.
624+ * Requires underlying stream to support read() and detail_fread() calls
625+ * to support fixed-size and variable-sized reads, respectively.
626+ */
627+template <typename S>
628+class BufferedReader

hodlinator commented at 8:56 am on April 3, 2025:

0template <typename S, size_t N = 1 << 16>
1class BufferedReader

I think it makes more sense to have the buffer size decided at compile time for now, to make heap/stack allocation of m_buf optional for the user (though std::array). See suggestion commit 4de03f51ab7943f9102b4572d1818ac081af7da4.

(Personally I would even be fine with not having the size configurable at all; constexpr size_t N = 1 << 16 inside the class).

The main drawback I can see from making it statically defined is that it disrupts the tests as they are written now. Let me know if you want me to assist with figuring something reasonable out there.

l0rinc commented at 12:14 pm on April 6, 2025:

It’s a good idea in general, but - as you’ve also mentioned at the end - it would limit its usability - and I couldn’t measure any speed gain, seems to behave exactly the same way (probably skips the stack).

Since we’re not really changing the size of the underlying vector (only narrowing, which shouldn’t result in a new allocation) I don’t see any drawback of the current approach.

It could be argued that allocating on the stack could be faster and cause less fragmentation, but we’re only doing this once per run and that’s an insignificant portion of runtime.

I like the suggestion, I’m just not used to allocating this much heap space (e.g. can it cause a stack overflow on any OS? Do we need unique_ptr here to force stack allocation? No idea…)

What do other reviewers think, is it worth investigating more? (cc: @ryanofsky, @maflcko?)

ryanofsky commented at 1:27 pm on April 7, 2025:

re: #31551 (review)

I don’t think this is a great use-case for std::array because when you are doing I/O It is usually fair to assume the cost of allocating the buffer is a lot smaller than the cost of actually doing the I/O. Buffers can also take up a relatively large amount of stack space and increase risk of stack overflow. Since larger buffer sizes often increase performance, this introduces a tradeoff between making I/O faster and risking crashes, which is not very appealing.

I also don’t think in general parameters that are tuned for performance should be compile time parameters as opposed to runtime parameters. Requiring them to be specified at compile time makes code more rigid because it can’t easily adapt to runtime conditions, also it can blow up the size of compiled program because templates will be expanded multiple times if different buffer sizes are ever used. And the normal benefits that you would get from choosing compile-time over runtime execution do not apply in these cases: there is no performance advantage and no extra correctness checking that can be done.

hodlinator commented at 8:39 am on April 8, 2025:

This article: https://www.embeddedrelated.com/showarticle/1330.php (referenced previously #30377 (review)) indicates that the stack size at least can end up being 132KiB on Raspberry Pi 3 (seems to have sold with 1GB of RAM). My Raspberry 4 is using 8MiB stack sizes (8 GB RAM). Seems Windows often ends up with 1MB stacks though. This makes me agree it’s better to use the heap in this specific case instead of guzzling up 1 << 16 = 64KiB.

As a general rule for smaller amounts, using stack memory should be considered as bumping the stack pointer is always faster than allocation+deallocation. It should tend toward better cache-locality due to stack being used for adjacent things rather than heap-objects in random places. Agree that the difference can be completely crowded out by I/O though.
The amount of template expansions from this and the amount of code it contains compared to STL usage is minimal.
In theory it would be cool to have dynamic sizes at runtime, but in practice we tend towards settling on values at compile time, as is currently done in this PR.

(AutoFile::write() uses std::array<std::byte, 4096> which I guess is fine as it’s <10KiB and guaranteed to be in a leaf of the call-tree).

l0rinc force-pushed on Apr 3, 2025

in src/streams.h:473 in aaceecad2b outdated

464@@ -465,8 +465,13 @@ class AutoFile
465         ::Unserialize(*this, obj);
466         return *this;
467     }
468+
469+    //! Write a mutable buffer more efficiently than write(), obfuscating the buffer in-place.
470+    void write_buffer(std::span<std::byte> src);
471 };
472 
473+using BufferData = std::vector<std::byte>;

hodlinator commented at 1:30 pm on April 3, 2025:

nit: This is prime naming real-estate being put into use, if we do it, maybe it could use less yoda-speak?

Option A)

Harmonize with DataStream:

0using DataBuffer = std::vector<std::byte>;

Option B)

There is the concept of ByteVector as part of other concepts (see ToByteVector() and ByteVectorHash). They use unsigned char but could be overloaded to support std::byte as well. Calling this ByteVector would be more consistent with pre-existing code.

0using ByteVector = std::vector<std::byte>;

(I have a feeling there might already be some typedef/using declaration for the unsigned char variant, but couldn’t find it).

Option C)

Avoid introducing this type, this layer of indirection obfuscates the code slightly until internalized.

l0rinc commented at 12:14 pm on April 6, 2025:

maybe it could use less yoda-speak

The ByteVector similarity, avoid I wanted, since std::byte it is not, as mentioned you have. Harmonizing DataBuffer with DataStream, like I do.

DrahtBot removed the label CI failed on Apr 3, 2025

in src/node/blockstorage.h:76 in aaceecad2b outdated

74@@ -75,10 +75,10 @@ static const unsigned int UNDOFILE_CHUNK_SIZE = 0x100000; // 1 MiB
75 static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000; // 128 MiB
76

hodlinator commented at 2:01 pm on April 3, 2025:

(Inline comment in random spot to avoid spamming main thread).

PR description

0- [@martinus](/bitcoin-bitcoin/contributor/martinus/)
1+ martinus

https://github.com/bitcoin/bitcoin/blob/aaceecad2ba85ac7513d1e2973960bdae65ab0fc/CONTRIBUTING.md?plain=1#L170-L174

l0rinc commented at 12:14 pm on April 6, 2025:

Done

l0rinc force-pushed on Apr 3, 2025

DrahtBot added the label CI failed on Apr 3, 2025

l0rinc commented at 7:57 pm on April 3, 2025: contributor

Sorry for the extra pushes, it seems there is some inconsistency on Windows regarding file flushing and deletion behavior - so I’ve forced flushing in tests via braces and added read-underflow checks for each to make sure that scenario is covered. PR is ready for review again!

in src/node/blockstorage.cpp:948 in fe6b0e425c outdated

947+        AutoFile file{OpenUndoFile(pos)};
948+        if (file.IsNull()) {
949+            LogError("OpenUndoFile failed for %s while writing", pos.ToString());
950             return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
951         }
952+        BufferedWriter fileout{file};

hodlinator commented at 10:13 pm on April 3, 2025:

Observed /home/hodlinator/.bitcoin/testnet4/blocks/rev00001.dat first being opened by AutoFile above for writing and then opened a second time inside the flush call below on testnet4. This seems potentially dangerous on master, and is made even more dangerous by BufferedWriter flushing later in the destructor. Would add scope in early commit.

 0        {
 1            // Open history file to append
 2            AutoFile file{OpenUndoFile(pos)};
 3            if (file.IsNull()) {
 4                LogError("OpenUndoFile failed for %s while writing", pos.ToString());
 5                return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
 6            }
 7
 8            BufferedWriter fileout{file};
 9
10            // Write index header
11            fileout << GetParams().MessageStart() << blockundo_size;
12            pos.nPos += BLOCK_HEADER_BYTES;
13            {
14                // Calculate checksum
15                HashWriter hasher{};
16                hasher << block.pprev->GetBlockHash() << blockundo;
17                // Write undo data & checksum
18                fileout << blockundo << hasher.GetHash();
19            }
20        } // Make sure file with buffering goes out of scope in case we call FlushUndoFile below.

l0rinc commented at 12:14 pm on April 6, 2025:

Excellent find, added a separate commit with you as a co-author

DrahtBot removed the label CI failed on Apr 4, 2025

in src/streams.h:652 in fe6b0e425c outdated

646+        if (dst.size()) {
647+            assert(m_buf_pos == m_buf.size());
648+            m_src.read(dst);
649+
650+            m_buf_pos = 0;
651+            m_buf.resize(m_src.detail_fread(m_buf));

hodlinator commented at 12:26 pm on April 4, 2025:

Would prefer if we wouldn’t do 2 [f]read/syscalls here. The second one is speculative, we might never use it, so defeats part of the point of buffering. See suggestion commit 4de03f51ab7943f9102b4572d1818ac081af7da4. Not thrilled with how my version adds a dependency on m_src.feof() though.

The aforementioned commit changes m_buf_pos into a span m_buf_used which is a step backward going by #31551 (review). I tried implementing it with just m_buf_pos and ended up needing to use std::copy_backward and it made for a bigger solution than span, see a8a82d5c1d0589aa24c30d54e1f52f07689d7f8c.

l0rinc commented at 12:14 pm on April 6, 2025:

Hmm, I need some more convincing here. I can’t measure any speed gain and don’t really see why basically repeating the logic from the first condition in the second one is a better overall solution. So the current seems fast as simple, I need a good reason for making it more complicated.

ryanofsky commented at 1:37 pm on April 7, 2025:

re: #31551 (review)

I can see benefits of both approaches. l0rinc’s approach seems a little simpler and more suited to current use-case where data size and buffer size are unlikely to line up. hodlinator’s suggested change might work better in other cases where there is a good chance data and buffer sizes will line up, and there might actually be a benefit to avoiding a final read.

I think I’d just go with current simpler solution, though I suspect there maybe some refactored version of hodlinators change that is simpler and has the best of both worlds.

hodlinator commented at 12:51 pm on April 8, 2025:

Can’t think of a simpler version of my variant unfortunately. It results in fewer syscalls at the cost of a longer function. I can stomach the current PR version.

l0rinc commented at 7:28 pm on April 4, 2025: contributor

Redid the IBDs (on an HDD this time) - also showing a ~7% average speedup

 0COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
 1STOP_HEIGHT=888888; DBCACHE=1000; \
 2C_COMPILER=gcc; CXX_COMPILER=g++; \
 3BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
 4(for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
 5hyperfine \
 6  --sort 'command' \
 7  --runs 2 \
 8  --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
 9  --parameter-list COMMIT ${COMMITS// /,} \
10  --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
11    cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
12    cmake --build build -j$(nproc) --target bitcoind && \
13    ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
14  --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
15  "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
16
17d2b72b1369 refactor: rename leftover WriteBlockBench
18652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
19
20Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
21  Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
22  Range (min … max):   41277.786 s … 41778.421 s    2 runs
23 
24Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
25  Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
26  Range (min … max):   38458.957 s … 39083.957 s    2 runs
27 
28Relative speed comparison
29        1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
30        1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)

in src/streams.cpp:104 in fe6b0e425c outdated

105 }
106 
107+void AutoFile::write_buffer(std::span<std::byte> src)
108+{
109+    if (!m_file) throw std::ios_base::failure("AutoFile::write_buffer: file handle is nullptr");
110+    util::Xor(src, m_xor, *m_position); // obfuscate in-place

hodlinator commented at 8:08 pm on April 4, 2025:

Shouldn’t be dereferencing here at the same time as we have an if-guard for it at the end of the function.

l0rinc commented at 12:15 pm on April 6, 2025:

Good find, changed it to:

0    if (m_xor.size() && m_position) util::Xor(src, m_xor, *m_position); // obfuscate in-place

Added you as co-author.

in src/streams.h:677 in fe6b0e425c outdated

671+    S& m_dst;
672+    BufferData m_buf;
673+    size_t m_buf_pos{0};
674+
675+public:
676+    explicit BufferedWriter(S& stream, size_t size = 1 << 16) : m_dst{stream}, m_buf(size) {}

hodlinator commented at 8:41 pm on April 4, 2025:

If we don’t have a need for dynamically sized buffers, BufferedWriter::m_buf could be an array without any further changes except from smaller constructor.

0    std::array<std::byte, N> m_buf;
1    size_t m_buf_pos{0};
2
3public:
4    explicit BufferedWriter(S& stream) : m_dst{stream} {}

Suggestion commit 4de03f51ab7943f9102b4572d1818ac081af7da4 for full version.

l0rinc commented at 12:15 pm on April 6, 2025:

Please see my concerns above

hodlinator commented at 9:58 pm on April 4, 2025: contributor

Code review fe6b0e425c88c42fc501acef86a642bc19feca9f

refactor: rename leftover WriteBlockBench

The benchmark was referencing the old name of the method

c77e3107b8

l0rinc force-pushed on Apr 6, 2025

l0rinc commented at 12:21 pm on April 6, 2025: contributor

Pushed a new version (with rebase because of recent CI failures):

AutoFile/BufferedWriter scope reduction in WriteBlockUndo to make sure the file is closed before it’s reopened - as observed by @hodlinator. Based on previous similar test failures this would especially be problematic on Windows;
Guard m_position dereference in new AutoFile::write_buffer (current usages have always had it, but it’s more consistent this way);
BufferData to DataBuffer renamed I have;
Primitive parameters in new tests now have comments.

Thanks for the comments, ready for further reviews.

in src/streams.cpp:104 in 8a24dfaa34 outdated

105 }
106 
107+void AutoFile::write_buffer(std::span<std::byte> src)
108+{
109+    if (!m_file) throw std::ios_base::failure("AutoFile::write_buffer: file handle is nullptr");
110+    if (m_xor.size() && m_position) util::Xor(src, m_xor, *m_position); // obfuscate in-place

ryanofsky commented at 1:51 pm on April 7, 2025:

In commit “optimization: Bulk serialization writes in WriteBlockUndo and WriteBlock” (8a24dfaa344f0cb99e54f7cb1c6694b372ab930e)

The new version of this seems a bit fragile and unsafe because if m_xor is set and m_position is unset, this will just skip the Xor step and proceed as if nothing is wrong, instead of throwing an error.

Would suggest deleting line 90 above if (!m_position.has_value()) throw std::ios_base::failure("AutoFile::write: position unknown"); and throwing that error here if (m_xor.size() && !m_position)

l0rinc commented at 4:38 pm on April 7, 2025:

You’re right again, it’s safer this way, thank you! https://github.com/bitcoin/bitcoin/compare/c28221545bf23492ee3d32b9c64eca5fc38804eb..1f28d22940805492fbec5a9eb37a5ab767f8add1#diff-6d04aca71dd4079e13471cd3c1699c3e55314fb04c5c6de062ffbdcc8db42024

in src/node/blockstorage.cpp:962 in c28221545b outdated

972+                HashWriter hasher{};
973+                hasher << block.pprev->GetBlockHash() << blockundo;
974+                // Write undo data & checksum
975+                fileout << blockundo << hasher.GetHash();
976+            }
977+            // Make sure buffer goes out of scope before we call FlushUndoFile

ryanofsky commented at 2:04 pm on April 7, 2025:

In commit “Narrow scope of undofile write to avoid possible race condition” (c28221545bf23492ee3d32b9c64eca5fc38804eb)

I don’t like the idea of introducing a separate bugfix commit for a bug introduced in the immediately preceding commit. I think it would be better to avoid adding the bug by making this change in the immediately preceding commit. It also might be clearer to avoid the bug by introducing an explicit fileout.flush() call instead of introducing a new scope and relying on the destructor to flush. This is just a thought so feel free to ignore.

Another option might be to move introduction of the new fileout scope into an earlier commit like commit “refactor: collect block read operations into try block” (467d2b380f4a780b9e17fe64d08b22e61f4a2554). This could make sense because even in the previous version of the code it makes sense logicallly to want to close the undo file before calling FlushUndoFile.

l0rinc commented at 4:37 pm on April 7, 2025:

introducing a separate bugfix commit for a bug introduced in the immediately preceding commit

Technically the bug was there before as well, AutoFile{OpenUndoFile(pos)} was still in scope when FlushUndoFile(pos.nFile) was called, which could theoretically lead to file handle conflicts (had something similar in tests, but only after Buffering) or other unexpected behavior - may not be as dangerous after Buffering, as @hodlinator also mentioned. Moved the commit before the buffering change, kept the braces with comment and added explicit fclose/flush calls as well for clarity.

ryanofsky approved

ryanofsky commented at 2:17 pm on April 7, 2025: contributor

Code review ACK c28221545bf23492ee3d32b9c64eca5fc38804eb. Lots of nice changes like code simplifications, documentation & naming improvements since the previous review but most significant change is new commit c28221545bf23492ee3d32b9c64eca5fc38804eb fixing a potential bug found by @hodlinator. Left some suggestions, but none are critical.

re: #31551 (comment)

Sorry for the extra pushes, it seems there is some inconsistency on Windows regarding file flushing and deletion behavior - so I’ve forced flushing in tests via braces and added read-underflow checks for each to make sure that scenario is covered. PR is ready for review again!

Could you explain in a little more detail what these failures were and how they were fixed? Could you also explain what exactly the (1«20) failures are, where they happen, and what’s the best theory about why they happen when previous approach using bigger buffer sizes did not seem to have a problem?

I don’t think any of these failures are very concerning, but I also don’t really understand them and previous PR discussion is hard to decipher so it it would be nice to have a clear description of the problems that have happened in one place.

l0rinc force-pushed on Apr 7, 2025

l0rinc commented at 4:38 pm on April 7, 2025: contributor

Could you explain in a little more detail what these failures were and how they were fixed?

It was a test-only failure, I had an unnamed AutoFile{}.write() call to directly write the test content to a file in the buffered reader test. Since AutoFile was anonymous, it was closed immediately on POSIX systems but in Windows it lingered around without the last flush, but the readers have opened it up for testing already. See https://github.com/bitcoin/bitcoin/compare/38b7e8e5ced36cddd7e87fa852ac38daf906501a..fe6b0e425c88c42fc501acef86a642bc19feca9f#diff-73552341d85aec344497b47f1c2aa53a7e01a415030e40f56b1c454aef5f209dR567-R571

Could you also explain what exactly the (1«20) failures are, where they happen, and what’s the best theory about why they happen when previous approach using bigger buffer sizes did not seem to have a problem?

I have pushed it to a separate PR in my own repo and noticed that 1<<20 passes on a single thread (i.e. --jobs 1), it only fails when ci-only jobs are enabled and when we’re running on multiple threads, in which case it OOMs - but passes for lower values.

why they happen when previous approach using bigger buffer sizes did not seem to have a problem?

It likely didn’t fail for the previous approach since block413567.raw is less than 1<<20 and the build is passing for smaller values as well, we likely hit the limit with 1<<20 on multiple threads (in the same way as 1<<30 fails on a single thread).

The latest push contains:

Migrated failure("AutoFile::write: position unknown") from AutoFile::write to AutoFile::write_buffer, guarded similarly behind a !m_xor.empty() (changed message slightly);
Added explicit fileout.flush() to BlockManager::WriteBlockUndo next to the scope reduction to be more explicit and moved the scope introduction commit (which only does fileout.fclose()) next to the refactoring commits.

hodlinator approved

hodlinator commented at 1:20 pm on April 8, 2025: contributor

ACK 1f28d22940805492fbec5a9eb37a5ab767f8add1

Concept

Good to add read-buffering and switch to a bigger write buffer to help fine-tune IBD.

(Also fixes potential issue in BlockManager::WriteBlockUndo with flushing a file which is still open for write using another handle).

Microbenchmark results for Clang 19.1.7

0₿ build/bin/bench_bitcoin -min-time=30000 -filter=<bench_name>
1
2|    op/s |  err% |         ins/op |        bra/op | benchmark
3|--------:|------:|---------------:|--------------:|:----------
4|  348.16 |  0.7% |  17,892,592.79 |  2,786,616.55 | `WriteBlockBench` (with only rename-commit)
5|  527.23 |  0.8% |   9,938,390.94 |  1,059,982.61 | `WriteBlockBench` (with PR)
6|  153.67 |  0.0% |  70,042,685.48 |  4,703,506.75 | `ReadBlockBench` (with only rename-commit)
7|  179.84 |  0.0% |  63,495,559.95 |  3,329,391.01 | `ReadBlockBench` (with PR)

51% op/s improvement for WriteBlockBench. 17% for ReadBlockBench.

DrahtBot requested review from ryanofsky on Apr 8, 2025

ryanofsky approved

ryanofsky commented at 3:48 pm on April 8, 2025: contributor

Code review ACK 1f28d22940805492fbec5a9eb37a5ab767f8add1. Only change since last review is moving the undo close-before-flush fix to an earlier commit and making it more explicit.

re: #31551 (comment)

Thanks for describing the previous failures. I still don’t really understand them and I guess I am not too concerned but would appreciate more of an explanation.

For the windows failure it seems like https://github.com/bitcoin/bitcoin/compare/38b7e8e5ced36cddd7e87fa852ac38daf906501a..fe6b0e425c88c42fc501acef86a642bc19feca9f#diff-73552341d85aec344497b47f1c2aa53a7e01a415030e40f56b1c454aef5f209dR567-R571 is not changing any test code but just dropping for (int rep{0}; rep < 10; ++rep) loop around the test. I don’t know how that changes things since there are no variables outside the loop and it looks like all files are closed before they are used again, both before and after that diff.

For the (1«20) OOM failure I just don’t even know the basics of when it happens and why it is expected to happen. From your reference to block413567.raw it sounds like it is happening in a benchmark? Normally I would not expect a benchmark to leak memory so is this happening in some special build with instrumentation that does leak memory, or just uses a lot of memory?

l0rinc commented at 4:11 pm on April 8, 2025: contributor

is not changing any test code

Github merged a few consecutive pushes, this is the smallest diff which showed the test fix: https://github.com/bitcoin/bitcoin/compare/aaceecad2ba85ac7513d1e2973960bdae65ab0fc..38b7e8e5ced36cddd7e87fa852ac38daf906501a?w=1#diff-73552341d85aec344497b47f1c2aa53a7e01a415030e40f56b1c454aef5f209dR568-R571 i.e. narrowing the scope of the Autofile{...}.write(...) test call.

0// Write out the file with random content
1{
2    AutoFile{test_file.Open(pos, false), obfuscation}.write(m_rng.randbytes<std::byte>(file_size));
3}
4BOOST_CHECK_EQUAL(fs::file_size(test_file.FileName(pos)), file_size);

Which assures that the same test file can be opened for reading now.

For the (1«20) OOM failure I just don’t even know the basics of when it happens and why it is expected to happen

That’s what bothers me as well, the build just fails silently. And if I make it single threaded to deduce which test causes it, it passes.

Normally I would not expect a benchmark to leak memory

We’re running the benchmarks as tests as well in https://github.com/bitcoin/bitcoin/blob/master/src/bench/CMakeLists.txt#L83. I don’t expect it to leak, just use extra instrumented memory on multiple threads which happens to add up, apparently. I’m trying to narrow down the culprit, but whenever I remove a few tests to see if the problem is with the other half, the build passes. And sometimes it takes 4 hours to run a build…

ryanofsky commented at 4:14 pm on April 8, 2025: contributor

extra instrumented memory

Thanks but this is the part I don’t understand. Instrumented how? Which CI build is failing? Is there a link to the failure?

l0rinc commented at 4:23 pm on April 8, 2025: contributor

Instrumented how?

It’s the ASan + LSan + UBSan + integer, no depends, USDT build, so the sanitizers need to instrument the code to detect violations, which bloats the resource requirements - I read that AddressSanitizer typically doubles the memory overhead. The build doesn’t fail for uninstumented code or for smaller values.

Which CI build is failing?

My mistake, I though this was easy to find in https://github.com/l0rinc/bitcoin/pull/2 - see https://github.com/l0rinc/bitcoin/actions/runs/14329738280/job/40162805114?pr=2 terminating with a container sigterm:

…………………………………………………………………………. 13/318 - p2p_node_network_limited.py –v1transport passed, duration: 247 s

14/318 - p2p_node_network_limited.py –v2transport passed, duration: 250 s …..

………………………………………………………………. Error: Process completed with exit code 143.

maflcko commented at 7:07 am on April 9, 2025: member

For the (1«20) OOM failure I just don’t even know the basics of when it happens and why it is expected to happen. From your reference to block413567.raw it sounds like it is happening in a benchmark? Normally I would not expect a benchmark to leak memory so is this happening in some special build with instrumentation that does leak memory, or just uses a lot of memory?

Yeah, I don’t think the issue is in the benchmarks. The output from above (14/318 - p2p_node_network_limited.py --v2transport passed, duration: 250 s) is a functional test, which are run after the benchmarks (and unit tests) passed, so it seems more likely that the issue is while running the functional tests under asan.

in src/node/blockstorage.cpp:960 in da7574f140 outdated

969+                HashWriter hasher{};
970+                hasher << block.pprev->GetBlockHash() << blockundo;
971+                // Write undo data & checksum
972+                fileout << blockundo << hasher.GetHash();
973+            }
974+            fileout.fclose(); // Make sure `AutoFile` goes out of scope before we call `FlushUndoFile`

maflcko commented at 7:30 am on April 10, 2025:

in da7574f140fafc0eb3ea80fd5f40e8442afffef1:

Interesting commit. I wonder why this worked at all previously, because the scope of the autofile was always beyond FlushUndoFile.

However, FlushUndoFile logs a warning when the flush does not work. Now, a flush error in this line is silently ignored.

l0rinc commented at 3:02 pm on April 10, 2025:

This only applies to the intermediary commit (the final buffered call flushes only), and the last fclose(file) call in FlatFileSeq::Flush doesn’t check for errors either, so there was already a “silent ignore” of potential file closing errors in the original code.

I’ll handle the close here temporarily (the last commit reverts it, but at least this commit is more complete) and will let https://github.com/bitcoin/bitcoin/pull/29307/files#diff-114c2880ec1ff2c5293ac65ceda0637bf92c05745b74b58410585a549464a33f handle the existing ones.

maflcko commented at 6:31 am on April 11, 2025:

This only applies to the intermediary commit (the final buffered call flushes only), and the last fclose(file) call in FlatFileSeq::Flush doesn’t check for errors either, so there was already a “silent ignore” of potential file closing errors in the original code.

I am not sure. Here, you are adding an explicit fclose, which is responsible for the flushing, so any flush error will be silently ignored (in this commit). The fclose in Flush seems acceptable to ignore, because the flush was already done (and checked). (If it wasn’t acceptable to ignore, it would be good to fix it along with an explanation why the fix is needed)

To answer my other questions: Lack of the commit probably doesn’t mean data corruption in the happy path, but if there was a power loss or crash of the process after FlushUndoFile, but before the autofile went out of scope, there could be data loss/corruption. However, this should be no different from just data loss/corruption before anything was written, so this seems almost like a refactor commit.

in src/node/blockstorage.h:78 in 3c4c6e8edc outdated

74@@ -75,10 +75,10 @@ static const unsigned int UNDOFILE_CHUNK_SIZE = 0x100000; // 1 MiB
75 static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000; // 128 MiB
76 
77 /** Size of header written by WriteBlock before a serialized CBlock (8 bytes) */
78-static constexpr size_t BLOCK_SERIALIZATION_HEADER_SIZE{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
79+static constexpr size_t BLOCK_HEADER_BYTES{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};

maflcko commented at 7:52 am on April 10, 2025:

in 3c4c6e8edc0fb4918838cb66811b423096db635b:

Not sure about this commit:

It would be easier to review if this was a scripted diff, because it spans several translation units and the scripted diff is smaller than the code changes.
Also, I am not sure the new name is correct. “Block serialization” here refers to writing block and undo data and this is the size of the header that proceeds those. Whereas “block header” usually refers to the first few bytes of a serialized block itself. So it may be better to drop this commit, or use a different wording, like “STORAGE_HEADER_SIZE” or “SER_HEADER_SIZE” (if you want to clarify that this is about serialization/storage) or “INDEX(_HEADER)_SIZE” or “SEPARATOR_SIZE” (if you want to go after the existing code comments) or “LEN_PREFIX_SIZE” (if you want to stress that the prefix includes the length)

l0rinc commented at 3:02 pm on April 10, 2025:

Added scripted diff to STORAGE_HEADER_SIZE (with guard to make sure it doesn’t exist before)

hodlinator commented at 9:57 pm on April 10, 2025:

Good point about BLOCK_HEADER_BYTES being a misnomer, quite different from CBlockHeader.

node::STORAGE_HEADER_SIZE feels awfully generic to me, would prefer the original name in that case, or slightly shortened: node::BLOCK_SER_HEADER_SIZE. Not a blocker though.

(I liked the _BYTES suffix to make the unit crystal clear, planning to incorporate it in a WIP change).

l0rinc commented at 9:45 pm on April 13, 2025:

Renamed to _BYTES suffix

in src/node/blockstorage.cpp:681 in c92ed996f1 outdated

677@@ -678,11 +678,11 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
678 
679         // Verify checksum
680         if (hashChecksum != verifier.GetHash()) {
681-            LogError("%s: Checksum mismatch at %s\n", __func__, pos.ToString());
682+            LogError("%s: Checksum mismatch at %s", __func__, pos.ToString());

maflcko commented at 8:00 am on April 10, 2025:

nit in c92ed996f1b81d5c05f80bd857565be73a4b0174:

Given that the function names were recently changed anyway, and given that there exist a log setting to enable source function name logging, it could make sense to remove __func__ and use a unique and descriptive error message here.

Maybe Checksum mismatch at %s while reading block undo?

l0rinc commented at 3:02 pm on April 10, 2025:

Done

cc: @hodlinator

hodlinator commented at 9:58 pm on April 10, 2025:

Nice approach for making the log messages unique. :+1:

in src/node/blockstorage.cpp:685 in c92ed996f1 outdated

682+            LogError("%s: Checksum mismatch at %s", __func__, pos.ToString());
683             return false;
684         }
685     } catch (const std::exception& e) {
686-        LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
687+        LogError("%s: Deserialize or I/O error - %s at %s", __func__, e.what(), pos.ToString());

maflcko commented at 8:00 am on April 10, 2025:

same

l0rinc commented at 3:02 pm on April 10, 2025:

Done

in src/node/blockstorage.cpp:996 in c92ed996f1 outdated

992@@ -993,27 +993,27 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
993     // Open history file to read
994     AutoFile filein{OpenBlockFile(pos, true)};
995     if (filein.IsNull()) {
996-        LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
997+        LogError("%s: OpenBlockFile failed for %s", __func__, pos.ToString());

maflcko commented at 8:01 am on April 10, 2025:

remove func and add “while reading”, like above, for consistency?

l0rinc commented at 3:02 pm on April 10, 2025:

Done

in src/node/blockstorage.cpp:1004 in c92ed996f1 outdated

1001     try {
1002         // Read block
1003         filein >> TX_WITH_WITNESS(block);
1004     } catch (const std::exception& e) {
1005-        LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
1006+        LogError("%s: Deserialize or I/O error - %s at %s", __func__, e.what(), pos.ToString());

maflcko commented at 8:02 am on April 10, 2025:

etc …

l0rinc commented at 3:01 pm on April 10, 2025:

Done

in src/node/blockstorage.cpp:1035 in 2a798b5f47 outdated

1035@@ -1036,15 +1036,12 @@ bool BlockManager::ReadBlock(CBlock& block, const CBlockIndex& index) const
1036 
1037 bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos& pos) const
1038 {
1039-    FlatFilePos hpos = pos;
1040-    // If nPos is less than 8 the pos is null and we don't have the block data

maflcko commented at 8:10 am on April 10, 2025:

2a798b5f478d54f3f2333fd282e5e238148842e4: Any reason to silently remove the comment? It would be good to explain it in the commit message. I guess this error should only happen when there is a race? An alternative would be to expand the error message itself?

l0rinc commented at 3:01 pm on April 10, 2025:

Based on https://github.com/bitcoin/bitcoin/commit/da338aada7943c392013c36c542af621fbc6edd1 (and the comment and following code) it seems to me this is a simple parameter validation to make sure we can safely step back to be able to read the header - i.e. pos.nPos - BLOCK_HEADER_BYTES. And also, since it repeated the constant 8 and mentioned pos is null (which didn’t make sense to me in this context), and it seemed we can better explain it with code, it’s why I’ve deleted. Added this explanation to commit message as well. @andrewtoth, do you think we should document it better?

maflcko commented at 7:34 am on April 11, 2025:

pos is null (which didn’t make sense to me in this context)

My guess is that “pos is null” refers to the fact that pos.IsNull() (equal to a default constructed) or that nFile and nPos were set to 0 (after pruning).

However, for those cases, the caller is encouraged to check it first. This should work almost always, except for some rare races, see #30410.

So it just seems a bit odd to remove the comment, but keep the one about unsigned overflow.

unsigned overflow is the symptom to avoid, but not the real reason for the check.

Also, as an unrelated side note, unsigned overflow is not undefined behavior in the language, just undesired behavior in this context.

l0rinc commented at 4:57 pm on April 13, 2025:

Reworded, added it back

maflcko commented at 8:56 am on April 14, 2025:

I don’t think the new wording is right. pos.IsNull() does not hit when nPos=0.

To quote myself:

My guess is that “pos is null” refers to the fact that pos.IsNull() (equal to a default constructed) -> or <- that nFile and nPos were set to 0 (after pruning).

l0rinc commented at 9:58 am on April 14, 2025:

Updated it again, let me know if this clarifies it now:

0// If `nPos` is less than `STORAGE_HEADER_BYTES`, we can't read the header that precedes the block data
1// This would cause an unsigned integer underflow when trying to position the file cursor
2// This can happen after pruning or default constructed positions
3LogError("Failed for %s while reading raw block storage header", pos.ToString());

maflcko commented at 11:26 am on April 14, 2025:

I guess one could consider a follow-up to set nPos=-1 when unsetting while pruning, so that this confusion is avoided.

in src/node/blockstorage.cpp:1044 in 2a798b5f47 outdated

1045         LogError("%s failed for %s", __func__, pos.ToString());
1046         return false;
1047     }
1048-    hpos.nPos -= 8; // Seek back 8 bytes for meta header
1049-    AutoFile filein{OpenBlockFile(hpos, true)};
1050+    AutoFile filein{OpenBlockFile({pos.nFile, static_cast<unsigned int>(pos.nPos - BLOCK_HEADER_BYTES)}, true)};

maflcko commented at 8:14 am on April 10, 2025:

2a798b5f478d54f3f2333fd282e5e238148842e4: Would be nice to avoid the cast here? It may be possible by changing the type of the constexpr symbol to uint32_t (instead of using size_t, which may be a larger unsigned type, but if it required uint64_t, it would be broken anyway)?

maflcko commented at 8:15 am on April 10, 2025:

Also, while touching, it would be nice to use named args for all integral or bool literals

l0rinc commented at 3:01 pm on April 10, 2025:

Changed to uint32_t BLOCK_HEADER_BYTES and uint32_t UNDO_DATA_DISK_OVERHEAD in the refactoring commit
Added default args - and made /*fReadOnly=*/false args explicit while here.

in src/test/streams_tests.cpp:13 in 082e2b9828 outdated

 8 #include <test/util/setup_common.h>
 9 #include <util/fs.h>
10 #include <util/strencodings.h>
11 
12 #include <boost/test/unit_test.hpp>
13+#include <node/blockstorage.h>

maflcko commented at 9:28 am on April 10, 2025:

nit in 082e2b9828494153317fbc540a9e595498766f00: This is not a boost header, so should go in the other section

l0rinc commented at 3:01 pm on April 10, 2025:

Done

in src/streams.h:643 in 082e2b9828 outdated

635+        : m_src{stream}, m_buf(size), m_buf_pos{size} {}
636+
637+    void read(std::span<std::byte> dst)
638+    {
639+        if (const auto available{std::min(dst.size(), m_buf.size() - m_buf_pos)}) {
640+            std::copy_n(m_buf.begin() + m_buf_pos, available, dst.begin());

maflcko commented at 9:32 am on April 10, 2025:

style nit: Not that it matters, but the call to begin() and addition can be avoided by just using iterators directly:

 0diff --git a/src/streams.h b/src/streams.h
 1index 2f82c389c0..62afb40dbb 100644
 2--- a/src/streams.h
 3+++ b/src/streams.h
 4@@ -626,26 +626,26 @@ class BufferedReader
 5 {
 6     S& m_src;
 7     DataBuffer m_buf;
 8-    size_t m_buf_pos;
 9+    DataBuffer::iterator m_buf_pos;
10 
11 public:
12     //! Requires stream ownership to prevent leaving the stream at an unexpected position after buffered reads.
13     explicit BufferedReader(S&& stream, size_t size = 1 << 16)
14         requires std::is_rvalue_reference_v<S&&>
15-        : m_src{stream}, m_buf(size), m_buf_pos{size} {}
16+        : m_src{stream}, m_buf(size), m_buf_pos{m_buf.end()} {}
17 
18     void read(std::span<std::byte> dst)
19     {
20-        if (const auto available{std::min(dst.size(), m_buf.size() - m_buf_pos)}) {
21-            std::copy_n(m_buf.begin() + m_buf_pos, available, dst.begin());
22+        if (const auto available{std::min<size_t>(dst.size(), m_buf.end() - m_buf_pos)}) {
23+            std::copy_n(m_buf_pos, available, dst.begin());
24             m_buf_pos += available;
25             dst = dst.subspan(available);
26         }
27         if (dst.size()) {
28-            assert(m_buf_pos == m_buf.size());
29+            assert(m_buf_pos == m_buf.end());
30             m_src.read(dst);
31 
32-            m_buf_pos = 0;
33+            m_buf_pos = m_buf.begin();
34             m_buf.resize(m_src.detail_fread(m_buf));
35         }
36     }

l0rinc commented at 3:01 pm on April 10, 2025:

It’s a good idea, but for consistency, I would have to change the writer as well and would likely need std::distance there, which I’d like to avoid - so for consistency will leave it as is. Let me know if you disagree.

in src/test/streams_tests.cpp:605 in 082e2b9828 outdated

600+    {
601+        DataBuffer excess_byte{1};
602+        BOOST_CHECK_EXCEPTION(buffered_reader.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
603+    }
604+
605+    try { fs::remove(test_file.FileName(pos)); } catch (...) {}

maflcko commented at 9:42 am on April 10, 2025:

Is there a reason for the try-catch, when the other tests don’t do it? seems odd to silence an exception, when it could indicate an error?

l0rinc commented at 3:00 pm on April 10, 2025:

Done, thanks

in src/streams.h:633 in 082e2b9828 outdated

628+    DataBuffer m_buf;
629+    size_t m_buf_pos;
630+
631+public:
632+    //! Requires stream ownership to prevent leaving the stream at an unexpected position after buffered reads.
633+    explicit BufferedReader(S&& stream, size_t size = 1 << 16)

maflcko commented at 9:49 am on April 10, 2025:

missing LIFETIMEBOUND for stream? Otherwise, there won’t be any Wdandling warnings at all for wrong code

maflcko commented at 9:56 am on April 10, 2025:

(I can’t edit my comment due to a GitHub error, but Wdandling should be Wdangling)

l0rinc commented at 3:00 pm on April 10, 2025:

Wasn’t sure about this - the semantics of LIFETIMEBOUND with rvalue references wasn’t clear to me -, but if https://godbolt.org/z/ffGjK9Y6P is correct, you’re right, thanks for the suggestion!

Added it as separate commit to apply it to BufferedFile (and BufferedWriter) as well:

BufferedWriter(S& stream LIFETIMEBOUND, size_t size = 1 << 16)
BufferedReader(S&& stream LIFETIMEBOUND, size_t size = 1 << 16)
BufferedFile(AutoFile& file LIFETIMEBOUND, uint64_t nBufSize, uint64_t nRewindIn)

maflcko commented at 9:51 am on April 10, 2025: member

Will take a look at the last commit later.

lgtm ACK 1f28d22940805492fbec5a9eb37a5ab767f8add1 🦈

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: lgtm ACK 1f28d22940805492fbec5a9eb37a5ab767f8add1 🦈
3jkwshUoCwdvu65CyGBzPw+32BRLBXNdyIarM/LG971L2uHq1nzFe+o7rE27IfcR9DibMN1VOul36+Ah1yIfAAg==

l0rinc force-pushed on Apr 10, 2025

l0rinc commented at 3:02 pm on April 10, 2025: contributor

it seems more likely that the issue is while running the functional tests under asan

I have narrowed it down to these tests: https://github.com/l0rinc/bitcoin/pull/2 (removed as many tests as possible, while the build still fails https://github.com/l0rinc/bitcoin/pull/2/files) - not sure how to continue.

Interestingly the ASAN build sometimes fails for 1 << 16 as well (see tracking issue: https://github.com/bitcoin/bitcoin/actions/runs/14355144302/job/40242755698?pr=32043#logs)

l0rinc force-pushed on Apr 10, 2025

in src/node/blockstorage.cpp:963 in bae92bb717 outdated

972+                fileout << blockundo << hasher.GetHash();
973+            }
974+
975+            // Make sure `AutoFile` goes out of scope before we call `FlushUndoFile`
976+            if (fileout.fclose()) {
977+                LogError("WriteBlockUndo: fclose failed\n");

maflcko commented at 6:38 am on April 11, 2025:

review note in bae92bb717693d0bb2602645832b243a0d17b35a: The trailing \n isn’t needed, but this doesn’t matter, because the LogError is removed in the last commit again.

l0rinc commented at 4:57 pm on April 13, 2025:

Done, since I was pushing anyway

in src/streams.h:531 in afc72bbe27 outdated

527@@ -523,7 +528,7 @@ class BufferedFile
528     }
529 
530 public:
531-    BufferedFile(AutoFile& file, uint64_t nBufSize, uint64_t nRewindIn)
532+    BufferedFile(AutoFile& file LIFETIMEBOUND, uint64_t nBufSize, uint64_t nRewindIn)

maflcko commented at 7:13 am on April 11, 2025:

nit in 1b4d187ded: Not sure why this is added, albeit it should be harmless. My understanding is that in C++, a mutable reference argument can not bind to a temporary. So the rules that would be enforced by LIFETIMEBOUND here are already enforced by the language.

l0rinc commented at 4:57 pm on April 13, 2025:

Could very well be, but it would still provide documentation against usages as:

0BufferedFile* createBufferedFile(fs::path streams_test_filename) {
1    AutoFile file{fsbridge::fopen(streams_test_filename, "w+b")};
2    return new BufferedFile(file, 10, 10);
3}

maflcko commented at 9:04 am on April 14, 2025:

Seems fine, but the commit history now looks odd. You are adding, removing, and then re-adding it.

(Personally, I run git range-diff on every push I do, or at least once I am done with pushes to a pull request. This ensures that all pushes ended up being correct)

l0rinc commented at 10:13 am on April 14, 2025:

yeah, hasty rebase mistake, fixed, thanks for your patience

in src/node/blockstorage.cpp:1005 in 824d03c592 outdated

1002+    AutoFile file{OpenBlockFile(pos, /*fReadOnly=*/true)};
1003+    if (file.IsNull()) {
1004         LogError("OpenBlockFile failed for %s while reading block", pos.ToString());
1005         return false;
1006     }
1007+    BufferedReader filein{std::move(file)};

maflcko commented at 8:31 am on April 11, 2025:

in 824d03c592b78cbc2abf2f03116467651f9388f6:

It is possible to achieve the same speed-up here, by using the existing SpanReader and ReadRawBlock with less code:

 0diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
 1index 9680415182..ff6b96af1f 100644
 2--- a/src/node/blockstorage.cpp
 3+++ b/src/node/blockstorage.cpp
 4@@ -995,18 +995,12 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
 5 bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
 6 {
 7     block.SetNull();
 8-
 9-    // Open history file to read
10-    AutoFile file{OpenBlockFile(pos, /*fReadOnly=*/true)};
11-    if (file.IsNull()) {
12-        LogError("OpenBlockFile failed for %s while reading block", pos.ToString());
13+    std::vector<uint8_t> dat;
14+    if (!ReadRawBlock(dat, pos)) {
15         return false;
16     }
17-    BufferedReader filein{std::move(file)};
18-
19     try {
20-        // Read block
21-        filein >> TX_WITH_WITNESS(block);
22+        SpanReader{dat} >> TX_WITH_WITNESS(block);
23     } catch (const std::exception& e) {
24         LogError("Deserialize or I/O error - %s at %s while reading block", e.what(), pos.ToString());
25         return false;

It is even half a percent faster on my machine, but that is probably just noise.

Style-wise, what I like about re-using ReadRawBlock, is that it also clearly documents that ReadRawBlock is a logically subset of ReadBlock and does less checks and less work.

l0rinc commented at 4:57 pm on April 13, 2025:

This will use slightly more memory this way, but I love the idea (I can also measure a 0.4% speedup - not that it matters), let’s see what @ryanofsky thinks (he recommended smaller buffers)

maflcko commented at 9:09 am on April 14, 2025:

This will use slightly more memory this way

Yes. Though, the commit message seems to indicate the opposite?

Delegating ReadBlock to ReadRawBlock avoids the buffer overhead …

l0rinc commented at 10:15 am on April 14, 2025:

I meant the speed and complexity overhead - rephrased for clarity, thanks for the feedback.

in src/streams.h:470 in afc72bbe27 outdated

464@@ -465,6 +465,9 @@ class AutoFile
465         ::Unserialize(*this, obj);
466         return *this;
467     }
468+
469+    //! Write a mutable buffer more efficiently than write(), obfuscating the buffer in-place.
470+    void write_buffer(std::span<std::byte> src);

maflcko commented at 9:18 am on April 11, 2025:

nit in afc72bbe27f3b682fc45f949bfeb538233dbffa5: I don’t think this is part of the normal “stream subset”, so should be moved to a different section

l0rinc commented at 4:57 pm on April 13, 2025:

Moved it before read, let me know if this is better.

maflcko commented at 9:26 am on April 11, 2025: member

re-ACK afc72bbe27f3b682fc45f949bfeb538233dbffa5 🤽

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: re-ACK afc72bbe27f3b682fc45f949bfeb538233dbffa5 🤽
3OhqG1qS5V2TCDZFXMUBMBYBW1EH9niVMUiegitPGCMYyLov9Cpv2oB8N2+9yFP+HFr1pGh9w7qTC6HajnxznDw==

DrahtBot requested review from ryanofsky on Apr 11, 2025

DrahtBot requested review from hodlinator on Apr 11, 2025

l0rinc force-pushed on Apr 13, 2025

refactor: collect block read operations into try block

Reorganized error handling in block-related operations by grouping related operations together within the same scope.

In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling.
In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.

3197155f91

Narrow scope of undofile write to avoid possible resource management issue

`AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>

6640dd52c9

scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant

Renames the constant to be less verbose and better reflect its purpose:
it represents the size of the storage header that precedes serialized block data on disk,
not to be confused with a block's own header.

-BEGIN VERIFY SCRIPT-
git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1
sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE')
-END VERIFY SCRIPT-

a4de160492

log: unify error messages for (read/write)[undo]block

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>

67fcc64802

l0rinc force-pushed on Apr 13, 2025

maflcko commented at 9:07 am on April 14, 2025: member

re-ACK d6e1b3849d98a6def4d8fc70c2c4ea71d0705211 💻

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: re-ACK d6e1b3849d98a6def4d8fc70c2c4ea71d0705211 💻
3NUJD8/Jp9vcO7VClfi+Zi8t3QYLlphjBP06VDeb9fKEClmCIr8yUefztOy/XqjGoBZByn9spNVKP6xZPXeuHDw==

refactor: clear up blockstorage/streams in preparation for optimization

Made every OpenBlockFile#fReadOnly value explicit.

Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES.
Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts.

Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists.
Without this attribute, code with lifetime violations wouldn't trigger compiler warnings.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>

056cb3c0d2

optimization: bulk serialization reads in `UndoRead`, `ReadBlock`

The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later.

Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention.

Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly.
Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality.

The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance.

Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        1,738,971.29 |              575.05 |    0.2% |     10.97 | `ReadBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>

520965e293

optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock`

Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations.
This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation.
Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,990,564.63 |              334.39 |    1.5% |     11.27 | `WriteBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>

8d801e3efb

l0rinc force-pushed on Apr 14, 2025

maflcko commented at 11:27 am on April 14, 2025: member

re-ACK 8d801e3efbf1e3b1f9a0060b777788f271cb21c9 🐦

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: re-ACK 8d801e3efbf1e3b1f9a0060b777788f271cb21c9 🐦
3NXHa0klOHq/bzEVdcT6vdDlYLPxRxRc77E8smjyF4BSUtIqR+eLTkbH+cNzcad3hULKH2NuDTpoe//j2/N/7Cw==

in src/node/blockstorage.cpp:936 in 056cb3c0d2 outdated

932@@ -933,7 +933,7 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
933     // Write undo information to disk
934     if (block.GetUndoPos().IsNull()) {
935         FlatFilePos pos;
936-        const unsigned int blockundo_size{static_cast<unsigned int>(GetSerializeSize(blockundo))};
937+        const auto blockundo_size{static_cast<uint32_t>(GetSerializeSize(blockundo))};

ryanofsky commented at 4:10 pm on April 14, 2025:

In commit “refactor: clear up blockstorage/streams in preparation for optimization” (056cb3c0d2efb86447b7ff7788c206e2e7d72c12)

Introducing uint32_t in this commit seems like a regression to me. Maybe I am missing something but it seems like previously code was only using size_t and unsigned types, and now it is still using both of these types and uint32_t is being added as a third type for no clear reason. (GetSerializeSize still returns size_t and FlatFilePos and related functions still use unsigned) Would suggest just keeping previous types since it seemed better, and in long run just switching to size_t everywhere.

Or feel free it ignore this comment if I am missing something, since this is not important.

l0rinc commented at 5:33 pm on April 14, 2025:

This was changed as a consequence of adjusting the constants to avoid a few casts in #31551 (review) But I’ll be the first to admit that I’m not exactly sure why we’re using both unsigned int and uint32_t (the latter seems more precise to me, but I know you prefer the more general ones). If it’s not a big deal, I’d like to tackle this in a follow-up PR, if needed, we’ve been editing and ACK-ig this PR for quite a few rounds now.

maflcko commented at 8:39 am on April 16, 2025:

in long run just switching to size_t everywhere.

I don’t think this is going to happen. size_t is unsigned, but there are many places that use signed types to denote sizes. Also, in structs or arrays, shorter fixed-size unsigned types are used sometimes, because they are sufficient an allow for better struct packing and array memory usage.

ryanofsky approved

ryanofsky commented at 4:34 pm on April 14, 2025: contributor

Code review ACK 8d801e3efbf1e3b1f9a0060b777788f271cb21c9. Most notable change is switching from BufferedReader to ReadRawBlock for block reads, which makes sense, and there are also various cleanups in blockstorage and test code.

hodlinator approved

hodlinator commented at 10:12 am on April 15, 2025: contributor

re-ACK 8d801e3efbf1e3b1f9a0060b777788f271cb21c9

Changes since previous review (containing concept remarks) / 1f28d22940805492fbec5a9eb37a5ab767f8add1:

Call ReadRawBlock used inside ReadBlock instead of buffering.
Removed try/catch exception censorship in tests.
Return error when fclose() fails to flush in intermediate commit.
Added LIFETIMEBOUND.
Switched to scripted diff for constant rename. Thanks for reviving _BYTES suffix!
Improved log messages and made them more unique.
Int-type/casting minutia.

Microbenchmark results using GCC 14.2.1

Optimization to use ReadRawBlock inside ReadBlock:

0|    op/s |  err% |         ins/op |         cyc/op |    IPC |        bra/op | benchmark
1|--------:|------:|---------------:|---------------:|-------:|--------------:|:----------
2|  149.35 |  0.1% |  72,571,626.29 |  24,212,461.78 |  2.997 |  5,095,459.91 | `ReadBlockBench` non-optimized
3|  182.00 |  0.1% |  64,420,332.52 |  20,028,281.73 |  3.216 |  3,634,906.46 | `ReadBlockBench` optimized

WriteBlock:

0|    op/s |  err% |         ins/op |        cyc/op |    IPC |        bra/op | benchmark
1|--------:|------:|---------------:|--------------:|-------:|--------------:|:----------
2|  314.96 |  0.4% |  19,677,668.56 |  8,910,214.00 |  2.208 |  3,034,914.07 | `WriteBlockBench` non-optimized
3|  511.77 |  0.7% |  12,649,108.06 |  4,473,937.05 |  2.827 |  1,658,127.69 | `WriteBlockBench` optimized
4|  514.57 |  0.3% |  12,633,747.44 |  4,521,294.66 |  2.794 |  1,655,747.80 | `WriteBlockBench` (optimized, buffersize 1 << 20)

in src/node/blockstorage.cpp:962 in 6640dd52c9 outdated

971+                // Write undo data & checksum
972+                fileout << blockundo << hasher.GetHash();
973+            }
974+
975+            // Make sure `AutoFile` goes out of scope before we call `FlushUndoFile`
976+            if (fileout.fclose()) {

TheCharlatan commented at 9:19 pm on April 16, 2025:

In commit 6640dd52c9fcb85d77f081780c02ee37b8089091

Introducing this if statement seems to be ephemeral, but even so, I’m not sure if its effects are really as intended. Previously fclose failing was just ignored. Is it really safe to add this early return now instead of just letting the Autofile destructor do its thing? Maybe I am missing something, but this will have effects on when ConnectBlock might fail, and I’m not sure if that is intended. There is also no similar check when we are writing blocks.

I saw the discussion in #31551 (review), and #31551 (review) , but I still don’t quite follow why a failed close is dangerous here, and not elsewhere.

l0rinc commented at 9:34 pm on April 16, 2025:

It’s an intermediary commit, the final code contains a flush only: https://github.com/bitcoin/bitcoin/pull/31551/files#diff-114c2880ec1ff2c5293ac65ceda0637bf92c05745b74b58410585a549464a33fR963

TheCharlatan commented at 9:53 pm on April 16, 2025:

So why is it added at all?

l0rinc commented at 9:55 pm on April 16, 2025:

To make the commit self-contained, as described in:

I’ll handle the close here temporarily (the last commit reverts it, but at least this commit is more complete) and will let #29307 (files) handle the existing ones.

TheCharlatan commented at 7:31 am on April 17, 2025:

Yes, I read that, and did not understand how this makes it more self-contained, or complete. The commit message also said nothing about adding a new error condition.

l0rinc commented at 7:37 am on April 17, 2025:

It’s closed now uniformly in https://github.com/bitcoin/bitcoin/pull/29307/files#diff-114c2880ec1ff2c5293ac65ceda0637bf92c05745b74b58410585a549464a33fR965-R972 - does that address your concern?

TheCharlatan commented at 8:46 am on April 17, 2025:

Not really, but I guess we can continue over there.

achow101 commented at 9:26 pm on April 16, 2025: member

ACK 8d801e3efbf1e3b1f9a0060b777788f271cb21c9

achow101 merged this on Apr 16, 2025

achow101 closed this on Apr 16, 2025

l0rinc deleted the branch on Apr 17, 2025

l0rinc commented at 7:53 am on April 17, 2025: contributor

Thank you all for helping me get this over the finish line @theuni, @ryanofsky, @maflcko, @hodlinator, @achow101! It went over quite a few iterations until we found the version that we all liked.

TheCharlatan commented at 8:45 am on April 17, 2025: contributor

Post-merge ACK

[IBD] batch block reads/writes during AutoFile serialization #31551

Summary

Unlocking new optimization opportunities

Measurements (micro benchmarks, full IBDs and reindexes)

Code Coverage & Benchmarks

Reviews

Conflicts

Option A)

Option B)

Option C)

Concept

Microbenchmark results for Clang 19.1.7

Microbenchmark results using GCC 14.2.1

[IBD] batch block reads/writes during `AutoFile` serialization #31551