bug-fix macos: give free bytes to F

kallewoof commented at 1:27 pm on January 7, 2020: member

The macos manpage for fcntl (for F_PEOFPOSMODE) states:

Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired.

This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate pos + length free bytes, rather than allocating length bytes after pos, as expected.

Fixes #17827.

fanquake added the label macOS on Jan 7, 2020

sipa commented at 1:31 pm on January 7, 2020: member

So this code has been wrong for 7 years? Why do we only notice the effect now?

fanquake commented at 1:32 pm on January 7, 2020: member

So this code has been wrong for 7 years? Why do we only notice the effect now?

It’s likely this issue only started occurring with the introduction of APFS, which is quite recent.

kallewoof commented at 1:33 pm on January 7, 2020: member

@sipa I suspect the OS eventually reclaims pre-allocated space, but it seems block files are not suffering from this kind of issue, so I’m honestly not sure. The docs clearly indicate we were using it wrong, though. (Edit: the issue is a bit more convoluted; see #17887 (comment))

Old edit:

It’s likely this issue only started occurring with the introduction of APFS, which is quite recent.

That is probably more likely.

kallewoof commented at 1:43 pm on January 7, 2020: member

On my machine (APFS encrypted volume), I was seeing rev00002.dat hitting 20 MB around block 135k. After this patch (running up to block 191106, then shutting down) I see

 0$ du -ch *.dat
 1128M	blk00000.dat
 2128M	blk00001.dat
 3128M	blk00002.dat
 4128M	blk00003.dat
 5128M	blk00004.dat
 6128M	blk00005.dat
 7128M	blk00006.dat
 8128M	blk00007.dat
 9128M	blk00008.dat
10128M	blk00009.dat
11128M	blk00010.dat
12128M	blk00011.dat
13128M	blk00012.dat
14128M	blk00013.dat
15128M	blk00014.dat
16128M	blk00015.dat
17128M	blk00016.dat
18 48M	blk00017.dat
19 20M	rev00000.dat
20 17M	rev00001.dat
21 17M	rev00002.dat
22 17M	rev00003.dat
23 17M	rev00004.dat
24 18M	rev00005.dat
25 18M	rev00006.dat
26 19M	rev00007.dat
27 19M	rev00008.dat
28 18M	rev00009.dat
29 18M	rev00010.dat
30 19M	rev00011.dat
31 19M	rev00012.dat
32 19M	rev00013.dat
33 19M	rev00014.dat
34 18M	rev00015.dat
35 19M	rev00016.dat
365.1M	rev00017.dat
372.5G	total

Master (~same block, after shutdown):

 0$ du -ch *.dat
 1128M	blk00000.dat
 2128M	blk00001.dat
 3128M	blk00002.dat
 4128M	blk00003.dat
 5128M	blk00004.dat
 6128M	blk00005.dat
 7128M	blk00006.dat
 8128M	blk00007.dat
 9128M	blk00008.dat
10128M	blk00009.dat
11128M	blk00010.dat
12128M	blk00011.dat
13128M	blk00012.dat
14128M	blk00013.dat
15128M	blk00014.dat
16128M	blk00015.dat
17128M	blk00016.dat
18144M	blk00017.dat
19 19M	rev00000.dat
20 17M	rev00001.dat
21 16M	rev00002.dat
22 17M	rev00003.dat
23 17M	rev00004.dat
24 18M	rev00005.dat
25 18M	rev00006.dat
26 36M	rev00007.dat
27 51M	rev00008.dat
28 18M	rev00009.dat
29 18M	rev00010.dat
30 18M	rev00011.dat
31162M	rev00012.dat
32 67M	rev00013.dat
33126M	rev00014.dat
34 51M	rev00015.dat
35 67M	rev00016.dat
36 21M	rev00017.dat
373.0G	total

Both runs were after rm -rf $BITCOINDIR.

grep 'Pre-allocating' results for master run, minus a small part of the start: https://pastebin.com/dXLLcRQa

Looking closely at the run on master, the files are blowing up 2x in size until they are no longer used, at which point they are sometimes truncated to their actual size. The blow-up applies to blk files as well, but these are always truncated down after use (see blk00017.dat, which is too large because it isn’t finished yet; this was after shutting Bitcoin Core down).

Running on this PR, the 2x blowup is not occurring for either files.

Running macos 10.14.6 (Mojave). User is running Catalina, so perhaps I would see the problem more often if I upgraded.

DrahtBot commented at 1:54 pm on January 7, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

kallewoof commented at 4:11 pm on January 7, 2020: member

Enlightening read: https://www.mail-archive.com/filesystem-dev@lists.apple.com/msg00226.html

IPGlider commented at 7:03 pm on January 7, 2020: contributor

Tested on MacOS 10.15.2 on an APFS volume and the problem disappears. Also tested on a mounted HFS+ disk volume and the problem disappears.

fanquake commented at 2:45 am on January 8, 2020: member

Concept ACK

fanquake added the label Needs backport (0.19) on Jan 8, 2020

eriknylund commented at 8:18 pm on January 9, 2020: contributor

I ran this on machine 2 in the issue report and can confirm that there is no blowup.

MacBook Pro (15-inch, 2018)
2,6 GHz 6-Core Intel Core i7
32 GB 2400 MHz DDR4
macOS Catalina 10.15.2 (upgraded from Mojave)

Tested on an APFS unencrypted disk image with 3 GB in size and let it run until the image was full:

 0$ ./bitcoind -datadir=/Volumes/APFS
 12020-01-09T19:43:17Z UpdateTip: 
 2new best=000000000000059cfb1620756c7a88b679dfcbedc1d3c255dbc0f09481f8021b
 3height=191055 version=0x00000001 log2_work=68.451374 tx=5349799 
 4date='2012-07-27T17:15:53Z' progress=0.010844 cache=292.8MiB(2160691txo)
 52020-01-09T19:43:19Z *** Disk space is too low!
 6
 7$ cd /Volumes/APFS/blocks/
 8
 9$ du -ch *.dat
10128M	blk00000.dat
11128M	blk00001.dat
12128M	blk00002.dat
13128M	blk00003.dat
14128M	blk00004.dat
15128M	blk00005.dat
16128M	blk00006.dat
17128M	blk00007.dat
18128M	blk00008.dat
19128M	blk00009.dat
20128M	blk00010.dat
21128M	blk00011.dat
22128M	blk00012.dat
23128M	blk00013.dat
24128M	blk00014.dat
25128M	blk00015.dat
26128M	blk00016.dat
27 48M	blk00017.dat
28 20M	rev00000.dat
29 17M	rev00001.dat
30 16M	rev00002.dat
31 17M	rev00003.dat
32 18M	rev00004.dat
33 18M	rev00005.dat
34 19M	rev00006.dat
35 18M	rev00007.dat
36 19M	rev00008.dat
37 18M	rev00009.dat
38 19M	rev00010.dat
39 18M	rev00011.dat
40 18M	rev00012.dat
41 18M	rev00013.dat
42 19M	rev00014.dat
43 18M	rev00015.dat
44 19M	rev00016.dat
455.0M	rev00017.dat
462.5G	total

Tested on a Mac OS Extended (HFS+) unencrypted disk image with 3 GB in size:

 0$ ./bitcoind -datadir=/Volumes/HFS+/
 12020-01-09T20:13:51Z UpdateTip: 
 2new best=00000000000004e055fbb153a6371891f7dc52adbe525b3a5cf3b9413fde7551 
 3height=189775 version=0x00000001 log2_work=68.414219 tx=5084658 
 4date='2012-07-19T07:39:48Z' progress=0.010306 cache=285.6MiB(2101977txo)
 52020-01-09T20:13:51Z Error: Disk space is too low!
 6
 7$ cd /Volumes/HFS+/blocks/
 8
 9$ du -ch *.dat
10128M	blk00000.dat
11128M	blk00001.dat
12128M	blk00002.dat
13128M	blk00003.dat
14128M	blk00004.dat
15128M	blk00005.dat
16128M	blk00006.dat
17128M	blk00007.dat
18128M	blk00008.dat
19128M	blk00009.dat
20128M	blk00010.dat
21128M	blk00011.dat
22128M	blk00012.dat
23128M	blk00013.dat
24128M	blk00014.dat
25128M	blk00015.dat
26 64M	blk00016.dat
27 19M	rev00000.dat
28 17M	rev00001.dat
29 16M	rev00002.dat
30 16M	rev00003.dat
31 17M	rev00004.dat
32 17M	rev00005.dat
33 18M	rev00006.dat
34 18M	rev00007.dat
35 18M	rev00008.dat
36 18M	rev00009.dat
37 18M	rev00010.dat
38 18M	rev00011.dat
39 18M	rev00012.dat
40 18M	rev00013.dat
41 18M	rev00014.dat
42 18M	rev00015.dat
438.0M	rev00016.dat
442.3G	total

eriknylund commented at 8:44 pm on January 9, 2020: contributor

I also built a Bitcoin-Qt.dmg and started a fresh download on machine 1 in the issue report, to verify that I can successfully complete the initial block download on the Mac mini with 480 GB of free disk space at the start. ETA 3 days.

fanquake commented at 5:06 am on January 10, 2020: member

I’ve just finished a from scratch sync using this PR + 17892, on an macOS (APFS) machine. Data directory is the expected size. No longer seeing very large rev*.dat files, most are between 16 and 21mb.

eriknylund commented at 5:24 pm on January 12, 2020: contributor

I also built a Bitcoin-Qt.dmg and started a fresh download on machine 1 in the issue report, to verify that I can successfully complete the initial block download on the Mac mini with 480 GB of free disk space at the start. ETA 3 days.

I’ve just finished a from scratch sync using this PR + 17892, on an macOS (APFS) machine. Data directory is the expected size. No longer seeing very large rev*.dat files, most are between 16 and 21mb.

The IBD is now complete and the estimate is spot on. 280 GB of disk space is used. The rev*.dat file sizes are in the range 12-27 MB.

 0Bitcoin % pwd
 1/Users/erik/Library/Application Support/Bitcoin
 2
 3Bitcoin % du -h .
 4 92M	./blocks/index
 5276G	./blocks
 63.7G	./chainstate
 72.0M	./wallets/database
 84.0M	./wallets
 9280G	.
10
11Bitcoin % tail debug.log
122020-01-12T17:17:44Z UpdateTip: 
13new best=000000000000000000141d63c0a927235c7e21b39f8c632f9b7e70a9f3fb1659 
14height=612543 version=0x3fffe000 log2_work=91.544872 tx=492581053 
15date='2020-01-12T17:17:13Z' progress=1.000000 cache=259.8MiB(1712729txo) 
16warning='71 of last 100 blocks have unexpected version'
17
18Bitcoin % du -ch blocks/rev*.dat | sort -n
19 12M	blocks/rev00311.dat
20 12M	blocks/rev01927.dat
21 14M	blocks/rev00314.dat
22...
23 23M	blocks/rev00337.dat
24 24M	blocks/rev01037.dat
25 27M	blocks/rev01036.dat
26 35G	total

eriknylund commented at 7:57 pm on January 12, 2020: contributor

Looking a bit more into the rev files I see what is most likely the truncation issue @kallewoof discovered in #17890

0Bitcoin % du -h blocks/rev00337.dat blocks/rev01037.dat blocks/rev01036.dat
1 23M	blocks/rev00337.dat
2 24M	blocks/rev01037.dat
3 27M	blocks/rev01036.dat
4Bitcoin % ls -lh blocks/rev00337.dat blocks/rev01037.dat blocks/rev01036.dat
5-rw-------  1 erik  staff    21M Jan 10 10:46 blocks/rev00337.dat
6-rw-------  1 erik  staff    23M Jan 11 12:56 blocks/rev01036.dat
7-rw-------  1 erik  staff    21M Jan 11 12:58 blocks/rev01037.dat

kallewoof commented at 4:28 am on January 13, 2020: member

Thanks! Yeah, we will have to do a one time run-through of all the rev files to clean up the excess use, or we will end up with wasted space for everyone.. I’m honestly not sure why this isn’t affecting other systems though. The finalize issue is not mac specific (only the over-preallocating issue is).

kallewoof commented at 4:36 am on January 13, 2020: member

This made me realize that this actually is a bug in the operating system. I couldn’t explain why this didn’t happen to earlier versions, but I believe I can now.

The bug is that ftruncate does not return pre-allocated disk space.

Before, we would over-preallocate and then immediately truncate back down to the expected size. This worked in pre-APFS land, but APFS has a bug which leaves the pre-allocated space as occupied by the file, even though we call ftruncate.

I will make a minimal test case displaying this behavior and submit a report to Apple.

Edit: https://openradar.appspot.com/radar?id=4930713610616832

Edit: I updated the bug report above with information on F_PUNCHHOLE, which appears to be the way to un-preallocate (return) space; this does nothing, however.

in src/util/system.cpp:987 in 5de93dced1 outdated

984     if (fcntl(fileno(file), F_PREALLOCATE, &fst) == -1) {
985         fst.fst_flags = F_ALLOCATEALL;
986         fcntl(fileno(file), F_PREALLOCATE, &fst);
987     }
988-    ftruncate(fileno(file), fst.fst_length);
989+    ftruncate(fileno(file), (off_t)offset + length);

Empact commented at 8:33 pm on January 13, 2020:

nit: could static_cast

kallewoof commented at 2:08 am on January 14, 2020:

Done

Empact commented at 8:35 pm on January 13, 2020: member

Code review ACK https://github.com/bitcoin/bitcoin/pull/17887/commits/5de93dced1a2d10b21ce0afa3bfb293cf58071e9 also checked man fcntl locally

kallewoof force-pushed on Jan 14, 2020

fanquake added this to the "Bugfixes" column in a project

fanquake commented at 12:27 pm on January 14, 2020: member

@arvidn Any chance you could comment here, given you patched an APFS related issue in libtorrent some time ago?

arvidn commented at 1:17 pm on January 14, 2020: none

This patch changes the behavior of AllocateFileRange(), which is documented to:

this function tries to make a particular range of a file allocated (corresponding to disk space) it is advisory, and the range specified in the arguments will never contain live data

It doesn’t seem like this patch correctly implements the documented behavior. It might fix an issue in how AllocateFileRange() is used, but there still seems to be a mismatch in how AllocateFileRange() behaves on other operating systems and what it’s documented to do.

If I understand the mailing list post Dominic Giampaolo correctly, in order to implement the documented behavior of AllocateFileRange() you would also need to stat() the file and set:

0fst.fst_length = offset + length - st_size;

and if offset + length < st_size, skip the fcntl() call entirely.

kallewoof commented at 1:50 pm on January 14, 2020: member

Edit: nevermind, I see what you’re saying, but I believe offset is already (supposed to be) equal to st_size, so I don’t think it matters.

arvidn commented at 1:57 pm on January 14, 2020: none

but I believe offset is already (supposed to be) equal to st_size

Right, this patch makes that assumption. The windows implementation does not make this assumption, nor does the documentation state it will make this assumption.

kallewoof commented at 2:13 pm on January 14, 2020: member

I see what you mean. I don’t think it matters in this case either way, but it’s worth pointing out.

kallewoof commented at 2:14 pm on January 14, 2020: member

Either way, that’s unrelated to the question which is how to properly pre-allocate (and more importantly, how to return unused space at the end) on APFS.

bug-fix macos: give free bytes to F_PREALLOCATE

The macos manpage for fcntl (for F_PEOFPOSMODE) states:

> Allocate from the physical end of file.  In this case, fst_length indicates the number of newly allocated bytes desired.

75163f4729

kallewoof force-pushed on Jan 14, 2020

kallewoof commented at 11:58 pm on January 14, 2020: member

Pushed a comment regarding the assumption <file size> == offset in the OSX version.

in src/util/system.cpp:978 in 75163f4729

973@@ -974,17 +974,19 @@ void AllocateFileRange(FILE *file, unsigned int offset, unsigned int length) {
974     SetEndOfFile(hFile);
975 #elif defined(MAC_OSX)
976     // OSX specific version
977+    // NOTE: Contrary to other OS versions, the OSX version assumes that
978+    // NOTE: offset is the size of the file.

laanwj commented at 11:41 am on January 15, 2020:

Wouldn’t it make sense to add an assertion here? Implementing a function differently than documented on a OS (silently assuming it is always used in a specific way) could turn into a troubleshooting nightmare at some point.

kallewoof commented at 12:00 pm on January 15, 2020:

You would have to check file size, in which case you might as well do what @arvidn suggested. It depends on the impact of checking the file size (once per MB of data for rev files). I initially wrote code to do just that, but it felt wasteful. If people think it’s fine, I can switch to this (or if there’s a more optimized variant).

 0diff --git a/src/util/system.cpp b/src/util/system.cpp
 1index d9fc38c91..3edcf5bc3 100644
 2--- a/src/util/system.cpp
 3+++ b/src/util/system.cpp
 4@@ -974,11 +974,17 @@ void AllocateFileRange(FILE *file, unsigned int offset, unsigned int length) {
 5     SetEndOfFile(hFile);
 6 #elif defined(MAC_OSX)
 7     // OSX specific version
 8+    long pos = ftell(file);
 9+    fseek(file, 0L, SEEK_END);
10+    long filesize = ftell(file);
11+    fseek(file, pos, SEEK_SET);
12+    if (offset + length <= filesize) return;
13     fstore_t fst;
14     fst.fst_flags = F_ALLOCATECONTIG;
15     fst.fst_posmode = F_PEOFPOSMODE;
16     fst.fst_offset = 0;
17-    fst.fst_length = length; // mac os fst_length takes the # of free bytes to allocate, not desired file size
18+    // mac os fst_length takes the # of free bytes to allocate, not desired file size
19+    fst.fst_length = static_cast<off_t>(offset) + length - filesize;
20     fst.fst_bytesalloc = 0;
21     if (fcntl(fileno(file), F_PREALLOCATE, &fst) == -1) {
22         fst.fst_flags = F_ALLOCATEALL;

laanwj commented at 3:14 pm on January 15, 2020:

Yes, that seems wasteful.

Another solution would be to remove the offset parameter and define the function to always work relative to the end of the file, for all platforms. This also avoids using the function in the wrong away.

kallewoof commented at 4:01 pm on January 15, 2020:

I think the main reason why it is there in the first place is to avoid the seek part when possible. It basically functions as a cache approximating the file size. It will overshoot by a small amount (depending on the size of whatever is being written that doesn’t currently fit), but it’s mostly accurate, and I think for performance reasons this is done that way. After all, only the preallocate part overshoots - the ftruncate call adjusts the file size to the expected value (..except it doesn’t – see #17892).

We could of course remove the offset completely, but I think it would be an unnecessary performance loss with very little gain.

In practice, we end up over-pre-allocating up to the maximum possible value of an entry written into the respective files; for rev files, I assume this is usually in the few kbs range, and for blk files, I assume it’s a couple of MBs since blocks are around that size. Analysis on this as below.

===

Given:

required = bytes needed
position = current position inside file
chunk_size = the size of each chunk in the file (1 MB for rev files)
allocation = current file allocation (pre-allocated space)
filesize = current file size, i.e. whatever we truncated it to (this is identical to allocation on systems except OSX, and is in part mitigated by the fix here)
old_chunks = (position + chunk_size - 1) / chunk_size
new_chunks = (position + required + chunk_size - 1) / chunk_size

For the case where new_chunks > old_chunks, there is a discrepancy in position vs filesize that is between 0..chunk_size-1 (not greater, because the code never allocates more chunks than needed).

The offset is set to the former, and thus the assumed file size is in the range [filesize - chunk_offset + 1 .. filesize].
The length is set to the value new_chunks * chunk_size - position, i.e. position + required + chunk_size - 1 - position = required + chunk_size - 1.

Results:

Expectation: pre-allocate required + chunk_size - 1 free bytes to get a file with size position + required + chunk_size - 1 bytes, then truncate to that size.
Result: pre-allocate required + chunk_size - 1 free bytes to get a file with size filesize + required + chunk_size - 1 bytes, then truncate to position + required + chunk_size - 1 bytes (this is currently not functional on APFS).
I.e. we end up with up to chunk_size more bytes in the pre-allocation stage.
For the rev case, we end up over-pre-allocating up to 1 extra MB of space that we keep bumping (fairly randomly in the range 0..1024*1024, though in practice this value is capped at the size of the biggest entry we write into rev files).
For the blk case, we over-pre-allocate up to 16 MB, but since blocks are capped at a few MBs, this ends up being a few MBs in practice.

If you spot any errors, let me know and I’ll update.

laanwj commented at 3:39 pm on January 22, 2020:

I think that’s correct. Though I would expect the seek to be of negligible performance compared to the allocation of disk space itself.

But it’s always possible to do that later; if people tested that this fixes the issue let’s not make further changes here.

kallewoof commented at 1:12 am on January 17, 2020: member

 0=== BLOCK FILE INFO STATS (blocks 0 .. 50000) ===
 1BLOCKS: 49877 additions, 12276387 total space, 246.1 bytes/addition
 2REVS:   49437 additions, 2365864 total space, 47.9 bytes/addition
 3
 4=== BLOCK FILE INFO STATS (blocks 50000 .. 100000) ===
 5BLOCKS: 49996 additions, 48138897 total space, 962.9 bytes/addition
 6     3 allocations; 30243 over, 0 under => 10081.0 over/alloc, 0.0 under/alloc
 7REVS:   49838 additions, 7405309 total space, 148.6 bytes/addition
 8     7 allocations; 30309 over, 0 under => 4329.9 over/alloc, 0.0 under/alloc
 9
10=== BLOCK FILE INFO STATS (blocks 100000 .. 150000) ===
11BLOCKS: 50003 additions, 626534873 total space, 12529.9 bytes/addition
12     38 allocations; 414684 over, 0 under => 10912.7 over/alloc, 0.0 under/alloc
13REVS:   49910 additions, 76723416 total space, 1537.2 bytes/addition
14     78 allocations; 140190 over, 0 under => 1797.3 over/alloc, 0.0 under/alloc
15
16=== BLOCK FILE INFO STATS (blocks 150000 .. 200000) ===
17BLOCKS: 49984 additions, 2449096762 total space, 48997.6 bytes/addition
18     147 allocations; 8693387 over, 0 under => 59138.7 over/alloc, 0.0 under/alloc
19REVS:   50221 additions, 329612289 total space, 6563.2 bytes/addition
20     329 allocations; 2841940 over, 0 under => 8638.1 over/alloc, 0.0 under/alloc
21
22=== BLOCK FILE INFO STATS (blocks 200000 .. 250000) ===
23BLOCKS: 50085 additions, 6683661512 total space, 133446.4 bytes/addition
24     397 allocations; 35292223 over, 0 under => 88897.3 over/alloc, 0.0 under/alloc
25REVS:   50444 additions, 898707532 total space, 17815.9 bytes/addition
26     886 allocations; 10750386 over, 0 under => 12133.6 over/alloc, 0.0 under/alloc
27
28=== BLOCK FILE INFO STATS (blocks 250000 .. 300000) ===
29BLOCKS: 49928 additions, 8774861776 total space, 175750.3 bytes/addition
30     526 allocations; 67195634 over, 0 under => 127748.4 over/alloc, 0.0 under/alloc
31REVS:   49276 additions, 1193450599 total space, 24219.7 bytes/addition
32     1181 allocations; 21881254 over, 0 under => 18527.7 over/alloc, 0.0 under/alloc
33
34=== BLOCK FILE INFO STATS (blocks 300000 .. 350000) ===
35BLOCKS: 50051 additions, 14839754426 total space, 296492.7 bytes/addition
36     886 allocations; 188673758 over, 0 under => 212950.1 over/alloc, 0.0 under/alloc
37REVS:   50406 additions, 2084855624 total space, 41361.3 bytes/addition
38     2037 allocations; 61869705 over, 0 under => 30373.0 over/alloc, 0.0 under/alloc
39
40=== BLOCK FILE INFO STATS (blocks 350000 .. 400000) ===
41BLOCKS: 50070 additions, 27257622944 total space, 544390.3 bytes/addition
42     1627 allocations; 549589308 over, 0 under => 337793.1 over/alloc, 0.0 under/alloc
43REVS:   50443 additions, 3878444863 total space, 76887.7 bytes/addition
44     3785 allocations; 183316201 over, 0 under => 48432.3 over/alloc, 0.0 under/alloc
45
46=== BLOCK FILE INFO STATS (blocks 400000 .. 450000) ===
47BLOCKS: 49953 additions, 39828992105 total space, 797329.3 bytes/addition
48     2385 allocations; 943542364 over, 0 under => 395615.2 over/alloc, 0.0 under/alloc
49REVS:   49951 additions, 5365225097 total space, 107409.8 bytes/addition
50     5276 allocations; 296906600 over, 0 under => 56274.9 over/alloc, 0.0 under/alloc
51
52=== BLOCK FILE INFO STATS (blocks 450000 .. 500000) ===
53BLOCKS: 50037 additions, 47505676936 total space, 949411.0 bytes/addition
54     2840 allocations; 1300043016 over, 0 under => 457761.6 over/alloc, 0.0 under/alloc
55REVS:   50030 additions, 6404583587 total space, 128014.9 bytes/addition
56     6284 allocations; 385011825 over, 0 under => 61268.6 over/alloc, 0.0 under/alloc
57
58=== BLOCK FILE INFO STATS (blocks 500000 .. 550000) ===
59BLOCKS: 50006 additions, 43443126157 total space, 868758.2 bytes/addition
60    2599 allocations; 1225858445 over, 0 under => 471665.4 over/alloc, 0.0 under/alloc
61REVS:   50019 additions, 5910980004 total space, 118174.7 bytes/addition
62    5800 allocations; 376748454 over, 0 under => 64956.6 over/alloc, 0.0 under/alloc
63
64=== BLOCK FILE INFO STATS (blocks 550000 .. 600000) ===
65BLOCKS: 49998 additions, 53633781103 total space, 1072718.5 bytes/addition
66    3211 allocations; 1685246556 over, 0 under => 524835.4 over/alloc, 0.0 under/alloc
67REVS:   49982 additions, 6957171383 total space, 139193.5 bytes/addition
68    6833 allocations; 476667594 over, 0 under => 69759.6 over/alloc, 0.0 under/alloc

kallewoof commented at 1:14 am on January 17, 2020: member

Strictly increasing, but looks like rev files will over-allocate by about 70 kb on average, and block files by .5 MB.

eriknylund commented at 7:22 pm on January 20, 2020: contributor

 0=== BLOCK FILE INFO STATS (blocks 0 .. 50000) ===
 1BLOCKS: 49877 additions, 12276387 total space, 246.1 bytes/addition
 2REVS:   49437 additions, 2365864 total space, 47.9 bytes/addition
 3
 4=== BLOCK FILE INFO STATS (blocks 50000 .. 100000) ===
 5BLOCKS: 49996 additions, 48138897 total space, 962.9 bytes/addition
 6     3 allocations; 30243 over, 0 under => 10081.0 over/alloc, 0.0 under/alloc
 7REVS:   49838 additions, 7405309 total space, 148.6 bytes/addition
 8     7 allocations; 30309 over, 0 under => 4329.9 over/alloc, 0.0 under/alloc
 9
10=== BLOCK FILE INFO STATS (blocks 100000 .. 150000) ===
11BLOCKS: 50003 additions, 626534873 total space, 12529.9 bytes/addition
12     38 allocations; 414684 over, 0 under => 10912.7 over/alloc, 0.0 under/alloc
13REVS:   49910 additions, 76723416 total space, 1537.2 bytes/addition
14     78 allocations; 140190 over, 0 under => 1797.3 over/alloc, 0.0 under/alloc
15
16=== BLOCK FILE INFO STATS (blocks 150000 .. 200000) ===
17BLOCKS: 49984 additions, 2449096762 total space, 48997.6 bytes/addition
18     147 allocations; 8693387 over, 0 under => 59138.7 over/alloc, 0.0 under/alloc
19REVS:   50221 additions, 329612289 total space, 6563.2 bytes/addition
20     329 allocations; 2841940 over, 0 under => 8638.1 over/alloc, 0.0 under/alloc
21
22=== BLOCK FILE INFO STATS (blocks 200000 .. 250000) ===
23BLOCKS: 50085 additions, 6683661512 total space, 133446.4 bytes/addition
24     397 allocations; 35292223 over, 0 under => 88897.3 over/alloc, 0.0 under/alloc
25REVS:   50444 additions, 898707532 total space, 17815.9 bytes/addition
26     886 allocations; 10750386 over, 0 under => 12133.6 over/alloc, 0.0 under/alloc
27
28=== BLOCK FILE INFO STATS (blocks 250000 .. 300000) ===
29BLOCKS: 49928 additions, 8774861776 total space, 175750.3 bytes/addition
30     526 allocations; 67195634 over, 0 under => 127748.4 over/alloc, 0.0 under/alloc
31REVS:   49276 additions, 1193450599 total space, 24219.7 bytes/addition
32     1181 allocations; 21881254 over, 0 under => 18527.7 over/alloc, 0.0 under/alloc
33
34=== BLOCK FILE INFO STATS (blocks 300000 .. 350000) ===
35BLOCKS: 50051 additions, 14839754426 total space, 296492.7 bytes/addition
36     886 allocations; 188673758 over, 0 under => 212950.1 over/alloc, 0.0 under/alloc
37REVS:   50406 additions, 2084855624 total space, 41361.3 bytes/addition
38     2037 allocations; 61869705 over, 0 under => 30373.0 over/alloc, 0.0 under/alloc
39
40=== BLOCK FILE INFO STATS (blocks 350000 .. 400000) ===
41BLOCKS: 50070 additions, 27257622944 total space, 544390.3 bytes/addition
42     1627 allocations; 549589308 over, 0 under => 337793.1 over/alloc, 0.0 under/alloc
43REVS:   50443 additions, 3878444863 total space, 76887.7 bytes/addition
44     3785 allocations; 183316201 over, 0 under => 48432.3 over/alloc, 0.0 under/alloc
45
46=== BLOCK FILE INFO STATS (blocks 400000 .. 450000) ===
47BLOCKS: 49953 additions, 39828992105 total space, 797329.3 bytes/addition
48     2385 allocations; 943542364 over, 0 under => 395615.2 over/alloc, 0.0 under/alloc
49REVS:   49951 additions, 5365225097 total space, 107409.8 bytes/addition
50     5276 allocations; 296906600 over, 0 under => 56274.9 over/alloc, 0.0 under/alloc
51
52=== BLOCK FILE INFO STATS (blocks 450000 .. 500000) ===
53BLOCKS: 50037 additions, 47505676936 total space, 949411.0 bytes/addition
54     2840 allocations; 1300043016 over, 0 under => 457761.6 over/alloc, 0.0 under/alloc
55REVS:   50030 additions, 6404583587 total space, 128014.9 bytes/addition
56     6284 allocations; 385011825 over, 0 under => 61268.6 over/alloc, 0.0 under/alloc
57
58=== BLOCK FILE INFO STATS (blocks 500000 .. 550000) ===
59BLOCKS: 50006 additions, 43443126157 total space, 868758.2 bytes/addition
60    2599 allocations; 1225858445 over, 0 under => 471665.4 over/alloc, 0.0 under/alloc
61REVS:   50019 additions, 5910980004 total space, 118174.7 bytes/addition
62    5800 allocations; 376748454 over, 0 under => 64956.6 over/alloc, 0.0 under/alloc
63
64=== BLOCK FILE INFO STATS (blocks 550000 .. 600000) ===
65BLOCKS: 49998 additions, 53633781103 total space, 1072718.5 bytes/addition
66    3211 allocations; 1685246556 over, 0 under => 524835.4 over/alloc, 0.0 under/alloc
67REVS:   49982 additions, 6957171383 total space, 139193.5 bytes/addition
68    6833 allocations; 476667594 over, 0 under => 69759.6 over/alloc, 0.0 under/alloc

I’m curious to know, how did you go about collecting these stats?

eriknylund commented at 7:43 pm on January 20, 2020: contributor

ACK 75163f4729c10c40d2843da28a8c79ab89193f6a built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB.

Are there any new tests that would make sense to add for this change?

eriknylund approved

eriknylund commented at 7:51 pm on January 20, 2020: contributor

I prefer this over the more complicated approach involving file size checks. It fixes the blow-up problem without adding more lines of code.

kallewoof commented at 0:19 am on January 21, 2020: member

@eriknylund

I’m curious to know, how did you go about collecting these stats?

I simply patched Bitcoin Core directly and re-ran IBD. I did some polishing on the final output, though. The branch is called ‘macos-f_preallocate_fix-benchmarked’ if you wanna see: https://github.com/kallewoof/bitcoin/tree/macos-f_preallocate_fix-benchmarked

Sjors commented at 2:22 pm on January 21, 2020: member

@kallewoof wrote:

I suspect the OS eventually reclaims pre-allocated space, but it seems block files are not suffering from this kind of issue, so I’m honestly not sure. The docs clearly indicate we were using it wrong, though.

Maybe. I have the blockchain on my macOS Catalina machine (APFS with encryption). Most historical rev files are ~17 MB but starting from rev01835.dat (October 22, 2019) I’m seeing many big ones (up to 171MB). blk files don’t have this issue. Catalina was released on October 7th and I upgraded on the 9th according to softwareupdate --history. There was another system update on the 15th. I did sync Bitcoin Core multiple times in the interim, covering about 10 rev files which are normal size.

Note that Finder just shows ~17 MB sizes.

When I sync a fresh node from scratch I see rev files get as large as 100 MB, though usually shrinking back a bit when it moves the next one:

With this PR rev files don’t grow over 19 MB.

I also tried starting a (testnet) node that was previously synced on master, but a day behind, and it didn’t complain.

kallewoof commented at 2:31 pm on January 21, 2020: member

@Sjors Thanks for testing! Yeah, that comment is outdated. I will update it. I described it in detail here: #17887 (comment)

dongcarl commented at 4:38 pm on January 21, 2020: member

Concept ACK, great find!

fjahr commented at 5:05 pm on January 21, 2020: member

Hm, I could not reproduce the issue with the current master (631df3ee87ec93e1fc748715671cdb5cff7308e6) in a way indicated here: #17887 (comment)

I started a new IBD and let it run up to ~220k blocks. Checked the blocks folder every couple of minutes and never saw a rev file larger than 20M, not even the most recent one as described by @Sjors .

Here are my file sizes: https://gist.github.com/fjahr/a56a2c81217b255f448f2b53928c93db

System: macOS Catalina 10.15.2, APFS (Encrypted), 2016 MBP

Happy to test again if that’s helpful. Based on the information I read here, I can’t tell why my results are different from everyone else.

Edit: I was now able to reproduce larger rev files (up to 64M). I was previously compiling with --enable-debug which appears to prevent this issue from appearing.

Edit 2: This patch fixed the issue when --enable-debug is not enabled. I updated the gist file to with my other results.

Sjors commented at 7:08 pm on January 21, 2020: member

I ran with --enable-debug though.

bg002h commented at 7:10 pm on January 21, 2020: none

My sizes using this pull request, running Catalina and an unencrypted APFS disk:

https://pastebin.com/W53SX4xg

Running master:

https://pastebin.com/0Sj2XXny

All file sizes for both this PR and master don’t seem any different while running or after shutdown.

Sjors commented at 7:14 pm on January 21, 2020: member

@bg002h ls -alFh won’t see the issue. Try du -ch instead.

bg002h commented at 7:33 pm on January 21, 2020: none

@Sjors Thx for the tip!

I reformatted my NVME raid to APFS, encrypted, case sensitive since the above.

For master, I see: https://pastebin.com/GeqsZhmh (rev 50 was 45M)

running this PR (pastern is down): bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat . . . 128M blk00052.dat 16M blk00053.dat 8.0K index 19M rev00000.dat 16M rev00001.dat 16M rev00002.dat 16M rev00003.dat 16M rev00004.dat 17M rev00005.dat 17M rev00006.dat 17M rev00007.dat 18M rev00008.dat 18M rev00009.dat 17M rev00010.dat 17M rev00011.dat 18M rev00012.dat 18M rev00013.dat 17M rev00014.dat 17M rev00015.dat 18M rev00016.dat 17M rev00017.dat 17M rev00018.dat 17M rev00019.dat 17M rev00020.dat 17M rev00021.dat 17M rev00022.dat 17M rev00023.dat 17M rev00024.dat 17M rev00025.dat 17M rev00026.dat 17M rev00027.dat 17M rev00028.dat 17M rev00029.dat 17M rev00030.dat 17M rev00031.dat 17M rev00032.dat 17M rev00033.dat 17M rev00034.dat 17M rev00035.dat 17M rev00036.dat 17M rev00037.dat 17M rev00038.dat 17M rev00039.dat 17M rev00040.dat 17M rev00041.dat 17M rev00042.dat 17M rev00043.dat 17M rev00044.dat 17M rev00045.dat 17M rev00046.dat 17M rev00047.dat 17M rev00048.dat 17M rev00049.dat 17M rev00050.dat 17M rev00051.dat 17M rev00052.dat 2.0M rev00053.dat in progress 7.5G total

bg002h commented at 7:42 pm on January 21, 2020: none

using master, and checking du repeatedly during blk1, rev1 gets pretty big before being trimmed back:

cg@Brians-MacBook-Pro blocks % du -ch * 576M blk00000.dat 8.0K index 190M rev00000.dat 766M total

The behavior is not observed using this PR:

bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 8.0K index 17M rev00000.dat 145M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 8.0K index 18M rev00000.dat 146M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 8.0K index 19M rev00000.dat 147M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 8.0K index 19M rev00000.dat 147M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 8.0K index 19M rev00000.dat 147M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 16M blk00001.dat 8.0K index 19M rev00000.dat 1.0M rev00001.dat 164M total bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 16M blk00001.dat 8.0K index 19M rev00000.dat 1.0M rev00001.dat 164M total bcg@Brians-MacBook-Pro blocks % bcg@Brians-MacBook-Pro blocks % du -ch * 128M blk00000.dat 96M blk00001.dat 8.0K index 19M rev00000.dat 12M rev00001.dat 255M total

fjahr commented at 8:30 pm on January 21, 2020: member

I ran with --enable-debug though.

Strange, I double checked and still not seeing any large rev files with --enable-debug…

kallewoof commented at 1:15 am on January 22, 2020: member

--enable-debug should not matter in this case.

maaku commented at 2:00 am on January 22, 2020: contributor

FWIW I’m able to reproduce this on gitian-built clients as far back as 0.12.

laanwj commented at 3:46 pm on January 22, 2020: member

code review ACK 75163f4729c10c40d2843da28a8c79ab89193f6a I have checked the documentation that the new use of F_PREALLOCATE is correct (and the old one is wrong).

Thanks everyone for testing!

laanwj referenced this in commit 04f78b818f on Jan 22, 2020

laanwj merged this on Jan 22, 2020

laanwj closed this on Jan 22, 2020

laanwj removed this from the "Bugfixes" column in a project

fanquake referenced this in commit c8ad23c529 on Jan 23, 2020

fanquake removed the label Needs backport (0.19) on Jan 23, 2020

fanquake commented at 0:42 am on January 23, 2020: member

Being backported to 0.19 in #17988.

laanwj referenced this in commit 178a834687 on Jan 23, 2020

sidhujag referenced this in commit 6234d8cce9 on Jan 24, 2020

MarkLTZ referenced this in commit d99bc96496 on Feb 13, 2020

kallewoof deleted the branch on Sep 2, 2020

kallewoof commented at 4:01 am on September 2, 2020: member

Apple responded to my bug report regarding this, in case people are curious: https://openradar.appspot.com/radar?id=4930713610616832

jasonbcox referenced this in commit 862f380605 on Sep 23, 2020

sidhujag referenced this in commit 3ffeb8e979 on Nov 10, 2020

PastaPastaPasta referenced this in commit 1b5ea905a3 on Sep 17, 2021

thelazier referenced this in commit c4ce0d1cb6 on Sep 25, 2021

Munkybooty referenced this in commit 2b9bf6c933 on Dec 9, 2021

Munkybooty referenced this in commit 5da2a644f8 on Dec 9, 2021

Munkybooty referenced this in commit 7ac5a41438 on Dec 9, 2021

Munkybooty referenced this in commit 1d79da9e0a on Dec 23, 2021

DrahtBot locked this on Feb 15, 2022

bug-fix macos: give free bytes to F_PREALLOCATE #17887

Conflicts