I think the main reason why it is there in the first place is to avoid the seek part when possible. It basically functions as a cache approximating the file size. It will overshoot by a small amount (depending on the size of whatever is being written that doesn’t currently fit), but it’s mostly accurate, and I think for performance reasons this is done that way. After all, only the preallocate part overshoots - the ftruncate
call adjusts the file size to the expected value (..except it doesn’t – see #17892).
We could of course remove the offset completely, but I think it would be an unnecessary performance loss with very little gain.
In practice, we end up over-pre-allocating up to the maximum possible value of an entry written into the respective files; for rev files, I assume this is usually in the few kbs range, and for blk files, I assume it’s a couple of MBs since blocks are around that size. Analysis on this as below.
===
Given:
required
= bytes needed
position
= current position inside file
chunk_size
= the size of each chunk in the file (1 MB for rev files)
allocation
= current file allocation (pre-allocated space)
filesize
= current file size, i.e. whatever we truncated it to (this is identical to allocation
on systems except OSX, and is in part mitigated by the fix here)
old_chunks = (position + chunk_size - 1) / chunk_size
new_chunks = (position + required + chunk_size - 1) / chunk_size
For the case where new_chunks > old_chunks
, there is a discrepancy in position
vs filesize
that is between 0..chunk_size-1
(not greater, because the code never allocates more chunks than needed).
- The
offset
is set to the former, and thus the assumed file size is in the range [filesize - chunk_offset + 1 .. filesize]
.
- The
length
is set to the value new_chunks * chunk_size - position
, i.e. position + required + chunk_size - 1 - position = required + chunk_size - 1
.
Results:
- Expectation: pre-allocate
required + chunk_size - 1
free bytes to get a file with size position + required + chunk_size - 1
bytes, then truncate to that size.
- Result: pre-allocate
required + chunk_size - 1
free bytes to get a file with size filesize + required + chunk_size - 1
bytes, then truncate to position + required + chunk_size - 1
bytes (this is currently not functional on APFS).
- I.e. we end up with up to
chunk_size
more bytes in the pre-allocation stage.
- For the rev case, we end up over-pre-allocating up to 1 extra MB of space that we keep bumping (fairly randomly in the range 0..1024*1024, though in practice this value is capped at the size of the biggest entry we write into rev files).
- For the blk case, we over-pre-allocate up to 16 MB, but since blocks are capped at a few MBs, this ends up being a few MBs in practice.
If you spot any errors, let me know and I’ll update.