maflcko
commented at 8:07 am on July 8, 2023:
member
Currently the *.dat files in the blocksdir store the data received from remote peers as-is. This may be problematic when a program other than Bitcoin Core tries to interpret them by accident. For example, an anti-virus program or other program may scan them and move them into quarantine, or delete them, or corrupt them. This may cause Bitcoin Core to fail a reorg, or fail to reply to block requests (via P2P, RPC, REST, …).
Fix this, similar to #6650, by rolling a random XOR pattern over the dat files when writing or reading them.
Obviously this can only protect against programs that accidentally and unintentionally are trying to mess with the dat files. Any program that intentionally wants to mess with the dat files can still trivially do so.
The XOR pattern is only applied when the blocksdir is freshly created, and there is an option to disable it (on creation), so that people can disable it, if needed.
DrahtBot
commented at 8:07 am on July 8, 2023:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
luke-jr
commented at 6:02 pm on July 9, 2023:
member
AFAIK there is software which reads these files intentionally. IMO they can be expected to adapt, but having the option to disable it seems like a good idea in case they aren’t ready right away.
TheCharlatan
commented at 8:30 am on July 10, 2023:
contributor
Strong Concept ACK.
maflcko force-pushed
on Jul 10, 2023
maflcko
commented at 10:21 am on July 10, 2023:
member
Thanks for the feedback so far. I’ve kept the option to disable, but made it a simple bool instead of allowing the user to pick the XOR-key.
If someone is interested, a new commit can be added to upgrade the linearize scripts to read the XOR-key.
DrahtBot added the label
Needs rebase
on Jul 11, 2023
maflcko force-pushed
on Jul 12, 2023
DrahtBot removed the label
Needs rebase
on Jul 12, 2023
dergoegge
commented at 12:19 pm on July 13, 2023:
member
Concept ACK
DrahtBot added the label
Needs rebase
on Jul 31, 2023
maflcko force-pushed
on Aug 1, 2023
maflcko
commented at 2:06 pm on August 1, 2023:
member
Rebased on the latest #https://github.com/bitcoin/bitcoin/pull/28060
DrahtBot removed the label
Needs rebase
on Aug 1, 2023
fanquake referenced this in commit
ceda819886
on Aug 1, 2023
DrahtBot removed the label
CI failed
on Aug 1, 2023
in
test/functional/feature_reindex.py:44
in
f0d2c3f73eoutdated
39@@ -39,9 +40,18 @@ def out_of_order(self):
40 # we're generating them rather than getting them from peers), so to
41 # test out-of-order handling, swap blocks 1 and 2 on disk.
42 blk0 = self.nodes[0].blocks_path / "blk00000.dat"
43+ with open(self.nodes[0].blocks_path / "xor.key", "rb") as xor_f:
44+ xor_dat = deser_string(xor_f)
I guess it may be simpler to serialize and deserialize the xor key as a raw byte span (without the size indicator). This should make it even easier to read in external scripts, for example in vanilla python.
Edit: And readers can assume it to be constant size. For example:
darosior
commented at 7:23 am on August 2, 2023:
member
Concept ACK.
maflcko marked this as ready for review
on Aug 2, 2023
maflcko force-pushed
on Aug 2, 2023
maflcko force-pushed
on Aug 2, 2023
maflcko
commented at 8:25 am on August 2, 2023:
member
Rebased, and made every commit a refactor, except for the one that has release notes and test changes :)
willcl-ark
commented at 1:53 pm on August 2, 2023:
member
Concept ACK.
I’ve not reviewed the code in detail yet, but this is a nice followup to #28060.
It seems to work well so far. I wrote a quick-n-dirty python script to manually xor my data dir using the same 8 byte key from this pull, and Bitcoin Core at commit fa3c1d230a didn’t have any problems reading the xored blocks or key afterwards.
sidhujag referenced this in commit
3a522e929e
on Aug 9, 2023
DrahtBot added the label
CI failed
on Sep 10, 2023
maflcko force-pushed
on Sep 11, 2023
kristapsk
commented at 11:29 am on September 11, 2023:
contributor
Concept ACK
DrahtBot removed the label
CI failed
on Sep 11, 2023
fanquake referenced this in commit
8209e86eeb
on Sep 14, 2023
DrahtBot added the label
Needs rebase
on Sep 14, 2023
maflcko force-pushed
on Sep 14, 2023
DrahtBot removed the label
Needs rebase
on Sep 14, 2023
DrahtBot added the label
CI failed
on Sep 14, 2023
maflcko force-pushed
on Sep 14, 2023
maflcko
commented at 3:14 pm on September 14, 2023:
member
Given that most of this is refactoring, which makes sense on its own, I’ve split that up into #28483 and turned this pull into a draft for now.
maflcko marked this as a draft
on Sep 14, 2023
DrahtBot removed the label
CI failed
on Sep 14, 2023
fanquake referenced this in commit
c9f288244b
on Sep 26, 2023
DrahtBot added the label
Needs rebase
on Sep 26, 2023
maflcko marked this as ready for review
on Sep 28, 2023
maflcko force-pushed
on Sep 28, 2023
DrahtBot added the label
CI failed
on Sep 28, 2023
DrahtBot removed the label
Needs rebase
on Sep 28, 2023
DrahtBot removed the label
CI failed
on Sep 28, 2023
maflcko force-pushed
on Oct 4, 2023
pablomartin4btc
commented at 7:55 pm on October 26, 2023:
member
tACKfa9f3f70011842e2c4358d15ab4639734d976f5c
02023-10-26T18:07:06Z init message: Verifying blocks…
12023-10-26T18:07:06Z Verifying last 6 blocks at level 3
22023-10-26T18:07:06Z Verification progress: 0%
32023-10-26T18:07:06Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=5, nPos=62189381)
42023-10-26T18:07:06Z Verification error: ReadBlockFromDisk failed at 160000, hash=0000003ca3c99aff040f2563c2ad8f8ec88bd0fd6b8f0895cfaf1ef90353a62c
52023-10-26T18:07:06Z : Corrupted block database detected.
6Please restart with -reindex or -reindex-chainstate to recover.
7: Corrupted block database detected.
8Please restart with -reindex or -reindex-chainstate to recover.
92023-10-26T18:07:06Z Aborted block database rebuild. Exiting.
102023-10-26T18:07:06Z Shutdown: In progress...
112023-10-26T18:07:06Z scheduler thread exit
122023-10-26T18:07:06Z Flushed fee estimates to fee_estimates.dat.
132023-10-26T18:07:06Z Shutdown: done
2023-10-26T18:12:03Z Using xor key for *.dat files: '0000000000000000'
Now, considering the possibility of the XOR key file getting corrupted (for example, accidentally opening it with an app that offers to ‘fix’ a corruption) or being deleted, especially on a fresh run when a random XOR key is generated (which I’ve also tested), wouldn’t it be better to save the key in the .conf or settings?
Before having that thought, I was considering proposing to give the option to the user to specify the filename, but that would add complexity, and I’m not sure if it’s worth it.
DrahtBot requested review from sipa
on Oct 26, 2023
DrahtBot requested review from jamesob
on Oct 26, 2023
DrahtBot requested review from willcl-ark
on Oct 26, 2023
DrahtBot requested review from dergoegge
on Oct 26, 2023
DrahtBot requested review from darosior
on Oct 26, 2023
DrahtBot requested review from TheCharlatan
on Oct 26, 2023
maflcko
commented at 8:23 am on October 27, 2023:
member
wouldn’t it be better to save the key in the .conf or settings?
The key is not a configuration or setting that is supposed to be modified by a user or that can be changed at all. Once it is persisted to storage, it is set in stone for all (future) dat files.
DrahtBot added the label
Needs rebase
on Oct 30, 2023
maflcko force-pushed
on Oct 30, 2023
DrahtBot removed the label
Needs rebase
on Oct 30, 2023
pablomartin4btc
commented at 4:48 pm on October 30, 2023:
member
re ACKfa6b180beb20d64962c2b5440eeacb31ed889a81
maflcko force-pushed
on Nov 22, 2023
lrs955
commented at 12:31 pm on November 23, 2023:
none
why not save the xor key inside the leveldb block database?
wouldn’t it be better to save the key in the .conf or settings?
The key is not a configuration or setting that is supposed to be modified by a user or that can be changed at all. Once it is persisted to storage, it is set in stone for all (future) dat files.
maflcko
commented at 12:34 pm on November 23, 2023:
member
why not save the xor key inside the leveldb block database?
Just seems more complicated with no benefit?
lrs955
commented at 6:17 pm on November 23, 2023:
none
why not save the xor key inside the leveldb block database?
Just seems more complicated with no benefit?
I understand.
Maybe im wrong but isn’t also the xor pattern of the chainstate saved in the database itself? Different leveldb database but still in a key a database.
DrahtBot added the label
CI failed
on Dec 13, 2023
DrahtBot removed the label
CI failed
on Dec 14, 2023
DrahtBot added the label
Needs rebase
on Dec 14, 2023
maflcko force-pushed
on Mar 22, 2024
DrahtBot removed the label
Needs rebase
on Mar 22, 2024
willcl-ark
commented at 2:15 pm on March 22, 2024:
member
While you’re re-touching, I think it could be worth adding an entry to doc/files.md too, especially as losing/not backing-up this file results in requiring a full reindex?
maflcko
commented at 3:03 pm on March 22, 2024:
member
especially as losing/not backing-up this file results in requiring a full reindex?
I don’t think this is true. The xor key can trivially be restored by xor-ing the first 8 bytes of the first block file with the first 8 bytes of the first block file of a datadir created with the xor disabled. This works, because the first block written should be the genesis block, always.
In python:
0>>> bytes(a ^ b for a,b in zip(open('blk00000.dat', 'rb').read()[:8], open('clean-blk00000.dat', 'rb').read()[:8])).hex()
1'228b70bf9c1abb3f'2>>> open('xor.key', 'rb').read().hex() # double-check3'228b70bf9c1abb3f'
maflcko force-pushed
on Mar 22, 2024
maflcko
commented at 4:12 pm on March 22, 2024:
member
doc/files.md
Done
in
src/node/blockstorage.cpp:1164
in
faaf3cf535outdated
1169+ if (!fs::is_empty(opts.blocks_dir) || opts.xor_key == false) std::fill(xor_key.begin(), xor_key.end(), std::byte{});
1170+ const fs::path xor_key_path{opts.blocks_dir / "xor.dat"};
1171+ if (fs::exists(xor_key_path)) {
1172+ // A pre-existing xor key file has priority.
1173+ AutoFile xor_key_file{fsbridge::fopen(xor_key_path, "rb")};
1174+ xor_key_file >> xor_key;
Jjust noting here that if the user specifies -blocksxor=01234567 and a file-based key already exists in the datadir, the key from the file is silently used.
This is the correct behaviour IMO, but wonder if we should log that the user-set option has been overridden?
As @maflcko pointed out to me previously, even if a user mistakenly thought they were using a custom xor key set as a CL option and removed the key file, they could still restore it. So I don’t see any dangers with this current behaviour.
The option is boolean, so I guess it could throw here. In any case, the setting won’t have an effect on an existing blocksdir, even on -reindex. Maybe the manpage can be adjusted to clarify this?
hodlinator
commented at 10:24 pm on July 16, 2024:
willcl-ark
commented at 9:38 am on May 1, 2024:
member
ACKfaaf3cf535999c2c9a84337414c8b35435384862
Nice change with the same rationale as the previous mempool XOR changes (preventing anti-virus falsely flagging/modifying block files).
One non-blocking comment left as a result of manual testing.
DrahtBot requested review from pablomartin4btc
on May 1, 2024
in
test/functional/feature_reindex.py:42
in
faaf3cf535outdated
38@@ -39,9 +39,19 @@ def out_of_order(self):
39 # we're generating them rather than getting them from peers), so to
40 # test out-of-order handling, swap blocks 1 and 2 on disk.
41 blk0 = self.nodes[0].blocks_path / "blk00000.dat"
42+ with open(self.nodes[0].blocks_path / "xor.dat", "rb") as xor_f:
TheCharlatan
commented at 3:05 pm on May 11, 2024:
Just a comment: This test implicitly checks if the blockfiles are XORed. Would a simple additional check on the undo data make sense?
in
src/node/blockstorage.cpp:1188
in
faaf3cf535outdated
1183+ strprintf("The blocksdir XOR-key can not be disabled when a random key was already stored! "
1184+ "Stored key: '%s', stored path: '%s'.",
1185+ HexStr(xor_key), fs::PathToString(xor_key_path)),
1186+ };
1187+ }
1188+ LogInfo("Using xor key for *.dat files: '%s'\n", HexStr(xor_key));
TheCharlatan
commented at 7:24 pm on May 11, 2024:
Nit:
This prints as:
0Using xor key for *.dat files: '0000000000000000'`
But the log lines informing the user on data obfuscation in the other DBs use:
0Using obfuscation key for /home/drgrid/.bitcoin/blocks/index: 00000000000000001Using obfuscation key for /home/drgrid/.bitcoin/chainstate: a3b154cf1d9714d7
I think it would be nice to be consistent here, both by listing the full directory and using the same text.
I’d say DB obfuscation is different from *.dat obfuscation, but I’ve made the messages a bit more similar. Thanks!
in
src/node/blockstorage.cpp:1169
in
faaf3cf535outdated
1164+ std::array<std::byte, 8> xor_key{};
1165+
1166+ // Initialize random fresh key, ...
1167+ FastRandomContext{}.fillrand(xor_key);
1168+ // ... but allow legacy or user-disabled key.
1169+ if (!fs::is_empty(opts.blocks_dir) || opts.xor_key == false) std::fill(xor_key.begin(), xor_key.end(), std::byte{});
TheCharlatan
commented at 7:48 pm on May 11, 2024:
Nit: I found this a bit hard to follow and a bit unfortunate that a new random key would be generated every time whether it were used or not. How about:
0@@ -1166,4 +1165,0 @@ static auto InitBlocksdirXorKey(const BlockManager::Options& opts)
1- // Initialize random fresh key, ...
2- FastRandomContext{}.fillrand(xor_key);
3- // ... but allow legacy or user-disabled key.
4- if (!fs::is_empty(opts.blocks_dir) || opts.xor_key == false) std::fill(xor_key.begin(), xor_key.end(), std::byte{});
5@@ -1171 +1167,8 @@ static auto InitBlocksdirXorKey(const BlockManager::Options& opts)
6- if (fs::exists(xor_key_path)) {
7+ bool key_exists = fs::exists(xor_key_path);
8+
9+ if (!key_exists && opts.xor_key && fs::is_empty(opts.blocks_dir)) {
10+ // Initialize random fresh key, ...
11+ FastRandomContext{}.fillrand(xor_key);
12+ }
13+
14+ if (key_exists) {
Thanks, but actually it doesn’t work when running on Windows. Added a new commit with a workaround.
hodlinator
commented at 10:31 pm on July 16, 2024:
contributor
Concept ACKfae8e0a9825d107431f3f33e4f70fe54d02ab3e3
(Haven’t had time to look at the tests yet).
Appreciate the robustness of InitBlocksdirXorKey() after some testing of removing the blocks/xor.dat file, re-running bitcoind. hexdumping it, removing the whole blocks/ directory, rerunning, hexdumping again etc.
maflcko force-pushed
on Jul 17, 2024
in
src/node/blockstorage.cpp:1164
in
fa986c918aoutdated
1159+{
1160+ // Bytes are serialized without length indicator, so this is also the exact
1161+ // size of the XOR-key file.
1162+ std::array<std::byte, 8> xor_key{};
1163+
1164+ if (fs::is_empty(opts.blocks_dir) && opts.use_xor) {
Make sure to run all tests locally, according to the documentation.
The failure may happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
in
src/node/blockstorage.cpp:1155
in
fab2287c6eoutdated
1150+ // Bytes are serialized without length indicator, so this is also the exact
1151+ // size of the XOR-key file.
1152+ std::array<std::byte, 8> xor_key{};
1153+
1154+ if (opts.use_xor && fs::is_empty(opts.blocks_dir)) {
1155+ // Only use random fresh key when the boolean option is set and when on
TheCharlatan
commented at 10:37 am on July 26, 2024:
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-12-22 00:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me