mempool: Persist with XOR #28207

pull maflcko wants to merge 1 commits into bitcoin:master from maflcko:2308-xor-memepool- changing 9 files +49 −15
  1. maflcko commented at 9:09 am on August 3, 2023: member

    Currently the mempool.dat file stores data received from remote peers as-is. This may be problematic when a program other than Bitcoin Core tries to interpret them by accident. For example, an anti-virus program or other program may scan the file and move it into quarantine, or delete it, or corrupt it.

    While the local wallet is expected to re-submit any pending transactions, unrelated transactions may be missing from the mempool after a restart. This may cause fee estimates to be off, or may cause block relay to be slower.

    Fix this, similar to #6650, by rolling a random XOR pattern over the dat file when writing or reading it.

    Obviously this can only protect against programs that accidentally and unintentionally are trying to mess with the dat file. Any program that intentionally wants to mess with the dat file can still trivially do so.

  2. DrahtBot commented at 9:09 am on August 3, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28438 (Use serialization parameters for CTransaction by ajtowns)
    • #28132 (policy: Enable full-rbf by default by petertodd)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Mempool on Aug 3, 2023
  4. maflcko commented at 9:12 am on August 3, 2023: member

    Currently opened as draft to wait for initial feedback. There is no option to disable this feature, because I am not aware of anyone reading the mempool.dat, is there? (The getrawmempool RPC is the recommended way to get the mempool, and using the savemempool RPC instead seems like an edge-case?)

  5. Sjors commented at 9:21 am on August 3, 2023: member

    Concept ACK.

    It may be difficult to find out if anyone relies on parsing mempool.dat. If it’s not too hard, we might as well add an option (default 1) and deprecate it in a few releases. That also makes it easier to toggle between master and the last release (by setting it to 0).

    You may also want to use a bit flag instead of increasing the version. The added complexity of that could be an argument to not make this optional.

  6. maflcko commented at 9:31 am on August 3, 2023: member

    I don’t think bit flags make sense in this context, because the mempool.dat frequently changes and is at most expected to be read by the previous version. However, your suggestion to use a setting makes sense. The setting could have several options, like “write version 1 mempool.dat”, and “use 0 for all random numbers” (if needed), and “use prng”.

    Edit: Hmm, I think just having a boolean option to say “write version 1 mempool.dat” is enough. If there are any users that need “use 0 for all random numbers”, they can create a feature request after they read the release notes.

  7. in src/kernel/mempool_persist.cpp:39 in fae683bc57 outdated
    34@@ -35,14 +35,15 @@ using fsbridge::FopenFn;
    35 
    36 namespace kernel {
    37 
    38-static const uint64_t MEMPOOL_DUMP_VERSION = 1;
    39+static const uint64_t MEMPOOL_DUMP_VERSION{2};
    


    ismaelsadeeq commented at 9:49 am on August 3, 2023:
    Will previous versions be able to read the mempool dump version 2? mempool.dat suppose to be both backward and forward-compatible between versions

    maflcko commented at 9:59 am on August 3, 2023:
    See my previous comment: #28207 (comment)

    maflcko commented at 10:00 am on August 3, 2023:
    No existing software that can only read version 1, can thus not read version 2.

    ismaelsadeeq commented at 1:04 pm on August 3, 2023:
    Okay, I was asking because I see a test for that in mempool_compatibility.py, probably that’s why CI is failing, mempool.dat future versions are to be backwards incompatible then we can remove the mempool_compatibility.py test along with this?

    maflcko commented at 7:19 am on August 4, 2023:
    Thanks, fixed CI
  8. DrahtBot added the label CI failed on Aug 3, 2023
  9. 0xB10C commented at 6:28 pm on August 3, 2023: contributor

    Concept ACK. I opened issue #16721 a while ago but closed it as it didn’t get much attention back then.

    I am not aware of anyone reading the mempool.dat, is there?

    I’ve written a mempool.dat parser a few years ago for fun. However, as you said, the RPCs are powerful enough and if someone really really wants to read the file, they can implement XOR functionality. Similar to block0000.dat files, these files are not something considered an interface for others to rely on.

  10. maflcko commented at 5:41 am on August 4, 2023: member
    Just for context: For testing I used -datacarriersize=9999999 and then put the following nulldata raw() descriptor into the mempool.dat: raw(6a4458354f2150254041505b345c505a58353428505e2937434329377d2445494341522d5354414e444152442d414e544956495255532d544553542d46494c452124482b482a). On current master I got several hits by different virus scanners. On this pull, all virus scanners were green and didn’t put the mempool.dat in the quarantine.
  11. maflcko force-pushed on Aug 4, 2023
  12. maflcko force-pushed on Aug 4, 2023
  13. maflcko marked this as ready for review on Aug 4, 2023
  14. maflcko force-pushed on Aug 4, 2023
  15. maflcko commented at 12:24 pm on August 4, 2023: member
    Added some code to make the Windows CI pass
  16. DrahtBot removed the label CI failed on Aug 4, 2023
  17. in src/init.cpp:462 in fa3276d785 outdated
    455@@ -456,6 +456,11 @@ void SetupServerArgs(ArgsManager& argsman)
    456     argsman.AddArg("-par=<n>", strprintf("Set the number of script verification threads (%u to %d, 0 = auto, <0 = leave that many cores free, default: %d)",
    457         -GetNumCores(), MAX_SCRIPTCHECK_THREADS, DEFAULT_SCRIPTCHECK_THREADS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    458     argsman.AddArg("-persistmempool", strprintf("Whether to save the mempool on shutdown and load on restart (default: %u)", DEFAULT_PERSIST_MEMPOOL), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    459+    argsman.AddArg("-persistmempoolv1",
    460+                   strprintf("Whether a mempool.dat file created by -persistmempool or the savemempool RPC will be written in the legacy format "
    461+                             "(version 1) or the current format (version 2). This temporary option will be removed in the future. (default: %u)",
    462+                             DEFAULT_PERSIST_MEMPOOL),
    


    darosior commented at 5:44 am on August 5, 2023:
    0                             DEFAULT_PERSIST_V1_DAT),
    

    maflcko commented at 10:13 am on August 6, 2023:
    Thanks, good catch. Fixed typo.
  18. in src/kernel/mempool_options.h:27 in fa3276d785 outdated
    23@@ -24,6 +24,8 @@ static constexpr unsigned int DEFAULT_BLOCKSONLY_MAX_MEMPOOL_SIZE_MB{5};
    24 static constexpr unsigned int DEFAULT_MEMPOOL_EXPIRY_HOURS{336};
    25 /** Default for -mempoolfullrbf, if the transaction replaceability signaling is ignored */
    26 static constexpr bool DEFAULT_MEMPOOL_FULL_RBF{false};
    27+/** Whether to fall back to legacy V1 serialization when writing mempool.dat */
    28+static constexpr bool DEFAULT_PERSIST_V1_DAT{false};
    


    darosior commented at 5:45 am on August 5, 2023:
    Might as well set it to true, as long as someone who was relying on the previous behaviour can simply set -persistmempoolv1.

    maflcko commented at 10:14 am on August 6, 2023:
    v1=false means v2=true, which is what you are asking for and what this pull request is doing, no?

    darosior commented at 6:14 am on August 7, 2023:
    Right, overlooked.
  19. darosior commented at 5:47 am on August 5, 2023: member
    Concept and approach ACK. I don’t think there is harm in making it the default behaviour immediately.
  20. luke-jr commented at 4:02 pm on August 5, 2023: member
    Concept ACK, I don’t think we need to concern with external programs reading this. If they do exist, they can adapt easily.
  21. luke-jr commented at 4:02 pm on August 5, 2023: member
    (Although downgrading might be something we want to support?)
  22. maflcko commented at 8:51 am on August 6, 2023: member

    (Although downgrading might be something we want to support?)

    It is. You can simply set -persistmempoolv1=1.

  23. maflcko force-pushed on Aug 6, 2023
  24. glozow commented at 3:06 pm on August 7, 2023: member
    concept ACK
  25. RandyMcMillan commented at 3:41 pm on August 7, 2023: contributor
    Concept ACK
  26. maflcko force-pushed on Aug 15, 2023
  27. DrahtBot added the label CI failed on Aug 15, 2023
  28. maflcko force-pushed on Aug 16, 2023
  29. DrahtBot removed the label CI failed on Aug 16, 2023
  30. in src/kernel/mempool_persist.cpp:40 in fac48d2dc5 outdated
    33@@ -34,14 +34,15 @@ using fsbridge::FopenFn;
    34 
    35 namespace kernel {
    36 
    37-static const uint64_t MEMPOOL_DUMP_VERSION = 1;
    38+static const uint64_t MEMPOOL_DUMP_VERSION{2};
    39 
    


    ismaelsadeeq commented at 12:43 pm on August 23, 2023:

    Can we define the lambda in this scope and reuse it in dump and load.

    0const auto xor_open{[](const char* mode, const fs::path& load_path, std::vector<std::byte> xor_key, FopenFn mockable_fopen_function) {
    1    return CAutoFile{mockable_fopen_function(load_path, mode), SER_DISK, CLIENT_VERSION, xor_key};
    2}};
    

    had remove the reference capture, will this make the lambda func have some undesirable behavior?


    maflcko commented at 2:11 pm on August 23, 2023:
    Does the overall diff result in less code, or easier to understand code?

    ismaelsadeeq commented at 3:01 pm on August 23, 2023:
    not less code, just a bit dry.

    maflcko commented at 3:03 pm on August 23, 2023:
    Ok thanks, I’ll wait for one more reviewer to ask for it and then switch over.

    pablomartin4btc commented at 9:05 pm on October 17, 2023:
    nit: I agree with @ismaelsadeeq about defining the lambda only once at the top, instead of currently twice (at LoadMempool and at DumpMempool, the downside is that you have to pass all the arguments on all 4 calls (CAutoFile file{xor_open(...)};), other than that I think there’s no undesirable behaviour on removing the &.

    maflcko commented at 9:26 am on October 20, 2023:
    Ok, I may or may not address this style nit if I have to re-touch, to avoid invalidating ACKs.

    maflcko commented at 10:32 am on November 8, 2023:
    The lambda is gone, so I think this is solved now.

    ismaelsadeeq commented at 10:48 am on November 8, 2023:
    Yes resolve, will review soon ✅
  31. in src/kernel/mempool_persist.cpp:46 in fac48d2dc5 outdated
    43 
    44-    FILE* filestr{opts.mockable_fopen_function(load_path, "rb")};
    45-    CAutoFile file(filestr, SER_DISK, CLIENT_VERSION);
    46+    std::vector<std::byte> xor_key;
    47+    const auto xor_open{[&] { return CAutoFile{opts.mockable_fopen_function(load_path, "rb"), SER_DISK, CLIENT_VERSION, xor_key}; }};
    48+    CAutoFile file{xor_open()};
    


    ismaelsadeeq commented at 12:55 pm on August 23, 2023:
    0     CAutoFile file{xor_open("rb", load_path, xor_key, opts.mockable_fopen_function)};
    
  32. in src/kernel/mempool_persist.cpp:62 in fac48d2dc5 outdated
    57@@ -57,9 +58,19 @@ bool LoadMempool(CTxMemPool& pool, const fs::path& load_path, Chainstate& active
    58     try {
    59         uint64_t version;
    60         file >> version;
    61-        if (version != MEMPOOL_DUMP_VERSION) {
    62+        if (version == 1) {
    63+            // No XOR-key present
    


    ismaelsadeeq commented at 12:55 pm on August 23, 2023:

    nit

    0            // mempool.dat dumped with version 1 serialization no XOR-key present.
    

    maflcko commented at 2:11 pm on August 23, 2023:
    I think if (version == 1) already documents that the read version was 1, so I don’t think we need a comment to say that the version was 1.
  33. in src/kernel/mempool_persist.cpp:71 in fac48d2dc5 outdated
    66+        } else {
    67             return false;
    68         }
    69+        const auto pos{std::ftell(file.Get())};
    70+        file.fclose();
    71+        CAutoFile file{xor_open()};
    


    ismaelsadeeq commented at 12:55 pm on August 23, 2023:
    0        CAutoFile file{xor_open("rb", load_path, xor_key, opts.mockable_fopen_function)};
    
  34. in src/kernel/mempool_persist.cpp:170 in fac48d2dc5 outdated
    171-        CAutoFile file(filestr, SER_DISK, CLIENT_VERSION);
    172+    std::vector<std::byte> xor_key(8);
    173+    const auto xor_open{[&](const char* mode) {
    174+        return CAutoFile{mockable_fopen_function(dump_path + ".new", mode), SER_DISK, CLIENT_VERSION, xor_key};
    175+    }};
    176+    CAutoFile file{xor_open("wb")};
    


    ismaelsadeeq commented at 12:57 pm on August 23, 2023:
    0    const std::path new_dump_path = dump_path + ".new";
    1    CAutoFile file{xor_open("wb", new_dump_path, xor_key, mockable_fopen_function)};
    
  35. in src/kernel/mempool_persist.cpp:184 in fac48d2dc5 outdated
    186+        if (!pool.m_persist_v1_dat) {
    187+            FastRandomContext{}.fillrand(xor_key);
    188+            file << xor_key;
    189+        }
    190+        file.fclose();
    191+        CAutoFile file{xor_open("ab")};
    


    ismaelsadeeq commented at 12:57 pm on August 23, 2023:
    0        CAutoFile file{xor_open("ab", new_dump_path, xor_key, mockable_fopen_function)};
    
  36. ismaelsadeeq commented at 1:12 pm on August 23, 2023: member

    light code-review ACK fac48d2dc59773f1f48bc0703af82c99dd70d0a0

    Idk why but I try testing the behavior on master 5aa67eb3655a0023f0cf115176fc8d5bac53cdcd with the steps you provided here but no anti virus quarantined the mempool.dat file, I used Intego, bitfinder.

    When we want to drop support for writing in the legacy format (v1), mempool_compatibilty.py functional test should be deleted along the CL I think.

    Do we also want fee_estimates.dat to be persisted with XOR?

  37. maflcko commented at 2:07 pm on August 23, 2023: member

    but no anti virus quarantined the mempool.dat file, I used Intego, bitfinder.

    The majority won’t detect the eicar test virus I used. I used an online service to scan the mempool.dat and IIRC 4/50 were red (or so), but I don’t recall which.

    Do we also want fee_estimates.dat to be persisted with XOR?

    Why? IIUC correctly, it only stores integers and floating point number calculated locally, or am I missing something? peers.dat may be a better choice for XOR?

  38. DrahtBot added the label Needs rebase on Aug 29, 2023
  39. maflcko force-pushed on Aug 29, 2023
  40. DrahtBot removed the label Needs rebase on Aug 29, 2023
  41. DrahtBot added the label Needs rebase on Sep 14, 2023
  42. maflcko force-pushed on Sep 14, 2023
  43. DrahtBot removed the label Needs rebase on Sep 14, 2023
  44. maflcko force-pushed on Sep 14, 2023
  45. DrahtBot added the label CI failed on Sep 14, 2023
  46. DrahtBot removed the label CI failed on Sep 20, 2023
  47. maflcko force-pushed on Oct 4, 2023
  48. hernanmarino approved
  49. DrahtBot requested review from luke-jr on Oct 17, 2023
  50. DrahtBot requested review from 0xB10C on Oct 17, 2023
  51. DrahtBot requested review from darosior on Oct 17, 2023
  52. DrahtBot requested review from Sjors on Oct 17, 2023
  53. DrahtBot requested review from glozow on Oct 17, 2023
  54. DrahtBot requested review from ismaelsadeeq on Oct 17, 2023
  55. pablomartin4btc commented at 9:11 pm on October 17, 2023: member
    cr ACK eeee4a5779ea20d859cff2e411ad46dd52384f1d
  56. fanquake assigned glozow on Oct 20, 2023
  57. in src/kernel/mempool_persist.cpp:71 in eeee4a5779 outdated
    67+        } else {
    68             return false;
    69         }
    70+        const auto pos{std::ftell(file.Get())};
    71+        file.fclose();
    72+        CAutoFile file{xor_open()};
    


    achow101 commented at 9:56 pm on November 6, 2023:

    In eeee4a5779ea20d859cff2e411ad46dd52384f1d “mempool: persist with XOR”

    Closing and reopening the file once we have the xor seems kinda clunky. Why not have a function on CAutoFile that lets us set the xor key after the fact?


    maflcko commented at 10:10 am on November 8, 2023:
    Nice. Good idea. This drops a bunch of duplicate and dead code.
  58. maflcko force-pushed on Nov 8, 2023
  59. achow101 commented at 7:20 pm on November 8, 2023: member
    ACK fa520848da6d718a07368b42b1a44bd2515e6e5a
  60. DrahtBot requested review from hernanmarino on Nov 8, 2023
  61. DrahtBot requested review from pablomartin4btc on Nov 8, 2023
  62. darosior approved
  63. darosior commented at 11:33 am on November 9, 2023: member
    utACK fa520848da6d718a07368b42b1a44bd2515e6e5a
  64. DrahtBot removed review request from hernanmarino on Nov 9, 2023
  65. DrahtBot requested review from hernanmarino on Nov 9, 2023
  66. in test/functional/mempool_compatibility.py:62 in fa520848da outdated
    58@@ -59,7 +59,7 @@ def run_test(self):
    59         old_node_mempool.rename(new_node_mempool)
    60 
    61         self.log.info("Start new node and verify mempool contains the tx")
    62-        self.start_node(1)
    63+        self.start_node(1, extra_args=["-persistmempoolv1=1"])
    


    glozow commented at 1:42 pm on November 9, 2023:
    Should the mempool_compatibility.py test be updated? (the comment “200100 # Last release with previous mempool format” isn’t accurate anymore)

    maflcko commented at 6:46 pm on November 9, 2023:
    Thx, done
  67. in src/kernel/mempool_persist.cpp:39 in fa520848da outdated
    34@@ -34,14 +35,13 @@ using fsbridge::FopenFn;
    35 
    36 namespace kernel {
    37 
    38-static const uint64_t MEMPOOL_DUMP_VERSION = 1;
    39+static const uint64_t MEMPOOL_DUMP_VERSION{2};
    


    glozow commented at 1:45 pm on November 9, 2023:
    nit: would be nice to name 1 as MEMPOOL_DUMP_VERSION_V1 or MEMPOOL_DUMP_VERSION_NO_XOR

    maflcko commented at 6:46 pm on November 9, 2023:
    Thx, done
  68. glozow commented at 1:59 pm on November 9, 2023: member
    ACK fa520848da Code LGTM, tried a few roundtrips of stop/start/savemempool/importmempool with combinations of the options. I originally thought -persistmempool=0 should imply something for -persistmempoolv1 but makes sense that it applies to savemempool when we’re not persisting.
  69. DrahtBot removed review request from hernanmarino on Nov 9, 2023
  70. DrahtBot requested review from hernanmarino on Nov 9, 2023
  71. DrahtBot removed review request from hernanmarino on Nov 9, 2023
  72. DrahtBot requested review from hernanmarino on Nov 9, 2023
  73. mempool: persist with XOR fa6b053b5c
  74. maflcko force-pushed on Nov 9, 2023
  75. glozow commented at 5:41 pm on November 10, 2023: member
    reACK fa6b053b5c964fb35935fa994cb782c0731a56f8
  76. DrahtBot removed review request from hernanmarino on Nov 10, 2023
  77. DrahtBot requested review from darosior on Nov 10, 2023
  78. DrahtBot requested review from hernanmarino on Nov 10, 2023
  79. DrahtBot requested review from achow101 on Nov 10, 2023
  80. in src/init.cpp:461 in fa6b053b5c
    457@@ -458,6 +458,11 @@ void SetupServerArgs(ArgsManager& argsman)
    458     argsman.AddArg("-par=<n>", strprintf("Set the number of script verification threads (%u to %d, 0 = auto, <0 = leave that many cores free, default: %d)",
    459         -GetNumCores(), MAX_SCRIPTCHECK_THREADS, DEFAULT_SCRIPTCHECK_THREADS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    460     argsman.AddArg("-persistmempool", strprintf("Whether to save the mempool on shutdown and load on restart (default: %u)", DEFAULT_PERSIST_MEMPOOL), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    461+    argsman.AddArg("-persistmempoolv1",
    


    ismaelsadeeq commented at 1:55 pm on November 13, 2023:
    This PR is an improvement, this patch allows you to read mempool.dat dumped using the legacy format. Whats the a rationale to temporary retention of persistmempoolv1, and when can we expect to remove this option?

    maflcko commented at 2:06 pm on November 13, 2023:
    Should be fine to remove in the second next release after this is merged. I presume the read code (// Leave XOR-key empty ) can stay forever/longer, because it is just one line of code, so no strong opinion.

    ismaelsadeeq commented at 2:20 pm on November 13, 2023:
    AFAIU we are persisting with XOR to “protect against programs that accidentally and unintentionally are trying to mess with the dat file” so I think it will be better if we completely prevent dumping without XOR.
  81. ismaelsadeeq commented at 1:56 pm on November 13, 2023: member

    ACK fa6b053b5c964fb35935fa994cb782c0731a56f8

    Used this PR to successfully read a mempool.dat saved in the legacy format. Additionally, the persistmempoolv1 option downgraded and allowed dumping using the legacy format.

    Code looks good to me.

  82. DrahtBot removed review request from hernanmarino on Nov 13, 2023
  83. DrahtBot requested review from hernanmarino on Nov 13, 2023
  84. DrahtBot removed review request from hernanmarino on Nov 13, 2023
  85. DrahtBot requested review from hernanmarino on Nov 13, 2023
  86. DrahtBot removed review request from hernanmarino on Nov 13, 2023
  87. DrahtBot requested review from hernanmarino on Nov 13, 2023
  88. DrahtBot removed review request from hernanmarino on Nov 13, 2023
  89. DrahtBot requested review from hernanmarino on Nov 13, 2023
  90. achow101 commented at 4:16 pm on November 13, 2023: member
    re-ACK fa6b053b5c964fb35935fa994cb782c0731a56f8
  91. DrahtBot removed review request from achow101 on Nov 13, 2023
  92. DrahtBot removed review request from hernanmarino on Nov 13, 2023
  93. DrahtBot requested review from hernanmarino on Nov 13, 2023
  94. achow101 merged this on Nov 13, 2023
  95. achow101 closed this on Nov 13, 2023

  96. maflcko deleted the branch on Nov 13, 2023
  97. DCMTOKEN commented at 9:26 am on July 9, 2024: none
    Ok I agree
  98. bitcoin locked this on Jul 9, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me