util: Teach AutoFile how to XOR #28060

pull MarcoFalke wants to merge 6 commits into bitcoin:master from MarcoFalke:2307-xor-file- changing 8 files +191 −77
  1. MarcoFalke commented at 10:39 am on July 10, 2023: member

    This allows AutoFile to roll an XOR pattern while reading or writing to the underlying file.

    This is needed for #28052, but can also be used in any other place.

    Also, there are tests, so I’ve split this up from the larger pull to make review easier, hopefully.

  2. DrahtBot commented at 10:39 am on July 10, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK Crypt-iQ, willcl-ark, jamesob
    Concept ACK sipa

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #26606 (wallet: Implement independent BDB parser by achow101)
    • #26596 (wallet: Migrate legacy wallets to descriptor wallets without requiring BDB by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot renamed this:
    streams: Add XorFile
    streams: Add XorFile
    on Jul 10, 2023
  4. MarcoFalke renamed this:
    streams: Add XorFile
    util: Add XorFile
    on Jul 10, 2023
  5. DrahtBot renamed this:
    util: Add XorFile
    util: Add XorFile
    on Jul 10, 2023
  6. DrahtBot added the label Utils/log/libs on Jul 10, 2023
  7. bitcoinfinancier approved
  8. in src/streams.h:526 in fa22763558 outdated
    527@@ -526,14 +528,16 @@ class AutoFile
    528      */
    529     bool IsNull() const         { return (file == nullptr); }
    530 
    531+    /** Implementation detail, only used internally. */
    532+    std::size_t detail_fread(Span<std::byte> dst);
    


    jamesob commented at 8:08 pm on July 11, 2023:

    fa2276355889ff7bcad95aa978863107c36a89f7

    Minor: new function is called detail_fread but commit and commit message mention detail_read.


    MarcoFalke commented at 8:01 am on July 12, 2023:
    Thanks, fixed commit message
  9. in src/streams.h:527 in fa9325a145 outdated
    522+    void ignore(size_t num_bytes);
    523+    void write(Span<const std::byte> src);
    524+
    525+    template <typename T>
    526+    XorFile& operator<<(const T& obj)
    527+    {
    


    jamesob commented at 8:31 pm on July 11, 2023:

    fa9325a14584be1c5aa6b4cdefbdaa69b27e402d

    Should these serialization methods have similar if (!m_file) guards as AutoFile does?


    MarcoFalke commented at 8:02 am on July 12, 2023:
    I’d say no, so I went ahead and removed them everywhere. Refer to the commit message for rationale.
  10. in src/streams.cpp:17 in fa9325a145 outdated
     9+
    10+std::size_t XorFile::detail_fread(Span<std::byte> dst)
    11+{
    12+    if (!m_file) throw std::ios_base::failure("XorFile::read: file handle is nullptr");
    13+    const auto init_pos{std::ftell(m_file)};
    14+    if (init_pos < 0) return 0;
    


    jamesob commented at 8:34 pm on July 11, 2023:

    fa9325a14584be1c5aa6b4cdefbdaa69b27e402d

    Why not throw a std::ios_base::failure() here? From what I can tell, we’d never expect to get back a negative error value.


    MarcoFalke commented at 1:35 pm on July 12, 2023:
    Yeah, looks like you are right. Thanks, done.
  11. in src/streams.cpp:43 in fa9325a145 outdated
    38+}
    39+
    40+void XorFile::write(Span<const std::byte> src)
    41+{
    42+    if (!m_file) throw std::ios_base::failure("XorFile::write: file handle is nullptr");
    43+    std::array<std::byte, 4096> buf;
    


    jamesob commented at 8:43 pm on July 11, 2023:

    fa9325a14584be1c5aa6b4cdefbdaa69b27e402d

    Might be interesting to see how statically allocating a buffer (say, as a member of XorFile) and reusing it might affect performance; may avoid some reallocations. Not sure how frequently this is actually called though. I expect this probably won’t make a difference though, because I guess XorFile::write() will be called on the order of once per block.


    MarcoFalke commented at 8:04 am on July 12, 2023:
    No idea, but my blind guess would be that calling std::fwrite takes longer than anything else in this function. :thinking:

    Crypt-iQ commented at 8:57 pm on July 18, 2023:
    I haven’t run benchmarks against statically allocating a buffer, but it may be useful since AutoFile::write is called many times per block. Basically every time the pattern CAutoFile << x is used which happens for every element of the block header, each transaction input, output, etc. I also ran a small unit test through perf where there were many writes and found that std::ftell dominated with std::fwrite close behind.

    MarcoFalke commented at 9:01 pm on July 18, 2023:
    It may be good to have a heat-map (Percent of time spent in each line) for this function for IBD, or a unit test that writes a block.

    MarcoFalke commented at 2:51 pm on July 19, 2023:

    I also ran a small unit test through perf where there were many writes and found that std::ftell dominated with std::fwrite close behind

    For me fwrite dominated over ftell, but the most time was spent on XOR:

     0ROUTINE ====================== AutoFile::write in src/streams.cpp
     1    49    100 Total samples (flat / cumulative)
     2     .      .   40:         nSize -= nNow;
     3     .      .   41:     }
     4     .      .   42: }
     5     .      .   43: 
     6     .      .   44: void AutoFile::write(Span<const std::byte> src)
     7---
     8     .      .   45: {
     9     .      .   46:     if (!m_file) throw std::ios_base::failure("AutoFile::write: file handle is nullptr");
    10     .      .   47:     if (m_xor.empty()) {
    11     .      .   48:         if (std::fwrite(src.data(), 1, src.size(), m_file) != src.size()) {
    12     .      .   49:             throw std::ios_base::failure("AutoFile::write: write failed");
    13     .      .   50:         }
    14     .      .   51:     } else {
    15     .      .   52:         std::array<std::byte, 4096> buf;
    16     .      .   53:         while (src.size() > 0) {
    17     .      .   54:             auto buf_now{Span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
    18     .      2   55:             std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
    19     .     16   56:             const auto current_pos{std::ftell(m_file)};
    20    49     49   57:             util::Xor(buf_now, m_xor, current_pos);
    21     .     33   58:             if (current_pos < 0 || std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    22     .      .   59:                 throw std::ios_base::failure{"XorFile::write: failed"};
    23     .      .   60:             }
    24     .      .   61:             src = src.subspan(buf_now.size());
    25     .      .   62:         }
    26     .      .   63:     }
    27     .      .   64: }
    28---
    

    Crypt-iQ commented at 3:24 pm on July 19, 2023:
    I think it must have been a fluke because when I synced up to a little past block ~200K, fwrite was in the lead and ftell and xor were about tied at ~3% each of the total time spent. I wrote a test here https://github.com/Crypt-iQ/bitcoin/commit/bf924d6d2d4020e799796c0751ac64c892ae8e6b that requires the benchmarks to be compiled but is not a benchmark. I made a patch in the following commit (https://github.com/Crypt-iQ/bitcoin/commit/be07dbc6f26638f7ec187e42144ccc5dab90c0a7) that reduced the number of ftell calls to 1 for AutoFile and it gave a noticeable performance improvement. I think it’s ok to track the file position this way, but I might be wrong

    MarcoFalke commented at 3:34 pm on July 19, 2023:

    I guess it depends on the data written. My benchmark was for 1MB written. With 33 bytes written, ftell is the dominant:

     0ROUTINE ====================== AutoFile::write in src/streams.cpp
     1     2     28 Total samples (flat / cumulative)
     2     .      .   40:         nSize -= nNow;
     3     .      .   41:     }
     4     .      .   42: }
     5     .      .   43: 
     6     .      .   44: void AutoFile::write(Span<const std::byte> src)
     7---
     8     .      .   45: {
     9     .      .   46:     if (!m_file) throw std::ios_base::failure("AutoFile::write: file handle is nullptr");
    10     .      .   47:     if (m_xor.empty()) {
    11     .      .   48:         if (std::fwrite(src.data(), 1, src.size(), m_file) != src.size()) {
    12     .      .   49:             throw std::ios_base::failure("AutoFile::write: write failed");
    13     .      .   50:         }
    14     .      .   51:     } else {
    15     .      .   52:         std::array<std::byte, 4096> buf;
    16     .      .   53:         while (src.size() > 0) {
    17     .      .   54:             auto buf_now{Span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
    18     .      .   55:             std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
    19     .     23   56:             const auto current_pos{std::ftell(m_file)};
    20     1      1   57:             util::Xor(buf_now, m_xor, current_pos);
    21     1      4   58:             if (current_pos < 0 || std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    22     .      .   59:                 throw std::ios_base::failure{"XorFile::write: failed"};
    23     .      .   60:             }
    24     .      .   61:             src = src.subspan(buf_now.size());
    25     .      .   62:         }
    26     .      .   63:     }
    27     .      .   64: }
    28---
    

    MarcoFalke commented at 3:56 pm on July 19, 2023:
    In any case the bottleneck doesn’t seem to be std::array<std::byte, 4096> buf;, so I think this thread can be closed?

    Crypt-iQ commented at 4:00 pm on July 19, 2023:
    yeah I didn’t see allocations come up at all in perf
  12. jamesob commented at 8:50 pm on July 11, 2023: member

    Concept ACK

    In general looks good, but I think you could really cut down on the amount of code here if you just made XorFile a subclass of CAutoFile, and made whatever small adjustments and modernizations necessary to CAutoFile to get that to work.

    E.g. XorFile::ignore() seems like essentially the same code as CAutoFile::ignore(), just with some std:: prefixes. Seems like code we don’t want to maintain two copies of.

  13. MarcoFalke renamed this:
    util: Add XorFile
    util: Teach AutoFile how to XOR
    on Jul 12, 2023
  14. MarcoFalke force-pushed on Jul 12, 2023
  15. Extract util::Xor, Add key_offset option, Add bench 9999a49b32
  16. refactor: Remove redundant file check from AutoFile shift operators
    The shift operators will call the write or read member function, which
    already does the check. Also, call sites are free to directly call
    ::(Un)Serialize(s, obj) to skip this check, so removing it increases
    consistency.
    fafe2ca0ce
  17. MarcoFalke commented at 8:00 am on July 12, 2023: member

    I think you could really cut down on the amount of code here

    Thanks, rewritten from scratch to reduce the lines of code.

  18. doc: Remove comments that just repeat what the code does
    No need to artificially bloat the code and waste space.
    fa8d227d58
  19. MarcoFalke force-pushed on Jul 12, 2023
  20. MarcoFalke force-pushed on Jul 12, 2023
  21. jamesob approved
  22. jamesob commented at 4:40 pm on July 12, 2023: member

    ACK fadb32556f64a72377c82be71a207195e6fcf68b (jamesob/ackr/28060.2.MarcoFalke.util_add_xorfile)

    Looks good! More concise change now that we’re just adding XOR capability to existing classes.

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK fadb32556f64a72377c82be71a207195e6fcf68b ([`jamesob/ackr/28060.2.MarcoFalke.util_add_xorfile`](https://github.com/jamesob/bitcoin/tree/ackr/28060.2.MarcoFalke.util_add_xorfile))
     4
     5Looks good! More concise change now that we're just adding XOR capability to existing classes.
     6-----BEGIN PGP SIGNATURE-----
     7
     8iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmSu17QACgkQepNdrbLE
     9TwWM2Q/5AQGR5Uv5aFQSPvjtd9ZbcPyYvDzwSjkFImCnCrVGeTlHux6rsS9SzL8y
    10gybawiRHUTHjLCO8j5ItvM7/stmyq4UiTk8QCOQL39dGJgBVjV+/pQnCqPlMSU1d
    11KtObCvXlVc5jWgr7OdHRDtDb5oZi1jc+bcr5ReVCggTdpcvjHxkxLSjRXEFQoeQi
    12kY/wa5yvids1kyCotiScB+zvc3FwVKj3/lWpPWYkpXdTxmqAjpmgweQlBqq90a2a
    13I5D0oI1fyerVzJrYLi5EQfY8PQZBhUDjJSzyMTpxr0EqXSRGWCDl5Zm4m0Mxo+8Q
    14B+Wymo/Dvf196ugDBbUWnQM/dG4tqQRZAB3U1VHwBUemb+fZgDRBeeyAs6/aIHik
    15YwUR8amJM08ClAJUfSaNXL/OouMqOoMds5tXwMM+Z/2msAgpdXx7c6ifGFaMQm5S
    16voElavRD55OLFn9UYKZHusF3EmeTN094pWkwMCeuFnnW9+Rpar5m0nvNuH7T/viI
    17xdFM2V/ypzkDtyDg21ZrR+OcqlGIsj0x8BB7rzs1JaxeoyO6Zshj7RW1JCPTJtBq
    18P2KnEiD8DoFL9gTuL3TtnT/5oIJ3Quqb8A8GnU4amzjOcUyg9mg3CwMuWuh17odO
    19LIckwWfQZlo7wzzOUXwE4i/Xj0ClbyNPgzroOzcXz12WdSsAQWw=
    20=SQKV
    21-----END PGP SIGNATURE-----
    
    0Tested on Linux-6.2.10-arch1-1-x86_64-with-glibc2.37
    1
    2Configured with ./configure 'BDB_LIBS=-L/home/james/src/bitcoin/db4/lib -ldb_cxx-4.8' BDB_CFLAGS=-I/home/james/src/bitcoin/db4/include 'CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare ' --disable-gui-tests --disable-fuzz-binary --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --disable-module-ecdh --no-create --no-recursion
    3
    4Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -mpclmul -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare  -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=.  -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4.1 -msse4.2 -msse4 -msha  i
    5
    6Compiler version: clang version 15.0.7
    
  23. sipa commented at 5:10 pm on July 12, 2023: member

    (Slight tangent, not necessarily for this PR)

    I was wondering if it wouldn’t be better to XOR with a proper (but non-cryptographic) RNG output instead of a fixed byte pattern, as it’d avoid repetitive patterns remaining in the file, or weak keys that could accidentally be selected.

    E.g. Xoshiro256++ is extremely fast (multiple GiB per second), see https://prng.di.unimi.it/. However, it’s nontrivial to seek in, which would be needed here. I got a bit carried away and wrote code to do that, but it’s a bit slower and more complex than what I’d like (~9 microseconds for seeking anywhere within the first 128 MiB of output, ~14 microseconds for the first 4 GiB).

    An alternative is concatenating multiple Xoshiro256++ outputs together, e.g. 4 KiB from Xoshiro256++(key=SipHash(key || 0)), 4 KiB from Xoshiro++(key=SipHash(key || 1)), … etc; that’d offer faster, simpler seeking.

  24. MarcoFalke commented at 5:28 pm on July 12, 2023: member

    Interesting. Right now, it is trivial to undo the XOR in python with something like:

    0        def util_xor(data):
    1            with open("xor.key", "rb") as xor_f:
    2                key = deser_string(xor_f)
    3            for i in range(len(data)):
    4                data[i] ^= key[i % len(key)]
    5            return bytes(data)
    

    If it requires users to instead implement a PRNG in python, they (who want to read the dat files for whatever reason) would most likely disable the XOR feature. (Probably an edge-case, so not sure if it matters, but I wanted to mention it)

  25. jamesob commented at 5:31 pm on July 12, 2023: member
    If the motivating reason for this feature is to prevent systematized, spurious detection of things in blocks (whether via antivirus software or human), isn’t it sufficient to keep the XOR key very simple? Especially, as Marco says, anything more sophisticated would complicate consuming blockfiles from external software.
  26. sipa commented at 5:48 pm on July 12, 2023: member

    It’s fair that this is perhaps not a real concern right now; I don’t know. I find it slightly unsatisfying that we can’t break any 8-byte repetitive pattern (see e.g. the picture here for a visual illustration), while computationally speaking doing so should not be any concern. I bring it up here and now, because trying to support both RNG and constant XORing later would be more code than just RNG.

    Regarding external software… I’m not sure. It wouldn’t surprise me that literally everyone who is interested in reading these files will stick to supporting only non-XOR regardless of whether it’s with a constant or an RNG. If not, can easily have Python code at least to read through the XORing (which we may want for functional tests anyway), but for other languages things would involve more complexity.

  27. MarcoFalke commented at 6:05 pm on July 12, 2023: member
    I could write the 128MiB XOR data to the key file in full. Complexity would be 2x, because each std::fread will be done twice (once to read the XOR key section, and once to read the XOR’d data section)?
  28. in src/streams.h:498 in faa0ab757d outdated
    505-        if (file) {
    506-            retval = ::fclose(file);
    507-            file = nullptr;
    508-        }
    509-        return retval;
    510+        if (std::FILE * rel{release()}) return std::fclose(rel);
    


    sipa commented at 6:14 pm on July 12, 2023:

    This reads to me like a multiplication between std::FILE and rel.

    How about if (auto rel{release()}) return std::fclose(rel); ?


    MarcoFalke commented at 6:32 pm on July 12, 2023:
    Yeah, it is a clang-format-16 bug. Someone should try to fix or report it.

    MarcoFalke commented at 10:10 am on July 13, 2023:
  29. sipa commented at 6:17 pm on July 12, 2023: member
    Code review ACK
  30. refactor: Modernize AutoFile
    * Add m_ prefix to the std::FILE member variable
    * Add std namespace where possible, to avoid confusion with member
      functions of the same name.
    * Add AutoFile::feof() member function, to be used in place of
      std::feof(AutoFile::Get())
    * Simplify fclose() in terms of release()
    * Fix typo in the error message in the ignore member function.
    fa7724bc9d
  31. Add AutoFile::detail_fread member function
    New code can call the method without having first to retrieve the raw
    FILE* pointer via Get().
    
    Also, move implementation to the cpp file. Can be reviewed with:
    --color-moved=dimmed-zebra  --color-moved-ws=ignore-all-space
    000019e158
  32. MarcoFalke force-pushed on Jul 13, 2023
  33. MarcoFalke commented at 1:03 pm on July 13, 2023: member

    I find it slightly unsatisfying that we can’t break any 8-byte repetitive pattern.

    I just find it difficult to think about without a use-case. The only thing it may affect is that it makes transparent on-disk compression harder, but 8-byte XOR already makes that hard, so I don’t think this is an argument (for or against) either?

  34. jamesob approved
  35. jamesob commented at 3:54 pm on July 13, 2023: member

    reACK faebf00

    Trivial rephrasing to avoid FILE release looking like a multiplication:

     0 diff --git a/src/streams.h b/src/streams.h
     1-index 03df20b5db..fc2fe4d9f8 100644
     2+index 03df20b5db..5ff952be76 100644
     3 --- a/src/streams.h
     4 +++ b/src/streams.h
     5 @@ -13,6 +13,7 @@
     6@@ -315,7 +315,7 @@
     7 -            file = nullptr;
     8 -        }
     9 -        return retval;
    10-+        if (std::FILE * rel{release()}) return std::fclose(rel);
    11++        if (auto rel{release()}) return std::fclose(rel);
    12 +        return 0;
    13      }
    
  36. in src/streams.cpp:56 in faebf00dbf outdated
    51+    } else {
    52+        std::array<std::byte, 4096> buf;
    53+        while (src.size() > 0) {
    54+            auto buf_now{Span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
    55+            std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
    56+            const auto current_pos{std::ftell(m_file)};
    


    Crypt-iQ commented at 9:03 pm on July 18, 2023:
    I think you can move this std::ftell to right before the loop and then use the result from std::fwrite to increment current_pos each loop iteration since from https://en.cppreference.com/w/cpp/io/c/fwrite: “The file position indicator for the stream is advanced by the number of characters written.”. I ran some tests and it gives a performance improvement if src.size() is ever > 4096 bytes (though I’m not sure that is ever the case currently with block serialization hitting this function incrementally?)

    MarcoFalke commented at 2:52 pm on July 19, 2023:
    I am hoping to nuke ftell completely, but that would require removing the Get member function

    Crypt-iQ commented at 3:28 pm on July 19, 2023:
    I think the calls to ftell in WriteBlockToDisk and UndoWriteToDisk can be removed since the file position is known before the call and the number of bytes written is also known in the success case (in the failure case the writes will throw), so it should just be possible to increment the initial file position used in OpenBlockFile / OpenUndoFile ?

    MarcoFalke commented at 3:37 pm on July 19, 2023:
    I mean I am happy to make the change, but it seems brittle if someone calls std::fseek after the constructor?

    Crypt-iQ commented at 3:45 pm on July 19, 2023:
    Hm good point, I didn’t think about that – it does seem like those functions are pretty contained since they open the file at the start and then close it at exit, but it does seem brittle

    MarcoFalke commented at 4:12 pm on July 19, 2023:
    Maybe I’ll pick up #28006 again and remove Get?

    Crypt-iQ commented at 4:30 pm on July 19, 2023:
    happy to review if you do

    MarcoFalke commented at 6:42 am on July 21, 2023:
    (Just for context for the follow-up) I am not sure how “clean” the code in #28006 is. Maybe an easier alternative might be to remove all calls to Get and instead wrap the calls that acted on FILE* into a lambda (which acts on FILE*) and pass the lambda to an Exec() member function. The Exec() function would then also reset the cached ftell state and force a call to ftell the next time it is needed. On top of requiring that the constructor takes a FILE*&& (taking ownership), this should be a relatively easy to review complete fix.
  37. streams: Teach AutoFile how to XOR fa633aa690
  38. MarcoFalke force-pushed on Jul 19, 2023
  39. MarcoFalke commented at 4:27 pm on July 19, 2023: member
    Pushed a small change to call std::ftell less often if a large data span is written. In a follow-up, changes can be made to call it less often when small data spans are written.
  40. Crypt-iQ commented at 7:56 pm on July 20, 2023: contributor
    crACK fa633aa
  41. DrahtBot requested review from jamesob on Jul 20, 2023
  42. MarcoFalke requested review from josibake on Jul 21, 2023
  43. willcl-ark commented at 9:23 am on August 1, 2023: contributor

    Lightly tested ACK fa633aa690

    Sidenote: reading the discussions on performance I decided to try and modify the Xor function to use AVX2 SIMD (mainly so I could learn more about it myself) and managed to get something working, but couldn’t get it to run faster than the current implementation in this pull.

    0|             ns/byte |              byte/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|                9.71 |      103,017,293.91 |    0.1% |      0.45 | `Xor4MBNoSIMD`
    3|               10.17 |       98,347,322.00 |    0.3% |      0.47 | `Xor4MBSIMD`
    

    Of course there’s a strong likelihood I did not implement it properly here… But it also seems likely that it’s probably not worth the hassle as compilers seem to be quite good at optimising simpler code to take advantage of this stuff these days AFAIU.

  44. MarcoFalke commented at 9:29 am on August 1, 2023: member
    [ @willcl-ark You could try to inspect the binary after compilation to see how the compiler optimized it (and what difference there is to your version) ]
  45. MarcoFalke commented at 2:04 pm on August 1, 2023: member
    @jamesob Mind doing a re-ACK (trivial). Otherwise I wonder if anything is left to be done here? Is this waiting on someone’s NACK or ACK?
  46. jamesob approved
  47. jamesob commented at 2:53 pm on August 1, 2023: member

    reACK fa633aa6906f3b130b691568bcd20b2b76bb1cbb (jamesob/ackr/28060.4.MarcoFalke.util_add_xorfile)

    Tiny diff since last review; moving ftell out of a loop.

     06:  faebf00dbf ! 6:  fa633aa690 streams: Teach AutoFile how to XOR
     1    @@ src/streams.cpp: void AutoFile::ignore(size_t nSize)
     2     +            throw std::ios_base::failure("AutoFile::write: write failed");
     3     +        }
     4     +    } else {
     5    ++        auto current_pos{std::ftell(m_file)};
     6    ++        if (current_pos < 0) throw std::ios_base::failure("AutoFile::write: ftell failed");
     7     +        std::array<std::byte, 4096> buf;
     8     +        while (src.size() > 0) {
     9     +            auto buf_now{Span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
    10     +            std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
    11    -+            const auto current_pos{std::ftell(m_file)};
    12     +            util::Xor(buf_now, m_xor, current_pos);
    13    -+            if (current_pos < 0 || std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    14    ++            if (std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    15     +                throw std::ios_base::failure{"XorFile::write: failed"};
    16     +            }
    17     +            src = src.subspan(buf_now.size());
    18    ++            current_pos += buf_now.size();
    19     +        }
    20          }
    21      }
    
     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3reACK fa633aa6906f3b130b691568bcd20b2b76bb1cbb ([`jamesob/ackr/28060.4.MarcoFalke.util_add_xorfile`](https://github.com/jamesob/bitcoin/tree/ackr/28060.4.MarcoFalke.util_add_xorfile))
     4
     5Tiny diff since last review; moving `ftell` out of a loop.
     6
     7
     8<details><summary>Range-diff</summary>
     9
    106:  faebf00dbf ! 6:  fa633aa690 streams: Teach AutoFile how to XOR
    11    @@ src/streams.cpp: void AutoFile::ignore(size_t nSize)
    12     +            throw std::ios_base::failure("AutoFile::write: write failed");
    13     +        }
    14     +    } else {
    15    ++        auto current_pos{std::ftell(m_file)};
    16    ++        if (current_pos < 0) throw std::ios_base::failure("AutoFile::write: ftell failed");
    17     +        std::array<std::byte, 4096> buf;
    18     +        while (src.size() > 0) {
    19     +            auto buf_now{Span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
    20     +            std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
    21    -+            const auto current_pos{std::ftell(m_file)};
    22     +            util::Xor(buf_now, m_xor, current_pos);
    23    -+            if (current_pos < 0 || std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    24    ++            if (std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
    25     +                throw std::ios_base::failure{"XorFile::write: failed"};
    26     +            }
    27     +            src = src.subspan(buf_now.size());
    28    ++            current_pos += buf_now.size();
    29     +        }
    30          }
    31      }
    32
    33</details>
    34-----BEGIN PGP SIGNATURE-----
    35
    36iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmTJHG8ACgkQepNdrbLE
    37TwUnxQ/+OXTeOMe54Moioc+djEr3OAHJqw+9Yg5mzNt+SZYb45lxkkAECzL42z+6
    38C32oarHZRt3pK3ZAQgKrU2TXNoNnn2gxvCJ5SOcjCafARkZCf4LgZzFdEMFJXQCN
    39VONFBo7kZb9A7wpEp2fU8cLg5/ASOeW19VLUxHpo9FI5XAKlUEUHT1JbJ3jfiYfW
    40LusvOYrleegDMZlo4zrrqTBymUVYzRjMYJb7cGHu2dl11xVXWOcHNiYiHRmk7g3O
    41QTAd5jWM55+BGGCht7F3cmhEtdk8mfLznPg6+GFNFgKTNyV2dxYbbQerUCsEGkOO
    42xv5Omo5lIigVafi9MNjOJ6z4auWHmjQV8Uid1IwAVs1FJEtwA5WAQivTa8pie8sA
    43wLOgRZhjMABfz+pDO+5WrLMqCgagGQyG276CiEUtMo7NvXdt6lE4YsQ0d1v/BLAZ
    44LGZYufdn5KX6PxyLVaHluFiEKJE164OZjWSNyyqePZBNJqyElYE9UrQ5uEPoCMXs
    45UrvxK7IhGIwvkUBcRH68W8z1+RQ53d4jWNZfmoCLZJzhKDAv06m6xbF4HCCarHi7
    463QtL9N7dBlNUBSMgZwTcQt7LZmg+QMYgyr4WvML84H1pv8dlIfx5hQ3uH0AJ+Jrl
    47Pcma+vHwhAa5UQSwHH5N11ANnvw1IXIftC6eHBcQApfWUmAwmKc=
    48=73oh
    49-----END PGP SIGNATURE-----
    
    0Tested on Linux-6.4.6-arch1-1-x86_64-with-glibc2.37
    1
    2Configured with ./configure 'BDB_LIBS=-L/home/james/src/bitcoin/db4/lib -ldb_cxx-4.8' BDB_CFLAGS=-I/home/james/src/bitcoin/db4/include 'CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare ' --disable-gui-tests --disable-fuzz-binary --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --disable-module-ecdh PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --disable-module-ecdh --no-create --no-recursion
    3
    4Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -mpclmul -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare  -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=.  -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4.1 -msse4.2 -msse4 -msha  i
    5
    6Compiler version: clang version 15.0.7
    
  48. fanquake merged this on Aug 1, 2023
  49. fanquake closed this on Aug 1, 2023

  50. MarcoFalke deleted the branch on Aug 1, 2023
  51. willcl-ark commented at 4:53 pm on August 1, 2023: contributor

    [ @willcl-ark You could try to inspect the binary after compilation to see how the compiler optimized it (and what difference there is to your version) ]

    I did take a rough look on my machine (and on godbolt), but I don’t think I have the necessary super-powers needed to really dissect why my (very naive and likely badly implemented) impl. didn’t pay off.

    From what I can see, it doesn’t appear that clang16 is auto-SIMD-ing your version, but my SIMD version looks quite instruction-heavy by comparison 🤷🏼

    Increasing the bench to 100MB does see the SIMD version going marginally faster, but nothing to get excited about…:

     0$ ./src/bench/bench_bitcoin --filter="Xor.*"
     1Warning, results might be unstable:
     2* CPU frequency scaling enabled: CPU 0 between 800.0 and 3,800.0 MHz
     3* CPU governor is 'powersave' but should be 'performance'
     4* Turbo is enabled, CPU frequency will fluctuate
     5
     6Recommendations
     7* Use 'pyperf system tune' before benchmarking. See https://github.com/psf/pyperf
     8
     9|             ns/byte |              byte/s |    err% |     total | benchmark
    10|--------------------:|--------------------:|--------:|----------:|:----------
    11|                8.92 |      112,111,802.67 |    0.5% |      0.01 | `Xor`
    12|                9.74 |      102,639,832.40 |    0.0% |     11.24 | `Xor100MBNoSIMD`
    13|                8.62 |      116,041,297.12 |    0.1% |      9.94 | `Xor100MBSIMD`
    

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me