Try to use posix_fadvise with CBufferedFile #14485

pull luke-jr wants to merge 1 commits into bitcoin:master from luke-jr:fadvise changing 3 files +54 −4
  1. luke-jr commented at 12:19 pm on October 15, 2018: member
    Resurrecting #12491, since it seemed fine and desirable…
  2. meshcollider added the label Resource usage on Oct 15, 2018
  3. meshcollider commented at 12:33 pm on October 15, 2018: contributor
    Concept ACK, would be good to hear from @eklitzke why the original was closed though
  4. in src/util.h:89 in 5ff402f58a outdated
    84+FILE* AdviseSequential(FILE *file);
    85+
    86+//! Close a file and return the result of fclose(). On POSIX systems that
    87+//! support it, advise the kernel to remove the file contents from the page
    88+//! cache (which can help on memory-constrained systems).
    89+int CloseAndDiscard(FILE *file);
    


    laanwj commented at 10:15 am on October 18, 2018:
    “discard” sounds to me as if writes to the file will be discarded, too maybe CloseAndUncache?
  5. laanwj commented at 10:16 am on October 18, 2018: member
    Concept ACK though I’d really like to see benchmarks for changes like this, to know if this has any impact in practice worth making the risk/complication of the change.
  6. jgarzik commented at 1:37 pm on October 18, 2018: contributor

    Probably worthing having – rough concept ACK – however:

    • The double-lseek will be slower than fstat() on some networked or detachable filesystems.

    • If close-and-uncache is used on just-written-to files, this action is a no-op unless a sync (fdatasync etc.) preceded it.

    • Definitely worth benchmarking on multiple filesystems to see the positive or negative value of this change.

  7. DrahtBot added the label Needs rebase on Nov 5, 2018
  8. bitcoin deleted a comment on Feb 7, 2019
  9. luke-jr force-pushed on Feb 11, 2019
  10. luke-jr commented at 3:33 pm on February 11, 2019: member
    Rebased and addressed @laanwj’s function rename request.
  11. DrahtBot removed the label Needs rebase on Feb 11, 2019
  12. in src/util/system.cpp:1144 in f2041fe9f5 outdated
    1146+        if (end == -1)
    1147+            goto close;
    1148+        posix_fadvise(fd, 0, end, POSIX_FADV_DONTNEED);
    1149+    }
    1150+#endif
    1151+ close:
    


    practicalswift commented at 10:57 am on February 13, 2019:
    Move close: inside #if … #endif?

    luke-jr commented at 5:17 pm on October 13, 2019:
    Replaced goto with nested ifs.
  13. DrahtBot commented at 7:11 am on March 23, 2019: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #26705 (clang-tidy: Fix modernize-use-default-member-init in headers and force to check all headers by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  14. luke-jr force-pushed on Apr 13, 2019
  15. DrahtBot added the label Needs rebase on May 2, 2019
  16. DrahtBot commented at 12:54 pm on September 30, 2019: contributor
  17. DrahtBot added the label Up for grabs on Sep 30, 2019
  18. DrahtBot closed this on Sep 30, 2019

  19. meshcollider reopened this on Oct 13, 2019

  20. luke-jr force-pushed on Oct 13, 2019
  21. DrahtBot removed the label Needs rebase on Oct 13, 2019
  22. luke-jr force-pushed on Oct 13, 2019
  23. luke-jr commented at 10:24 pm on October 13, 2019: member
    Rebased
  24. maflcko removed the label Up for grabs on Oct 16, 2019
  25. in src/util/system.cpp:37 in b2c2917889 outdated
    24-#undef _POSIX_C_SOURCE
    25-#endif
    26-
    27-#define _POSIX_C_SOURCE 200112L
    28-
    29-#endif // __linux__
    


    vasild commented at 2:15 pm on March 27, 2020:

    I think we shouldn’t remove this. We are setting _POSIX_C_SOURCE in order to ask the system to provide the functions. feature_test_macros(7) explains that indeed we are supposed to set it ourselves (not just observe its default value). And, of course, set it before including the system headers.

    Further, we should consider OSes that will not provide the functions we need (posix_fadvise() in this case) even if we define _POSIX_C_SOURCE to 200112 (e.g. MacOS). So, a configure-time check is warranted, similarly to what #18437 does.

  26. in src/util/system.cpp:1124 in b2c2917889 outdated
    1119+    off_t end = lseek(fd, 0, SEEK_END);
    1120+    if (end != -1) {
    1121+        posix_fadvise(fd, start, end - start, POSIX_FADV_WILLNEED);
    1122+        posix_fadvise(fd, start, end - start, POSIX_FADV_SEQUENTIAL);
    1123+    }
    1124+    lseek(fd, start, SEEK_SET);
    


    vasild commented at 2:19 pm on March 27, 2020:

    We can avoid seeking to the end of the file and back just to get the file size. Supplying 0 as the 3rd argument will suffice. From posix_fadvise(2):

    … for len bytes (or until the end of the file if len is 0)

  27. in src/util/system.cpp:1255 in b2c2917889 outdated
    1131+    // Ignore any errors up to and including the posix_fadvise call since it's
    1132+    // advisory.
    1133+    if (file != nullptr) {
    1134+        const int fd = fileno(file);
    1135+        if (fd != -1) {
    1136+            const off_t end = lseek(fd, 0, SEEK_END);
    


    vasild commented at 2:24 pm on March 27, 2020:
    nit: I like the const, but it is missing from variables in AdviseSequential(), maybe add it there too?
  28. in src/util/system.cpp:1256 in b2c2917889 outdated
    1132+    // advisory.
    1133+    if (file != nullptr) {
    1134+        const int fd = fileno(file);
    1135+        if (fd != -1) {
    1136+            const off_t end = lseek(fd, 0, SEEK_END);
    1137+            if (end != (off_t)-1) {
    


    vasild commented at 2:34 pm on March 27, 2020:
    The typecast is not needed as off_t is “extended signed integral”.

    luke-jr commented at 9:05 pm on May 7, 2020:
    Prefer to stick to the manpage.
  29. vasild commented at 2:49 pm on March 27, 2020: contributor
    My gut feel tells me that this is good, but experience tells me that it needs some performance testing to demonstrate that it really brings a boost. Sometimes such changes have surprisingly adverse effects or no effects and end up being just code clutter.
  30. DrahtBot added the label Needs rebase on Apr 23, 2020
  31. luke-jr commented at 9:09 pm on May 7, 2020: member
    Rebased & comments addressed
  32. luke-jr force-pushed on May 7, 2020
  33. DrahtBot removed the label Needs rebase on May 7, 2020
  34. luke-jr force-pushed on May 12, 2020
  35. DrahtBot added the label Needs rebase on Feb 1, 2021
  36. practicalswift commented at 3:22 pm on February 1, 2021: contributor

    Adding to the three previous requests for benchmarks:

    Do we have any benchmarks showing the practical gains in swiftness? :)

  37. luke-jr force-pushed on Feb 6, 2022
  38. DrahtBot removed the label Needs rebase on Feb 6, 2022
  39. DrahtBot added the label Needs rebase on Feb 21, 2022
  40. Try to use posix_fadvise with CBufferedFile
    This primarily affects blocks when bitcoin is launched with -reindex, as
    that causes the block files to be loaded as CBufferedFile objects one at
    a time as the reindex progresses.
    
    Co-Authored-By: Luke Dashjr <luke-jr+git@utopios.org>
    3f2c08b820
  41. luke-jr force-pushed on Feb 22, 2022
  42. DrahtBot removed the label Needs rebase on Feb 23, 2022
  43. achow101 added the label Up for grabs on Oct 12, 2022
  44. achow101 added the label Needs Benchmark on Oct 12, 2022
  45. aureleoules commented at 6:33 pm on November 4, 2022: member
    This change seems to slow down IBD. I benchmarked by syncing a fresh node both on master and this PR (rebased on master) until block height=495000 with -prune=550 and -connect=localhost (syncing from existing local full node). Master took 5h27mins and this PR took 5h40mins. #14485 (comment)
  46. vasild commented at 10:04 am on November 7, 2022: contributor

    This is about 4% difference. Could it be within noise/variance? I mean if you run 3 times master and it takes e.g. 5h27min, 5h03min, 5h42min then you know the result from the PR is within noise. I think that if this makes some impact (positive or negative) it should be observable on smaller tests too, e.g. something that runs 5 or 10 minutes. Then you can repeat a few times and see how much is the variance for both master and the PR.

    Is such a workload disk bound? If it is, then does the full node (the source of the sync) compete for disk resources too (that would slow down the syncing node and bring noise)? Was the full node also running the PR?

  47. aureleoules commented at 12:15 pm on November 7, 2022: member

    @vasild you’re right, sorry about my last benchmark. I didn’t lock the CPU frequency which probably caused a lot of noise.

    I re-ran some benchmarks with CPU frequency locked. I rebased the PR locally on master. The local node used to sync the benchmarked nodes runs on this PR.

     0Command being timed: "./bitcoind_pr -datadir=/tmp/btc -prune=10000 -stopatheight=400000 -connect=localhost -server=0 -printtoconsole=0"
     1        User time (seconds): 6765.53
     2        System time (seconds): 707.99
     3        Percent of CPU this job got: 137%
     4        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:30:34
     5        Average shared text size (kbytes): 0
     6        Average unshared data size (kbytes): 0
     7        Average stack size (kbytes): 0
     8        Average total size (kbytes): 0
     9        Maximum resident set size (kbytes): 2742480
    10        Average resident set size (kbytes): 0
    11        Major (requiring I/O) page faults: 1858
    12        Minor (reclaiming a frame) page faults: 2802209
    13        Voluntary context switches: 19966530
    14        Involuntary context switches: 15899
    15        Swaps: 0
    16        File system inputs: 473456
    17        File system outputs: 333039544
    18        Socket messages sent: 0
    19        Socket messages received: 0
    20        Signals delivered: 0
    21        Page size (bytes): 4096
    22        Exit status: 0
    
     0Command being timed: "./bitcoin/src/bitcoind -datadir=/tmp/btc -prune=10000 -stopatheight=400000 -connect=localhost -server=0 -printtoconsole=0"
     1        User time (seconds): 6944.92
     2        System time (seconds): 707.66
     3        Percent of CPU this job got: 137%
     4        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:32:33
     5        Average shared text size (kbytes): 0
     6        Average unshared data size (kbytes): 0
     7        Average stack size (kbytes): 0
     8        Average total size (kbytes): 0
     9        Maximum resident set size (kbytes): 2733784
    10        Average resident set size (kbytes): 0
    11        Major (requiring I/O) page faults: 2161
    12        Minor (reclaiming a frame) page faults: 2674124
    13        Voluntary context switches: 20103862
    14        Involuntary context switches: 18039
    15        Swaps: 0
    16        File system inputs: 536328
    17        File system outputs: 334433416
    18        Socket messages sent: 0
    19        Socket messages received: 0
    20        Signals delivered: 0
    21        Page size (bytes): 4096
    22        Exit status: 0
    

    The results seem to be similar, so ACK 3f2c08b8202e6a98cc55ff101e274281b54c64f1.

  48. maflcko removed the label Up for grabs on Feb 1, 2023
  49. DrahtBot added the label Needs rebase on Feb 1, 2023
  50. DrahtBot commented at 11:40 am on February 1, 2023: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  51. achow101 requested review from jamesob on Apr 25, 2023
  52. DrahtBot commented at 1:45 am on July 24, 2023: contributor

    There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  53. achow101 commented at 4:05 pm on September 20, 2023: member
    This PR does not seem to have conceptual support. Please leave a comment if you would like this to be reopened.
  54. achow101 closed this on Sep 20, 2023

  55. luke-jr referenced this in commit 79e4722e62 on Jun 13, 2024
  56. bitcoin locked this on Sep 19, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me