compressor: use a prevector in CompressScript serialization [ZAP1] #18847

pull jb55 wants to merge 1 commits into bitcoin:master from jb55:2020-05-compresscript-prevector changing 4 files +27 −13
  1. jb55 commented at 0:47 am on May 2, 2020: member

    This function was doing millions of unnecessary heap allocations during IBD.

    I’m start to catalog unnecessary heap allocations as a pet project of mine: as-zero-as-possible-alloc IBD. This is one small step.

    before: May01-174536

    after: May01-174610

    should I type alias this? I type aliased it

    This is a part of the Zero Allocations Project #18849 (ZAP1). This code came up as a place where many allocations occur.

  2. DrahtBot added the label Tests on May 2, 2020
  3. Empact commented at 8:58 am on May 2, 2020: member
  4. in src/compressor.h:15 in b6856e4998 outdated
     9@@ -10,8 +10,9 @@
    10 #include <script/script.h>
    11 #include <serialize.h>
    12 #include <span.h>
    13+#include <prevector.h>
    14 
    15-bool CompressScript(const CScript& script, std::vector<unsigned char> &out);
    16+bool CompressScript(const CScript& script, prevector<33, unsigned char> &out);
    


    Empact commented at 9:01 am on May 2, 2020:
  5. sipa commented at 7:20 am on May 3, 2020: member
    Any reason not to do this for decompression?
  6. jb55 commented at 3:53 pm on May 3, 2020: member

    Any reason not to do this for decompression?

    nope, I could do that as well.

  7. jb55 force-pushed on May 15, 2020
  8. jb55 force-pushed on May 15, 2020
  9. jb55 force-pushed on May 15, 2020
  10. jb55 force-pushed on May 15, 2020
  11. in src/compressor.h:19 in 14cd3d9ec3 outdated
    15-bool CompressScript(const CScript& script, std::vector<unsigned char> &out);
    16+/**
    17+ * This saves us from making many heap allocations when serializaing
    18+ * and deserializaing compressed scripts
    19+ */
    20+typedef prevector<33, unsigned char> CCompressedScript;
    


    Empact commented at 6:36 pm on May 15, 2020:

    nit: How about using CompressedScript = prevector<33, unsigned char>;

    “Do not prefix class names with C.” https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c


    jb55 commented at 6:40 pm on May 15, 2020:
    ok I was just following code I saw in the codebase, will update

    Empact commented at 6:50 pm on May 15, 2020:
    Yeah the codebase is a mix of styles because we don’t update code unless we have a need to touch it - updating the codebase wholesale would introduce risk, conflicts between pull requests, and make it more difficult to properly scrutinize changes to the code over time.
  12. jb55 force-pushed on May 15, 2020
  13. jb55 force-pushed on May 15, 2020
  14. jb55 force-pushed on May 15, 2020
  15. in src/compressor.h:82 in d03a694c43 outdated
    78+            auto size = GetSpecialScriptSize(nSize);
    79+            // if size > sizeof(CompressedScript) this would incur a
    80+            // heap allocation in the CompressedScript prevector.
    81+            // bump the CompressedScript prevector size if this ever
    82+            // happens.
    83+            assert(size <= sizeof(CompressedScript));
    


    JeremyRubin commented at 8:20 pm on May 15, 2020:

    This line is a little frightening to me.

    Is this something we know at compile time and can static_assert? Otherwise we should just resize and take the performance hit here.

    Performance degradation is much better than crash fail :)


    jb55 commented at 9:35 pm on May 15, 2020:
    I was hoping this assert would trigger in the tests if GetSpecialScriptSize was ever changed, but I just tried chaging the 32 value to 34 and the assert wasn’t triggered. so now I’m wondering if there should be better test coverage here.

    jb55 commented at 9:36 pm on May 15, 2020:
    oh nm I just realized this is horribly wrong, not sure why I’m doing sizeof here. I’m going to just delete this assert
  16. jb55 force-pushed on May 15, 2020
  17. jb55 force-pushed on May 15, 2020
  18. in src/compressor.h:23 in 98c06886a5 outdated
    15-bool CompressScript(const CScript& script, std::vector<unsigned char> &out);
    16+/**
    17+ * This saves us from making many heap allocations when serializaing
    18+ * and deserializaing compressed scripts
    19+ */
    20+using CompressedScript = prevector<33, unsigned char>;
    


    JeremyRubin commented at 10:01 pm on May 15, 2020:
    Can you explain where 33 comes from?

    JeremyRubin commented at 10:01 pm on May 15, 2020:
    (ideally in this comment)
  19. jb55 commented at 10:12 pm on May 15, 2020: member

    Can you explain where 33 comes from?

    oh yeah good idea

  20. jb55 force-pushed on May 15, 2020
  21. jb55 force-pushed on May 15, 2020
  22. compressor: use a prevector in compressed script serialization
    Use a prevector for stack allocation instead of heap allocation during
    script compression and decompression. These functions were doing
    millions of unnecessary heap allocations during IBD.
    
    We introduce a CompressedScript type alias for this prevector. It is
    size 33 as that is the maximum size of a compressed script.
    
    Fix the DecompressScript header to match the variable name from
    compressor.cpp
    
    Signed-off-by: William Casarin <jb55@jb55.com>
    83a425d25a
  23. jb55 force-pushed on May 15, 2020
  24. jb55 renamed this:
    compressor: use prevector in CompressScript serialization
    compressor: use a prevector in CompressScript serialization
    on May 15, 2020
  25. DrahtBot commented at 0:26 am on May 16, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    No conflicts as of last run.

  26. jb55 renamed this:
    compressor: use a prevector in CompressScript serialization
    compressor: use a prevector in CompressScript serialization [ZAP1]
    on May 16, 2020
  27. elichai commented at 9:21 am on May 17, 2020: contributor
    tACK 83a425d25af033086744c1c8c892015014ed46bd
  28. jb55 commented at 1:23 am on May 21, 2020: member

    doing some tests I’m seeing a very tiny improvement in tmpfs/memdisk reindex up to height 200k:

    but could just be luck :shrug: would be interesting to test this to even higher heights.

    0n       Δn=6      Δn+1      ms       
    11       -3135    -581       219706    zeroalloc-mem-200k
    22       -2554    -981       220287    zeroalloc-mem-200k
    33       -1573    -100       221268    zeroalloc-mem-200k
    44       -1473    -88        221368    master-mem-200k
    55       -1385    -1385      221456    master-mem-200k
    66                           222841    master-mem-200k
    

    I also traced the number of calls to CompressScript: around 57248988 calls at height 200k

    I used

  29. jb55 commented at 7:33 pm on April 27, 2021: member
    friendly yearly bump
  30. sipa commented at 6:25 pm on April 28, 2021: member
    utACK 83a425d25af033086744c1c8c892015014ed46bd
  31. MarcoFalke merged this on Apr 28, 2021
  32. MarcoFalke closed this on Apr 28, 2021

  33. sidhujag referenced this in commit a47fad5669 on Apr 29, 2021
  34. gwillen referenced this in commit 10ee57280a on Jun 1, 2022
  35. DrahtBot locked this on Aug 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me