refactor: use string_view for passing string literals to Parse{Hash,Hex} #28172

pull jonatack wants to merge 2 commits into bitcoin:master from jonatack:2023-07-avoid-string-copies-in-ParseHex-functions changing 3 files +13 −12
  1. jonatack commented at 7:54 PM on July 27, 2023: member

    as string_view is optimized to be trivially copiable, whereas the current code creates a std::string copy at each call.

    These utility methods are called by quite a few RPCs and tests, as well as by each other.

    $ git grep "ParseHashV\|ParseHashO\|ParseHexV\|ParseHexO" | wc -l
    61
    

    Also remove an out-of-date external link.

  2. DrahtBot commented at 7:54 PM on July 27, 2023: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, brunoerg, ns-xvrn, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28134 (refactor: deduplicate AmountFromValue() functions by jonatack)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/rpc/util.cpp:76 in a42464ef1b outdated
      72 | @@ -73,7 +73,7 @@ CAmount AmountFromValue(const UniValue& value, int decimals)
      73 |      return amount;
      74 |  }
      75 |  
      76 | -uint256 ParseHashV(const UniValue& v, std::string strName)
      77 | +uint256 ParseHashV(const UniValue& v, const std::string& strName)
    


    maflcko commented at 6:09 AM on July 28, 2023:

    The names are short, so a copy or reference should make no difference. Also, a copy is still done when the input is a c-string, which is the case here for all call sites.


    jonatack commented at 2:53 AM on August 2, 2023:

    It looks like passing std::strings by reference to const (as done in ParseHashStr() in core_io.h) would be much better than passing string literals (for which by-value or by-reference-to-const doesn't matter, as you rightly point out).

    <details><summary>test</summary><p>

    // string-passing.cpp
    //
    // $ clang++ -std=c++17 string-passing.cpp && ./a.out
    // foo(string) took 436096 cycles
    // bar(string) took 156324 cycles
    // foo("string literal") took 1464669 cycles
    // bar("string literal") took 1463397 cycles
    
    #include <ctime>
    #include <iostream>
    #include <string>
    
    void foo(std::string s) { return; }
    void bar(const std::string& s) { return; }
    
    int main() {
        std::string s = "blockhash";
    
        auto start = clock();
        for (int it = 0; it < 100000000; ++it) {
            foo(s);
        }
        std::cout << "foo(string) took " << (clock() - start) << " cycles" << std::endl;
    
        start = clock();
        for (int it = 0; it < 100000000; ++it) {
            bar(s);
        }
        std::cout << "bar(string) took " << (clock() - start) << " cycles" << std::endl;
    
        start = clock();
        for (int it = 0; it < 100000000; ++it) {
            foo("blockhash");
        }
        std::cout << "foo(\"string literal\") took " << (clock() - start) << " cycles" << std::endl;
    
        start = clock();
        for (int it = 0; it < 100000000; ++it) {
            bar("blockhash");
        }
        std::cout << "bar(\"string literal\") took " << (clock() - start) << " cycles" << std::endl;
    
        return 0;
    }
    

    </p></details>

    Replacing the blockhash, hash, and txid string literals by a constant would do it (and the parameter 1 ones look like they should be txid as well).


    jonatack commented at 4:27 PM on August 2, 2023:

    Pushed an update to use string_view instead.

  4. maflcko changes_requested
  5. maflcko commented at 6:10 AM on July 28, 2023: member

    Not sure. This has no effect on the compiled binary and is only changing the style of the code.

  6. fanquake commented at 8:48 AM on August 1, 2023: member

    Also unsure. Can you better-explain the changes here/respond to the review comment?

  7. jonatack force-pushed on Aug 2, 2023
  8. jonatack renamed this:
    rpc, util: avoid string copies in ParseHash/ParseHex functions
    rpc, util: use string_view for passing string literals to Parse{Hash,Hex}
    on Aug 2, 2023
  9. DrahtBot renamed this:
    rpc, util: use string_view for passing string literals to Parse{Hash,Hex}
    rpc, util: use string_view for passing string literals to Parse{Hash,Hex}
    on Aug 2, 2023
  10. jonatack force-pushed on Aug 2, 2023
  11. DrahtBot added the label CI failed on Aug 2, 2023
  12. jonatack commented at 6:00 PM on August 2, 2023: member

    Updated to pass string literals to the Parse{Hash,Hex} functions by value as string_view rather than std::string. Thank you for the helpful feedback!

    <details><summary>microbench</summary><p>

    // passing-string-literals.cpp
    //
    // $ g++ -std=c++17 passing-string-literals.cpp && ./a.out
    // StringByValue("string literal") took 1426 cycles
    // StringByConstRef("string literal") took 1412 cycles
    // StringViewByValue("string literal") took 612 cycles
    
    #include <ctime>
    #include <iostream>
    #include <string>
    #include <string_view>
    
    void StringByValue(std::string s) { return; }
    void StringByConstRef(const std::string& s) { return; }
    void StringViewByValue(std::string_view s) { return; }
    
    int main() {
        const int iters = 100000; // try also with `iters` values of, say, 1 or 100
    
        auto start = clock();
        for (int it = 0; it < iters; ++it) {
            StringByValue("blockhash");
        }
        std::cout << "StringByValue(\"string literal\") took " << (clock() - start) << " cycles" << std::endl;
    
        start = clock();
        for (int it = 0; it < iters; ++it) {
            StringByConstRef("blockhash");
        }
        std::cout << "StringByConstRef(\"string literal\") took " << (clock() - start) << " cycles" << std::endl;
    
        start = clock();
        for (int it = 0; it < iters; ++it) {
            StringViewByValue("blockhash");
        }
        std::cout << "StringViewByValue(\"string literal\") took " << (clock() - start) << " cycles" << std::endl;
    
        return 0;
    }
    

    </p></details>

  13. DrahtBot removed the label CI failed on Aug 2, 2023
  14. in src/rpc/util.cpp:77 in be3a44ae04 outdated
      73 | @@ -73,29 +74,29 @@ CAmount AmountFromValue(const UniValue& value, int decimals)
      74 |      return amount;
      75 |  }
      76 |  
      77 | -uint256 ParseHashV(const UniValue& v, std::string strName)
      78 | +uint256 ParseHashV(const UniValue& v, std::string_view strName)
    


    maflcko commented at 6:15 AM on August 3, 2023:
    uint256 ParseHashV(const UniValue& v, std::string_view name)
    

    nit, while touching all lines where this is used


    jonatack commented at 6:36 PM on August 3, 2023:

    done

  15. in src/rpc/util.cpp:90 in be3a44ae04 outdated
      89 | +uint256 ParseHashO(const UniValue& o, std::string_view strKey)
      90 |  {
      91 |      return ParseHashV(o.find_value(strKey), strKey);
      92 |  }
      93 | -std::vector<unsigned char> ParseHexV(const UniValue& v, std::string strName)
      94 | +std::vector<unsigned char> ParseHexV(const UniValue& v, std::string_view strName)
    


    maflcko commented at 6:15 AM on August 3, 2023:

    same


    jonatack commented at 6:37 PM on August 3, 2023:

    done

  16. maflcko approved
  17. maflcko commented at 6:26 AM on August 3, 2023: member

    string_view is dangerous, because it can lead to use-after-free, but here it seems fine. lgtm.

    I think you can remove the 4 links to random blog posts, because everyone is able to type std::string_view into a search engine, if they want to. Also, I don't think the benchmark is useful in this context and can be removed. If we cared about an RPC call taking a few femtoseconds less, there are much lower hanging fruits to pick. A benchmark would be useful if this showed an end-to-end improvement, but it seems more likely that the compiler already optimized all of this down to the same number of cycles when the functions are called in a complex RPC method.

  18. DrahtBot renamed this:
    rpc, util: use string_view for passing string literals to Parse{Hash,Hex}
    refactor: use string_view for passing string literals to Parse{Hash,Hex}
    on Aug 3, 2023
  19. DrahtBot added the label Refactoring on Aug 3, 2023
  20. jonatack force-pushed on Aug 3, 2023
  21. jonatack commented at 6:42 PM on August 3, 2023: member

    Updated the pull description and repushed to take @MarcoFalke's review feedback (thanks!)

  22. bitcoin deleted a comment on Aug 4, 2023
  23. maflcko commented at 5:38 AM on August 4, 2023: member

    lgtm ACK b94581a8acecafe5ffff15692485a5572d1db57c

  24. DrahtBot added the label Needs rebase on Aug 24, 2023
  25. refactor: use string_view to pass string literals to Parse{Hash,Hex}
    as string_view is optimized to be trivially copiable, and in these use cases we
    only perform read operations on the passed object.
    
    These utility methods are called by quite a few RPCs and tests, as well as by each other.
    
    $ git grep "ParseHashV\|ParseHashO\|ParseHexV\|ParseHexO" | wc -l
    61
    7d494a48dd
  26. doc: remove out-of-date external link in src/util/strencodings.h bb91131d54
  27. jonatack force-pushed on Aug 25, 2023
  28. jonatack commented at 10:45 PM on August 25, 2023: member

    Rebased per git range-diff c9273f6 b94581a bb91131 for an include header from the merge of #28230. Should be trivial to re-ACK.

  29. DrahtBot removed the label Needs rebase on Aug 25, 2023
  30. jonatack requested review from maflcko on Aug 28, 2023
  31. maflcko commented at 2:00 PM on August 28, 2023: member

    lgtm ACK bb91131d545d986aab81c4bb13676c4520169259

  32. DrahtBot removed review request from maflcko on Aug 28, 2023
  33. in src/rpc/util.cpp:78 in bb91131d54
      74 | @@ -74,29 +75,29 @@ CAmount AmountFromValue(const UniValue& value, int decimals)
      75 |      return amount;
      76 |  }
      77 |  
      78 | -uint256 ParseHashV(const UniValue& v, std::string strName)
      79 | +uint256 ParseHashV(const UniValue& v, std::string_view name)
    


    brunoerg commented at 6:23 PM on September 15, 2023:

    What is the difference of changing std::string to std::string_view in comparision of changing it to const std::string& name? The overhead of a reference to the entire object?


    maflcko commented at 9:36 AM on September 17, 2023:

    const std::string& requires that the referred-to memory exists inside a std::string in the memory somewhere.

    std::string_view does not have this requirement, because it allows to refer to anything char-span-like.


    brunoerg commented at 11:48 AM on September 18, 2023:

    Cool, thanks!


    jonatack commented at 5:57 PM on September 19, 2023:

    @brunoerg If relevant to your question, check out the microbench results above in #28172 (comment).

    // StringByValue("string literal") took 1426 cycles
    // StringByConstRef("string literal") took 1412 cycles
    // StringViewByValue("string literal") took 612 cycles
    
  34. brunoerg approved
  35. brunoerg commented at 11:48 AM on September 18, 2023: contributor

    crACK bb91131d545d986aab81c4bb13676c4520169259

  36. nsvrn commented at 10:47 PM on October 13, 2023: contributor

    Tested locally and reviewed the code changes, looks good.

    ACK https://github.com/bitcoin/bitcoin/commit/bb91131d545d986aab81c4bb13676c4520169259

  37. achow101 commented at 7:43 PM on November 2, 2023: member

    ACK bb91131d545d986aab81c4bb13676c4520169259

  38. achow101 merged this on Nov 2, 2023
  39. achow101 closed this on Nov 2, 2023

  40. jonatack deleted the branch on Nov 2, 2023
  41. bitcoin locked this on Nov 1, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me