Minor optimizations to bech32::Decode(); add tests. #12881

pull murrayn wants to merge 1 commits into bitcoin:master from murrayn:bech32_decode changing 2 files +6 −3
  1. murrayn commented at 10:39 AM on April 4, 2018: contributor

    Just a few minor optimizations to bech32::Decode():

    1. optimize the order and logic of the conditionals
    2. get rid of subsequent '(c < 33 || c > 126)' check which is redundant (already performed above)
    3. add a couple more bech32 tests (mixed-case)
  2. fanquake added the label Refactoring on Apr 4, 2018
  3. in src/bech32.cpp:159 in b65a5a3bfd outdated
     154 | @@ -161,26 +155,28 @@ std::pair<std::string, data> Decode(const std::string& str) {
     155 |      for (size_t i = 0; i < str.size(); ++i) {
     156 |          unsigned char c = str[i];
     157 |          if (c < 33 || c > 126) return {};
     158 | -        if (c >= 'a' && c <= 'z') lower = true;
     159 | -        if (c >= 'A' && c <= 'Z') upper = true;
     160 | +        // Mixing upper and lowercase is not OK.
     161 | +        if (!lower && c >= 'a' && c <= 'z') { if(upper) return {}; lower = true; }
    


    practicalswift commented at 11:05 AM on April 4, 2018:

    Nit: if (upper) instead of if(upper)


    murrayn commented at 11:32 AM on April 4, 2018:

    Not even a nit. Should be a compiler error IMO. I must be tired. :-)

  4. in src/bech32.cpp:175 in b65a5a3bfd outdated
     169 |      }
     170 |      data values(str.size() - 1 - pos);
     171 |      for (size_t i = 0; i < str.size() - 1 - pos; ++i) {
     172 |          unsigned char c = str[i + pos + 1];
     173 | -        int8_t rev = (c < 33 || c > 126) ? -1 : CHARSET_REV[c];
     174 | +        int8_t rev = CHARSET_REV[c];
    


    practicalswift commented at 11:10 AM on April 4, 2018:

    Add assert(c >= 33 && c <= 126); on the line before int8_t rev = CHARSET_REV[c]; to make assumption explicit?


    murrayn commented at 3:23 AM on April 5, 2018:

    @practicalswift If I were writing Decode() from scratch, I don't think it would occur to me to add an assert() there. Not convinced it adds any value.


    laanwj commented at 11:05 AM on April 26, 2018:

    Remember that we have asserts enabled on release notes, better not add them in inner loops especially not if the goal is to 'tighten up' anything.

  5. practicalswift commented at 11:13 AM on April 4, 2018: contributor

    LowerCase() and std::tolower() are not equivalent. std::tolower() takes the currently installed C locale into account.

  6. murrayn commented at 11:46 AM on April 4, 2018: contributor

    Had a feeling there was a reason for LowerCase().

  7. in src/bech32.cpp:165 in 1bc88b4a66 outdated
     160 | @@ -161,18 +161,20 @@ std::pair<std::string, data> Decode(const std::string& str) {
     161 |      for (size_t i = 0; i < str.size(); ++i) {
     162 |          unsigned char c = str[i];
     163 |          if (c < 33 || c > 126) return {};
     164 | -        if (c >= 'a' && c <= 'z') lower = true;
     165 | -        if (c >= 'A' && c <= 'Z') upper = true;
     166 | +        // Mixing upper and lowercase is not OK.
     167 | +        if (!lower && c >= 'a' && c <= 'z') { if (upper) return {}; lower = true; }
    


    sipa commented at 12:44 AM on April 5, 2018:

    Please follow the coding style: any if that has anything but just a single-statement then body must use braces and indentation.

  8. fanquake commented at 1:06 AM on April 5, 2018: member

    @murrayn After fixing any nits, please also squash your commits.

  9. in src/bech32.cpp:165 in 163d794add outdated
     160 | @@ -161,18 +161,25 @@ std::pair<std::string, data> Decode(const std::string& str) {
     161 |      for (size_t i = 0; i < str.size(); ++i) {
     162 |          unsigned char c = str[i];
     163 |          if (c < 33 || c > 126) return {};
     164 | -        if (c >= 'a' && c <= 'z') lower = true;
     165 | -        if (c >= 'A' && c <= 'Z') upper = true;
    


    promag commented at 1:32 AM on April 5, 2018:

    else here would improve a bit.


    promag commented at 1:43 AM on April 5, 2018:

    Maybe better:

    if (c >= 'a') { if (c <= 'z') lower = true; }
    else if (c >= 'A') { if (c <= 'Z') upper = true; }
    

    murrayn commented at 3:29 AM on April 5, 2018:

    @promag Do you have an example in mind of a most common case?


    promag commented at 8:35 AM on April 5, 2018:

    The success case?


    murrayn commented at 10:25 AM on April 5, 2018:

    OK, in that case (let's assume a string consisting only of lowercase or uppercase letters) the existing code will do three comparisons and one assignment, per character. The proposed code will do three comparisons per character, with the added benefit of returning earlier in the case of a malformed string. Not sure why you would "suspect this makes the most common case worst".


    murrayn commented at 11:28 AM on April 5, 2018:

    Further to this, if we're going to assume the most common case is the success case (which is reasonable), it would probably be good to move the initial (c < 33 || c > 126) check to an ultimate "else if" check to the original code.


    promag commented at 8:41 PM on April 5, 2018:

    @murrayn probably, but can you post benchmark results?

  10. promag commented at 1:47 AM on April 5, 2018: member

    Please provide benchmark results. I suspect this makes the most common case worst. Maybe the suggestion below would improve performance for the most common case. An edge case doesn't have to perform better if the implementation makes the remaining cases worst.

  11. murrayn force-pushed on Apr 5, 2018
  12. laanwj commented at 7:25 AM on April 5, 2018: member

    LowerCase() and std::tolower() are not equivalent. std::tolower() takes the currently installed C locale into account.

    Thanks for catching this. We should be extremely careful to not introduce locale dependencies in the low-level string parsing functions. We've had serious problems with those in the past. This can result in country-specific bugs...

  13. sipa commented at 11:50 AM on April 5, 2018: member

    Please don't overthink this. Decoding addresses is hardly relevant (I don't think anyone would notice if they were 100x slower). My goal when writing this was more clarity and simplicity than speed, though I'm obviously not opposed to performance improvements if they don't conflict with those goals.

    I am interested in whether the additional branches don't make performance worse though. My gut feeling is that they impact performance more than comparisons.

  14. murrayn commented at 2:10 AM on April 6, 2018: contributor

    @sipa Thanks for the feedback. I've reworked the code again to reflect your input.

  15. MarcoFalke commented at 7:38 PM on April 11, 2018: member

    To get rid of the merge commit, please squash your commits according to https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits

  16. Tighten up bech32::Decode(); add tests. 60f61f9952
  17. murrayn force-pushed on Apr 13, 2018
  18. murrayn commented at 9:22 AM on April 20, 2018: contributor

    Not sure if this PR is stalled due to the earlier comment about benchmarks...I didn't think benchmarks would be as interesting after my most recent commit, in which the code was more straightforward. Just in case, I have benchmarked:

    if (c >= 'a' && c <= 'z') lower = 1;
    else if (c >= 'A' && c <= 'Z') upper = 1;
    else if (c < 33 || c > 126) return 0;

    versus

    if (c < 33 || c > 126) return 0;
    if (c >= 'a' && c <= 'z') lower = 1;
    if (c >= 'A' && c <= 'Z') upper = 1;

    and my results show the former is significantly faster; however, with -O2 compiler optimization enabled they benchmark identically, which isn't surprising.

  19. promag commented at 10:26 AM on April 26, 2018: member

    utACK 60f61f9.

    Change is slightly better because success checks come first, which is probably the most common case.

  20. MarcoFalke commented at 11:47 AM on April 26, 2018: member

    Please also adjust the OP

  21. murrayn renamed this:
    Tighten up bech32::Decode(); add tests.
    Minor optimizations to bech32::Decode(); add tests.
    on Apr 28, 2018
  22. laanwj commented at 10:08 AM on May 15, 2018: member

    utACK 60f61f99529f54f85c847d61122c70c0358ebecc

  23. laanwj merged this on May 15, 2018
  24. laanwj closed this on May 15, 2018

  25. laanwj referenced this in commit 1d4662f5dc on May 15, 2018
  26. laanwj referenced this in commit 5779dc4f76 on Jun 7, 2018
  27. PastaPastaPasta referenced this in commit 41e71ce854 on Jun 17, 2020
  28. PastaPastaPasta referenced this in commit 28b02f42e1 on Jul 2, 2020
  29. UdjinM6 referenced this in commit ee1aa96286 on May 21, 2021
  30. UdjinM6 referenced this in commit 86ae5fcb64 on May 25, 2021
  31. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 18:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me