Let validateaddress locate error in Bech32 address #16807

pull meshcollider wants to merge 8 commits into bitcoin:master from meshcollider:201909_bech32_error_detection changing 8 files +607 −63
  1. meshcollider commented at 1:14 pm on September 5, 2019: contributor

    Addresses (partially) #16779 - no GUI change in this PR

    Adds a LocateError function the bech32 library, which is then called by validateaddress RPC, (and then eventually from a GUI tool too, future work). I think modifying validateaddress is nicer than adding a separate RPC for this. Includes tests.

    Based on https://github.com/sipa/bech32/blob/master/ecc/javascript/bech32_ecc.js Credit to sipa for that code

  2. meshcollider added the label Wallet on Sep 5, 2019
  3. meshcollider added the label RPC/REST/ZMQ on Sep 5, 2019
  4. meshcollider requested review from sipa on Sep 5, 2019
  5. emilengler commented at 2:48 pm on September 5, 2019: contributor
    See #16779 Concept ACK
  6. DrahtBot commented at 4:17 pm on September 5, 2019: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23545 (scripted-diff: Use clang-tidy syntax for C++ named arguments [WIP; TOO MANY CONFLICTS] by MarcoFalke)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. in src/bech32.cpp:508 in 9d91c4a046 outdated
    381+    int ind = CheckCharacters(str);
    382+    if (ind != -1) {
    383+        error = "Invalid character";
    384+        return ind;
    385+    }
    386+    if (str.size() > 90) {
    


    promag commented at 10:18 pm on September 5, 2019:

    9d91c4a046d990d7f648328dfed982ad38c87ff1

    Should be first test?

  8. in src/bech32.cpp:476 in 9d91c4a046 outdated
    416+        int length = str.size() - 1;
    417+        uint32_t syn = Syndrome(residue);
    418+        int s0 = syn & 0x3FF;
    419+        int s1 = (syn >> 10) & 0x3FF;
    420+        int s2 = syn >> 20;
    421+        int l_s0 = GF1024_LOG[s0];
    


    promag commented at 10:23 pm on September 5, 2019:

    9d91c4a046d990d7f648328dfed982ad38c87ff1

    Could add at the top:

    0static_assert(sizeof(GF1024_LOG) / sizeof(GF1024_LOG[0]) == 2014, "GF1024_LOG length should be 1024");
    

    Same for GF1024_EXP.

  9. in src/bech32.h:29 in 9d91c4a046 outdated
    24@@ -25,6 +25,9 @@ std::string Encode(const std::string& hrp, const std::vector<uint8_t>& values);
    25 /** Decode a Bech32 string. Returns (hrp, data). Empty hrp means failure. */
    26 std::pair<std::string, std::vector<uint8_t>> Decode(const std::string& str);
    27 
    28+/** Return the position of an error in a Bech32 string. If no errors are present, return -1. */
    29+int LocateError(const std::string& str, std::string& error);
    


    promag commented at 10:25 pm on September 5, 2019:

    9d91c4a046d990d7f648328dfed982ad38c87ff1

    Could add some tests to src/test/bech32_tests.cpp?

  10. in src/bech32.cpp:436 in 9d91c4a046 outdated
    431+                    return 0;
    432+                }
    433+                return str.size() - p1 - (p1 >= values.size() ? 2 : 1);
    434+            }
    435+        }
    436+        for (int p1 = 0; p1 < length; p1++) {
    


    promag commented at 10:26 pm on September 5, 2019:

    9d91c4a046d990d7f648328dfed982ad38c87ff1

    nit, ++p1.

  11. in src/rpc/misc.cpp:43 in f877a9b4af outdated
    39@@ -39,6 +40,8 @@ static UniValue validateaddress(const JSONRPCRequest& request)
    40             "  \"iswitness\" : true|false,     (boolean) If the address is a witness address\n"
    41             "  \"witness_version\" : version   (numeric, optional) The version number of the witness program\n"
    42             "  \"witness_program\" : \"hex\"     (string, optional) The hex value of the witness program\n"
    43+            "  \"error\" : \"message\"           (string, optional) The error message if the address is an invalid bech32 address\n"
    


    promag commented at 10:43 pm on September 5, 2019:

    f877a9b4af7da6632430eff898b957107514e6de

    Could include only when prefix matches CChainParams::Bech32HRP?


    meshcollider commented at 11:23 pm on September 5, 2019:
    I thought about this but I wasn’t sure, because one of the checks is that the HRP is present and valid too. You’re probably right, that would be better for the user.
  12. promag commented at 10:44 pm on September 5, 2019: member
    Concept ACK, light review, haven’t compared with reference code.
  13. sipa commented at 10:46 pm on September 5, 2019: member
    Would it be possible to add functions to key_io that provide this functionality for all addresses (with just basic error messages for anything but bech32)? That way the RPC code doesn’t need to directly invoke the bech32 module.
  14. promag commented at 10:55 pm on September 5, 2019: member

    +1 @sipa suggestion, something like

    0CTxDestination DecodeDestination(const std::string& str, std::string& error);
    1CTxDestination DecodeDestination(const std::string& str)
    2{
    3    const std::string& error;
    4    return DecodeDestination(str, error);
    5}
    

    And maybe also embed the error index in the message for now.

  15. meshcollider commented at 5:14 am on September 6, 2019: contributor
    I’ve added boost tests, fixed the above nits, and added a final commit with one implementation of the above idea. I can drop the final commit if it’s not a good approach, but this is the cleanest way I could come up with
  16. laanwj added the label Feature on Sep 30, 2019
  17. DrahtBot added the label Needs rebase on Oct 18, 2019
  18. meshcollider commented at 10:45 am on October 21, 2019: contributor
    Rebased
  19. meshcollider removed the label Needs rebase on Oct 21, 2019
  20. DrahtBot added the label Needs rebase on Oct 28, 2019
  21. Sjors commented at 9:34 am on October 29, 2019: member

    Concept ACK. I would prefer to make address_type an optional hint, and instead guess the address type.

    Distinguishing between base58 and bech32 could be done by counting how many characters of the address are part of the base58/bech32 character set. You can then use error_index to indicate the position of a character that doesn’t belong in the set.

  22. in src/rpc/misc.cpp:35 in 19e9def690 outdated
    28@@ -28,6 +29,7 @@ static UniValue validateaddress(const JSONRPCRequest& request)
    29                 "\nReturn information about the given bitcoin address.\n",
    30                 {
    31                     {"address", RPCArg::Type::STR, RPCArg::Optional::NO, "The bitcoin address to validate"},
    32+                    {"address_type", RPCArg::Type::STR, RPCArg::Optional::OMITTED_NAMED_ARG, "The address type provided, used to detect errors. Options are \"legacy\", \"p2sh-segwit\", and \"bech32\"."},
    


    luke-jr commented at 1:04 am on November 13, 2019:
    Why can’t this be detected from the input?

    meshcollider commented at 7:59 pm on November 22, 2019:
    Because if we assume it has errors in it, then we can’t rely on checks like whether all the characters are bech32 or base58 or whatever. But I like Sjors’ suggestion of counting the characters to find the most likely. I’ll try that when I rebase this.

    luke-jr commented at 3:04 pm on February 13, 2020:
    FWIW, for Knots I am just checking for the Bech32HRP to match

    Sjors commented at 3:33 pm on February 13, 2020:
    That sounds simple enough.

  23. in test/functional/rpc_validateaddress.py:1 in 19e9def690 outdated
    0@@ -0,0 +1,130 @@
    1+#!/usr/bin/env python3
    


    luke-jr commented at 2:45 am on November 13, 2019:
    File needs +x permission
  24. luke-jr commented at 3:02 pm on February 13, 2020: member
    Once this is merged, I have the GUI changes ready to go…
  25. meshcollider removed the label Needs rebase on Feb 19, 2020
  26. DrahtBot added the label Needs rebase on Mar 4, 2020
  27. luke-jr referenced this in commit 8071e0918b on Mar 5, 2020
  28. luke-jr referenced this in commit 7ad94ca9a0 on Mar 6, 2020
  29. gmaxwell commented at 10:22 pm on March 8, 2020: contributor
    Why is this limited to only returning a single error position? Fairly simple code can give up-to two, and it’s not hard to imagine fancier code that can sometimes return more than that.
  30. gmaxwell commented at 8:43 pm on March 15, 2020: contributor
  31. sipa commented at 4:30 am on March 16, 2020: member

    Concept ACK to adding Bech32 error location to validateaddress, but echoing earlier comments:

    • Why only one error? The reference code in https://github.com/sipa/bech32 finds two, and I don’t think just returning one makes things meaningfully simpler. Even if not implemented, the interface should probably return an array of error positions.
    • It feels unnecessary to pass an address type. Just only run the bech32 error location code when the HRP matches (and treat as base58 otherwise).
  32. meshcollider commented at 6:29 am on April 7, 2020: contributor
    Fair points, I’ll modify to return more than one error. Re. address type, we discussed it a while back in IRC, I can’t remember if there was a reason we didn’t base it solely on HRP or not, but I am happy to do it that way if you want because I can’t see a real downside now. I still have to fix the current error and rebase too.
  33. luke-jr referenced this in commit 8f8cfa64a4 on Jun 9, 2020
  34. in test/functional/rpc_validateaddress.py:112 in 54e107add4 outdated
    107+
    108+        # Legacy address with one error
    109+        address = "muehQnn2P5NhiRCGg9HewAWrXLRy2d7TZE"
    110+        res = self.nodes[0].validateaddress(address, "legacy")
    111+        assert_equal(res['isvalid'], False)
    112+        assert_equal(res['error'], "Invalid checksum for Base58 address")
    


    adamjonas commented at 4:33 pm on June 22, 2020:
    This legacy address produces the error “Invalid Base58 character in address”.
  35. in test/functional/rpc_validateaddress.py:125 in 54e107add4 outdated
    120+
    121+        # Legacy address with incorrect prefix for current network
    122+        address = "1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2"
    123+        res = self.nodes[0].validateaddress(address, "legacy")
    124+        assert_equal(res['isvalid'], False)
    125+        assert_equal(res['error'], "Invalid prefix for Base58 address")
    


    adamjonas commented at 4:33 pm on June 22, 2020:
    Same here - this address produces the error “Invalid Base58 character in address”.
  36. adamjonas commented at 4:35 pm on June 22, 2020: member
    A couple of touch-ups to the tests needed when you rebase.
  37. fanquake added the label Waiting for author on Sep 15, 2020
  38. luke-jr referenced this in commit c362513f6d on Dec 9, 2020
  39. Sjors commented at 12:27 pm on January 6, 2021: member
    Needs rebase
  40. felipsoarez commented at 1:26 pm on January 6, 2021: none
    Concept ACK, light review, haven’t compared with reference code.
  41. meshcollider removed the label Needs rebase on Apr 20, 2021
  42. meshcollider removed the label Waiting for author on Apr 20, 2021
  43. meshcollider commented at 8:03 am on April 20, 2021: contributor
    Rebased + reworked to include Bech32m and to allow multiple error indices to be returned as discussed above.
  44. meshcollider commented at 8:47 am on April 20, 2021: contributor
    Rebased again on 21736 to fix CI
  45. MarcoFalke commented at 10:09 am on April 20, 2021: member

    From ci:

    AssertionError: not(Invalid HRP or Base58 character in address == Invalid address format)

  46. in src/key_io.cpp:290 in 56b90c86d7 outdated
    285+        std::vector<unsigned char> data;
    286+        if (!DecodeBase58(str, data, 21)) {
    287+            return "Invalid HRP or Base58 character in address";
    288+        }
    289+        if (!DecodeBase58Check(str, data, 21)) {
    290+            return "Invalid checksum for Base58 address";
    


    Sjors commented at 7:42 am on April 22, 2021:
    56b90c86d7d4c1a353cc365126a7900b5bb61da9 I’m having a hard time reaching this error condition (when running this commit). E.g. just swapping a random character tends to result in Invalid HRP or Base58 character in address rather than a checksum error. It might be useful to introduce rpc_validate_address.py earlier and add some base58 examples to it.
  47. in src/rpc/misc.cpp:49 in 56b90c86d7 outdated
    45@@ -46,6 +46,7 @@ static RPCHelpMan validateaddress()
    46                         {RPCResult::Type::NUM, "witness_version", /* optional */ true, "The version number of the witness program"},
    47                         {RPCResult::Type::STR_HEX, "witness_program", /* optional */ true, "The hex value of the witness program"},
    48                         {RPCResult::Type::STR, "error", /* optional */ true, "Error message, if any"},
    49+                        {RPCResult::Type::STR, "error_locations", /* optional */ true, "Indices of errors locations in address, if known (e.g. Bech32 errors)"},
    


    Sjors commented at 7:44 am on April 22, 2021:
    56b90c86d7d4c1a353cc365126a7900b5bb61da9 nit: it might be better to introduce this result field, as well as std::vector<int>& error_locations) in the next commit.
  48. Sjors commented at 8:15 am on April 22, 2021: member

    3bc568d looks mostly good and works.

    I don’t fully understand the moon math involved in the error position detection, but it seems to work and it’s run if and only if the regular decoding fails.

  49. in test/functional/rpc_validateaddress.py:19 in 3bc568d675 outdated
    14+        # Valid address
    15+        address = "bcrt1q049ldschfnwystcqnsvyfpj23mpsg3jcedq9xv"
    16+        res = self.nodes[0].validateaddress(address)
    17+        assert_equal(res['isvalid'], True)
    18+        assert 'error' not in res
    19+        assert 'error_locations' not in res
    


    achow101 commented at 10:17 pm on May 11, 2021:

    In 3bc568d67537e1725ea0d7472412fe9e4331d01c “Add functional test for validateaddress”

    A lot of this test is repetitive, would be nice to add a check function (maybe self.assert_valid and self.assert_invalid) that you can just give it all of the different scenarios rather than copying the same 5 lines.

  50. achow101 commented at 10:32 pm on May 11, 2021: member
    I’m not sure that I agree with having separate functions for locating the errors that callers need to call. Rather I think it would be better to modify the decode functions themselves to locate the errors too when it fails to decode. This way we aren’t duplicating code and we don’t have to try to figure out where decode failed after the fact.
  51. luke-jr referenced this in commit 86f44b46e6 on Jun 27, 2021
  52. luke-jr referenced this in commit bd1f5bd793 on Jun 27, 2021
  53. luke-jr referenced this in commit abc3e57161 on Jun 27, 2021
  54. luke-jr referenced this in commit f1a9c3cffe on Jun 27, 2021
  55. meshcollider commented at 1:45 am on October 1, 2021: contributor
    Pushed a substantial rebase + rework. Addresses all comments and suggestions above. Notably this now makes DecodeAddress return the error locations rather than creating a separate function as @achow101 suggested.
  56. achow101 commented at 7:51 pm on October 1, 2021: member

    ACK 1414005c5b24f13bb4da850c62c7c3cdc6436ad1

    Reviewed everything except the actual bech32(m) error location code as I currently don’t understand the math there. Also tested manually with a few addresses.

  57. in src/bech32.cpp:106 in 1e5709ef4a outdated
    101+    148, 260, 658, 385, 287, 262, 204, 126, 586, 1004, 236, 165, 854, 411,
    102+    932, 560, 19, 215, 1002, 775, 653, 928, 901, 964, 884, 798, 839, 786,
    103+    433, 610, 116, 855, 180, 479, 910, 1014, 599, 915, 905, 306, 516, 731,
    104+    626, 978, 825, 344, 605, 654, 209
    105+};
    106+static_assert(sizeof(GF1024_EXP) / sizeof(GF1024_EXP[0]) == 1023, "GF1024_EXP length should be 1023");
    


    sipa commented at 7:53 pm on October 1, 2021:
    In C++17 you can use std::size(GF1024_EXP) == 1023.
  58. in src/bech32.cpp:469 in 1e5709ef4a outdated
    464+        }
    465+        values[i - pos - 1] = rev;
    466+    }
    467+
    468+    // Base the polymod value off the witness version
    469+    Encoding encoding = values[0] > 0 ? Encoding::BECH32M : Encoding::BECH32;
    


    sipa commented at 7:56 pm on October 1, 2021:

    What if there is a typo in the witness version character?

    BIP350 actually spells out a strategy for locating errors in this case: https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki#addresses-for-segregated-witness-outputs

  59. in src/rpc/misc.cpp:54 in 9c0406a611 outdated
    50@@ -51,6 +51,10 @@ static RPCHelpMan validateaddress()
    51                         {RPCResult::Type::NUM, "witness_version", /* optional */ true, "The version number of the witness program"},
    52                         {RPCResult::Type::STR_HEX, "witness_program", /* optional */ true, "The hex value of the witness program"},
    53                         {RPCResult::Type::STR, "error", /* optional */ true, "Error message, if any"},
    54+                        {RPCResult::Type::ARR, "error_locations", "Indices of error locations in address, if known (e.g. Bech32 errors)",
    


    sipa commented at 7:57 pm on October 1, 2021:
    Maybe say “likely error locations”; if you have too many errors, there is no telling where they are.
  60. meshcollider commented at 0:55 am on October 2, 2021: contributor
    Addressed @sipa’s comments, thanks!
  61. DrahtBot added the label Needs rebase on Oct 11, 2021
  62. meshcollider commented at 12:36 pm on October 11, 2021: contributor
    Rebased 974227bb4576288795984085387ae0802fc3066e -> befad5eef168a193b142c4e6a1d0f120a85aae65 to fix conflict with #22794
  63. meshcollider removed the label Needs rebase on Oct 11, 2021
  64. meshcollider commented at 12:40 pm on October 11, 2021: contributor
    Also added a new commit 9048249bf003891573e60bf9b8c8ddd4a4cefa1a (“Add lots of comments to Bech32”) which, as the name suggests, adds a lot of explanatory comments to the Bech32 code to assist reviewers and better document the behaviour - up til this point, there was no documentation of how this process works or how to verify the constants.
  65. DrahtBot commented at 12:40 pm on October 11, 2021: member
    Btw, if you leave the “Needs rebase” label for me to remove, I will also remove all my previous rebase comments, keeping the discussion thread clean.
  66. meshcollider deleted a comment on Oct 11, 2021
  67. meshcollider requested review from sipa on Oct 11, 2021
  68. meshcollider commented at 10:42 pm on October 11, 2021: contributor
    @gmaxwell you may also like to review 🙂
  69. More detailed error checking for base58 addresses 0b06e720c0
  70. Add Bech32 error location function b62b67e06c
  71. Add error_locations to validateaddress RPC 02a7bdee42
  72. Add boost tests for bech32 error detection c4979f77c1
  73. Refactor and add more tests for validateaddress 42d6a029e5
  74. Add release notes for validateaddress Bech32 error detection 2eb5792ec7
  75. Add lots of comments to Bech32 5599813b80
  76. Modify copyright header on Bech32 code 88cc481092
  77. MarcoFalke added the label Needs rebase on Oct 12, 2021
  78. DrahtBot removed the label Needs rebase on Oct 12, 2021
  79. in src/bech32.cpp:322 in b62b67e06c outdated
    317+    for (int i = from; i < to; i++) {
    318+        vec.push_back(i);
    319+    }
    320+}
    321+
    322+/** Return index of first invalid character in a Bech32 string. */
    


    ryanofsky commented at 5:14 pm on October 20, 2021:

    In commit “Add Bech32 error location function” (b62b67e06cc406fdad68da4c091168fb5f11c1d4)

    Comment is out of date

  80. in src/bech32.cpp:466 in b62b67e06c outdated
    461+        if (rev == -1) {
    462+            error_locations.push_back(i);
    463+            return "Invalid Base 32 character";
    464+        }
    465+        values[i - pos - 1] = rev;
    466+    }
    


    ryanofsky commented at 5:21 pm on October 20, 2021:

    In commit “Add Bech32 error location function” (b62b67e06cc406fdad68da4c091168fb5f11c1d4)

    These lines could be moved into a function and deduplicated with lines 415-424 above.

    Maybe not worth it because a PR that affects address decoding is more dangerous than a PR that just changes error detection.

  81. in src/bech32.cpp:39 in 5599813b80 outdated
    34+// The defining polynomial of the extension is x^2 + 9x + 23
    35+// Let (e) be a primitive element of GF(1024), that is, a generator of the field.
    36+// Every non-zero element of the field can then be represented as (e)^k for some power k.
    37+// The array GF1024_EXP contains all these powers of (e) - GF1024_EXP[k] = (e)^k in GF(1024).
    38+// Conversely, GF1024_LOG contains the discrete logarithms of these powers, so
    39+// GF1024_LOG[GF1024_EXP[k]] == k
    


    ryanofsky commented at 7:19 pm on October 20, 2021:

    In commit “Add lots of comments to Bech32” (5599813b80e53a1539c66625b4320ab1b4fb4848)

    Could write a unit test or constexpr compile time test to check this


    meshcollider commented at 8:07 am on October 21, 2021:
    This would be simple to add yep 👍

    sipa commented at 10:37 pm on November 16, 2021:
    Even nicer would be to have the tables be generated at compile time using a constexpr algorithm…
  82. in src/key_io.cpp:105 in 0b06e720c0 outdated
    101@@ -98,15 +102,27 @@ CTxDestination DecodeDestination(const std::string& str, const CChainParams& par
    102             return ScriptHash(hash);
    103         }
    104 
    105-        // Set potential error message.
    106-        // This message may be changed if the address can also be interpreted as a Bech32 address.
    107-        error_str = "Invalid prefix for Base58-encoded address";
    108+        if (!std::equal(script_prefix.begin(), script_prefix.end(), data.begin()) &&
    


    ryanofsky commented at 10:10 pm on October 20, 2021:

    In commit “More detailed error checking for base58 addresses” (0b06e720c0182dee8b560d2e8d3891b036f63ea7)

    Is there something which guarantees data size is greater or equal to script_prefix and pubkey_prefix sizes? If not then, it seems like this can read uninitialized memory.

    Would suggest either adding assert’s to make it clear these cases won’t happen, or else adding size checks to the if statement to handle these cases.


    sipa commented at 9:50 pm on November 16, 2021:
    @ryanofsky Agree, I think this needs a length check.

    MarcoFalke commented at 1:14 pm on November 25, 2021:

    Error: attempt to subscript a past-the-end iterator 1 step from its current position, which falls outside its dereferenceable range.

    https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=41364

  83. in src/key_io.cpp:115 in 0b06e720c0 outdated
    113+        }
    114+        return CNoDestination();
    115+    } else if (!is_bech32) {
    116+        // Try Base58 decoding without the checksum, using a much larger max length
    117+        if (!DecodeBase58(str, data, 100)) {
    118+            error_str = "Invalid HRP or Base58 character in address";
    


    ryanofsky commented at 10:19 pm on October 20, 2021:

    In commit “More detailed error checking for base58 addresses” (0b06e720c0182dee8b560d2e8d3891b036f63ea7)

    IMO previous error message “Invalid address format” seems more understandable if you aren’t an address format guru than “Invalid HRP or Base58 character in address”. Maybe consider keeping the existing message or switching to something in between like “Not a valid BECH32 or Base58 encoding”

  84. in src/bech32.cpp:478 in b62b67e06c outdated
    473+        if (residue != 0) {
    474+            uint32_t syn = Syndrome(residue);
    475+            int s0 = syn & 0x3FF;
    476+            int s1 = (syn >> 10) & 0x3FF;
    477+            int s2 = syn >> 20;
    478+            int l_s0 = GF1024_LOG[s0];
    


    ryanofsky commented at 11:09 pm on October 20, 2021:

    In commit “Add Bech32 error location function” (b62b67e06cc406fdad68da4c091168fb5f11c1d4)

    This code doesn’t seem particularly performance critical. Maybe it would be safer to use bounds checked array lookups instead of direct pointer accesses.


    sipa commented at 10:34 pm on November 16, 2021:
    @ryanofsky Alternatively, would it help if there was an (additional) & 0x3FF inside the []?

    ryanofsky commented at 1:20 am on November 23, 2021:

    re: #16807 (review)

    @ryanofsky Alternatively, would it help if there was an (additional) & 0x3FF inside the []?

    I think most ideal thing would be for it to use something like std::array::at and throw an exception in case of a bug. Alternative of adding & 0x3FF to lookups could help avoid undefined behavior, but also clutter the code and mask potential bugs. What’s best just depends on what your priorities are. No strong opinions from me!

  85. in src/test/bech32_tests.cpp:73 in c4979f77c1 outdated
    67@@ -68,10 +68,36 @@ BOOST_AUTO_TEST_CASE(bech32_testvectors_invalid)
    68         "1qzzfhee",
    69         "a12UEL5L",
    70         "A12uEL5L",
    71+        "abcdef1qpzrz9x8gf2tvdw0s3jn54khce6mua7lmqqqxw",
    72     };
    73+    static const std::pair<std::string, int> ERRORS[] = {
    


    ryanofsky commented at 11:18 pm on October 20, 2021:

    In commit “Add boost tests for bech32 error detection” (c4979f77c1264f0099d1dfa278b1d9c18340b5f9)

    Could be good to static_assert that ERRORS and CASES arrays are the same size.

  86. in src/test/bech32_tests.cpp:99 in c4979f77c1 outdated
    94         BOOST_CHECK(dec.encoding == bech32::Encoding::INVALID);
    95+        std::vector<int> error_locations;
    96+        std::string error = bech32::LocateErrors(str, error_locations);
    97+        BOOST_CHECK_EQUAL(err.first, error);
    98+        if (err.second == -1) BOOST_CHECK(error_locations.empty());
    99+        else BOOST_CHECK_EQUAL(err.second, error_locations[0]);
    


    ryanofsky commented at 2:12 am on October 21, 2021:

    In commit “Add boost tests for bech32 error detection” (c4979f77c1264f0099d1dfa278b1d9c18340b5f9)

    Could be good to BOOST_CHECK_EQUAL(error_locations.size(), 1)).

    Could also be good to add cases with size > 1 here, but less important since python tests cover some of these.

    Same suggestions could apply to bech32m test below

  87. in src/bech32.cpp:574 in 5599813b80 outdated
    570+                // Compute the error position p1 as l_s1 - l_s0 = p1 (mod 1023)
    571+                size_t p1 = (l_s1 - l_s0 + 1023) % 1023; // the +1023 ensures it is positive
    572+                // Now because s0 = e1*(e)^(997*p1), we get e1 = s0/((e)^(997*p1)). Remember that (e)^1023 = 1,
    573+                // so 1/((e)^997) = (e)^(1023-997).
    574                 int l_e1 = l_s0 + (1023 - 997) * p1;
    575+                // Finally, some sanity checks on the result:
    


    ryanofsky commented at 2:29 am on October 21, 2021:

    In commit “Add lots of comments to Bech32” (5599813b80e53a1539c66625b4320ab1b4fb4848)

    If check below is supposed to be a sanity check, shouldn’t there be an assert, or a log, or a push_back(-1), or some other indication if it fails?

    Same comment applies to other sanity checks and continue statements below. The descriptions here seem like they could be clearer about which of these conditions are possible to hit and which should be impossible.

  88. in src/bech32.cpp:530 in b62b67e06c outdated
    525+
    526+        if (error_locations.empty() || (!possible_errors.empty() && possible_errors.size() < error_locations.size())) {
    527+            error_locations = std::move(possible_errors);
    528+        }
    529+    }
    530+    return "Invalid checksum";
    


    ryanofsky commented at 2:49 am on October 21, 2021:

    In commit “Add Bech32 error location function” (b62b67e06cc406fdad68da4c091168fb5f11c1d4)

    Might be nice if this mentioned the encoding that was detected. I.e. add an optional<Encoding> error_encoding variable, and set:

    0error_locations = std::move(possible_errors);
    1error_encoding = encoding;
    

    and return:

    0return error_encoding == BECH32M ? "Invalid bech32m checksum" : error_encoding == BECH32 ? "Invalid bech32 checksum" : "Invalid checksum"
    
  89. in doc/release-notes-16807.md:4 in 88cc481092
    0@@ -0,0 +1,6 @@
    1+Updated RPCs
    2+------------
    3+
    4+- The `validateaddress` RPC now optionally returns an `error_locations` array, with the indices of
    


    ryanofsky commented at 2:57 am on October 21, 2021:

    In commit “Add release notes for validateaddress Bech32 error detection” (2eb5792ec7bbeaf7138420b6c85c5cd0a0404946)

    Maybe mention that validateaddress returns new more specific error strings if an address can’t be decoded.

    Also “optionally returns” here seems a little misleading since this isn’t an option the caller can control. Would maybe just say it returns error locations whenever it’s possible to determine them.

  90. ryanofsky approved
  91. ryanofsky commented at 3:12 am on October 21, 2021: member

    Code review ACK 88cc4810926e4f5af6757ee1b0eed61abda3d746 with caveat that I only checked the new LocateErrors code to try to verify it didn’t have unsafe or unexpected operations or loop forever or crash. Did not try to verify behavior corresponds to the spec. In the worst case bugs here should just affect error messages not actual decoding of addresses so this seemed ok.

    I left various review suggestions, but feel to ignore them or just use ones that seem useful.

    I do think it should be possible to add more test coverage. I think it would be nice to have a fuzz test that would encode a random address, decode it and verify the address was the same, introduce random errors, and verify the errors were detected at the right locations. Also maybe there could be unit tests or constexpr checks that verify the contents of the new GF1024_EXP / GF1024_LOG arrays.

  92. meshcollider commented at 8:10 am on October 21, 2021: contributor
    Thanks for the detailed review and suggestions @ryanofsky. I’ll definitely either incorporate them here or in a follow-up :)
  93. laanwj added this to the "Blockers" column in a project

  94. in src/key_io.h:26 in 88cc481092
    22@@ -23,7 +23,7 @@ std::string EncodeExtPubKey(const CExtPubKey& extpubkey);
    23 
    24 std::string EncodeDestination(const CTxDestination& dest);
    25 CTxDestination DecodeDestination(const std::string& str);
    26-CTxDestination DecodeDestination(const std::string& str, std::string& error_msg);
    27+CTxDestination DecodeDestination(const std::string& str, std::string& error_msg, std::vector<int>* error_locations = nullptr);
    


    promag commented at 8:08 am on October 25, 2021:
    Consider another DecodeDestination overload to avoid pointer to vector: https://github.com/promag/bitcoin/commit/94ca6b403e221f12830ea50158547f1142c67230

    ryanofsky commented at 1:24 pm on October 25, 2021:
    Suggestion seems fine, but no reasoning is given, and it’s not clearly an improvement. Google’s “Use non-const pointers to represent optional outputs” which is what current code is doing does make it easier to distinguish output arguments from input arguments at call sites.

    laanwj commented at 11:11 am on November 22, 2021:
    Agree with @ryanofsky here. References are hard to read as output arguments. If you’re going to make a separate function declaration, I’d prefer a function that returns it as part of the return value. E.g. as a pair or tuple. But this is fine as-is.
  95. in src/bech32.cpp:315 in b62b67e06c outdated
    310 inline unsigned char LowerCase(unsigned char c)
    311 {
    312     return (c >= 'A' && c <= 'Z') ? (c - 'A') + 'a' : c;
    313 }
    314 
    315+void push_range(int from, int to, std::vector<int>& vec)
    


    sipa commented at 10:12 pm on November 16, 2021:

    Instead of this function, it may be worth just using std::iota.

    The function is equivalent to:

    0void push_range(int from, int to, std::vector<int>& vec)
    1{
    2    vec.resize(to - from);
    3    std::iota(vec.begin(), vec.end(), from);
    4}
    

    But I suggest just doing that directly in the call sites.

  96. in src/bech32.cpp:469 in b62b67e06c outdated
    464+        }
    465+        values[i - pos - 1] = rev;
    466+    }
    467+
    468+    // We attempt error detection with both bech32 and bech32m, and choose the one with the fewest errors
    469+    // We can't simply use the segwit version, because that may be one of the errors
    


    sipa commented at 10:33 pm on November 16, 2021:
    Well, you can reject (and not return) candidates for which the checksum type mismatches the witness version (bech32 for 0, bech32m for 1+) AND the character that identifies the witness version is not one of the error locations.
  97. w0xlt approved
  98. w0xlt commented at 7:39 am on November 21, 2021: contributor

    tACK 88cc481

    The command below returns the “error_locations” field on the PR branch.

    0./src/bitcoin-cli validateaddress bc1qctxsy54qjhk0dl5ea0pd2xtth2qq9n78wlk39m
    

    Master branch:

    0{
    1  "isvalid": false,
    2  "error": "Invalid address format"
    3}
    

    PR Branch:

    0{
    1  "isvalid": false,
    2  "error_locations": [
    3    29
    4  ],
    5  "error": "Invalid checksum"
    6}
    
  99. laanwj commented at 12:17 pm on November 22, 2021: member
    Code review and manually tested ACK 88cc4810926e4f5af6757ee1b0eed61abda3d746 Thanks for adding all the bech32 comments.
  100. laanwj merged this on Nov 22, 2021
  101. laanwj closed this on Nov 22, 2021

  102. laanwj removed this from the "Blockers" column in a project

  103. sipa commented at 1:38 pm on November 22, 2021: member
    These are some review comments I had queued up. I’ll go through the rest still, and perhaps open a PR for any serious ones I still find.
  104. sidhujag referenced this in commit 6b1aa00ff1 on Nov 22, 2021
  105. meshcollider commented at 0:09 am on November 23, 2021: contributor
    Thanks for the review comments (and the merge)! I will open a follow-up PR today.
  106. meshcollider deleted the branch on Nov 23, 2021
  107. sidhujag referenced this in commit d0a559def5 on Nov 23, 2021
  108. laanwj referenced this in commit 22feb7fee9 on Dec 6, 2021
  109. sidhujag referenced this in commit 67b14190f6 on Dec 7, 2021
  110. DrahtBot locked this on Nov 25, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-08 22:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me