test: Support decoding segwit address in address_to_scriptpubkey() #27269

pull ismaelsadeeq wants to merge 2 commits into bitcoin:master from ismaelsadeeq:add-bech32-testnet-address-to-scriptpubkey changing 2 files +34 −1
  1. ismaelsadeeq commented at 4:52 PM on March 16, 2023: member

    rpc_scantxoutset.py sendtodestination only sends to legacy addresses and scriptPubkeys because wallet.py address_to_scriptpubkey does not support conversion of segwit address.

    This update enables address_to_scriptpubkey to support the conversion of testnet segwit addresses to scriptPubkeys.

    This change will enable rpc_scantxoutset.py ScantxoutsetTest to have more test coverage by adding more sendtodestination calls with bech32 and bech32m testnet addresses, then test the bech32 and bech32m derivation subsets UTXO amount in Test extended key derivation.

    I will add the test coverage in a subsequent Pull request.

  2. DrahtBot commented at 4:52 PM on March 16, 2023: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK josibake, theStack, willcl-ark

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Mar 16, 2023
  4. willcl-ark commented at 9:43 AM on March 17, 2023: member

    Concept ACK

    Thanks for working on this. Increasing the address to scriptpubkey decoding tests to include bech32 seems desirable. I think this would mean that we could remove the todo on L427 of wallet.py too?

    The way the equivalent base58 function test_base58encodedecode() is implemented is slightly different from the new test_bech32_decode():

    1. It checks two functions simultaneously, base58_to_bytes() and byte_to_base58(), whereas your first commit only adds bech32_to_bytes() and not the equivalent byte_to_bech32().
    2. The test vectors provided to the new test are raw address strings (and raw bytes) vs hex-encoded bytestrings in the base58 version.

    I think for completion it could be worth adding the equivalent byte_to_bech32() function and matching the test vector format used in the base58 checks, seeing as you are here making this change now.

    Woudl be curious what others think on this though...

  5. ismaelsadeeq commented at 12:10 PM on March 17, 2023: member

    Yes, I agree the todo can be removed since there are no other address types to add.

    • I think there is no need to add byte_to_bech32(), there is an existing encode_segwit_address() function that does the same thing.
    • I can use bech32_to_bytes() and encode_segwit_address() with hex-encoded byte string and version as test vectors in test_bech32_decode() just like in test_base58encodedecode().
  6. ismaelsadeeq force-pushed on Mar 17, 2023
  7. ismaelsadeeq commented at 12:41 PM on March 17, 2023: member

    Thanks for reviewing, I made the update.

  8. in test/functional/test_framework/address.py:24 in afd7ea6726 outdated
      19 | @@ -20,8 +20,8 @@
      20 |      sha256,
      21 |      taproot_construct,
      22 |  )
      23 | -from .segwit_addr import encode_segwit_address
      24 |  from .util import assert_equal
      25 | +from test_framework.segwit_addr import (encode_segwit_address, bech32_decode, convertbits)
    


    josibake commented at 9:31 AM on March 20, 2023:

    per the functional test README:

    - If more than one name from a module is needed, use lexicographically sorted multi-line imports
      in order to reduce the possibility of potential merge conflicts.
    

    so this should be

    from test_framework.segwit_addr import (
        bech32_decode,
        convertbits,
        encode_segwit_address,
    )
    

    josibake commented at 10:23 AM on March 20, 2023:

    also, if you end up taking my suggestion to use decode_segwit_address, you can remove the bech32_decode, convertbits imports


    ismaelsadeeq commented at 11:26 AM on March 20, 2023:

    Yes, I agree with that, also I will use decode_segwit_address. removing the imports.

  9. in test/functional/test_framework/address.py:165 in afd7ea6726 outdated
     157 | @@ -158,6 +158,11 @@ def check_script(script):
     158 |          return script
     159 |      assert False
     160 |  
     161 | +def bech32_to_bytes(address):
     162 | +    encoding, hrp, data = bech32_decode(address)
     163 | +    payload = bytes(convertbits(data[1:], 5, 8, False))
     164 | +    version = data[0]
     165 | +    return version, payload
    


    josibake commented at 10:19 AM on March 20, 2023:

    since you are already using the encode_segwit_address function, it might be better to also use the decode_segwit_address function here instead of duplicating the logic.

    something like:

        hrp = address[:2]
        version, payload = decode_segwit_address(hrp, address)
        return version, bytearray(payload)
    

    ismaelsadeeq commented at 11:29 AM on March 20, 2023:

    Thanks for pointing that out :100:

  10. in test/functional/test_framework/address.py:198 in afd7ea6726 outdated
     193 | +        check_bech32_decode(bytes.fromhex('39cf8ebd95134f431c39db0220770bd127f5dd3cc103c988b7dcd577ae34e354'), 1)
     194 | +        check_bech32_decode(bytes.fromhex('708244006d27c757f6f1fc6f853b6ec26268b727866f7ce632886e34eb5839a3'), 1)
     195 | +        check_bech32_decode(bytes.fromhex('616211ab00dffe0adcb6ce258d6d3fd8cbd901e2'), 0)
     196 | +        check_bech32_decode(bytes.fromhex('b6a7c98b482d7fb21c9fa8e65692a0890410ff22'), 0)
     197 | +        check_bech32_decode(bytes.fromhex('f0c2109cb1008cfa7b5a09cc56f7267cd8e50929'), 0)
     198 | +
    


    josibake commented at 10:19 AM on March 20, 2023:

    styling nit: remove the extra line at the end of the file

  11. in test/functional/test_framework/address.py:192 in afd7ea6726 outdated
     180 | @@ -176,3 +181,18 @@ def check_base58(data, version):
     181 |          check_base58(bytes.fromhex('0041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     182 |          check_base58(bytes.fromhex('000041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     183 |          check_base58(bytes.fromhex('00000041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     184 | +
    


    josibake commented at 10:22 AM on March 20, 2023:

    styling nit: two blank lines between functions, per PEP8 guidlines


    ismaelsadeeq commented at 11:47 AM on March 20, 2023:

    functions in the codebase have one blank line in between.


    josibake commented at 11:52 AM on March 20, 2023:

    This is because a lot of code was written before we introduced a style guide, or because it was missed during review. If something is not covered by the style guide, then we should try to be consistent with the surrounding code. If something is covered by the style guide, we should be consistent with the style guide, even if that is not consistent with the surrounding code. This helps nudge the codebase towards the style guide over time.


    ismaelsadeeq commented at 11:58 AM on March 20, 2023:

    Thanks


    willcl-ark commented at 11:34 AM on March 24, 2023:

    I may be mistaken but I believe that actually this should have 1 line gap, as it's a member function? But its OK, I wouldn't change and invalidate the ACKS you've got already! (I configured my editor to show PEP8 diagnostics so I can view them directly as I'm editing).

  12. in test/functional/test_framework/address.py:165 in afd7ea6726 outdated
     157 | @@ -158,6 +158,11 @@ def check_script(script):
     158 |          return script
     159 |      assert False
     160 |  
     161 | +def bech32_to_bytes(address):
    


    josibake commented at 10:22 AM on March 20, 2023:

    styling nit: two blank lines between functions, per PEP8 guidelines


    ismaelsadeeq commented at 11:28 AM on March 20, 2023:

    I think for consistency in the code base it's okay. There is only one blank line between functions in address.py


    josibake commented at 11:49 AM on March 20, 2023:

    The goal is to have the code base consistent with our style guide, not consistent with itself. This means any new code should adhere to the style guide and, if possible, old code should be formatted consistent with the style guide if it's already being touched as part of a PR.


    ismaelsadeeq commented at 11:52 AM on March 20, 2023:

    I understand.

  13. in test/functional/test_framework/wallet.py:54 in f0426fe104 outdated
      50 | @@ -50,6 +51,7 @@
      51 |      key_to_p2wpkh_script,
      52 |      keyhash_to_p2pkh_script,
      53 |      scripthash_to_p2sh_script,
      54 | +    program_to_witness_script
    


    josibake commented at 10:26 AM on March 20, 2023:

    styling nit: should be lexicographically sorted and add a comma at the end of each line (even if it is the last one)


    ismaelsadeeq commented at 11:36 AM on March 20, 2023:

    Thanks. :+1:

  14. josibake commented at 10:32 AM on March 20, 2023: member

    Concept ACK

    Thanks for tackling a TODO! I'm guessing there are other places in the functional tests where we would want coverage for more than just base58 addresses, so definitely a worthwhile change. Overall, looks good! I left a few suggestions and style comments in my review. Mainly:

    • I think for consistency you can use the decode_segwit_address function from segwit_addr.py. This avoids having the logic for decoding bech32 addresses in multiple places.
    • Even though a lot of these files don't adhere to PEP8 guidelines, we want to make sure any new code introduced does follow the guidelines. What I typically do is run flake8 on the file I plan to work on before making any changes and then run flake8 again after my changes to make sure all of my new changes follow PEP8 guidelines
  15. ismaelsadeeq force-pushed on Mar 20, 2023
  16. in test/functional/test_framework/address.py:25 in 479e3e7ef5 outdated
      19 | @@ -20,8 +20,11 @@
      20 |      sha256,
      21 |      taproot_construct,
      22 |  )
      23 | -from .segwit_addr import encode_segwit_address
      24 |  from .util import assert_equal
      25 | +from test_framework.segwit_addr import (
      26 | +    encode_segwit_address,
    


    josibake commented at 11:55 AM on March 20, 2023:

    style nit: these should be lexicographically sorted


    ismaelsadeeq commented at 11:59 AM on March 20, 2023:

    Okay :+1:

  17. ismaelsadeeq force-pushed on Mar 20, 2023
  18. josibake commented at 12:08 PM on March 20, 2023: member

    ACK https://github.com/bitcoin/bitcoin/commit/9aac917c6e7fae48ceae5fe786c412a583e46d3e

    Thanks for incorporating the feedback and addressing the style comments! Looks good.

  19. in test/functional/test_framework/wallet.py:419 in 9aac917c6e outdated
     415 | @@ -414,11 +416,13 @@ def getnewdestination(address_type='bech32m'):
     416 |  
     417 |  def address_to_scriptpubkey(address):
     418 |      """Converts a given address to the corresponding output script (scriptPubKey)."""
     419 | +    if address[:4] == 'tb1q': # testnet segwit
    


    josibake commented at 12:05 PM on March 21, 2023:

    thinking about this a bit more, I think we should have a more robust way of checking this. iirc, taproot would be tb1p on testnet, so this would not work for segwit v1 addresses.

    Maybe instead you could do something like version, payload = bech32_to_bytes and then check the version. If the version is None, then move on to payload, version = base58_to_byte etc.


    ismaelsadeeq commented at 11:31 AM on March 22, 2023:

    Thanks for the thorough review Josi. I will fix that. This means I have to check in bech32_to_bytes version is None or not so as not to call bytearray(payload) with a None value.

  20. ismaelsadeeq force-pushed on Mar 22, 2023
  21. in test/functional/test_framework/address.py:169 in becdc604b5 outdated
     161 | @@ -159,6 +162,14 @@ def check_script(script):
     162 |      assert False
     163 |  
     164 |  
     165 | +def bech32_to_bytes(address):
     166 | +    hrp = address[:2]
     167 | +    version, payload = decode_segwit_address(hrp,address)
     168 | +    if version is None:
     169 | +        return None, None
    


    josibake commented at 12:50 PM on March 22, 2023:

    there should be a space between function arguments and I'd return a tuple here to stay consistent with the other functions in the test_framework/segwit_addr.py file:

        version, payload = decode_segwit_address(hrp, address)
        if version is None:
            return (None, None)
    

    ismaelsadeeq commented at 1:02 PM on March 22, 2023:

    Thanks :+1:

  22. ismaelsadeeq force-pushed on Mar 22, 2023
  23. josibake commented at 1:07 PM on March 22, 2023: member
  24. fanquake requested review from theStack on Mar 22, 2023
  25. in test/functional/test_framework/address.py:166 in fabd16645f outdated
     161 | @@ -159,6 +162,14 @@ def check_script(script):
     162 |      assert False
     163 |  
     164 |  
     165 | +def bech32_to_bytes(address):
     166 | +    hrp = address[:2]
    


    theStack commented at 12:55 AM on March 23, 2023:

    This unfortunately doesn't lead to the correct result for regtest segwit addresses, as those have the four-character HRP 'bcrt'. To make it work for addresses on all networks, I'd suggest to:

    1. finding the HRP by extracting the string before the bech32 separator character '1' (either the find() or .split() method on strings should be handy for this)
    2. assert that the extracted HRP matches our supported segwit addresses (i.e. one of bc, tb, bcrt).

    ismaelsadeeq commented at 8:44 AM on March 23, 2023:

    Thank you for your valuable feedback.

    • I will use .split('1') on the address to extract the hrp, and also assert to verify if the hrp is valid as you noted.
    • I will have to modify address_to_scriptpubkey to check for base58 address first before calling bech32_to_bytes or it will assert false.

    I hope this addresses the issue


    josibake commented at 9:00 AM on March 23, 2023:

    I'm not sure I understand your second point, @ismaelsadeeq . The order shouldn't matter in address_to_scriptpubkey. It should be sufficient to properly extract the HRP and check that it is a bech32.


    ismaelsadeeq commented at 9:37 AM on March 23, 2023:

    bech32_to_bytes will be updated to handle all segwit addresses and check if hrp is valid.

    def bech32_to_bytes(address):
        hrp = address.split('1')[0]
        assert hrp in ['bc', 'tb', 'bcrt'], f"hrp should be 'bc', 'tb', or 'bcrt', not '{hrp}'"
        version, payload = decode_segwit_address(hrp, address)
        if version is None:
            return (None, None)
        return version, bytearray(payload)
    

    So calling address_to_scriptpubkey with a base58 address will assert to false, because the hrp is invalid.

    def address_to_scriptpubkey(address):
        """Converts a given address to the corresponding output script (scriptPubKey)."""
        version, payload = bech32_to_bytes(address)
        if version is not None:
            return program_to_witness_script(version, payload) # testnet segwit scriptpubkey
        payload, version = base58_to_byte(address)
        if version == 111:  # testnet pubkey hash
            return keyhash_to_p2pkh_script(payload)
        elif version == 196:  # testnet script hash
            return scripthash_to_p2sh_script(payload)
        # TODO: also support other address formats
        else:
            assert False
    

    address_to_scriptpubkey will be updated to check whether an address is in base58 before calling bech32_to_bytes

    def address_to_scriptpubkey(address):
        """Converts a given address to the corresponding output script (scriptPubKey)."""
        prefix = address[0]
        if prefix in ['m','n','2']:
            payload, version = base58_to_byte(address)
            if version == 111:  # testnet pubkey hash
                return keyhash_to_p2pkh_script(payload)
            elif version == 196:  # testnet script hash
                return scripthash_to_p2sh_script(payload)
    
        version, payload = bech32_to_bytes(address)
        return program_to_witness_script(version, payload) #segwit scriptpubkey
    

    josibake commented at 9:43 AM on March 23, 2023:

    ah, i see. I think id approach this in a different way: in bech32_to_bytes, if the HRP doesn't match a bech32 HRP, return (None, None).

    that way, address_to_scriptpubkey can continue and try base58 next. if the address is indeed malformed (meaning not bech32 or base58), eventually address_to_scriptpubkey will return false. @theStack I think this is more in line with what you meant by assert?


    ismaelsadeeq commented at 10:44 AM on March 23, 2023:

    That will work also, thanks @josibake


    theStack commented at 12:19 PM on March 23, 2023:

    @josibake: yes, agree that considering on what we need in address_to_scriptpubkey returning (None, None) is better than asserting. LGTM!

  26. theStack commented at 1:02 AM on March 23, 2023: contributor

    Concept ACK

    Thanks for tackling this, and warm welcome as a new contributor! With this change, we don't have to issue getaddressinfo or validateaddress RPCs anymore to convert from address to scriptPubKey.

    To ensure that the helper address_to_scriptpubkey works as expected, I think it would be good to call it least once with a segwit address in a functional test. One possible candidate is e.g. replacing the RPC call in this line: https://github.com/bitcoin/bitcoin/blob/4c6b7d330a4e80460dcce19b1a0a47d77a0b99ea/test/functional/wallet_fast_rescan.py#L61 This reveals that the added decoding support currently doesn't work as expected, see review comment below.

  27. test: test_bech32_decode in address.py
    Adds bech32_to_bytes() which can decode a bech32 address and return the
    version as an `int` and the payload in bytes.
    
    bech32_to_bytes() is used by the test_bech32_decode unit test to test
    decoding of segwit addresses.
    aac8793c7a
  28. test: add bech32 decoding support to address_to_scriptpubkey()
    This permits functional tests to decode bech32 addresses to scriptpubkeys.
    d178082996
  29. ismaelsadeeq force-pushed on Mar 23, 2023
  30. josibake commented at 11:23 AM on March 23, 2023: member

    ACK https://github.com/bitcoin/bitcoin/pull/27269/commits/d178082996dc3000f42816f89afcf3fa4d31e159

    I tested this by making sure the unittests run and also modifying test/functional/wallet_fast_rescan.py to use address_to_scriptpubkey, per @theStack 's suggestion. Works great!

    I'd love to see a follow-up PR targeting areas where we are a) only testing base58 addresses but should be testing bech32 as well and b) anywhere we are making a getaddressinfo RPC call to get the spk to instead use address_to_scriptpubkey

  31. fanquake requested review from theStack on Mar 23, 2023
  32. theStack approved
  33. theStack commented at 12:20 PM on March 23, 2023: contributor

    ACK d178082996dc3000f42816f89afcf3fa4d31e159 ✔️

    I'd love to see a follow-up PR targeting areas where we are a) only testing base58 addresses but should be testing bech32 as well and b) anywhere we are making a getaddressinfo RPC call to get the spk to instead use address_to_scriptpubkey

    👍 If anyone wants to work on this, feel free to tag me as reviewer.

  34. josibake commented at 10:35 AM on March 24, 2023: member

    review-beg: @willcl-ark

  35. in test/functional/test_framework/address.py:167 in aac8793c7a outdated
     161 | @@ -159,6 +162,16 @@ def check_script(script):
     162 |      assert False
     163 |  
     164 |  
     165 | +def bech32_to_bytes(address):
     166 | +    hrp = address.split('1')[0]
     167 | +    if hrp not in ['bc', 'tb', 'bcrt']:
    


    willcl-ark commented at 11:29 AM on March 24, 2023:

    I think this is totally fine as is here for now, but in the future we may want to put these all in a list/enum

  36. in test/functional/test_framework/address.py:196 in aac8793c7a outdated
     188 | @@ -176,3 +189,18 @@ def check_base58(data, version):
     189 |          check_base58(bytes.fromhex('0041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     190 |          check_base58(bytes.fromhex('000041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     191 |          check_base58(bytes.fromhex('00000041c1eaf111802559bad61b60d62b1f897c63928a'), 0)
     192 | +
     193 | +
     194 | +    def test_bech32_decode(self):
     195 | +        def check_bech32_decode(payload, version):
     196 | +            hrp = "tb"
    


    willcl-ark commented at 11:56 AM on March 24, 2023:

    Instead of hardcoding the hrp here, why not have check_bech32_decode() take hrp as a parameter, then we can add bc1 test vectors instead of being limited to tb1?

    edit: I don't consider this a blocker or needing reponse before merging, but will be nice to consider in the future.

  37. willcl-ark approved
  38. willcl-ark commented at 11:58 AM on March 24, 2023: member

    ACK d17808299

    I also tested two of the BIP173 Test Vectors:

    check_bech32_decode(bytes.fromhex('0020000000c4a5cad46221b2a187905e5266362b99d5e91c6ce24d165dab93e86433'), 1)
    check_bech32_decode(bytes.fromhex('00201863143c14c5166804bd19203356da136c985678cd4d27a1b8c6329604903262'), 1)
    

    ...which also passed. In doing so I noticed the hardcoding of human readable part hrp in address.py#L196 was limiting which test vectors I could use, and I think un-hardcoding this would be great to do as a followup.

  39. fanquake merged this on Mar 24, 2023
  40. fanquake closed this on Mar 24, 2023

  41. sidhujag referenced this in commit a9716df3e4 on Mar 24, 2023
  42. fanquake referenced this in commit c0311b1dda on Mar 29, 2023
  43. bitcoin locked this on Mar 23, 2024
  44. ismaelsadeeq deleted the branch on Jun 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-22 09:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me