BIP93: terminology, typo, and phrasing fixups #2052

pull BenWestgate wants to merge 5 commits into bitcoin:master from BenWestgate:bip93-fix-typos-and-terms changing 1 files +128 −65
  1. BenWestgate commented at 2:37 am on December 10, 2025: none

    Various terminology fixes, typo fixes and phrase fixups from #2040 and #2023. @apoelstra said:

    In the interest of moving forward I would kinda like Ben to make a new PR with the non-HRP changes, which it seems like everyone agrees with and would reduce the size of the diff of this one.

    #2040 (comment)

    This is that PR.

  2. Change master seed to secret in most places, ''t'' to ''k'' and other term fixes 4b6ffb554b
  3. Replace deleted linebreak, delete vestigal oxford commas aa06616060
  4. in bip-0093.mediawiki:22 in aa06616060 outdated
    22-Secret data can be split into up to 31 shares.
    23-A minimum threshold of shares, which can be between 1 and 9, is needed to recover the secret, whereas without sufficient shares, no information about the secret is recoverable.
    24+This document proposes a checksummed base32 format, "codex32", and a standard for backing up and restoring the master seed of a
    25+[https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032] hierarchical deterministic wallet using it.
    26+It includes an encoding format, a BCH error-correcting checksum, and optional Shamir's secret sharing algorithms for share generation and secret recovery.
    27+Secret data can be encoded directly, or split into up to 31 shares.
    


    BenWestgate commented at 1:42 pm on December 10, 2025:
    I moved SSS down a sentence and called it optional so users don’t assume it’s required.

    apoelstra commented at 3:31 pm on December 10, 2025:
    Yeah, this made me double-take but we were already trying to do that with our misleading “threshold from 1 to 9” description. Your text is better.
  5. in bip-0093.mediawiki:23 in aa06616060 outdated
    23-A minimum threshold of shares, which can be between 1 and 9, is needed to recover the secret, whereas without sufficient shares, no information about the secret is recoverable.
    24+This document proposes a checksummed base32 format, "codex32", and a standard for backing up and restoring the master seed of a
    25+[https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032] hierarchical deterministic wallet using it.
    26+It includes an encoding format, a BCH error-correcting checksum, and optional Shamir's secret sharing algorithms for share generation and secret recovery.
    27+Secret data can be encoded directly, or split into up to 31 shares.
    28+A minimum threshold of shares, which can be between 2 and 9, is needed to recover the secret, whereas without sufficient shares, no information about the secret is recoverable.
    


    BenWestgate commented at 1:42 pm on December 10, 2025:
    closes #2023
  6. in bip-0093.mediawiki:83 in aa06616060 outdated
    78@@ -75,6 +79,8 @@ It reuses the base-32 character set from BIP-0173, and consists of:
    79 ** A checksum which consists of 13 bech32 characters as described below.
    80 
    81 As with bech32 strings, a codex32 string MUST be entirely uppercase or entirely lowercase.
    82+Note that per BIP-0173, the lowercase form is used when determining a character's value for checksum purposes.
    83+In particular, given an all uppercase codex32 string, we still use lowercase <code>ms</code> as the human-readable part during checksum construction.
    


    BenWestgate commented at 1:43 pm on December 10, 2025:
    moved these sentences from test vectors to definition
  7. in bip-0093.mediawiki:174 in aa06616060 outdated
    172+
    173+def ms32_encode(data):
    174+    combined = data + ms32_create_checksum(data)
    175+    return "ms" + "1" + ''.join([CHARSET[d] for d in combined])
    176+
    177+def ms32_decode(codex):
    


    BenWestgate commented at 1:45 pm on December 10, 2025:
    this is a function that would save a few minutes for implementers, summarizes the validity rules (except for incomplete group) in code form.

    apoelstra commented at 3:32 pm on December 10, 2025:
    Where did it come from? It looks correct to me but I didn’t look too carefully at it.

    BenWestgate commented at 4:20 pm on December 10, 2025:
    Me. It’s from my local copy of my python-codex32 package. It passes the test vectors. I’ll send CI when I publish it. We could reject incomplete groups here rather than wait for convertbits or your sanity_check. Lib code will have slight diff to raise descriptive error classes like your rust does instead of return None though.

    apoelstra commented at 4:27 pm on December 10, 2025:

    In my view it’s okay for the reference code to be more accepting than a production implementation.

    But I’d also be happy tightening it up. Let’s defer to a later PR though since there’s enough happening here as-is.


    BenWestgate commented at 5:04 pm on December 10, 2025:
    yes, we never know if a base32 native secret wants to use this specification in which case incomplete groups would not apply to it.

    BenWestgate commented at 7:11 am on December 12, 2025:

    It looks correct to me but I didn’t look too carefully at it.

    https://github.com/bitcoin/bips/pull/2052/changes/6763349720f52cbb09cd5e706887328ab668fc67

    Fixed a bug and made the validity condition cleaner.

  8. in bip-0093.mediawiki:201 in aa06616060 outdated
    203 * All shares have the same threshold value, the same identifier, and the same length.
    204 * All of the share index values are distinct.
    205-* The number of codex32 shares is exactly equal to the (common) threshold value.
    206-
    207-If all the above conditions are satisfied, the <code>ms32_recover</code> function will return a codex32 secret when its argument is the list of codex32 shares with each share represented as a list of integers representing the characters converted using the bech32 character table from BIP-0173.
    208+* The number of shares is exactly equal to the (common) threshold value.
    


    BenWestgate commented at 1:47 pm on December 10, 2025:
    throughout the text I reduced the use of “codex32” as an adjective to reduce jargon and avoid UI designers assuming they can’t just say “share” or “secret”.
  9. in bip-0093.mediawiki:245 in aa06616060 outdated
    243 </source>
    244 
    245 ===Generating Shares===
    246 
    247-If we already have ''t'' valid codex32 strings such that:
    248+If we already have ''k'' valid codex32 strings such that:
    


    BenWestgate commented at 1:49 pm on December 10, 2025:
    I prefer k and that’s what the SSS wikipedia article uses but t is also correct. The codex32 book and test vectors used k so that’s why I got rid of t for consistency.
  10. in bip-0093.mediawiki:293 in aa06616060 outdated
    314 
    315-The codex32 secret and the ''t''-1 codex32 shares form a set of ''t'' valid codex32 strings from which additional shares can be derived as described above.
    316+The codex32 secret and the ''k''-1 codex32 shares form a set of ''k'' valid initial codex32 strings from which additional shares can be derived as described above.
    317 
    318-===Long codex32 Strings===
    319+===Long codex32===
    


    BenWestgate commented at 1:51 pm on December 10, 2025:
    I renamed the format “Long codex32” to match “codex32” it felt weird to make “strings” part of the proper noun. also updated test vector 5 to match.
  11. in bip-0093.mediawiki:359 in aa06616060
    356+*** We do not define how to choose the identifier, beyond noting that it SHOULD be distinct for every master seed and share set the user may need to disambiguate.
    357+** The share index "s".
    358+** A conversion of the 16-to-64-byte BIP-0032 HD master seed to bech32:
    359+*** Start with the bits of the master seed, most significant bit per byte first.
    360+*** Re-arrange those bits into groups of 5, and pad with arbitrary bits at the end if needed.
    361+*** Translate those bits to characters using the bech32 character table from BIP-0173.
    


    BenWestgate commented at 1:55 pm on December 10, 2025:

    This duplicates L281-285 in “For an Existing Secret” we might want to cite this section and delete lines up there or vice versa.

    Also looks like I forgot the checksum here?


    apoelstra commented at 3:40 pm on December 10, 2025:
    Yeah, I vote we delete the above section and link down here. In “for an existing secret” I think we want to add a line about length. Specifically, if your seed data is 46 bytes or fewer, use codex32; otherwise use long codex32 (ref).

    BenWestgate commented at 4:41 pm on December 10, 2025:

    We could also move this section after the Unshared Secret section.

    Everyone needs the master seed standard to import/export seeds but not all users want the SSS stuff. So “Existing secret” is lower priority. If we define master seed encoding first L281-285 can be replaced with a link.

    There’s a lot of overlap between “Fresh” and “Existing” and it risks reader auto-pilot.

    For example my reviewer on BIP85 app completely missed that initial share indices were supposed to be chosen alphabetically. (There is not supposed to be a “z” initial share for example.)

    A refactor to de-duplicate is possible but I haven’t thought about it.


    apoelstra commented at 5:08 pm on December 10, 2025:
    I’m fine with the existing duplicated text. We can refactor in a followup if you want. Especially if the goal is to rearrange text in order of priority.

    BenWestgate commented at 6:48 pm on December 10, 2025:

    I moved the new “Master seed format” to be below “Unshared Secret”. The “direct encoding of … master seed” sentence could hop down from “Unshared” leaving it HRP pure (no diff when #2040 is rebased on this.)

    Looks better to me. I left “existing secret” as in https://github.com/bitcoin/bips/commit/aa06616060d1af376092664d2cb6c728a21067e9.


    BenWestgate commented at 7:11 pm on December 10, 2025:

    In “for an existing secret” I think we want to add a line about length. Specifically, if your seed data is 46 bytes or fewer, use codex32; otherwise use long codex32 (ref).

    It says:

    Generate a valid checksum in accordance with the Checksum section

    And the Checksum section says:

    0def ms32_create_checksum(data):
    1    if len(data) > 80:                       # See Long codex32 Strings
    2        return ms32_create_long_checksum(data)
    
  12. in bip-0093.mediawiki:472 in aa06616060 outdated
    470 Share with index <code>C</code>: <code>MS12NAMECACDEFGHJKLMNPQRSTUVWXYZ023FTR2GDZMPY6PN</code>
    471 
    472 * Derived share with index <code>D</code>: <code>MS12NAMEDLL4F8JLH4E5VDVULDLFXU2JHDNLSM97XVENRXEG</code>
    473-* Secret share with index <code>S</code>: <code>MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW</code>
    474-* Master secret (hex): <code>d1808e096b35b209ca12132b264662a5</code>
    475+* Recovered secret seed with index <code>S</code>: <code>MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW</code>
    


    BenWestgate commented at 2:04 pm on December 10, 2025:

    If the book does away with “secret seed” we may drop it here and stick to “secret” or “codex32 secret”. See: https://github.com/BlockstreamResearch/codex32/issues/72 This is the best terminology for now, plus “secret seed” books are in print. I’m fine with using this term in BIP93.

    For distinction, I’m deliberately not calling interpolated secrets “codex32-encoded master seeds” because they aren’t encoded from seeds, they’re decoded into seeds.


    apoelstra commented at 3:41 pm on December 10, 2025:
    This seems reasonable to me.
  13. in bip-0093.mediawiki:485 in aa06616060 outdated
    543+identifier (bech32): <code>0C8V</code>
    544+
    545+payload (bech32): <code>M32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06F</code>
    546 
    547-* Secret share with index <code>S</code>: <code>MS100C8VSM32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK</code>
    548-* Master secret (hex): <code>dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9baecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9</code>
    


    BenWestgate commented at 2:13 pm on December 10, 2025:
    The example text did not fit what the data showed, assembled from random characters and appending a checksum. So I did this piecewise assembly to match the text exactly as per the discussion in #2040 (review)

    apoelstra commented at 3:44 pm on December 10, 2025:
    This looks fine to me. I don’t want to hang up this PR on terminology but I find it a little bit confusing that we use the terms “master seed” and “secret seed” to refer to the same data with different encodings. I wonder if we should change “Master seed (hex)” to “Secret seed (hex)” and change “Secret seed” to “Secret seed (codex32)”.

    BenWestgate commented at 4:55 pm on December 10, 2025:

    Master seeds aren’t bijective with their codex32-encodings due to threshold, identifier and arbitrary padding so I wouldn’t use the same word.

    As with up here I’m okay with just calling it “secret” or “codex32 secret”, this case is not a “codex32-encoded” existing seed:

    This example shows generating a new 512-bit master seed using “random” codex32 characters and appending a checksum.

    And codex32 characters should be bech32 characters?

    I see you’ve put (codex32) in parenthesis is that the preference when giving data as opposed to (bech32) as master vector 1 used:

    codex32 secret (bech32):

    “codex32 secret (bech32)” or “Secret (bech32)” looks better than “Secret seed (codex32)”


    apoelstra commented at 5:10 pm on December 10, 2025:
    I’m happy with “codex32 secret (bech32)”. It maaybe implies that the checksum is the bech32 checksum but I think it’s clear from context what’s meant. And it captures both that the character set is bech32 vs hex, and that the “codex32 secret” has more data than the bare seed.

    BenWestgate commented at 6:58 pm on December 10, 2025:

    Nowhere do we label a result string (bech32). (hex) seems to be used everywhere because it looks more like bech32 than the reverse.

    Checksum is 150% too long to be Bech32.

    So I have Vector 5 like this:

    k value (bech32): 0

    identifier (bech32): 0C8V

    payload (bech32): M32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06F

    • checksum: HPV80UNDVARHRAK
    • codex32 secret: MS100C8VSM32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK
    • Master seed (hex): dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9baecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9
    • master node xprv: xprv9s21ZrQH143K4UYT4rP3TZVKKbmRVmfRqTx9mG2xCy2JYipZbkLV8rwvBXsUbEv9KQiUD7oED1Wyi9evZzUn2rqK9skRgPkNaAzyw3YrpJN

    Do we need to change Vector 2 as well or are you okay there:

    • Derived share with index D: MS12NAMEDLL4F8JLH4E5VDVULDLFXU2JHDNLSM97XVENRXEG
    • Recovered secret seed with index S: MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW
    • Master seed (hex): d1808e096b35b209ca12132b264662a5

    apoelstra commented at 8:10 pm on December 10, 2025:
    Vector 2 seems fine to me.
  14. BenWestgate commented at 2:16 pm on December 10, 2025: none
    Short rationale for changes that may not be immediately obvious.
  15. in bip-0093.mediawiki:62 in aa06616060 outdated
    58@@ -59,6 +59,10 @@ However, BIP-0039 has no error-correcting ability, cannot sensibly be extended t
    59 
    60 ==Specification==
    61 
    62+We first describe the general checksummed base32<ref>'''Why use base32 at all?''' The lack of mixed case makes it more
    


    apoelstra commented at 3:28 pm on December 10, 2025:

    In 4b6ffb554be33c054b4028b75794e7d667ee5dc9:

    I think you’re missing a . after the word base32.


    BenWestgate commented at 5:01 pm on December 10, 2025:

    After base32 is a reference link that came along from BIP-0173. Our reasons for base32 slightly differ from BIP-0173’s so I edited the ref but the regular text reads:

    We first describe the general checksummed base32 format called ‘‘codex32’’ and then define a BIP-0032 master seed encoding using it.


    apoelstra commented at 5:11 pm on December 10, 2025:

    Never mind, I was misreading the markdown and didn’t understand when the <ref> was closed.

    For some reason I don’t have a “resolve” button here, but your text is correct.


    BenWestgate commented at 7:02 pm on December 10, 2025:
    I’ll click it. I’ll leave the other resolved convos open in case someone else wants to review this week.
  16. in bip-0093.mediawiki:131 in aa06616060
    124@@ -119,6 +125,10 @@ def ms32_create_checksum(data):
    125     polymod = ms32_polymod(values + [0] * 13) ^ MS32_CONST
    126     return [(polymod >> 5 * (12 - i)) & 31 for i in range(13)]
    127 </source>
    128+This implements a [https://en.wikipedia.org/wiki/BCH_code BCH code] that
    129+guarantees detection of '''any error affecting at most 8 characters'''
    130+and has less than a 3 in 10<sup>20</sup> chance of failing to detect more
    131+errors.
    


    apoelstra commented at 3:29 pm on December 10, 2025:

    In 4b6ffb554be33c054b4028b75794e7d667ee5dc9:

    We should say “random errors” or even “uniformly random errors”. It’s important that it’s possible (and not too hard) to construct adversarial error patterns of 9 or more characters that still pass the checksum.

  17. in bip-0093.mediawiki:190 in aa06616060
    188+    if not (hrp == "ms" and codex[pos+1].isdigit()) or codex[pos+1] == "0" and codex[pos+6] != "s":
    189+        return None
    190+    data = [CHARSET.index(x) for x in codex[pos+1:]]
    191+    if not ms32_verify_checksum(data):
    192+        return None
    193+    return data[:-13 if len(data) < 94 else -15]  # See Long codex32 Strings</source>
    


    apoelstra commented at 3:33 pm on December 10, 2025:

    In 4b6ffb554be33c054b4028b75794e7d667ee5dc9

    Can you move </source> to its own line?

  18. in bip-0093.mediawiki:330 in aa06616060
    323@@ -286,19 +324,40 @@ def ms32_create_long_checksum(data):
    324     polymod = ms32_long_polymod(values + [0] * 15) ^ MS32_LONG_CONST
    325     return [(polymod >> 5 * (14 - i)) & 31 for i in range(15)]
    326 </source>
    327+This implements a [https://en.wikipedia.org/wiki/BCH_code BCH code] that
    328+guarantees detection of '''any error affecting at most 8 characters'''
    329+and has less than a 3 in 10<sup>23</sup> chance of failing to detect more
    330+errors.
    


    apoelstra commented at 3:36 pm on December 10, 2025:

    In 4b6ffb554be33c054b4028b75794e7d667ee5dc9:

    Same as above (should change “errors” to “random errors” or “uniformly random errors”)

  19. apoelstra commented at 3:45 pm on December 10, 2025: contributor
    Done reviewing aa06616060d1af376092664d2cb6c728a21067e9. Looks great, thanks! My comments are all just nits (except the “random errors” thing which I think is important).
  20. jonatack added the label Proposed BIP modification on Dec 10, 2025
  21. jonatack added the label Pending acceptance on Dec 10, 2025
  22. errors->random errors, fix newlines, vector5: secret seed->codex32 secret
    reduced the heading level of checksum and error correction to make the table of contents easier to parse.
    
    Moved Master seed Encoding to be below Unshared Secret.
    7b83e6f2d4
  23. in bip-0093.mediawiki:87 in 7b83e6f2d4 outdated
    83+In particular, given an all uppercase codex32 string, we still use lowercase <code>ms</code> as the human-readable part during checksum construction.
    84 For presentation, lowercase is usually preferable, but uppercase SHOULD be used for handwritten codex32 strings.
    85 If a codex32 string is encoded in a QR code, it SHOULD use the uppercase form, as this is encoded more compactly.
    86 
    87-===Checksum===
    88+====Checksum====
    


    BenWestgate commented at 7:47 pm on December 10, 2025:
    ToC was getting hard on my eyes with so many H3 so I indented Checksum and Error Correction so there’s only 4 H3 in a row afterwards: Unshared Secret, Master seed format, Recover Secret, Generate Shares.
  24. in bip-0093.mediawiki:208 in 7b83e6f2d4 outdated
    206+** The share index "s".
    207+** A conversion of the 16-to-64-byte BIP-0032 HD master seed to bech32:
    208+*** Start with the bits of the master seed, most significant bit per byte first.
    209+*** Re-arrange those bits into groups of 5, and pad with arbitrary bits at the end if needed.
    210+*** Translate those bits to characters using the bech32 character table from BIP-0173.
    211+** A valid checksum in accordance with the Checksum section.
    


    BenWestgate commented at 7:48 pm on December 10, 2025:
    Checksum was added. Rest is unchanged from it’s old location.
  25. BIP93: change codex32 characters to bech32 characters d926cc9076
  26. Fix hrp length off by 1 bug. Refactor validity condition to read easier. 6763349720
  27. in bip-0093.mediawiki:536 in d926cc9076 outdated
    532@@ -476,13 +533,20 @@ Note that the choice to append four zero bits was arbitrary, and any of the foll
    533 
    534 ===Test vector 5===
    535 
    536-This example shows generating a new 512-bit master seed using "random" codex32 characters and appending a checksum.
    537+This example shows generating a new 512-bit master seed using "random" bech32 characters and appending a checksum.
    


    BenWestgate commented at 7:58 pm on December 10, 2025:
    caught this. There’s no such thing as “codex32” characters because we use the bech32 character set.
  28. in bip-0093.mediawiki:184 in 6763349720
    180+    if pos < 2 or not (48 <= len(codex) <= 127):
    181         return None
    182     if not all(x in CHARSET for x in codex[pos+1:]):
    183         return None
    184-    hrp = codex[:pos]
    185-    if not (hrp == "ms" and codex[pos+1].isdigit()) or codex[pos+1] == "0" and codex[pos+6] != "s":
    


    BenWestgate commented at 7:18 am on December 12, 2025:
    Easier to read without parentheses.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-12-14 08:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me