BIP85: Add Codex32 as application 93' #1958

pull BenWestgate wants to merge 2 commits into bitcoin:master from BenWestgate:codex32 changing 1 files +239 −2
  1. BenWestgate commented at 11:40 pm on September 7, 2025: none

    This allows wallets to derive codex32 secrets and codex32 shares from BIP-0032 master keys.

    Summary of changes

    Rationale

    • Mirrors the existing BIP-85 application for BIP-39.
    • Codex32 offers error correction, hand verification, identifiers, and secret sharing improvements vs BIP-39.
    • Deterministic generation produces auditable backups by avoiding reliance on local RNG, helping users who distrust device entropy.

    Specification

    • Adds Application 93’ to BIP-0085 using derivation path:
    0m/83696968'/93'/{hrp || threshold}'/{byte_length}'/{index}'
    
    • Uses the BIP-85 DRNG
    • Unspecified identifiers default to child’s BIP-32 master seed fingerprint

    Tests Reference tests and new vectors will be included in the reference bipsea implementation: https://github.com/benwestgate/bipsea/compare/master...BenWestgate:bipsea:master

    Mailing List Discussion: https://groups.google.com/g/bitcoindev/c/--lHTAtq0Qc

    Status Ready for conceptual and approach review. This change is additive and does not modify existing BIP-85 behavior.

  2. jonatack added the label Proposed BIP modification on Sep 8, 2025
  3. jonatack added the label Pending acceptance on Sep 8, 2025
  4. BenWestgate force-pushed on Sep 8, 2025
  5. BenWestgate marked this as a draft on Sep 8, 2025
  6. doc: Add codex32 application (93') to BIP-0085 6defa98018
  7. BenWestgate force-pushed on Sep 8, 2025
  8. BenWestgate marked this as ready for review on Sep 8, 2025
  9. BenWestgate renamed this:
    Add Codex32 (BIP-0093) as application 93' to BIP-0085
    BIP85: Add Codex32 application 93'
    on Sep 9, 2025
  10. BenWestgate renamed this:
    BIP85: Add Codex32 application 93'
    BIP85: Add Codex32 as application 93'
    on Sep 9, 2025
  11. akarve commented at 6:47 pm on September 10, 2025: contributor
    Documenting recent discussions: @BenWestgate Please see my mailing list comments to your thread with suggestions and simplifications (path, byte extraction, idx, etc.). Regarding 1.4.0 the main thing is we want to warrant full compatibility (all features) up to the prior version and (just saw you reopened 68) a PR to the 1.3.0 client is probably the easiest way to achieve that. Lmk if anything is unclear.
  12. BenWestgate commented at 3:24 pm on September 12, 2025: none

    Documenting recent discussions: @BenWestgate Please see my mailing list comments to your thread with suggestions and simplifications (path, byte extraction, idx, etc.). Regarding 1.4.0 the main thing is we want to warrant full compatibility (all features) up to the prior version and (just saw you reopened 68) a PR to the 1.3.0 client is probably the easiest way to achieve that. Lmk if anything is unclear.

    It seems you’d like to consolidate some of the paths. There’s a few ways to do this, if you have a favorite or one that immediately stands out as obvious let me know.

    I think `m/83696968’/93’/{hrp}’/{cat({n} {threshold} {byte_length}}’/{index}’

    hrp is alone as it’s unknown how many future human-readable prefixes there may be.

    n will always be ‘‘1’’ through ‘‘31’’, t ‘‘0’’, or ‘‘2’’ through ‘‘9’’, byte_length ‘‘16’’ through ‘‘64’’. So we can decimal concatenate them with the max value being: 31 9 64 -> 31964’

    I’m thinking the identifier could be the bech32 encoding of the bip85 index, as the purpose of incrementing the index is to get new seeds, and BIP93 says “…the identifier SHOULD be distinct for all master seeds the user may need to disambiguate.”

    index = 0 -> identifier = qqqq, index = 1 -> identifier qqqp, and so on. A particular identifier can be selected by converting it to an integer {index} once index reaches 32^4, it can fall back to the default BIP-0032 fingerprint.

    On byte extraction: I agree we should draw byte_length bytes and pad to a multiple of 5 bits with a CRC. The polynomials (1 << crc_len) | 3 is optimal for 1-4 bits. Output share indices still can use the current read one byte at a time method.

  13. jonatack commented at 4:17 pm on September 17, 2025: member
    Pinging @scgbckbone (who has been active on BIP85 review) for feedback.
  14. scgbckbone commented at 11:48 am on October 13, 2025: contributor

    Seems to me this is well over the BIP-85 application scope. As I understand it, BIP85 generates “a thing” from “a thing”. Your application is generating “multiple things” from “a thing”.

    Why are you generating multiple initial shares via BIP85 ?

    What I imagined BIP85 application should looks like after reading BIP93:

    1. way to generate secret share s from BIP-32 root seed (so that you can load other wallets with derived entropy). Something like this: m/83696968’/93’/{b93_index mapped to int -> s in this case}’/{byte_length}’/{index}'
    2. way to generate any non-secret share from BIP32 root seed. This, as per rationale, would allow users to generate 2nd (and only 2nd) share deterministically via BIP85, and not via RNG. All other shares should be derived according to BIP93 via interpolation. m/83696968’/93’/{b93_index mapped to int -> not s in this case}’/{byte_length}’/{index}'

    ** maybe even threshold should be part of the BIP32 derivation path, BUT I think not as it has no effect to the actual secret generated (it only affects checksum)

    Assuming I’m not wrong in my “specualtion”, why not just use m/83696968’/128169’/{num_bytes}’/{index}’ to generate deterministic bytes from BIP-32 root seed for any share ?

  15. jonatack added the label PR Author action required on Oct 14, 2025
  16. Merge branch 'master' into codex32 b71c42cbae
  17. BenWestgate commented at 5:11 pm on October 19, 2025: none

    Thank you for great feedback @scgbckbone. I’ll explain the rationale behind your questions first.

    …BIP85 generates “a thing” from “a thing”.

    True, but that thing can be structured. For example, BIP39 derives an entire mnemonic, not one word at a time. Here, the “thing” is a complete codex32 backup as there’s no recoverable seed without at least {threshold} shares.

    Why are you generating multiple initial shares via BIP85 ?

    Determinism. We want to eliminate ambiguity about which initial share indices were derived by BIP85 to make BIP85 child seed recovery easier. Example:

     0bip85 = Bip85(master_root_xprv)
     1# 1. generate secret share "s" from root seed
     2secret1 = bip85.derive_codex32(t=3, share_idx='s')
     3# 2. generate `k` any non-"s" shares from root seed, interpolate according to BIP93
     4secret2 = Codex32String.interpolate_at(
     5    [
     6      bip85.derive_codex32(k=3, share_idx='a'),
     7      bip85.derive_codex32(k=3, share_idx='c),
     8      bip85.derive_codex32(k=3, share_idx='d'),
     9    ],
    10    target="s"
    11)
    12secret3 = Codex32String.recover(
    13    [
    14        bip85.derive_codex32(k=3, share_idx='x'),
    15        bip85.derive_codex32(k=3, share_idx='y),
    16        bip85.derive_codex32(k=3, share_idx='z'),
    17    ],
    18    target="s"
    19)
    20derived_secrets = [secret1, secret2, secret3]
    21identifiers = set()
    22master_seeds = set()
    23for secret in derived_secrets:
    24    identifiers.add(secret.identifier)
    25    master_seeds.add(secret.data)
    26
    27if len(identifiers) < len(master_seeds):
    28    raise Bip93Quote("Identifier SHOULD be distinct for every master seed the user may need to disambiguate")
    

    For the same BIP85 root key, each {threshold} set of initial {share_idx} BIP85 derived shares recovers a different secret; and the 's' derivation yet another. That’s a bad property: these codex32 sets share the same header ms13<identifier> making them hard to disambiguate. Mismatched sets recover wrong seeds.

    ** maybe even threshold should be part of the BIP32 derivation path, BUT I think not as it has no effect to the actual secret generated (it only affects checksum)

    For secret sharing, the {threshold} must be in the derivation path. Otherwise ms12testa... and ms13testa... share entropy payloads even though they’re distinct backup sets, a security vulnerability if both are used.

    For unshared secrets, the threshold has no effect, so it’d be ideal to ignore it when not secret sharing. That way, knowing the BIP85 index and root key uniquely identifies the seed, regardless of threshold, consistent with other BIP85 applications.

    …why not just use m/83696968’/128169’/{num_bytes}’/{index}’ to generate deterministic bytes from BIP-32 root seed for any share ?

    Then why not use that for BIP39 or any other application too? Let users convert deterministic bytes into mnemonics or codex32 strings as they wish. The point of a BIP85 application is to standardize how that entropy is consumed into a specific deterministic format.

    Based on feedback from you and @akarve Simplified proposal: Derivation: matlab m/83696968'/93'/{header}'/{byte_length}'/{index}'

    • where {header} is an int encoding of <hrp>, <hrp>1<k>, or <hrp>1<k><identifier> (TBD).

    Simplifications:

    • {share_idx} and {num_shares} can be eliminated
    • {identifier} can be implicit, but if user-defined, {index} should feed into it to keep output identifiers distinct per master seed
    • “Existing master seed” derivation rule is removed, we only generate fresh seeds.
      • Users can discard an initial share and interpolate if they have an existing master seed they wish to share.
    • BIP93 interpolation and relabeling identifiers left to users.
    • Default identifier = BIP32 fingerprint of derived seed.

    Example of the simplified form:

     0bip85 = Bip85(master_root_xprv)
     1# 1. generate k=0 secret share "s" from root seed
     2secret1 = bip85.derive_codex32(k=0)
     3# 2. generate `k` fixed non-"s" shares from root seed, interpolate according to BIP93
     4shares = bip85.derive_codex32(k=2)
     5secret2 = Codex32String.interpolate_at(shares, target="s")
     6shares = bip85.derive_codex32(k=3)
     7secret3 = Codex32String.interpolate_at(shares, target="s")
     8derived_secrets = [secret1, secret2, secret3]
     9identifiers = set()
    10master_seeds = set()
    11for secret in derived_secrets:
    12    identifiers.add(secret.identifier)
    13    master_seeds.add(secret.data)
    14
    15assert len(identifiers) == len(master_seeds)
    16# header is distinct for each master seed the user may need to disambiguate
    17
    18assert len(master_seeds) == 1
    19# same master seed for same {index}, regardless of k
    

    Seems over the BIP-85 scope.

    The version brings it back within scope:

    • k=0: derive 1 codex32 secret.
    • k=2: derive 2 codex32 shares (‘a’ and ‘c’) → recover same secret.
    • k=3: generate 3 codex32 shares (‘a’, ‘c’ and, ’d’) → recover same secret. Identifier defaults to derived seed’s BIP32 fingerprint. Incrementing {index} yields new seeds, with new identifiers automatically.
  18. jonatack removed the label PR Author action required on Oct 19, 2025
  19. scgbckbone commented at 1:53 am on October 27, 2025: contributor

    Then why not use that for BIP39 or any other application too? Let users convert deterministic bytes into mnemonics or codex32 strings as they wish. The point of a BIP85 application is to standardize how that entropy is consumed into a specific deterministic format.

    agreed, rest my case here…

    For example, BIP39 derives an entire mnemonic, not one word at a time.

    this is bad comparison, as 12/24 words represent encoding of 16/32 bytes of entropy. While your approach creates multiple shares. Same as if I would create multiple 12 words seeds from 16 bytes of entropy.

    this is imo ok (using code from your snippets whithout ever running it or reviewing it). My understanding is that each line in below snippet, generates just one share?

    0secret_share = bip85.derive_codex32(t=3, share_idx='s')
    1share_a = bip85.derive_codex32(t=3, share_idx='a'),
    2share_c = bip85.derive_codex32(t=3, share_idx='c'),
    3share_d = bip85.derive_codex32(t=3, share_idx='d'),
    

    with what I have issue is this, where you just generating multiple shares (somehow):

    0# 2. generate `k` fixed non-"s" shares from root seed, interpolate according to BIP93
    1shares = bip85.derive_codex32(k=2)
    

    The version brings it back within scope:

    only k=0 is within the scope (imho)

    Don’t get me wrong, I’m not intending to block this BIP update. Updated version is much better. I’m only trying to figure out why is this needed & whether there is any advantage in what you’re doing vs. what I’m doing. Here is my pseudo-code, to try to prove the point that nothing else than simple “one share generation” is needed here & rest can be left to BIP-93 interpolation:

    0CHARSET = "qpzry9x8gf2tvdw0s3jn54khce6mua7l"
    1secret_share = <32 bytes secret loaded in HWW>
    2id = "cash"
    3threshold = 3
    4num_shares = 4
    5# use BIP85 to deterministically generate secret share "L" (or any other, up to specific wallet implementation)
    6share_l = bip85(secret_share).derive_codex32(t=threshold, id=id, share_idx='l')
    7shares = [sahre_l]
    8for i in range(num_shares - 1):  # -1 as share 'a' was already generated
    9    shares.append(bip93.interpolate([secret_share, share_l], CHARSET[i]))   
    

    Above pseudo-code always generate the same shares.

  20. BenWestgate commented at 1:29 pm on October 28, 2025: none

    While your approach creates multiple shares. Same as if I would create multiple 12 words seeds from 16 bytes of entropy.

    I would not say these are quite the same. Both the initial k shares and 12 word seeds need entropy to create but a set of k shares does represent a single seed. < k can not recover a seed or define a specific backup set of shares. All other bip85 applications recover a specific secret at each bip85 index so that it can be recovered later. If the app generates “loose” shares, this is not possible, bip85 does not know which initial shares generated, if at all, a master seed. And were that desired, users can discard some output.

    only k=0 is within scope (imho)

    This can’t generate shares because {share_idx} must be “S” for k=0. Of course, I like the simplicity of this small scope but it leaves users and wallets tasked with generating the randomness needed to securely secret share that BIP85 derived codex32 secret.

    If the purpose of BIP85 is deterministic randomness from bip32 keychains, why suddenly stop short of providing the additional randomness needed to do SSS for an SSS-aware format?

    Path: m/83696968'/93'/{hrp}'/{byte_length}'/{index}' We could drop the bip93 interpolation and directly encode threshold strings at fixed indices, using default bip32 fingerprint identifier. BIP85 app output would be: k=0: a codex32 secret. k=2: same codex32 secret and share “A”. k=3: same codex32 secret, a new share “A”, and share “C”. k=4: same codex32 secret, new shares “A” and “C”, and share “D”. etc…

    Now each {bip85 index} derives a single seed as with bip39 but we also give deterministic entropy for secret sharing. Leaving wallets to interpolate any remaining shares.

    My understanding is that each line in below snippet, generates just one share?

    Yes, it was an example of part 2 of your proposal:

    1. way to generate any non-secret share from BIP32 root seed. This … would allow users to generate share[s] deterministically via BIP85, and not via RNG.

    The disadvantages I saw with it for the same {bip85 index}:

    1. Different arbitrary {share_idx} combinations of bip85 derived shares recover different seeds, which is unexpected bip85 behavior and less interoperable.
      • Solution: fix the indices, only output:
        1. “S”, [“A”, “C”] or, [“A”, “C”, “D”] for k=0, k=2 and k=3 respectively. Or
        2. “S”, [“S”, “A”] or, [“S”, “A”, “C”], etc
      • Whichever is preferred, a is more usable by humans. Avoids mistakenly writing the secret. b avoids our bip85 app needing interpolation logic as it outputs direct entropy encodings.
    2. Different thresholds recover different seeds.
      • Fix: Remove k from the derivation path and always encode seed as the first {bytes_length} from the DRNG
        • Or use HMAC(derived_entropy, identifier) and truncate.
    3. {threshold} doesn’t affect share payloads so they might be reused across backups.
      • Seeds may be “reshared” but shares should be fresh.
      • e.g. a 2-of-n and 3-of-n and accidentally use the same {bip85 index} then both backups have the same share ‘A’ payload.
      • Fix: After generating the seed, reseed the DRNG with HMAC(derived_entropy, k) and then derive k - 1 independent share payloads with {bytes_length} reads from the DRNG.
        • Could also reseed with HMAC(derived_entropy, k + identifier) which would support resharing an existing master seed at the same threshold with a unique identifier.
        • Or use HMAC(derived_entropy, k + identifier + share_idx) directly and truncate for each share payload.

    I’m only trying to figure out why is this needed…

    To support deterministic generation of initial strings for users/wallets intending to do SSS, we should output a threshold quantity of strings to avoid these 3 interoperability and recovery problems.

    However this isn’t necessary for a minimum viable PR, as k=0 is useful on its own. And I can break this into two PRs one for codex32 secrets and another for k >= 2 which output a codex32 secret and k-1 codex32 shares.

    How should I proceed?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-01 22:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me