BIP Draft: Multilingual mnemonic display and input conventions #2200

pull osem23 wants to merge 3 commits into bitcoin:master from osem23:multilingual-mnemonic-bip-v2 changing 1 files +284 −0
  1. osem23 commented at 11:25 AM on June 23, 2026: none

    This adds a Specification BIP draft, "Multilingual mnemonic display and input rules" (resubmission of the previously-closed #2192, updated).

    A display wordlist is a 2048-entry list in a target language, index-parallel to the canonical English BIP-39 wordlist. PBKDF2 runs only on the canonical English mnemonic; native-language renderings are a display and input layer with no new cryptographic surface, and every seed produced under the convention is restorable in any BIP-39 wallet via its English form.

    The preamble follows the BIP 3 format (Authors, Assigned, Discussion; no Discussions-To/Comments-*). I have not self-assigned a BIP number.

    Discussion

    Reference implementation (MIT): https://github.com/osem23/bip39-wordlists-tzur — 30 index-paired display wordlists with bidirectional mappings, the 10 canonical BIP-39 wordlists preserved byte-for-byte for spec comparison, a reference validator enforcing every MUST clause, reference decoders in Python, JavaScript, and Swift producing byte-identical seeds, and per-language conformance test vectors across the five BIP-39 entropy lengths.

    Shipped in production by the TZUR Wallet suite (iPhone and Windows).

    License: BSD-2-Clause (document), MIT (reference implementation).

  2. Add Informational BIP: Multilingual mnemonic display and input conventions
    A display wordlist is a 2048-entry list in a target language, index-parallel
    to the canonical English BIP-39 wordlist. PBKDF2 runs only on the canonical
    English mnemonic; native-language renderings are a display and input layer with
    no new cryptographic surface, and every seed produced under the convention is
    restorable in any BIP-39 wallet via its English form.
    
    Preamble follows the BIP 3 format. No BIP number self-assigned.
    c0ec345f2f
  3. jonatack added the label New BIP on Jun 23, 2026
  4. jonatack renamed this:
    Add Informational BIP: Multilingual mnemonic display and input conventions
    BIP Draft: Multilingual mnemonic display and input conventions
    on Jun 23, 2026
  5. in bip-multilingual-mnemonic.md:12 in c0ec345f2f outdated
       7 | +  Type: Informational
       8 | +  Assigned: ?
       9 | +  License: BSD-2-Clause
      10 | +  Discussion: 2026-06-13: https://groups.google.com/g/bitcoindev/c/Rwo7P5pTA0c
      11 | +              2026-06-23: https://delvingbitcoin.org/t/bip39-native-language-display-wordlists-mapped-to-canonical-english/2637
      12 | +```
    


    danielabrozzoni commented at 2:27 PM on June 24, 2026:

    The preamble should contain

    Requires: 39
    

    osem23 commented at 4:51 AM on June 25, 2026:

    Done, added Requires: 39 to the preamble.

  6. in bip-multilingual-mnemonic.md:4 in c0ec345f2f
       0 | @@ -0,0 +1,282 @@
       1 | +```
       2 | +  BIP: ?
       3 | +  Layer: Applications
       4 | +  Title: Multilingual mnemonic display and input conventions
    


    danielabrozzoni commented at 3:07 PM on June 24, 2026:

    Unfortunately title should be at most 50 characters, and this is 51 😅


    osem23 commented at 4:51 AM on June 25, 2026:

    Fixed. It's now "Multilingual mnemonic display and input rules" (45 chars).

  7. in bip-multilingual-mnemonic.md:7 in c0ec345f2f
       0 | @@ -0,0 +1,282 @@
       1 | +```
       2 | +  BIP: ?
       3 | +  Layer: Applications
       4 | +  Title: Multilingual mnemonic display and input conventions
       5 | +  Authors: Daniel Osemberg <ceo@blocksight.live>
       6 | +  Status: Draft
       7 | +  Type: Informational
    


    danielabrozzoni commented at 3:09 PM on June 24, 2026:

    osem23 commented at 4:51 AM on June 25, 2026:

    Agreed, set to Type: Specification.

  8. danielabrozzoni commented at 3:19 PM on June 24, 2026: member

    Only gave a first very quick pass, will do another one soon :)

  9. jonatack commented at 6:00 PM on June 24, 2026: member

    This draft appears to be mostly AI generated?

    Edit: am looking at the document history in https://github.com/osem23/bip39-wordlists-tzur/commits/main/docs/BIP-multilingual-mnemonics.md

  10. osem23 commented at 8:34 PM on June 24, 2026: none

    Yes, I used AI as a writing tool, and I'm not going to pretend otherwise. I'm proud of it. But "AI generated" isn't really the question for a spec. The question is whether it's correct and implementable. I stand behind every line and I understand every line. If any specific clause reads as wrong, vague, or unnecessary, point at it and I'll fix or defend it. That's the feedback I can actually use, and this spec is built to be checked rather than trusted.

  11. osem23 commented at 8:35 PM on June 24, 2026: none

    Only gave a first very quick pass, will do another one soon :)

    Thanks for the pass, all three are good catches.

  12. Address review: title length, Specification type, Requires 39
    Per danielabrozzoni's review on PR #2200:
    - Title trimmed to 50-char limit (now 45): "...display and input rules"
    - Type changed Informational -> Specification (BIP-3: implementable
      with compliant implementations; has validator, decoders, vectors)
    - Add Requires: 39, placed after Discussion per BIP-3 field order
    
    Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    4bf395ea9f
  13. in bip-multilingual-mnemonic.md:28 in 4bf395ea9f outdated
      23 | +This document does **not** replace BIP-39, does not deprecate any existing BIP-39 wordlist, and does not change the canonical seed-derivation flow. It defines only a display and backup layer that sits above an unchanged BIP-39 core. The following points hold throughout this specification:
      24 | +
      25 | +- **English BIP-39 remains canonical.** The English BIP-39 mnemonic is the only mnemonic fed to PBKDF2-HMAC-SHA512, and the only artifact that determines the derived seed and cross-wallet compatibility. This document does not alter BIP-39 entropy, checksum, Unicode normalization, or PBKDF2 rules.
      26 | +- **Localized wordlists are a display and backup layer only.** A display wordlist is never the password input to PBKDF2. It exists so a user can read and write their backup in their own language.
      27 | +- **The mapping is by word index.** The display token at index `i` corresponds to the English BIP-39 word at index `i`, and to nothing else. There is no per-language entropy, checksum, or key derivation.
      28 | +- **The localized mnemonic is always reversible to the canonical English mnemonic.** The bidirectional mapping is bijective across all 2048 entries (§Display wordlist requirements), so a conformant display mnemonic resolves back to exactly one English BIP-39 mnemonic, deterministically.
    


    murchandamus commented at 9:38 PM on June 24, 2026:

    Was that supposed to be a link to the Display wordlist section?


    osem23 commented at 4:51 AM on June 25, 2026:

    Yes, it points to the Display wordlist requirements section. I use the §section style throughout instead of anchor links. Happy to switch these to real anchors if the editors prefer.

  14. in bip-multilingual-mnemonic.md:84 in 4bf395ea9f outdated
      79 | +
      80 | +A wallet that accepts a display mnemonic on restore tokenizes it on whitespace before lookup:
      81 | +
      82 | +1. Tokenize on Unicode whitespace (characters with the Unicode `White_Space` property) plus the ideographic space (`U+3000`) used by the official Japanese BIP-39 mnemonic.
      83 | +2. Normalize every token and the display wordlist to the same Unicode form (NFC) before comparison. Mismatched normalization between input and wordlist causes silent lookup failures on precomposed/decomposed accent pairs. NFC, and the NFKD that BIP-39 applies before PBKDF2, are both safe: they never merge two distinct entries in a conformant wordlist (there are zero NFKD collisions across the reference wordlists).
      84 | +3. If a wallet applies any *lossy* fold to input as a convenience — stripping diacritics, case-folding, or similar — and that fold maps a token to more than one wordlist entry, the wallet MUST reject the token and ask the user to disambiguate. It MUST NOT silently pick one entry. Distinct entries can collapse under accent stripping (for example Vietnamese `được` and `đuốc`, or Swedish `läger` and `lager`), and an arbitrary pick selects the wrong index and derives the wrong seed. Lossy folds are not required by this convention; a wallet that performs none is always conformant. Per-language collision counts are reported by the reference validator and documented in `validation/encoding-notes.md`.
    


    murchandamus commented at 9:49 PM on June 24, 2026:

    Maybe you are implying that already, but would it be possible to enforce at word list creation time that no word matches another per list if diacritics were stripped, case was folded or similar? Has that been done for the proposed lists?

    E.g., this was done for the French wordlist, where "special French characters "é-è" are considered equal to "e", for example "museau" and "musée" can not be together".


    osem23 commented at 4:51 AM on June 25, 2026:

    Right now I enforce this at input time, not at construction time the way the French list did. §Input parsing MUST 3: if a lossy fold (diacritic strip, case fold) maps a token to more than one entry, the wallet must reject and ask the user, never auto-pick. The validator already reports per-language collision counts under those folds.

    I didn't make it a construction-time MUST because a mechanically-seeded list can't always satisfy it without curation, which is the quality tension you raise in your top comment. I can add it as a construction-time SHOULD with the per-list collision report surfaced, and make it a MUST for any list that claims a curated tier. The input-time disambiguation MUST keeps wallets safe in the meantime.

  15. in bip-multilingual-mnemonic.md:112 in 4bf395ea9f outdated
     107 | +
     108 | +Every wordlist MUST clause above is mechanically enforceable. A reference validator at `validation/validate_all.py` in the reference registry checks each: exactly 2048 entries per file, UTF-8 encoding without BOM, absence of duplicates, absence of leading or trailing whitespace, absence of embedded whitespace under the full Unicode `White_Space` property, absence of hyphen or dash codepoints inside any entry, NFC form for TZUR Original wordlists and for the native-side fields of mappings, test vectors, and compound-entry datasets, and round-trip consistency of the bidirectional mapping against the canonical English wordlist. SHOULD-clause metrics (4-character prefix uniqueness, native-speaker review status, wordlist identifier triple) are not enforced as errors by the validator and are tracked separately in the registry's construction notes and the per-mapping JSON metadata.
     109 | +
     110 | +### Multi-word native concepts
     111 | +
     112 | +Some languages express a single BIP-39 concept only as a multi-word native term: Hebrew `רופא שיניים` (dentist), Turkish `hindistan cevizi` (coconut), Indonesian `kebun binatang` (zoo), Vietnamese multi-syllable words that use native word-spacing. Requirement 4 forbids embedded whitespace, so a conformant wordlist stores such entries as a single glued orthographic token (e.g., `רופאשיניים`, `hindistancevizi`, `kebunbinatang`). This is a structural consequence of the tokenization rule, not an independent requirement.
    


    murchandamus commented at 9:53 PM on June 24, 2026:

    I was wondering how so many languages had been created at inception. So the wordlists were created by translating the English words to the target languages?


    osem23 commented at 4:51 AM on June 25, 2026:

    Yes. Generated by translation, then validated rather than trusted: structural checks, back-translation and forward-translation each with an LLM verdict, multilingual sentence-embedding similarity, Wiktionary cross-reference, and a blind LLM top-8 pass. Process and per-language results are in docs/CONSTRUCTION.md and docs/V2_VALIDATION.md. It isn't a substitute for native-speaker review, which is why the lists are explicitly supersedable.

  16. in bip-multilingual-mnemonic.md:255 in 4bf395ea9f
     250 | +
     251 | +The specific MUST clauses each address a concrete failure mode. Embedded whitespace inside an entry breaks the paper-backup round trip because mnemonic tokenization is whitespace-based; a multi-word entry fragments into two tokens that the wallet cannot resolve, and the seed becomes unrecoverable from text backup. The bijective mapping requirement ensures that translation in either direction is unambiguous. The NFC storage requirement prevents precomposed/decomposed accent mismatches from causing silent lookup failures on restore.
     252 | +
     253 | +The 4-character prefix uniqueness recommendation from the original BIP-39 specification is achievable for English and most Latin-script languages but structurally infeasible for several scripts where word stems and limited short-prefix variety dominate. Requiring it would exclude those languages or force authorship of artificial vocabulary. Treating it as a SHOULD with informational reporting per language preserves the autocomplete benefit where feasible without excluding scripts where it is not.
     254 | +
     255 | +Native-speaker review is recommended (SHOULD) rather than required (MUST) because its absence is a UX risk, not a cryptographic risk. The worst case is a poorly-chosen native word that a future PR can correct; no funds are at stake.
    


    murchandamus commented at 10:05 PM on June 24, 2026:

    I don’t follow here. If people had started using the original native words to record their backup, changing the poorly-chosen word would invalidate their backup.


    osem23 commented at 4:51 AM on June 25, 2026:

    You're right, that line was wrong and I removed it (6608dcb). Published lists are frozen. A correction is a new versioned list, never a mutation of a published one, so an existing backup is never invalidated: it resolves against the exact version that produced it, pinned by SHA-256, with the canonical English mnemonic as the safety net.

  17. in bip-multilingual-mnemonic.md:259 in 4bf395ea9f outdated
     254 | +
     255 | +Native-speaker review is recommended (SHOULD) rather than required (MUST) because its absence is a UX risk, not a cryptographic risk. The worst case is a poorly-chosen native word that a future PR can correct; no funds are at stake.
     256 | +
     257 | +The 9 non-English canonical BIP-39 wordlists are alphabetized independent word selections, not translations of the English list, so they cannot serve as a display layer over an English mnemonic without the user facing semantically unrelated tokens at each index. This convention does not replace those wordlists; it sits parallel to them and fills the role they do not fill.
     258 | +
     259 | +This convention does not eliminate the cross-wallet restore problem for display-only backups; it bounds the problem and defines wallet-level obligations (§Backup and portability policy) that mitigate it. The user-facing safety net is the canonical English mnemonic, which every conformant wallet exposes in any flow that shows a display mnemonic. A backup that includes the canonical English mnemonic is restorable in any BIP-39 wallet without depending on the receiving wallet's wordlist support.
    


    murchandamus commented at 10:06 PM on June 24, 2026:

    If the users have to end up recording both the display-words and the English words, how does this solve the issues that non-English speakers are significantly more likely to make mistakes recording the English words?


    osem23 commented at 4:51 AM on June 25, 2026:

    MUST 1 is an availability obligation on the wallet, not a requirement to record a second English copy. A user can back up in the display language only, and then there is no English transcription step and therefore no English transcription error, which is exactly the failure this removes. English stays viewable and exportable as the portability guarantee and safety net, surfaced and labeled. I clarified this in the text (6608dcb).

  18. in bip-multilingual-mnemonic.md:267 in 4bf395ea9f outdated
     262 | +
     263 | +## Security Considerations
     264 | +
     265 | +- **PBKDF2 input is invariant under this convention.** Only the canonical English mnemonic reaches PBKDF2-HMAC-SHA512. An implementation that feeds the display mnemonic directly to PBKDF2 is non-conformant and produces incompatible seeds. The conformance test vectors in the reference registry exercise the resolve-to-English path for every supported language.
     266 | +- **Strict single-wordlist tokenization.** On restore, every token in the display mnemonic MUST resolve within a single display wordlist. Wallets MUST NOT silently accept mnemonics whose tokens span multiple wordlists, partial-match across wordlists, or fall through to the canonical English wordlist when a display token is unrecognized. Mixed-wordlist input is malformed and is rejected.
     267 | +- **Only the canonical English mnemonic guarantees cross-wallet recovery.** A user whose wallet supports a display wordlist can always recover the seed in any BIP-39 wallet by entering the canonical English mnemonic. A user who backs up only the display mnemonic and then needs to restore in a wallet that does not support the same display wordlist cannot recover without the mapping. The normative wallet-level obligations that follow from this property are defined in §Backup and portability policy above.
    


    murchandamus commented at 10:12 PM on June 24, 2026:

    I was somewhat excited by your idea at first, but this approach seems to undermine a big portion of the potential utility of this BIP. If the wordlists are not intended to be stable, I am not sure I see the point.


    osem23 commented at 4:51 AM on June 25, 2026:

    Agreed, and they are stable. The registry pins v1.0 with the SHA-256 as the load-bearing identifier; lists are frozen per version and never mutated in place. The Rationale line that implied otherwise was the bug, and I fixed it (6608dcb). Stability is the point, the same way it is for BIP-39 itself.

  19. murchandamus commented at 10:38 PM on June 24, 2026: member

    I gave this a quick first read. I like the idea of normalizing to the English wordlist under the hood as it directly mitigates one of the worst issues with BIP39’s portability.

    That said, the approach to the additional languages feels unappealing to me: producing initial lists by mechanically translating the English word list is bound to cause a number of issues such as the described concerns with terms composed of multiple words and diacritics, which would persist especially for wordlists that don’t get review before publication. As such wordlists would have room for improvement, it implies that there would soon be multiple wordlists for some languages which would cause even more confusion on top of BIP39 language lists vs display language lists. It seems worthwhile to try and pursue more stable higher quality lists from the get-go, so that more languages would only ever have a single wordlist to converge on.

    Given the numerous pull requests we’ve had to the BIPs repository where people tried to add more wordlists to BIP39, I would like to suggest only shipping a framework for more languages to be added instead of shipping with placeholder language lists, and to leave the creation of wordlists to the respective language communities.

    Since you are creating a new mnemonic scheme that is essentially a breaking change to BIP39 for every language but English, I would alternatively propose that you go further and create a new scheme that is not backwards compatible with BIP39 but instead addresses all issues with BIP39:

    • use the indices of the words to generate the seed instead of hashing text
    • encode a version
    • use a better checksum
    • if possible encode information about the output script pattern used
    • maybe create a generic encoding of data with words that then is used to encode a seed in a second BIP

    Preferably such a scheme would also use a different number of words so that it cannot be mixed up with BIP39.

  20. jonatack commented at 11:27 PM on June 24, 2026: member

    Yes, I used AI as a writing tool, and I'm not going to pretend otherwise. I'm proud of it. But "AI generated" isn't really the question for a spec. The question is whether it's correct and implementable. I stand behind every line and I understand every line. If any specific clause reads as wrong, vague, or unnecessary, point at it and I'll fix or defend it. That's the feedback I can actually use, and this spec is built to be checked rather than trusted.

    Thank you for clarifying. My goal isn't to stigmatize and I'm still trying to figure out the best way to handle LLM-generated submissions. I think it's mildly preferable to state upfront to readers and reviewers when the content is mostly LLM output, and to what extent, out of respect for their time. Some may indeed not see any issue. Others may not wish to spend scarce review cycles doing human review of LLM output, or may prefer to delegate review of LLM output out to LLMs, because human review is a scarce and expensive resource. The idea is to respect the community's time and help them allocate it well.

  21. Address review: list stability, English-availability, framework framing
    - Remove the incorrect "future PR can correct, no funds at stake" line.
      Corrections are new versioned lists; published lists are frozen; backups
      resolve against the pinned version (SHA-256).
    - Clarify Backup MUST 1 is an availability obligation on the wallet, not a
      requirement that the user record a second English copy.
    - State explicitly that the BIP specifies a framework and blesses no
      individual wordlist as canonical; list creation belongs to language
      communities.
    6608dcb931
  22. osem23 commented at 4:51 AM on June 25, 2026: none

    Thanks for the careful read. We agree on the core: normalizing to English under the hood is the win.

    On mechanical translation and "multiple lists per language", I think we're closer than it reads. The BIP ships no wordlists into this repo and blesses none as canonical. It specifies the framework: construction, mapping, and input rules, plus a conformance profile where every wordlist-level MUST maps to an executable check. The 30 lists live in a separate registry as a bootstrap corpus, supersedable by native-speaker review. I made that explicit in 6608dcb. So "ship a framework, leave creation to the communities" is the intended end state, not a conflict with it.

    On why I shipped a starting corpus and not just an empty framework: it expands practical BIP-39 coverage from 10 languages to ~30 today. The 10 canonical lists cover roughly a third of people by native language. The other two thirds, about 5 billion native speakers, have no list at all. A working corpus, even one communities later refine, lets wallets onboard those users now instead of waiting for 20 separate community list efforts to each reach completion. That reach, opening Bitcoin self-custody to people in their own language, is the whole point of the proposal.

    On "multiple lists cause confusion": that's what the (language, version, SHA-256) triple is for. Two lists for one language are two versions, and each backup names the one that produced it. BIP-39 today carries no version identifier at all, so this is strictly more robust, not less.

    On stability: I'm committing to immutability-by-version. A published list is frozen, corrections are new versions, no existing backup is invalidated. (You caught a Rationale line that said the opposite; fixed in 6608dcb.)

    On going further to a new, non-backwards-compatible scheme (indices to seed, version byte, stronger checksum, script-type encoding, distinct word count): I think that's worth doing, but it's a BIP-39 successor and a different document. This proposal's entire value is zero new cryptographic surface and universal restore in the installed base today, including English-only wallets. Folding a successor in forfeits exactly that, and helps none of those ~5 billion speakers now. I'd support a successor effort on its own track and would contribute, but I'd keep the two separate so this one stays deployable.

  23. osem23 commented at 9:38 AM on June 25, 2026: none

    This is not something I originally set out to work on.

    My main work is BlockSight.Live, a free Bitcoin explorer. I have worked in the Bitcoin ATM industry in Israel for the last 3+ years and have seen thousands of regular users interact with Bitcoin.

    My main goal has always been to build useful tools for Bitcoiners.

    While building a Bitcoin wallet with the native explorer integrated into it, I encountered the BIP39 language issue directly, and it bothered me.

    Users in Israel still generally have to write down their seed words in English. For many people, that is not natural, and I think it creates a real backup and recovery risk.

    I am not trying to change BIP39 itself.

    I am trying to explore whether this can be made better while keeping English BIP39 as the canonical base.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-25 13:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me