Mnemonic sentences instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.
BIP Draft: Formosa — Themed mnemonic sentences for generating deterministic keys #2108
pull Yuri-SVB wants to merge 11 commits into bitcoin:master from Yuri-SVB:master changing 1 files +422 −0-
Yuri-SVB commented at 8:16 PM on February 28, 2026: none
-
ea51d9b4b1
Formosa as BIP
Mnemonic *sentences* instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.
-
in bip.mediawiki:4 in ea51d9b4b1 outdated
0 | @@ -0,0 +1,224 @@ 1 | +<pre> 2 | + BIP: ? 3 | + Layer: Applications 4 | + Title: Formosa --- Themed mnemonic sentences for generating deterministic keys
murchandamus commented at 7:24 PM on March 2, 2026:Title is limited to 50 characters
Yuri-SVB commented at 9:44 PM on March 23, 2026:New title: Encoding seed as themed mnemonic sentences
in bip.mediawiki:14 in ea51d9b4b1
9 | + Status: Draft 10 | + Type: Standards Track 11 | + Created: 2021-12-10 12 | + License: BSD-2-Clause 13 | + Requires: BIP-0032, BIP-0039 14 | + Post-History: https://www.toptal.com/cryptocurrency/formosa-crypto-wallet-management
murchandamus commented at 7:27 PM on March 2, 2026:As we are now following BIP3 for the BIP Process, the Preamble is formatted slightly differently:
Authors: Yuri S Villas Boas <yuri@t3infosecurity.com> André Fidencio Gonçalves <andre7c4@gmail.com> Status: Draft Type: Specification Assigned: ? License: BSD-2-Clause Requires: 32, 39 Discussion: https://gnusha.org/pi/bitcoindev/jQqInjh7VTC5byefTzENidJjigvRqf5Y7UvbrWjKPJykvhdlLETeglGE3zoAiVAxUyAXU8uWHsHEjJ0MHqqPTy4prgaIhgMyIrD9c6ZUuE0=@pm.me/#t https://gnusha.org/pi/bitcoindev/F4cs-RJRQYBXhjoS9fc_cUc93yLrkQS5DNQAeFRHrLEQ5bScCjKSnaqN-IcXb16fxqO053muqFCx8_GzzKN5XCGCIHD9Ir1_baI5voKYfOo=@pm.me/ https://www.toptal.com/cryptocurrency/formosa-crypto-wallet-managementmurchandamus commented at 7:31 PM on March 2, 2026: memberHi Yuri, thank you for your submission. I see that your proposal was posted to the mailing list in 2023. Since then, we deployed BIP3 as a new BIP Process, so there are a few formatting changes that would be needed to the preamble. I would also suggest that you add a link to the prior discussion to the Discussion header.
At first glance, your document appears to be missing a Specification, a Rationale, and a Backwards Compability section. Please refer to BIP3 for more information.
murchandamus added the label New BIP on Mar 4, 2026murchandamus added the label PR Author action required on Mar 4, 2026murchandamus renamed this:Formosa as BIP
BIP Draft: Formosa — Themed mnemonic sentences for generating deterministic keys
on Mar 16, 2026murchandamus commented at 6:11 PM on March 16, 2026: memberHi @Yuri-SVB, I haven’t given this document a full review yet, because the initial submission has some formatting issues. If you are still working on this, please update your submission to meet the formatting requirements.
Yuri-SVB commented at 4:35 PM on March 23, 2026: noneHello, Murchandamus. Thank you for your attention, and thank you for remembering my earlier attempt from 3 years ago! I believe the requirements are met now.
3166be9419Update bip.mediawiki
Co-authored-by: Mark "Murch" Erhardt <murch@murch.one>
738dac9c16Update bip.mediawiki
Satisfying requirement of title in fewer than 50 characters.
Yuri-SVB commented at 5:50 PM on March 28, 2026: noneHi @Yuri-SVB, I haven’t given this document a full review yet, because the initial submission has some formatting issues. If you are still working on this, please update your submission to meet the formatting requirements.
Hello, Murch! Could you confirm all the remaining formatting requirements were met? Thank you!
murchandamus commented at 5:05 AM on March 29, 2026: memberHey Yuri, sorry for not getting around to looking at this yet. The preamble looks much better. I’m afraid I’m gonna be afk next week, so I will not be able to give this a full read until I’m back the week after.
murchandamus removed the label PR Author action required on Mar 29, 2026Yuri-SVB commented at 4:54 PM on April 1, 2026: noneHey Yuri, sorry for not getting around to looking at this yet. The preamble looks much better. I’m afraid I’m gonna be afk next week, so I will not be able to give this a full read until I’m back the week after.
Hello, Murch. No problem. I hope this compilation of references on Formosa (how I call this BIP39 expansion) can be of help.
in bip.mediawiki:93 in 738dac9c16
88 | +| 128 | 4 | 132 | 4 | 24 | 12 | 89 | +| 160 | 5 | 165 | 5 | 30 | 15 | 90 | +| 192 | 6 | 198 | 6 | 36 | 18 | 91 | +| 224 | 7 | 231 | 7 | 42 | 21 | 92 | +| 256 | 8 | 264 | 8 | 48 | 24 | 93 | +</pre>
murchandamus commented at 7:41 PM on April 14, 2026:I’m not completely opposed to a text-only presentation, but wanted to point out that Mediawiki does include table formatting, and most readers of the BIP would probably see the rendered version. When using table formatting, I think it would be possible to skip over the abbreviations to label the table and it would still fit.
Yuri-SVB commented at 12:40 AM on April 27, 2026:in bip.mediawiki:79 in 738dac9c16
74 | +BIP-0039 is a special case where each sentence contains three 11-bit fields 75 | +indexing a single 2048-word list (3 x 11 = 33). 76 | + 77 | +The following table describes the relation between the initial entropy 78 | +length (ENT), the checksum length (CS), the number of 33-bit sentences (S), 79 | +and the length of the generated mnemonic sentence (MS) in words. The word
murchandamus commented at 7:46 PM on April 14, 2026:It’s slightly confusing that you speak about multiple sentences that together compose to a single mnemonic sentence. Perhaps it would be better to use distinct terms, i.e, to use a different term for sentences or for the mnemonic sentence. I’m not convinced it’s the right suggestion, but perhaps,
Ssentences make one “mnemonic story” withMSwords?
Yuri-SVB commented at 12:39 AM on April 27, 2026:in bip.mediawiki:101 in 738dac9c16
96 | + 97 | +# Initialize an empty sentence array with one slot per category. 98 | +# For each category in the theme's ''filling order'': 99 | +## Extract <code>BIT_LENGTH</code> bits from the current position in the bit stream. 100 | +## Interpret them as an unsigned integer index. 101 | +## If the category is ''led by'' another category, look up the appropriate sub-list from the leading category's mapping using the already-selected leading word. Otherwise, use the category's total word list.
murchandamus commented at 8:22 PM on April 14, 2026:I don’t understand what you mean with “if the category is led by”
Yuri-SVB commented at 12:38 AM on April 27, 2026:in bip.mediawiki:129 in 738dac9c16
124 | + - the wordlist is sorted which allows for more efficient lookup of the code words 125 | + (i.e. implementations can use binary search instead of linear search) 126 | + 127 | +d) first-letters uniqueness 128 | + - the wordlist is created in such a way that it's enough to type the first two 129 | + letters to unambiguously identify the word
murchandamus commented at 8:29 PM on April 14, 2026:This formatting is odd. Did you intend to make those code blocks?
<img width="754" height="499" alt="Image" src="https://github.com/user-attachments/assets/c67d7b88-2640-494f-a005-68d13d1afc99" />
Yuri-SVB commented at 12:38 AM on April 27, 2026:murchandamus commented at 8:36 PM on April 14, 2026: memberThis reads already pretty well, although the specification could be presented in a more technical manner. It seems a bit light on the Rationale. It would be preferable if there were a Backwards Compatibility section instead of the mention in the Abstract.
I think an example of a Formosa-encoded seed could help illustrate what you are trying to do, I was firmly expecting to see one until I got to the end.
murchandamus added the label PR Author action required on Apr 14, 2026f5b0a1e942Formosa: address PR #2108 review feedback
Restructure the draft to follow BIP-3 conventions and resolve the issues raised by reviewers in https://github.com/bitcoin/bips/pull/2108: - Introduce explicit Specification section with a Terminology subsection that distinguishes 'word', 'category', 'theme', 'sentence' and 'mnemonic' / 'mnemonic story', removing the ambiguity of using 'sentence' at two different scales. - Replace the unclear 'if the category is led by another category' wording with an explicit LED_BY field description and a step-by-step algorithm that covers both the leaderless and led cases. - Reflow the theme-property list (previously a/b/c/d/e split by an intervening paragraph) into a single numbered list so it renders as a list rather than as code blocks. - Add a dedicated Rationale section covering the 33-bit sentence size, themed sentences, free-form theme schema, the LED_BY mechanism, the re-encoding-through-BIP-39 design, and why custom themes are discouraged. - Add a dedicated Backwards Compatibility section describing compatibility at the mnemonic, entropy, and seed levels. - Add a worked Example section showing a 128-bit entropy being encoded into a 4-sentence mnemonic story under a small illustrative theme, including bit splitting, FILLING_ORDER vs NATURAL_ORDER, and the LED_BY lookup. - Tighten the Abstract and Motivation; clarify that BIP-39 is itself a Formosa theme.
ac185147e0Formosa: spell out abbreviated table labels
Reviewer on PR #2108 asked for no abbreviations in table labels. Replace: - ENT / CS / S / MS column headers with 'Initial entropy bits', 'Checksum bits', 'Total bits', 'Number of sentences', 'Mnemonic words (6-word theme)' and 'Mnemonic words (BIP-0039)'. - 'List size / Bits / Chars to identify / Density (bits/char)' with 'Wordlist size / Bits per word / Characters to identify / Density (bits per character)'. - ADJ. with ADJECTIVE in the example bit-assignment diagram, and the surrounding narrative ENT/MS uses with the spelled-out forms. The accompanying formulas now use the expanded names too, so the algorithm description and the table column headers stay consistent.
621fa45042Formosa: rebuild Example on the real medieval_fantasy theme
Replace the previous hypothetical 5-category example with one that mirrors the medieval_fantasy theme actually shipped at https://github.com/Yuri-SVB/formosa/tree/master/src/mnemonic/themes, including: - the real 6 categories with their actual BIT_LENGTHs (VERB=5, SUBJECT=6, OBJECT=6, ADJECTIVE=5, WILDCARD=6, PLACE=5, summing to 33); - the real FILLING_ORDER and NATURAL_ORDER; - the real lead tree (VERB → SUBJECT; SUBJECT → OBJECT and WILDCARD; OBJECT → ADJECTIVE; WILDCARD → PLACE), showing that a single leader can have several dependent categories; - a 33-bit block whose decoded indices (28, 32, 63, 27, 46, 29) pick existing words and existing sub-list entries: VERB[28] =unveil, SUBJECT_under_unveil[32]=king, OBJECT_under_king[63] =wine, ADJECTIVE_under_wine[27]=sweet, WILDCARD_under_king[46] =queen, PLACE_under_queen[29]=throne_room, yielding the sentence 'king unveil sweet wine queen throne_room'. This keeps the worked example faithful to the reference implementation rather than to a fabricated theme, so that anyone can reproduce the encoding by parsing medieval_fantasy.json.
2d87a3cbe5Formosa: explain LED_BY as a primitive next-word predictor
Add a paragraph to the LED_BY rationale clarifying that a Formosa theme behaves as a primitive language model (next-word predictor): each LED_BY relation skews the conditional distribution over the next word so that probability mass falls only on the 2^BIT_LENGTH words compatible with the already- chosen leader, and zero elsewhere. The theme designer plays the role of training data, hand-curating which combinations are semantically coherent. This framing makes explicit why themes produce sentences that 'sound right' while still covering all 2^33 bit patterns of a sentence.
000a7401d9Cite the companion project Mooncake (https://github.com/T3-Infosec/mooncake)
which builds on this property by rendering each Formosa category as an on-screen table whose rows and columns are permuted per input session. Combined with the randomized-indexation property, an attacker watching only the screen still learns nothing without also recovering the press sequence. Add a Rationale paragraph explaining a further benefit of splitting the vocabulary into several short wordlists (32-128 entries each): such tables fit on a mobile-device screen and admit input via on-screen lookup, which a single 2048-word list does not. The randomized indexation: - defeats pure key-logging (keystrokes alone don't reveal words; the attacker also needs the session permutation), - raises the bar for shoulder surfing (same as key-logging: only keys AND session's permutation suffice. Either alone is uniformative). This gives an operational, security-focused argument for the many-small-lists design that complements the existing memorization and information-density arguments. Formosa: document Mooncake's volume-key input on mobile Add a paragraph to the Mooncake rationale describing the proposed mobile input mechanism: reuse of the volume-up / volume-down keys as a two-button binary selector. Because every Formosa category is sized 2^BIT_LENGTH and the on-screen table is laid out in rows, sub-rows and columns whose counts are powers of two, narrowing to a single cell takes exactly BIT_LENGTH presses (5 for a 32-entry category, 6 for 64, 7 for 128). The per-category press count is invariant therefore uninformative, and equal to the bits of entropy encoded, and the 'one bit per press' bound matches the existing side-channel argument. Add three concrete reasons why volume-key input on mobile resists visual shoulder surfing better than an on-screen keyboard: - Subtler input motions: a single finger pressing a side rocker, much harder to read from a distance than multi-finger taps on a glass keyboard. - Easy occlusion with the second hand: both volume keys are on one edge of the device, so the free hand (or the holding hand's thumb) can cover them without obscuring the screen for the user. - Pocket input via headphone volume buttons: because the protocol is purely binary, headphone volume controls are sufficient, letting the user keep the buttons in a pocket while operating it by feel and removing the input motion from the observer's field of view entirely.
murchandamus removed the label PR Author action required on Apr 27, 202638c7dfd754Update bip.mediawiki
Fixed typo from "dektop" to "desktop" Fixed agreement of number from "Those of a mobile device" to "Those of mobile devices"
in bip.mediawiki:51 in 38c7dfd754
46 | +the mental associations that aid long-term retention. 47 | + 48 | +Formosa builds upon BIP-0039 by organizing mnemonic words into themed sentences 49 | +with syntactic roles (e.g., subject, verb, adjective, object, place). Each 50 | +sentence draws vocabulary from a coherent semantic domain --- medieval fantasy, 51 | +science fiction, nature, finance, or any custom theme --- enabling the user to
murchandamus commented at 4:27 PM on April 29, 2026:The triple hyphen doesn’t get rendered as a special character in Mediawiki markup, so perhaps just use em dashes:
sentence draws vocabulary from a coherent semantic domain — medieval fantasy, science fiction, nature, finance, or any custom theme — enabling the user toin bip.mediawiki:4 in 38c7dfd754 outdated
0 | @@ -0,0 +1,422 @@ 1 | +<pre> 2 | + BIP: ? 3 | + Layer: Applications 4 | + Title: Encoding seed as themed mnemonic sentences
murchandamus commented at 4:44 PM on April 29, 2026:“Formosa” has better memorability. E.g., the following has 50 characters:
Title: Formosa—Seed encoding per themed mnemonic stories
Yuri-SVB commented at 10:50 PM on April 29, 2026:Thank you! 'Formosa' alludes to 'format', since it's a format for passwords / entropy arrays.
murchandamus commented at 4:47 PM on April 29, 2026: memberGood improvements, this reads great. I’m gonna look into a number assignment. It would probably be good if some wallet developers that have worked with BIP39 reviewed it, too.
murchandamus added the label Needs number assignment on Apr 29, 2026923faa4880Update bip.mediawiki
Substituted triple hyphen for — Co-authored-by: Murch <murch@murch.one>
08df954e5fUpdate bip.mediawiki
Updated title to mention Formosa and be more self-explanatory. Co-authored-by: Murch <murch@murch.one>
Yuri-SVB commented at 10:52 PM on April 29, 2026: noneGood improvements, this reads great. I’m gonna look into a number assignment. It would probably be good if some wallet developers that have worked with BIP39 reviewed it, too.
Do you have someone in mind? Would you like me to invite a wallet develper?
codeswot commented at 2:59 AM on May 5, 2026: noneI was one of the first people to try out Formosa and work on it to some extent on other app. notably Mooncake a Side-channel attack protection wallet app (i used formosa here) and loved it, ported formosa from python to dart at some point too. While my involvement makes me biased, I am providing this review from the perspective of an implementer to verify the protocol's stability and compatibility.
BIP-39 Compatibility: The Formosa mapping function is a reversible transformation. It is a strict 1:1 mapping of bits to sentence structures. I can confirm that a seed generated via Formosa can be exported to any standard BIP-39 wallet (e.g., Trezor, Ledger, Coldcard) without modification. The mnemonic sentence acts as an encoding layer for the underlying BIP-39 word list, not an alternative cryptographic standard. Entropy Density: The proposal maintains full 128-bit security by strictly mapping the sentence structures to the defined bit-length. There is no reduction in entropy; the 'themed' words are simply a semantic overlay for the underlying integer values. Defense-in-Depth: The inclusion of the Mooncake module in our reference implementation demonstrates that the Formosa format allows for UI-level side-channel mitigations (specifically against shoulder surfing and screen capture) that are difficult to implement with standard, non-structured word lists.I have verified that the implementation treats the mnemonic as a deterministic derivation of the seed. I am happy to provide test vectors or answer any questions the maintainers have regarding the wallet-import behavior."
Contributors
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-09 19:10 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me