BIP39 changes

slush0 commented at 12:33 pm on February 7, 2014: contributor

Hello,

BIP39 is already implemented in Trezor, Multibit HD and bitcoinj 0.11. Other projects also agreed to eventually implement it (Electrum). Please merge the latest version of BIP39 and change status from Draft to Accepted.

Thanks, Marek

PBKDF2 uses 2048 rounds 8d639fd794

Restructured, fixed some grammar/spelling errors. 8f8044b12d

Merge pull request #1 from ebfull/master

Restructured, fixed some grammar/spelling errors.

a3b576f6bd

Draft -> Accepted 78eaa26d9a

BIP 39 marked as Accepted 8eddd5b5f4

Fixed conflict

Conflicts:
	README.mediawiki

65442116b4

Fixed table formatting 5b98706b36

gmaxwell commented at 2:47 pm on February 7, 2014: contributor

PBKDF2 iteration counts this low are kind of snake-oily— not high enough to prevent millions of attempts per second and precomputation attacks. The low count was more acceptable when the design wasn’t being promoted as a “brain wallet”. To both recommend this as a brain wallet and reduce the iteration count further is concerning. The lack of a delegatable structure has also arisen as an issue with other wallet encoding formats.

I’m afraid that I couldn’t recommend this design as a best practice. I consider the current design irresponsible and unsafe.

prusnak commented at 3:04 pm on February 7, 2014: contributor

@gmaxwell: I don’t think that low iteration count is a problem. The thing is that every passphrase is accepted and will generate a valid BIP32 root node (plausible deniability). Attacker would need to take this root node, descend to a node which contains possible address and look into a blockchain if it has been used already. The last operation is much more complicated than using higher PBKDF2 iteration count or using scrypt instead of PBKDF2.

Moreover user can send various small amounts of bitcoins to fake passphrases in order to confuse a possible attacker and protect the intended passphrase.

slush0 commented at 3:09 pm on February 7, 2014: contributor

Another point is that BIP39 is not supposed to be a classic “brainwallet”; source of entropy is a computer’s RNG, not a human mind, so the BIP39 sentence is already very strong, plus the passphrase described by @prusnak.

For these reasons I don’t think your concern is valid for expected usecases of bip39.

gmaxwell commented at 3:10 pm on February 7, 2014: contributor

Yes, doing the point multiply to get the pubkey is more costly— but my laptop can do around 170,000 G*s operations per second. Specialized hardware (e.g. retasked FPGAs) could be much, much faster. We’ve already seen in practice that brain wallets have frequently resulted in users getting robbed (google: brainwallet stolen). I would not personally support a BIP promoting brainwallets— which the revised text of the BIP currently does as of this pull request— unless it had ludicrous strengthening.

Moreover user can send various small amounts of bitcoins to fake passphrases in order to confuse a possible attacker

… Really?

prusnak commented at 3:12 pm on February 7, 2014: contributor

@gmaxwell Really

slush0 commented at 3:13 pm on February 7, 2014: contributor

Specialized hardware (e.g. retasked FPGAs) could be much, much faster.

The entropy of bip39 mnemonic is 128-256 bits. Bruteforcing that is impractical.

petertodd commented at 3:15 pm on February 7, 2014: contributor

Not only are you promoting brainwallets, but for the users who don’t use BIP39 as a brainwallet your risking their funds by not including a checksum. Even a very simple checksum, just 8 bits, would dramatically reduce the chance of this happening, and can be implemented on slow devices like a Trezor, yet you’re not doing that.

Like it or not you’re going to get people hurt.

petertodd commented at 3:16 pm on February 7, 2014: contributor

The entropy of bip39 mnemonic is 128-256 bits. Bruteforcing that is impractical.

The entropy of bip39 used as a brainwallet is not 128-256 bits…

slush0 commented at 3:22 pm on February 7, 2014: contributor

@gmaxwell @petertodd In the whole bip39 the “brainwallet” is mentioned only once with the meaning that you may remember the sentence generated by the computer.

In our proposal the mnemonic is supposed to be generated by algorithm described here: https://github.com/trezor/bips/blob/master/bip-0039.mediawiki#generating-the-mnemonic . This creates mnemonic with very high entropy and with checksum; that’s how we implement that in Trezor.

The reason why generating of mnemonic is “optional” part of the proposal is because some developers want to implement mnemonic generator in different way (but still with high entropy), but still want to achieve bip39 compatibility.

petertodd commented at 3:39 pm on February 7, 2014: contributor

Which is the problem, “bip39 compatibility” will end up getting achieved by letting invalid checksums be ignored, users will accidentally enter in seeds that are invalid, and they’re not going to notice. Like I said before on the email list if you’re stuck with the UTF8 strings requirement - which I think it nuts anyway - brute force seeds so that only a small % are valid. In this case you can “soft-fork” the standard by further restricting what is valid, and depreciating old seeds as quickly as possible.

slush0 commented at 4:32 pm on February 7, 2014: contributor

We at Trezor solve “bip39 compatibility” by displaying red warning when the mnemonic has invalid checksum (or words are not in known wordlist). User can agree in using that sentence as the seed, but he got the information that he might be doing something wrong, which is generally a good practice.

I mean - this is purely problem of dialogs in the application and not the problem of the algorithm itself, which has been created as a consensus across many developer teams. As I said, this is already implemented in various software and people are already using that in real applications. For that reason it is impractical to re-write the draft.

There was also long discussion (started more than six months ago), so I’m bit puzzled that you raised such concerns right now.

gmaxwell commented at 4:52 pm on February 7, 2014: contributor

@slush0 At some point the design changed, originally an application could reject user formed “brainwallet” strings, under the revised design this isn’t possible because the proposal now consists of a completely independent “brainwallet” scheme with no constraints at all on the string. Because of this I can’t see how you’re arguing this is “purely a problem of dialogs in the application”.

As it stands there is no advice to include such dialogs, much less any instruction about required to do so, any considerations in the design to make doing so easy, or even any acknowledgement that a concern and a solution even exist. Effectively your response is telling me that to make the proposal safe an implementer must undertake a number of additional steps which are not even hinted at in the proposal. I hope you can see why I consider this concerning.

“We rushed ahead and deployed this in applications” is absolutely not a reason to endorse a BIP as accepted that has a flawed security model which will— as has been amply been demonstrated by the catastrophe of “brain wallets”— put users funds at risk.

If the proposal was revised (reverted!) to include a specification for decoding and checking the strings and mandating that conforming implementations perform this check and at least warn the user as you describe I would still consider this an mildly unfortunate conclusion— as we already know that users risk assessment on brainwallets is usually miscalibrated— but far better and no longer reprehensible. This could maintain compatibility with “plausible denyablility” schemes by only applying the test to the (say) first 128 bits worth of input.

slush0 commented at 4:58 pm on February 7, 2014: contributor

@gmaxwell Yes, we changed things as some developers on bitcoin-dev mailing list raised concerns about hardcoded wordlist. All this has been widely discussed on the list so this is not any “surprise”. Actually the change was the way to find any consensus. Again, there is a checksum, it is simply not enforced; that’s the red box described above.

Yes, we rushed in recent year to find a concensus between developers. Obviously it is not possible to satisfy everybody, but as you see, the draft is implemented by independent teams in other software, so it’s not that I rushed anything. The fact that you didn’t follow discussion on topic is not my fault.

gmaxwell commented at 5:17 pm on February 7, 2014: contributor

@slush0 " that’s the red box described above." is not part of the proposal. If it were part of the proposal I would have far less concern. Effectively you are admitting that this proposal is unsafe if implemented as specified.

I did follow the discussion on bitcoin-development— and I saw several people express concern that the proposed modifications were unsafe. Accordingly, I was surprised to see this pull request.

Can you explain to me the opposition to specifing the behavior which you admit above is necessary for safe usage?

slush0 commented at 5:22 pm on February 7, 2014: contributor

Can you explain to me your unwillingness to specify the behavior which you admit above is necessary for safe usage?

If it will be enough for you, then it is ok for me as well. This is your first constructive comment since the beginning of the discussion.

gmaxwell commented at 5:36 pm on February 7, 2014: contributor

I apologize for offending you, it wasn’t my intent.

If I thought I were raising a new concern I would have lead with remedies. Proposed solutions were proposed by Peter Todd and Adam Back and went completely without response by the proposing team, and concerns were expressed by others. (And Peter was clear that his opposition at the time was strong enough for him to NAK the proposal)

Specifying a verification scheme, even if its bypassable in conforming implementations, would be acceptable to me. As would the condition check proposed in various forms at various times and places by Sipa, Peter Todd, and Adam Back (which has the benefit of dictionary independence but with some computational cost at generation time).— though I’ll go ask those other folks to comment to, because I don’t want to send you down a goose chase only to get to another roadblock.

prusnak commented at 5:37 pm on February 7, 2014: contributor

@gmaxwell Do you think adding the following paragraph to “From mnemonic to seed” section would solve the raised problem? (Feel free to extend and/or rephrase).

“Although using mnemonic not generated by algorithm described in “Generating the mnemonic” section is possible, this is not advised and software should compute checksum of the mnemonic sentence using wordlist and issue a warning if it is invalid.”

luke-jr commented at 5:59 pm on February 7, 2014: member

I don’t think this is (correctly used) merely a “brainwallet”, since it is 1) computer-generated (entropy should be the same as any other HD wallet seed), and 2) represents a true wallet (not just an address). It’d definitely be better if software had a way to (and was required to) verify the input was computer-generated - is there a way to at least ensure this is possible in the future, even if not implemented in the initial releases?

Side idea, probably way too late unless people really like it: can the seed (and checksum) be encoded as a pattern, rather than the words themselves?

christophebiocca commented at 6:23 pm on February 7, 2014: none

@slush0 @prusnak

I think enforcing the checksum (at least with a warning, if not outright refusal to proceed) is exactly how we steer users away from using brainwallets. If they’re willing to calculate the checksum by hand and still not use a good random generator, we can’t help them.

I’ll also reiterate my previous suggestion to include the trezor’s wordlist as a recommended default right in the BIP itself. While I understand you’re trying to accomodate the desire for other wordlists, there’s no harm in at least saying “use this one for maximum compatibility”. Hopefully most/all wallets will support it.

maaku commented at 6:42 pm on February 7, 2014: contributor

Such a low, non-user settable iteration count for PBKDF2 is extremely worrying. It should be set as high as possible without introducing unacceptable delay in the user interface. I expect that even on a hardware constrained device like the Trezor this would be a much larger value, maybe in the 10k range. Additionally, it is something which could be compactly encoded in the mnemonic itself (e.g. by storing the base-2 logarithm minus the minimum). This would future-proof the standard against future hardware advances.

The lack of a details of checksum validation is a non-starter. If I were creating or curating a wallet app, I would not allow creation/exports of private keys using this BIP. Users must have a simple, offline mechanism for checking that the validity of a remembered private key, especially if this is being recommended for brainwallet use.

It is trivially easy to create two UTF-8 strings which are semantically identical but hash differently. With Trezor you control the input stack, but other wallet developers are left up to the whims of the operating system / user software. This will become a problem for users whose localized wordlists involve compatibility characters: is é U+00E9 or U+0065 U+0301? Both are valid Unicode, both hash differently, and both look identical to user. There is a solution to this problem almost as old as Unicode: normalized forms of Unicode standard annex 15 (http://unicode.org/reports/tr15/). Why are you not first converting to a normalized form before hashing?

christophebiocca commented at 7:15 pm on February 7, 2014: none

Such a low, non-user settable iteration count for PBKDF2 is extremely worrying. It should be set as high as possible without introducing unacceptable delay in the user interface.

If the entropy is bad, you’ve already lost. These aren’t password hashes, they’re meant to be a way to crunch a high-entropy string back into a fixed size seed. Honestly, it might as well be a single SHA256 call, to make people understand that it’s not secure when used with user-chosen strings.

Users must have a simple, offline mechanism for checking that the validity of a remembered private key

They do, as long as the implementation they use to check it suppports the same word list. The one advantage of this over the original bidirectional mapping technique is that once I’ve written down my mnenomic, and I’ve verified that it is correct (using the checksum), I can import that passphrase into any future wallet that supports BIP39, even as they migrate to other wordlists or longer strings.

This will become a problem for users whose localized wordlists involve compatibility characters: is é U+00E9 or U+0065 U+0301

Assuming a wallet developer does use a custom word list with unicode characters, can’t the list itself define the canonical representation of the word? It won’t save you when the importing wallet doesn’t support the word list, of course, but that’s a given.

It seems like everyone is fixating on the fact that this BIP allows for unsafe implementations to still technically be compliant. Yet when the stricter original proposal was written, it was impossible to get wallet developers to agree to implement it.

My concern is not whether wallets with terrible security practices can call themselves “bip39 compliant” or not. All I want to make sure of is that someone writing a new wallet, reading the proposal, can write a secure implementation without having to read between the lines.

prusnak commented at 7:20 pm on February 7, 2014: contributor

@maaku “Such a low, non-user settable iteration count for PBKDF2 is extremely worrying.”

Wrong. Read my comment above: #17 (comment) @maaku I expect that even on a hardware constrained device like the Trezor this would be a much larger value, maybe in the 10k range.

Wrong again. 1k takes around 1 second on Trezor. 4k takes ~4 seconds.

gmaxwell commented at 7:26 pm on February 7, 2014: contributor

::sigh:: the slow KDF is the reason for delegatable designs, but I’m certainly not about to argue that you should have that kind of major redesign. I should have called out delegation very early… but when there was ~no risk of brainwallet use there wasn’t really a need for a KDF at all except for standard good practice and to make adhoc manual threshold encryption by splitting up the key more secure.

The constraints on trezor performance make it infeasible to make any user produced entropy reasonably secure, so I suspect the focus should be on the minor improvements necessary to make that kind of misuse unlikely.

As far as the language goes, I think thats probably the right direction but inadequate detail to actually achieve the good behavior in practice. I feel bad that I’ve left you feeling like the goalposts are being moved out from under you, and so I’m hoping that some actual (fast) resolution can come from the thoughts being expressed here, and I’m consciously trying to avoid adding more yea/nay while other people are still chiming in.

slush0 commented at 7:31 pm on February 7, 2014: contributor

Regards to unicode normalization: which form of these four is the most common? https://en.wikipedia.org/wiki/Unicode_equivalence We can mention specific normalization form in the BIP, making these conversions clear.

maaku commented at 7:35 pm on February 7, 2014: contributor

@christophebiocca:

If the entropy is bad, you’ve already lost. These aren’t password hashes, they’re meant to be a way to crunch a high-entropy string back into a fixed size seed. Honestly, it might as well be a single SHA256 call, to make people understand that it’s not secure when used with user-chosen strings.

Then find a way to make the checksum mandatory and fail strings which do not pass. If you don’t do that, then human nature being what it is people will use it to construct their own unsafe private keys. And they will have their funds stolen.

They do, as long as the implementation they use to check it suppports the same word list.

The word list is not a part of this standard. What you’re telling me is that to perform the verification I need to use something implementation-defined. That’s very much defeating the point of having a standard, and worrying. I’d much rather there was a universal, wordlist-independent way to check validity.

It seems like everyone is fixating on the fact that this BIP allows for unsafe implementations to still technically be compliant.

Yes, yes we are. Because it is our responsibility to make sure that standards ensure safety.

The Unicode problem is trivially fixable: mandate that strings are converted into a normalized form before further processing. NFKD is probably the form you’d want, but I’d recommend reading the entire annex and making a determination yourself.

This is a one-line fix to the standard, and doesn’t effect your Trezor implementation at all, assuming you’re using an English (ASCII) word list.

maaku commented at 7:39 pm on February 7, 2014: contributor

Regards to unicode normalization: which form of these four is the most common? https://en.wikipedia.org/wiki/Unicode_equivalence We can mention specific normalization form in the BIP, making these conversions clear.

Normalized form C is the most common, as that is what is mandated for Internet traffic (I’m not sure if there’s an actual RFC to this point, or if it is just convention, but it’s what everyone uses).

However for this particular application you probably want one of the compatibility forms, since those are many-to-one mappings that further reduce the possibility of bad Unicode input resulting in the wrong key being imported. I think NFKD would be the safest choice, but it’d be nice if someone else read the annex and made an independent determination.

Added bip39 english wordlist ce1862ac6b

Removed reference to brainwallet 6a3bb51e3f

Usage of UTF-8 NFKD 041f51c2ff

slush0 commented at 8:09 pm on February 7, 2014: contributor

Ok, after digging into normalization forms, NFKD really seems to be the best choice, so I specified it in the paper.

I also removed the reference to “brainwallet” (which was clearly misunderstanding, because we didn’t understand that term as a sentence necessarily generated by human) and with an explanation that this proposal aim to encode computer-generated entropy to human-readable data and back, not to serve brainwallet purposes.

christophebiocca commented at 8:13 pm on February 7, 2014: none

@slush0 This addresses my concerns with the standard. Someone reading and implementing bip39 can now write an entirely secure and compatible implementation without having to dig around for auxiliary information.

Thanks.

gary-rowe commented at 9:03 pm on February 7, 2014: none

@slush0 Would you mind raising an issue in Bitcoinj to cover the normalization requirement? I think that Ken Sedgwick should know about the normalization requirement, if he isn’t tracking this repo.

This link will probably help him: http://docs.oracle.com/javase/tutorial/i18n/text/normalizerapi.html

maaku commented at 9:18 pm on February 7, 2014: contributor

This is much better, but please consider adopting something like Peter/Adam’s checksum grinding proposal, or separating the checksum from the menmonic. I would rather there was no KDF and the checksum enforced than standardization on an English wordlist.

slush0 commented at 9:42 pm on February 7, 2014: contributor

We’ve been considering other proposals than KDF, however we’ve found that using PKDF is more straightforward and standardized choice for doing hardening and achieve plausible deniability in one step. We also tried to keep the transformation procedure as simple as possible which generally increases the chance of implementing bip39 into various clients.

mikehearn commented at 10:13 pm on February 7, 2014: contributor

I think this thread raises some important questions about the BIP process, but it’s probably a discussion for another time/place.

ksedgwic commented at 10:42 pm on February 7, 2014: none

NFKD looks fine to me.

sipa commented at 2:27 pm on February 8, 2014: member

Given that BIP39 apparently changed from a method of encoding entropy directly, into a KDF + checksum sort of mechanism, I would really like to bring up this earlier proposal of mine again: https://bitcointalk.org/index.php?topic=102349.0.

Advantages:

It has a checksum built into the derivation mechanism, which cannot be bypassed.
Only requires a word list in generator implementations, not for derivation or checking (so doesn’t need to be part of the standard).
Supports variable iteration count (in exponential steps), without needing extra bits to encode it.
Generating (and attacking) is harder than checking/deriving - you only need to do it once anyway.

To support low-power hardware such a the Trezor, it would need to support difficulties down to 2^12, I presume, but that’s fine. At least we can end up with a standard that in 5 years isn’t limited by constraints of low-power hardware that existed at some point in the past.

Here is a proposal:

A mnemonic is valid (in combination with a passphrase) if either:
- After 64 iterations of HMAC-SHA512, the first 6 bits are zero. The seed in that case is the 63rd iteration’s output.
- After 128 iterations, the first 7 bits are zero. The seed in that case is the 127th iteration’s output.
- …
- After 65536 iterations, the first 16 bits are zero. The seed in that case is the 65535th iteration’s output.
For generation, don’t add a checksum directly to the entropy. Instead, iterate several times until entropy is found for which the mnemonic, after 64 steps of HMAC, results in an output which has 6 zero bits (or equivalent for the requested difficulty).
- This on average requires 4096 HMAC-SHA512 steps to generate a mnemonic, but only needs 64 to verify/derive.

Thoughts?

prusnak commented at 2:38 pm on February 8, 2014: contributor

@sipa How do you achieve plausible deniability with your scheme?

sipa commented at 3:01 pm on February 8, 2014: member

@prusnak That’s indeed a concern I missed.

I’m not sure however whether the current proposal does provide that (when used for Bitcoin wallets), though. Every passphrase/mnemonic combination is valid, but you can easily match the generated seed’s wallet addresses against the set of unspent outputs to see if it holds some coins.

Admittedly, the current proposal does make it harder to attack the passphrase (assuming the mnemonic is already known), as it requires EC operations and a UTXO set match to verify. My proposal can be modified to provide the same, by only incorporating the passphrase in an additional final HMAC-SHA512 step. That would remove the ‘salt’ functionality of the passphrase though, which I’m not sure I like less.

maaku commented at 4:04 am on February 9, 2014: contributor

Can someone give a quick summary of what plausible deniability means in this context, and why is it important?

gmaxwell commented at 4:18 am on February 9, 2014: contributor

Here is how I understand it: Writing down your key is the only sane thing to do— memory is far more fallible than we credit it for, and keys strong enough to even consider using are fairly long even with friendly encodings. This is all well and good…

But what happens when your evil maid finds the key? That would be bad. So you encrypt it or leave out a word from it or something along those lines. But what if your evil made holds you at gunpoint? In this case it would be helpful if an incorrect decryption were still a valid wallet, you could have even sent some funds to it, allowing you to plausably deny having even more funds elsewhere.

In reality this threat model is a really fringe case, and the sort of thing thats fun to think about in mental games of spy vs spy than something of practical importance. Transaction graph analysis will almost certainly break it: Maid says “I know you bought this pack of gum using key XYZ. Key XYZ is not one of the keys generated by this code you gave me, talk or you get the feather duster again!” Basically your attacker has to be completely ignorant of all transactions made with the denied key, and thats quite easy to break. Plus, if the scheme doesn’t have a slow KDF your attacker can brute force out a pretty reasonable space of probable transformations, this is the sort of thing that could just be a standard feature of a wallet recovery tool— let it run a few seconds minutes to try out a million likely related keys… but if you can get the denyability property near costlessly, why not? Though I don’t think the proposal here is getting it costlessly at all right now.

petertodd commented at 8:14 am on February 9, 2014: contributor

There’s no reason to leave out a checksum for the sake of plausible deniability - in your evil maid example you can just as easily generate a n word long seed, then extend it by one more word, brute forcing both to be valid. Only write down the first n words and use the first as a wallet for some small amount of funds and hope the attacker accepts it as is.

A second case where plausible deniability is useful is if you get raided by the FBI who is looking for evidence that you’ve been hiding assets from the IRS. They find a scrap of paper with 10 random words on it, and it turns out those words can access a $100,000 Bitcoin wallet. At the trial, you claim the money isn’t yours, they’ve got the wrong man, and it’s just dumb luck that the words matched a wallet with funds in them. Of course, they’re not going to believe that once the prosecution brings in an expert witness who points out the probability of 10 random looking words happening to match what someone elses securely generated wallet used was less likely than the judge being simultaneously attacked by a gorilla and tiger escaped from the local zoo.

For plausible deniability to work, it needs to actually be plausible. You achieve that by making the starting space where you get your seeds from small and unmeasurable. Pass phrases from books or human-generated sentences meet this criteria precisely because the entropy is small, and you can make such poor entropy safe enough by using extremely costly joule-hard KDF’s.

Following that approach our suspected man would have been smartest to pick a sentence from his diary, or maybe the awful poetry he wrote before he realized that an degree in post-modern literature wasn’t going to pay the bills. He then runs that sentence through a KDF with a 360kJ computation cost. (that’s kilo-joules, a 100W CPU running for 1 hour) The last part of the KDF process is to brute-force the checksum to produce a BIP39-compatible seed; a small ~256x increase in difficulty. Now when the FBI raids his home they take every single written thing they find and brute force it. Lets say that leads to ten million different passphrases they have to try brute-forcing: 10*10^6 passphrases * 360kJ/passphrase = 10GWh or $1,000,000 worth of electricity at $0.1/kWh.

First of all, they might just give up, great! But lets say they don’t give up: his lawyer can successfully argue that as everyone knows brainwallets are insecure. The search space of human generated phrases is still small with a high probability of collision, especially when you’re willing to spend a million bucks brute forcing every possible combination; it’s quite plausible the funds are not owned by his client. Equally, no-one knows how likely such collisions are as it’s rare for anyone to spend a million bucks to find out, so it is reasonable to doubt they are uncommon. Because of this, it could very well be theft from an unrelated third party if the FBI took the money, and thus they should leave it. The last argument fails - governments aren’t going to turn down free cash - but the jury is bewildered enough to aquit.

Now, back to BIP39… What the above shows is that plausible deniability in the real world has nothing to do with checksums, it is compatible with BIP39 w/ checksums using multiple different ways, and finally as argued before not including checksums in BIP39 is downright irresponsible and will get people hurt due to funds getting lost via typos. From a technical point of view, you can see a 8-bit grinded checksum as a lowest common denominator really; basically any KDF scheme - @sipa’s sophisticated grinding scheme included - can be extended to output a seed meeting that criteria using an algorithm where the final step is to simply grind the output further with a counter until the seed matches the checksum. The last decision is how many bits to use; the grinding only needs to happen on seed generation, so again, I ask the Trezor wallet people how long does grinding a 8-bit checksum take? I think for the one-time-use of seed generation, a few seconds is perfectly reasonable, so if we can do more than 8 bits, all the better.

Anyway, if you guys aren’t willing to create a standard that’s safe to use we shouldn’t be willing to put our stamp of approval on it by accepting it. Like it or not, accepting a BIP denotes approval of its contents by the people who control acceptance of pull requests to this repo. That acceptance will looked upon by users as evidence the standard works, represents best practices, and is safe to use. It is immoral for us to put our stamp of approval on something we know is dangerous; whether or not others were foolish enough to implement a bad standard doesn’t change that.

prusnak commented at 10:49 am on February 9, 2014: contributor

Then I guess we are OK with accepting the changes except switching the status from Draft to Accepted. We can live with that (similarly to BIP38 which is a Draft, but still widely used without any public discussion).

petertodd commented at 1:26 pm on February 9, 2014: contributor

BIP38 has a checksum; as far as I know it has no serious risks associated with it.

If BIP39 is not fixed it needs a highly visible warning at the top of the standard that the standard is dangerous, should not be used, is only included to document existing implementations of it, along with a description of why.

Given fixing the standard is trivial, an approach would be to create a followup BIP42 standard with the fix. Once some wallet software is upgraded, do a user-education campaign. Simultaneously releasing a tool that does brainwallets better would be a good harm reduction measure; PR-wise it’d be useful to get an official sign-off from a well known cryptographer like Adam Back on the design: “Adam Back agrees: LobotomyKey is the safest way to do something as stupid as using a brainwallet!” If you’re feeling evil, plant some fake reddit sob-stories about people losing their funds because of mis-typed BIP39 mnemonics, with or without associated brainwallet usage.

Or you could just fix things up front and save us all a lot of trouble.

petertodd commented at 1:32 pm on February 9, 2014: contributor

Incidentally, test vectors belong in the BIP document itself so that the git commit hash of the bip repo include them. Feel free to make a bip-0039 directory with them, as was done with Gavin’s BIP-16 test results, bip-0016/qa.mediawiki, and just stick the .json file in there.

prusnak commented at 4:41 pm on February 9, 2014: contributor

Dealing with private keys directly is risk itself, but that’s not the topic of this discussion. I mentioned BIP38 mere as an example of BIP not accepted, and still being widely used.

petertodd commented at 4:47 pm on February 9, 2014: contributor

@prusnak You misunderstand what the risk is; it’s not dealing with private keys, it’s dealing with them in a way that allows for human error like typos to go undetected.

christophebiocca commented at 5:03 pm on February 9, 2014: none

not including checksums in BIP39 is downright irresponsible and will get people hurt due to funds getting lost via typos

The (wordlist independent) checksums you mention only constitute an integrity check, they have no error correction in them (unless I completely misunderstand them). I’m comparing the various scenarios where people make typos. I have a hard time coming up with a situation where BIP39 as written actually leads to lost funds, while a checksum like you propose doesn’t.

If I make a typo re-entering my mnenomic back from a piece of paper:

BIP39 (no shared wordlist): The mistake is only detected once the wallet is opened and found to be empty. The user can retry, and eventually get it right.
Independent checksum: The mistake is detected as soon as the user finishes typing.
BIP39 (with shared wordlist): The mistake is detected as soon as the user finishes typing. A clever wallet can try to autocorrect common typos.

If I make a mistake writing my mnenomic to a piece of paper (and I don’t check it), then try to restore later:

BIP39 (no shared wordlist): The mistake is only detected once the wallet is opened and found to be empty. The user will now have to try and figure out what transcription error they originally made.
Independent checksum: The mistake is detected as soon as the user finishes typing. But there’s no indication of where the error is. The user will now have to try and figure out what transcription error they originally made.
BIP39 (with shared wordlist): The mistake is detected as soon as the user finishes typing. A clever wallet can try to autocorrect common typos. Some of the time that will fix it, other times that will not be sufficient and the user will have to figure things by themselves.

In both these scenarios, the checksum doesn’t do anything more than speed up error detection. The wordlist approach can correct most errors (as well as check the checksum).

The only scenario I can think of where your approach has a noticeable advantage is:

I generate a mnenomic using one tool
Write it down (correctly)
Put no funds in the associated wallet (in fact I delete all traces from my machine)
Import that mnenomic (incorrectly) into another tool (that doesn’t support the same wordlist, otherwise it’d catch the typos)
Move money into that wallet.

Now my paper backup is correct, but my funds are in a mnenomic that the original wallet wouldn’t recognize as valid. It will be extremely hard to find out what my error was.

But that’s an utterly contrived example, do you have a better one? Can anyone else come up with one?

slush0 commented at 5:08 pm on February 9, 2014: contributor

@petertodd This discussion is going in wrong direction. However @prusnak’s point is that although private keys are safe, they even have checksum, there are still many horror stories of people losing hundreds and thousands bitcoins because they didn’t understand concept of change addresses.

So if you want to think for users, please firstly ban Bitcoin-qt because of a need of periodic backups and ban all bitcoin clients who allow direct handling of private keys, because all this expose common users to significant and real risk of losing money.

BIP39 aims to actually reduce the risk of losing money, because it introduces all that fancy stuff like checksums, plus it completely removes the risk of direct handling of private keys. If you even try to be fair in judging the proposal, you cannot oversee this.

I’m going to change “Accepted” to “Draft” from the pull request. Please accept the pull request then.

BIP39: Accepted -> Draft 6da6c40218

petertodd commented at 7:51 pm on February 9, 2014: contributor

@prusnak If you’re going to have a checksum, enforce it reliably. Since you’re unwilling to agree on a wordlist and standardize it that won’t happen. Thus I proposed a simple solution that lets you avoid worldlists, and is still quite safe under all circumstances. @slush0 A private key is a private key. The bad guy getting their hands on a BIP39 mnenomic+password results in you losing your funds all the same, especially considering that the passwords either have to be short enough to be memorable, and thus trivially brute-forcable, or long enough that losing them is in itself a risk. However by not including a reliably enforced checksum - a trivial technical measure - you are risking users funds and have created a dangerous system for no good reason.

slush0 commented at 9:45 pm on February 9, 2014: contributor

@petertodd ad “you are risking user funds”:

“software SHOULD compute checksum of the mnemonic sentence using wordlist and issue a warning if it is invalid.”

-> What exactly is unclear on this sentence? Where can properly coded client with BIP39 support lose user’s money? @petertodd Nobody is talking about security of private key! Our point is that “current standards” implemented in all bitcoin clients are much more dangerous for common users, because they let the user handle private keys, do the transaction and silently move change output to another private key! That’s something what’s absolutely unexpected and what’s the total UX fail. Please focus to these real problems of existing infrastructure and care less about theoretical issues where the real loss is neglible, because the potential use case which can lead to losing user’s funds is detectable (as is recommended in the paper).

For me this discussion is nothing more than obstructing the reached consensus of other developers just because of non-realistic scenarios which won’t happen in real life.

slush0 commented at 9:50 pm on February 9, 2014: contributor

By the way we changed many things in this pull request to reach some consensus with you. Current changes in this pull requests are:

slightly changed wording (proof read of native speaker), that’s why the pull request is so big, but there’re no real changes in meaning of the paper.
Added UTF-8 NFKD, which is generally a step forward.
Added english wordlist, as requested
Removed the misleading “brainwallet” - this was just misunderstanding of the word

Please accept these changes which are actually improving the state of draft and let’s move the discussion from this pull request to bitcoin-dev mailing list, which is the better place for it.

petertodd commented at 1:15 am on February 10, 2014: contributor

There’s a big difference between wallet software letting a user do something by manually getting a private key out via some expert-mode interface and designing a system that deliberately makes it easy, even necessary, to disable safeguards. You’re specifically handling UTF-8 so you can avoid having to specify a wordlist in stone, which inevitably means you will see wallet software disable the checking of checksums to allow seeds to be imported between wallets that do not support the same wordlists. The BIP39 standard as written has to have SHOULD instead of MUST because of that. Meanwhile the two actions you can take to fix this problem, specifying a wordlist in stone and removing the UTF8 handling, or using a grinded checksum, you refuse to do; the latter can even be done in a completely backwards compatible way.

So why is that? I note you still haven’t answered my question about whether or not a Trezor can grind a checksum.

Again, the problem is very simple: user generates mnemonic, user writes down mnemonic as a secure backup, user imports into same or different wallet software with typo. Since the checksum is optional and checksum checking must be able to either be “clicked through” or disabled for compatibility we have an opportunity for the backup and wallet to diverge and funds to be lost. Keep in mind that there may be no transactions in the wallet to warn the user something is amiss, they may not realize the lack of transactions/funds represents a problem, or they may be importing into a offline system that can’t determine what transactions would be associated with the wallet anyway.

Also since I’m repeating myself, I’ll point out that equally the fact that mistyping a passphrase has the same effect - all passphrases are valid - will result in users losing their funds and in reality reflects a poorly done brainwallet-based design snuck in through the backdoor. (note how passphrases can’t be changed) Again, that is solid grounds for NACKing BIP39.

petertodd commented at 1:26 am on February 10, 2014: contributor

re: consensus, there seems to be consensus among developers I respect that it is rather worrisome that you’ve been able to come to consensus within your group of wallet developers that BIP39 is a good idea.

slush0 commented at 1:42 am on February 10, 2014: contributor

@petertodd Long story short, you think that using standard application API for handling private key, without ANY warning, is completely different story than user knowlingly supress red box telling the user than he’s probably doing something wrong? Ok, then we probably won’t find a consensus at this point.

" you refuse to do" - No, I’m not refusing to do it. Actually I would love to have hardcoded wordlist and don’t allow any other method. Unfortunately other developers didn’t agree on this point, so we offered this to find an agreement with them.

Honestly, we need to go forward. We’ve spent almost a year (!!) on theoretical discussions and we’ve rewritten the proposal many times. Please keep in mind that finding a solution where everybody will be absolutely happy with is impossible. I’d like to see you to understand this.

Please accept the pull request, because your comments are off-topic to current pull request. Let’s then discuss deeper changes on mailing list. Again, current pull request simply fixes some minor things of the mechanism which is in the draft already, so there’s no reason to block changes in this way.

Please…

should -> must 13b7749520

slush0 commented at 1:47 am on February 10, 2014: contributor

… better?

sipa commented at 1:48 am on February 10, 2014: member

Let’s not make this into an us-vs-them discussion, and work towards a solution.

First, my own opinion:

I think the concern raised here is that you’re essentially defining two versions of the standard, and try to get the benefits of both.

A wordlistless version that just accepts any UTF-8 string as mnemonic, and produces a seed for each, but which cannot detect any typo.
A version with a specific well-defined wordlist, with a checksum. However, if you want to support mnemonics from software without the checksum, you cannot enforce it. This makes it directly usable as human-generated-brainwallet tool, which is something that many people have concerns about.

I think you have to choose one or the other. In practice, either every wallet with implement the dictionary (and we’d have been better off just enforcing it), or they will end up using different wordlists each and all software will just accept anything (and we just have human-generated brainwallets).

Regarding wallets (including Bitcoin-Qt’s) current key handling resulting in more problems: very likely. I’m completely with you that we should try to avoid handling private keys directly, avoid requiring constant backups, and avoid exposing users to individual addresses. that’s also why I’m working on BIP32 support in it. I’d also like to have some version of paper wallet / mnemonic in it, but only under the condition that it cannot (easily) support human-generated entropy. Ideally, it also has adaptive strengthening, so it doesn’t become weaker security-wise over time.

Is a checksum-grinding mechanism really inacceptable to you? It would lower many concerns.

As a meta-discussion: it is very unclear to me who should be in charge of allowing modifications to the bips repository. It’s certainly not the core developers (so please don’t see my opinion here as review, it’s just my opinion), I’d say the same person who assigns BIP ids + original author (or champion who took over)?

In any case, I don’t see why this change cannot be merged, it’s certainly an improvement and it represents the current state of the proposal. Whether it is a good proposal, with community consensus, is a different question.

luke-jr commented at 1:53 am on February 10, 2014: member

Frankly, the BIP docs are just a reflection of reality. If Trezor deployed BIP 39 as-is, and other software also adopted it, it would be Accepted whether this were merged or not, and to merge it would only poorly reflect on the repository.

So, while I think it’s good to discuss how to improve the BIP before people start using it, it is the very fact of adoption which has control over it being Accepted.

slush0 commented at 1:59 am on February 10, 2014: contributor

and work towards a solution. @sipa That’s why I’m not going to re-write the proposal again and again. This is already third or fourth version of the algorithm and we’re working toward simplifying the proposal and finding wider and wider agreement. Rewriting it from scratch every time after somebody else tell us that “its wrong because I think so” won’t help us to move forward.

Is a checksum-grinding mechanism really inacceptable to you? It would lower many concerns.

Actually the main reason for refusing these proposals is that they’re much more complicated to implement them correctly, for (in our opinion) not very good reasons, because I don’t think that correctly implemented bip39 would introduce any real risk.

Any non-standard or “mining” algorithm rises resistence of implementing it into another software. I was always trying keep things simple and do not over-engineer things. With all respect to other proposals, I think KISS principle work in the real world very well.

maaku commented at 2:21 am on February 10, 2014: contributor

Checksum grinding would be relatively straight forward to implement. If the checksum is a single byte, then you only have to grind a hundred or two variations to generate a solution. This is a simple loop with no fancy algorithms.

If performance is an issue, drop or weaken the KDF. You gain far, far more from a checksum (protecting against everyday concerns) than the KDF (protecting against unlikely/exceptional concerns).

petertodd commented at 2:32 am on February 10, 2014: contributor

@luke-jr These are Bitcoin Improvement Proposals. Just because a bunch of people are doing something doesn’t mean we have to say it’s a good idea, let alone an improvement. Fundamentally the people with commit access to this repo are putting their stamp of approval in some fashion on the contents of the repo. @slush0 This is what grinding looks like FWIW when generating a seed from random data:

0seed = os.urandom()
1while SHA256(seed)[0] != b'\x00':
2    seed[0:4] = struct.pack('>l', struct.unpack('>l', seed[0:4])[0] + 1)[0]

The passphrase case isn’t much different.

slush0 commented at 2:38 am on February 10, 2014: contributor

This is offtopic.

Who can merge this pull request? Back in days of bitcoin.it I was able to change such minor things in wiki by myself. Now it takes 3+ days and 40+ messages.

sipa commented at 2:44 am on February 10, 2014: member

Agree, all this is offtopic.

The discussion here should be about this change; the rest of the discussion is indeed better done on the mailing list (I apologize for continuing it here).

That said, I think there is sufficient uncertainty about the process of making changes to BIPs. As long as it’s an unfinished draft, there should be no reason why more than the author’s approval is required. When it is finished (even if not accepted), there’s probably more necessary,

laanwj commented at 7:31 am on February 10, 2014: member

Agreed, changes to draft should simply be merged if the author agrees. Though of course everyone is free to discuss. Community concerns only become blocking when the BIP needs to be accepted, and that’s discussion for the mailing list.

laanwj referenced this in commit 8ac51158bd on Feb 10, 2014

laanwj merged this on Feb 10, 2014

laanwj closed this on Feb 10, 2014

luke-jr referenced this in commit a90bd90c3c on May 19, 2016

luke-jr referenced this in commit 0dd4583db1 on Jun 6, 2017

bitcartel referenced this in commit 36f3a6cb98 on Oct 31, 2019

guggero referenced this in commit a33ab4cce2 on Jun 23, 2022

real-or-random referenced this in commit daacb3ec0f on Aug 10, 2022

bitcoin deleted a comment on Nov 18, 2023

kingcathy23 approved

heksani approved

Jesjimenez approved

rkhadem4 approved

bitcoin locked this on Oct 16, 2024

BIP39 changes #17