Request for BIP number for "Address Format for Witness Program"
BIP 142: Address Formats for Witness Program #267
pull jl2012 wants to merge 3 commits into bitcoin:master from jl2012:segwit-address changing 1 files +157 −0-
jl2012 commented at 1:45 PM on December 24, 2015: contributor
-
Create bip-segwitaddress.mediawiki 09192275ac
- jl2012 renamed this:
Request for BIP number
BIP draft: Address Format for Witness Program
on Dec 24, 2015 -
gubatron commented at 8:10 PM on December 24, 2015: none
"Padding with 41 - 26 = 15 0x00:"
Wasn't it 43 bytes?
-
dabura667 commented at 10:39 PM on December 24, 2015: none
What is the rationale behind 41 bytes?
Why fixed length? At this point, the blob of letters is so large and the data contained within no longer needing to be a hash means that you could easily trick a user who is only looking at the first few characters into thinking they're sending to the right segwit address.
Having the B or T at the beginning will give a false sense of security when there is none, and telling the user "compare the LAST few characters!" Won't help in most situations because the sheer length of the address will mean the end will be cut off in most input boxes that it might be shown in.
I propose:
- Do away with fixed length
- If you must have a human readable address for this, use p2sh and change the version byte from 5 to 7 or whatever.
-
dabura667 commented at 10:41 PM on December 24, 2015: none
Unorthodox, but perhaps the checksum should come after the version byte so people can verify the address by eyeballing it.
-
sipa commented at 10:42 PM on December 24, 2015: member
There is a reason for having a new address format: p2sh only has 80-bit collision resistance.
-
dabura667 commented at 11:01 PM on December 24, 2015: none
There is a reason for having a new address format: p2sh only has 80-bit collision resistance. @sipa How was 80-bit calculated? @gubatron If this is true, the BIP should state 80-bit, and avoid "less" and other vague wordage. @sipa What about the length attack problem I described? I would say a large amount of people just look at the first 5-6 characters and compare, but if the real address was:
<PUBKEY> OP_CHECKSIGbut the attacker shows them<PUBKEY> OP_DROP OP_TRUEthey could cause a user to lose funds, and maybe steal them.To the user, the first 30 letters are the same, so 99% of users will be satisfied.
If collision is what you're concerned about, why not soft fork in SHA512 with the initial implementation, or use HASH256 instead?
-
sipa commented at 11:04 PM on December 24, 2015: member
Witness v1 uses 256-bit hashes for redeemscripts (which have 128-bit collision resistance).
I don't disagree with getting rid of addresses entirely, and encouraging payment protocol etc. instead, but if P2SH supports addresses, and Witness scripts don't, we're potentially encouraging potentially worse security for systems that somehow are stuck with relying on addresses.
-
dabura667 commented at 11:13 PM on December 24, 2015: none
I don't disagree with getting rid of addresses entirely
I am not proposing such a thing. I merely think that having the hash (of the pubkey or script, traditionally) as close to the version byte as possible means that more brute forcing must be done to try and spoof (vanitygen style brute force the first x characters) the first few letters.
Which is why I would suggest maybe: (unorthodox, but it would mean the smallest change in the witness program would change the beginning wildly.)
Payload 151A0076A914010966776006953D5567439E5E39F86A0D273BEE88AC000000000000000000000000000000 first 4 bytes of checksum 1B4A8136 Add the 4 checksum bytes after version and before witness program. This is the 47-byte binary witness program address: 151B4A81361A0076A914010966776006953D5567439E5E39F86A0D273BEE88AC000000000000000000000000000000We could even make the checksum larger to allow for greater collision resistance. 8 bytes etc.
-
sipa commented at 11:22 PM on December 24, 2015: member
No problem with that :)
-
sipa commented at 11:22 PM on December 24, 2015: member
But it doesn't solve the problem really. An attacker can trivially grind to make the first bytes match up too.
-
dabura667 commented at 11:55 PM on December 24, 2015: none
32 byte checksum :-D
-
dabura667 commented at 11:57 PM on December 24, 2015: none
Also, still curious on the 41 byte max length reasoning.
Wouldn't it make sense to allow for some kind of variable length scheme?
-
jl2012 commented at 2:17 AM on December 25, 2015: contributor
@gubatron @dabura667 You have to read this with the main segwit BIP: #265
41-byte is the maximum length for a witness program allowed in BIP-SW. The reason for 41-byte should be discussed in that BIP.
Since 58 is not a power of 2, replacing
<PUBKEY> OP_CHECKSIGwith<PUBKEY> OP_DROP OP_TRUEwill completely ruin the address, except the first 2-3 characters.I have considered variable length, but the first character is fixed only if the length is fixed
-
jl2012 commented at 2:28 AM on December 25, 2015: contributor
@dabura667 moving the checksum to the beginning should be a good idea. It spreads its entropy to the whole address since 58 is not a power of 2.
Using variable length will make the address shorter in most cases, but the first character will also become variable
-
dabura667 commented at 4:12 AM on December 25, 2015: none
will completely ruin the address
Only after a certain point. Not the whole address. One thing I overlooked is the segwit length byte, but as long as that stays the same I can create a very similar address.
Try this:
0f240021026a85bb9fb33e8841c83b52f8d15438ccc65c646657c5992dddeffd0b21790c19ac0000000000
(no checksum) is bob's address
0f240021026a85bb9fb33e8841c83b52f8d15438ccc65c646657c5992dddeffd0b21790c19b90000000000
This is the malicious address, replaces checksig with nop so that anyone can spend.
The address is the same up til 2/3 of the way in unless we insert the checksum after the version byte.
I can't get to a pc right now, but try it yourself
-
jl2012 commented at 4:13 AM on December 25, 2015: contributor
@dabura667 Yes you are right. Moving the checksum to the beginning should fix the problem
-
maaku commented at 6:30 AM on December 25, 2015: contributor
Moving checksum to the beginning doesn't do anything to fix the problem -- not with the checksum size being considered.
But there are two much more fundamental problems with this proposal:
- Segwit was specifically constructed so as to not require a new address format. P2PKH and P2SH addresses work just fine, and more complicated schemes can be done via the payment protocol. This has three primary benefits: (a) infrastructure does not need to be updated, (b) p2pkh and p2sh can be used as a fallback for non-upgraded nodes (the upgraded node listens for both segwit and non-segwit incoming coins), and (c) we can avoid placing this debate on the critical path for implementation.
- Base58 addresses is a terrible idea that would have never passed peer review if done today. They are variable-length, distinguish capital and lowercase, have no error correction, are not accessible, and are insecure for encoding anything other than a hash (which this proposal is not!).
I suggest putting this BIP on hold and instead work on a proper implementation of a next-generation address format, using e.g. Damm's algorithm or Reed-Solomon error correction codes, then revisit as a potential future address format if an actual need is demonstrated.
-
NicolasDorier commented at 7:21 AM on December 25, 2015: contributor
If you want to use payment protocol you need a server you trust to provide the payment endpoint for people needing to pay you, you also need parser of protobuf, and understanding PKI infrastructure. This simply cannot replace address format for wallet providers.
An address format is needed. If Base58 is a terrible idea, proposing other scheme should be considered important, without an address format wallet provider will simply not implement segwit. (or use a suboptimal address scheme)
-
maaku commented at 7:25 AM on December 25, 2015: contributor
@NicolasDorier the point is that you can make a segwit payment to a p2pkh or p2sh address, so existing addresses work just fine.
-
jl2012 commented at 8:16 AM on December 25, 2015: contributor
@maaku I think the BIP has explained why an address is needed for native segwit. Since this is a separate BIP, it shouldn't affect the implementation of segwit itself.
The motivation is simple: to make an address that every existing Bitcoin user understand, and easy for wallet devs to implement, and at least as safe as existing address
"They are variable-length, distinguish capital and lowercase, have no error correction, are not accessible, and are insecure for encoding anything other than a hash (which this proposal is not!)."
Out of these comments.....
- This is fixed length
- Capital and lowercase ---- same as current addresses
- No error correction ---- same as current addresses
- Not accessible ---- a bit worse than current addresses as it is longer
- Insecure for encoding anything other than a hash ---- true
So only point 5 is relevant. If this is addressed, the new address format would not be worse than an existing one (except longer)
An obvious solution is to increase the length of checksum, so it basically becomes encoding a hash (i.e. checksum) again.
-
maaku commented at 9:22 AM on December 25, 2015: contributor
- This is fixed length.
Interesting. This is not the case with P2PKH or P2SH addresses, which vary in size between 33 and 34 characters. But testing randomly generated scripts does seem to indicate a consistent size here.
same as current addresses
The point is that current addresses fail considerably on these metrics. If we go forward with generating a new address format, then we should aim to create a better standard than we are stuck with today. And this isn't a case of "more research is required" -- there is decades of research literature and practical experience in encoding short strings of information with error correction in human-friendly formats.
For example, most of the concerns I raised above could be alleviated by using a base32 encoding with Damm error correction code. Any single digit error or swapping of consecutive digits would be corrected, case wouldn't matter, and code points selected for visual and auditory distinction. A payload of your format would look like the following:
qnouwchpsg14mj6eeyta5qi8ep7urd7kzgcw75cdrksearqxf64fdxkeomb6999ufak3stqa5
Code for manipulating these formats would actually be less than base58 complexity, for an encoding that is 73 characters instead of 64.
You might want to consider a 2nd encoding format for smaller, more commonly used script sizes.
An obvious solution is to increase the length of checksum, so it basically becomes encoding a hash (i.e. checksum) again.
Unfortunately you really need a full 32 bytes to close off all forms of attack with 128-bit security. With encoding sizes >100 characters as a result, one really wonders what the value of such a proposal would be.
A better solution is to permute/mix the bytes of the payload in a reversible way based on their base-8 checksum.
-
jl2012 commented at 9:42 AM on December 25, 2015: contributor
for an encoding that is 73 characters instead of 64.
Longer than 80 will break the line. I think 75 is the upper limit.
A better solution is to permute/mix the bytes of the payload in a reversible way based on their base-8 checksum.
How about using AES, with the checksum as the key?
-
maaku commented at 9:57 AM on December 25, 2015: contributor
How about using AES, with the checksum as the key?
Well it'd have to be something recoverable. The idea I had in mind was to an order independent checksum (e.g. XOR the bytes, but maybe there's something better in the literature) and a deterministic, data-independent shuffle like Fisher-Yates. Since such a shuffle operates on the order of bytes being moved around, but doesn't modify those bytes, the 8-bit order-independent checksum would be preserved, allowing you to unshuffle during decode. AES wouldn't have that property though.
That would not provide cryptographic security though. I need to think on this some more.
-
jl2012 commented at 10:08 AM on December 25, 2015: contributor
Well it'd have to be something recoverable.
AES should be recoverable, if the key (i.e. checksum) is given in plaintext.
-
gubatron commented at 1:53 PM on December 25, 2015: none
I was just confused because the format spec at first said:
[2 to 41-byte witness program] [padding by 0x00 to 43 bytes]and then the example said: "Padding with 41 - 26 = 15 0x00:"
but I thought after that you needed to add a couple more
0x00's to get to 43 bytes for the padding. -
jl2012 commented at 8:52 AM on December 26, 2015: contributor
The PR is closed for now for rewrite
- jl2012 closed this on Dec 26, 2015
- jl2012 reopened this on Dec 27, 2015
-
New proposal with 2 address types 93cedfbf2d
-
jl2012 commented at 8:23 AM on December 27, 2015: contributor
The proposal has been updated with 2 address types defined
-
dabura667 commented at 11:25 AM on December 27, 2015: none
Tested 71 character QR code for kicks. Here's my results.
tl;dr 71 characters is small enough to be made into a reasonably sized QR and be read at normal distances regardless of the strength of error correction the QR contains.
Site used in testing http://www.hcidata.info/qr_code.php Webcam used in testing 1080p Logicool standard web camera Content: bayyyhyyyyyyyyyyyyq4wteyyejc3w5sybwui8iksjqoh6mrah9o4hoprh7t67nfcsdmi1t ECC: L - smallest error correction Size: 6 (4.8 cm x 4.8 cm on my 1080p screen) farthest successful result: 53 cm away from screen ECC: H - best error correction Size: 4 (4.8 cm x 4.8 cm on my 1080p screen) farthest successful result: 38 cm away from screen (tried on my iPhone 6: successfully read it from 48 cm) -
in bip-segwitaddress.mediawiki:None in 93cedfbf2d outdated
58 | + 59 | +It is case-insensitive and includes all alphanumeric characters excluding 0, 2, l, v. The order of alphabet is chosen so that less ambiguous alphabet characters will appear more frequently than others. 60 | + 61 | +An address starts with a version digit, which is b<sub>32</sub> for the main-network and t<sub>32</sub> for the testnet. 62 | + 63 | +The next digit is a length digit, which the value is length of the witness program in byte (L) minus 2.
dabura667 commented at 11:37 AM on December 27, 2015:why not make it length - 10? This way we could account for a larger space of scripts (supporting all the way up to 41)
What scripts of LEN < 10 can you think of that would be useful for the everyday user? (actually, < 9 if you don't count the segwit version byte)
jl2012 commented at 12:29 PM on December 27, 2015:@sipa may want to reduce to max size from 41 to 33. Anything longer than 33 should use a version 1 witness program. Unless we need a strong hash function in the future, there is no reason to use more than 33 bytes. And it is very easy to extend it with a softfork.
I assume that segwit being limited to 41 bytes means multisig etc. are planned to be P2SH.
Yes, that's called "version 1 witness program"
in bip-segwitaddress.mediawiki:None in 93cedfbf2d outdated
14 | +==Motivation== 15 | + 16 | +To define standard payment addresses for native segwit transactions to promote early adoption of the more efficient transaction method. 17 | + 18 | +== Specification == 19 | +=== P2PKH segwit address ===
dabura667 commented at 11:40 AM on December 27, 2015:Why not also include a similar proposal for P2SH segwit while you're at it?
I assume that segwit being limited to 41 bytes means multisig etc. are planned to be P2SH.
Update for new witness prog design & formatting c2d3488c02in bip-segwitaddress.mediawiki:None in 93cedfbf2d outdated
123 | + bayyyh-yyyyyy-yyyyyy-q4wtey-yejc3w-5sybwu-i8iksj-qoh6mr-ah9o4h-oprh7t-67nfcs-dmi1t 124 | + 125 | +==Reference Implementation== 126 | +From arbitrary witness program to general segwit address: https://gist.github.com/jl2012/760b0f952715b8b6c608 127 | + 128 | +==See Also==
btcdrak commented at 4:20 PM on December 27, 2015:Use
Referencesinstead.jl2012 force-pushed on Dec 29, 2015luke-jr added the label New BIP on Dec 30, 2015luke-jr added the label Needs number assignment on Dec 30, 2015jl2012 commented at 10:40 AM on December 31, 2015: contributorThe python code now allows encode and decode of general segwit address
rubensayshi cross-referenced this on Jan 7, 2016 from issue BIP 141: Segregated Witness (Consensus layer) by CodeSharkluke-jr assigned luke-jr on Jan 8, 2016luke-jr renamed this:BIP draft: Address Format for Witness Program
BIP 142: Address Formats for Witness Program
on Jan 8, 2016luke-jr removed the label Needs number assignment on Jan 8, 2016luke-jr merged this on Jan 8, 2016luke-jr closed this on Jan 8, 2016jl2012 cross-referenced this on Jan 18, 2016 from issue New witness program definition in BIP141, and related revision in 142 - 144 by jl2012luke-jr referenced this in commit fd560bb671 on Jan 20, 2018
This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 15:10 UTC
More mirrored repositories can be found on mirror.b10c.me