sipa
commented at 11:16 pm on February 7, 2019:
member
This adds support for a descriptor-specific 8-character checksum.
Descriptors may optionally be suffixed with a # plus these 8 checksum characters. Any descriptor that contains a # at the end must be followed by a valid checksum. If the # is missing entirely, it is valid without checksum.
All RPCs are updated to report descriptors that include the checksum. On input, they are optional except in deriveaddress and importmulti, which require descriptors which include a checksum.
A new RPC is also added to analyse descriptors (getdescriptorinfo), which can be used to compute the checksum for a descriptor without.
sipa force-pushed
on Feb 7, 2019
instagibbs
commented at 2:21 am on February 8, 2019:
member
Can a motivation for the placement be given? Accidentally eliding it for whatever reason neuters the protection while still maintaining a valid descriptor, but clearly it seems simpler from an implementation and compatibility perspective.
sipa
commented at 2:30 am on February 8, 2019:
member
@gsanders Well for critical RPCs the plan is that the checksum won’t be optional. I just haven’t included that in this PR as it means adapting a bunch of tests, which I only want to do once the checksum algorithm is final.
meshcollider added the label
Wallet
on Feb 8, 2019
meshcollider
commented at 3:18 am on February 8, 2019:
contributor
Concept ACK
meshcollider added this to the milestone 0.18.0
on Feb 8, 2019
promag
commented at 2:44 pm on February 8, 2019:
member
Concept ACK.
It’d be nice to read a draft to update doc/descriptors.md.
sipa
commented at 8:04 pm on February 8, 2019:
member
Added a section to doc/descriptors.md.
in
doc/descriptors.md:182
in
975a4c41bdoutdated
177+
178+These checksums consist of 8 alphanumeric characters. As long as errors are
179+restricted to substituting characters in `0123456789()[],'/*abcdefgh@:$%{}`
180+for others in that set and changes in letter case, up to 4 errors will always
181+be detected in descriptors up to 501 characters, and up to 3 errors in longer
182+ones. For larger numbers of errors, or other types of errors, the is a
in
src/script/descriptor.cpp:27
in
141c965313outdated
18@@ -19,6 +19,97 @@
1920 namespace {
2122+////////////////////////////////////////////////////////////////////////////
23+// Checksum //
24+////////////////////////////////////////////////////////////////////////////
25+
26+// This section implements a checksum algorithm for descriptors with the following properties:
27+// * Every 1 character substitition error counts as 1 or 2 symbol errors, but:
practicalswift
commented at 1:22 pm on February 9, 2019:
Sjors
commented at 7:45 pm on February 10, 2019:
member
Concept ACK, will review shortly. Agree that deriveaddress and importmulti should require a checksum. The introduction of getdescriptorinfo means there’s no need to make that opt-out.
Do I understand correctly that the checksum is either based on the canonical form, i.e. based on public keys, or skips keys altogether (but that seems suboptimal when these keys are not in checksummed xpub form)? Otherwise the result of getdescriptorinfo could be confusing if you feed it an xpriv.
Maybe return a warning if a user does provide an xpriv that they should clear their shell command history (and generally recommend either not doing that, or providing a safer method like #15346).
meshcollider
commented at 8:09 pm on February 10, 2019:
contributor
The surrounding code looks good other than the comments above, haven’t reviewed the actual checksum code itself yet
in
src/rpc/misc.cpp:164
in
141c965313outdated
158+ " \"issolvable\" : true|false, (boolean) Whether the descriptor is solvable\n"
159+ " \"isprivate\" : true|false, (boolean) Whether the input descriptor contained at least one private key\n"
160+ "}\n"
161+ },
162+ RPCExamples{
163+ "Analyse a descriptor\n" +
laanwj
commented at 12:21 pm on February 12, 2019:
might want to mention that this example only works on mainnet
laanwj
commented at 12:22 pm on February 12, 2019:
member
lightly tested ACK, code changes look good to me, haven’t checked any of the magic numbers in PolyMod
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa
commented at 3:17 am on February 13, 2019:
member
Several changes:
Addressed all comments
Finalized the checksum design (and switched to a slightly better generator)
Added explanation (incl. Sage code) of the checksum
Made checksums mandatory in deriveaddresses and importmulti
Expanded and updated tests, including a Python implementation of the checksum
in
src/script/descriptor.cpp:62
in
5a04dafbd2outdated
48+ * 3 errors in windows up to 19000 symbols.
49+ * - Taking all those generators, and for degree 7 ones, extend them to degree 8 by adding all degree-1 factors.
50+ * - Selecting just the set of generators that guarantee detecting 4 errors in a window of length 512.
51+ * - Selecting one of those with best worst-case behavior for 5 errors in windows of length up to 512.
52+ *
53+ * The generator and the constants to implement it can be verified using this Sage code:
in
src/script/descriptor.cpp:28
in
83949e8b17outdated
23+// Checksum //
24+////////////////////////////////////////////////////////////////////////////
25+
26+// This section implements a checksum algorithm for descriptors with the following properties:
27+// * Every 1 character substitution error counts as 1 or 2 symbol errors, but:
28+// * An error substituting a character from 0123456789()[],'/*abcdefgh@:$%{} for another in
Maybe flip this around, and explain a bit more how symbol error count works:
0// * Mistakes in a descriptor string are measured in symbol errors. A higher symbol
1// error count is more difficult to detect, because it becomes indistinguishable
2// from a different descriptor.
3// * An error substituting a character from 0123456789()[],'/*abcdefgh@:$%{} for
4// another in that set always counts as 1 symbol error.
5// * A case error always counts as 1 symbol error.
6// * Any other 1 character substitution error counts as 1 or 2 symbol errors.
7// * Note that hex encoded keys are covered by these special characters, whereas xprivs
8// and xpubs use different characters, but already have their own checksum mechanism.
9// Function names like "multi()" use different characters, but mistakes would generally
10// result in an unparseable descriptor.
Some of this is also explained in DescriptorChecksum in different wording.
Nit: putting ABCDEFGH and abcdefgh all the way to the right would make it a bit more clear that they intentionally have the same offset (for some reason Github doesn’t believe in fixed-width font for code, but even in editors it would be more clear).
Github renders in a fixed-width font here. I don’t think this concern weighs up against keeping the characters in alphabetical ordering (to the extent possible).
practicalswift
commented at 7:11 pm on February 13, 2019:
FWIW, abc... and ABC... are perfectly aligned here :-)
A non-fixed width GitHub code view is surely not intentional – I suggest reporting to GitHub! :-)
I think that’s fine because both descriptor enhanced importmulti and deriveaddress are new anyway. We can make them more user friendly later.
Sjors
commented at 10:56 am on February 13, 2019:
member
tACK4f95087 modulo RPC help syntax
I can’t vouch for the actual math, but the documentation, Sage code, tests and Python re-implementation are comforting. One way, perhaps overkill, to sanity check that the checksum works as intended is to generate a whole bunch of deterministic typos and see that they are indeed detected.
sipa force-pushed
on Feb 13, 2019
sipa force-pushed
on Feb 13, 2019
sipa
commented at 6:15 pm on February 13, 2019:
member
Rebased.
gmaxwell
commented at 7:08 pm on February 14, 2019:
contributor
@instagibbs it’s funny, I stumbled across this, too, and did a little search on analyse/analyze in the code base, it’s almost evenly split. So it would probably be better to do a separate PR that standardizes (standardises?) on either the British or American spelling
gmaxwell
commented at 9:04 pm on February 14, 2019:
I am not a fan of PRs to go around switching between different english styles, especially in comments. All of them are valid, all can be mutually read by english speakers. Trying to maintain consistency would just mean a never ending sequence of fixups PRs.
We have a finite amount of resources to handle changes, they should be conserved for efforts that improve the capability or reliability of the software.
I’m not saying it should necessarily be changed, but changing it in this PR only will not really add anything wrt consistency either (since analyze/analyse appear in more or less the same frequency in the code base). So… leave it as is?
instagibbs
commented at 9:47 pm on February 14, 2019:
I had no idea it was a valid word, haha. I retract the comment!
instagibbs
commented at 9:47 pm on February 14, 2019:
I thought it was a misspelling of “analysis” which made no grammatical sense.
instagibbs changes_requested
instagibbs
commented at 7:24 pm on February 14, 2019:
member
Please add a single functional test for deriveaddresses and importmulti lacking the checksum, I compiled out the mandatory flag for them and tests seem to pass.
sipa force-pushed
on Feb 14, 2019
sipa
commented at 11:33 pm on February 14, 2019:
member
Rebased and added a test to deriveaddresses and importmulti to test for missing checksum.
in
test/functional/rpc_deriveaddresses.py:22
in
34164546d8outdated
Instead of adding the new argument with a default value, why not create a new function ParseChecked? From the call site it is more clear, especially since it’s statically defined where checksum is needed.
promag
commented at 4:56 pm on February 15, 2019:
member
Needs release for the new RPC. deriveaddress release notes already points to descriptors.md.
sipa force-pushed
on Feb 15, 2019
sipa added the label
Needs release note
on Feb 15, 2019
sipa
commented at 10:19 pm on February 15, 2019:
member
@promag I’ve just added a “needs release notes” label for now, as it intersects with the notes added for deriveaddresses and importmulti.
@fanquake Done.
in
src/script/descriptor.h:65
in
a19b3f92e2outdated
62@@ -63,7 +63,7 @@ struct Descriptor {
63 };
6465 /** Parse a descriptor string. Included private keys are put in out. Returns nullptr if parsing fails. */
in
src/script/descriptor.cpp:135
in
a19b3f92e2outdated
130+ }
131+ if (clscount) c = PolyMod(c, cls);
132+ for (int j = 0; j < 8; ++j) c = PolyMod(c, 0); // Shift further to determine the checksum.
133+ c ^= 1; // Prevent appending zeroes from not affecting the checksum.
134+
135+ std::string ret;
promag
commented at 1:05 am on February 16, 2019:
member
Could have an “empty checksum” test. Feel free to ignore my nits. LGTM code wise.
sipa force-pushed
on Feb 16, 2019
Descriptor checksum3b40bff988
Add getdescriptorinfo to compute checksumb52cb63688
Make descriptor checksums mandatory in deriveaddresses and importmultibe62903c41
Add checksums to descriptors.mdfd637be8d2
sipa force-pushed
on Feb 16, 2019
DrahtBot
commented at 7:59 am on February 16, 2019:
member
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#15414 ([wallet] allow adding pubkeys from imported private keys to keypool by Sjors)
#14912 ([WIP] External signer support (e.g. hardware wallet) by Sjors)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
in
src/script/descriptor.cpp:101
in
fd637be8d2
96+ * - The most common 'unprotected' descriptor characters (hex, keypaths) are in the first group of 32.
97+ * - Case errors cause an offset that's a multiple of 32.
98+ * - As many alphabetic characters are in the same group (while following the above restrictions).
99+ *
100+ * If p(x) gives the position of a character c in this character set, every group of 3 characters
101+ * (a,b,c) is encoded as the 4 symbols (p(a) & 31, p(b) & 31, p(c) & 31, (p(a) / 32) + 3 * (p(b) / 32) + 9 * (p(c) / 32).
meshcollider
commented at 11:12 am on February 16, 2019:
I’m not sure if I’m reading this wrong, but isn’t the fourth symbol in the wrong order, shouldn’t this be (p(c) / 32) + 3 * (p(b) / 32) + 9 * (p(a) / 32)?
meshcollider
commented at 11:16 am on February 16, 2019:
contributor
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-06-12 06:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me