secp256k1_pubkey_sort
function the the public API which was originally part of the musig PR (#1479). However, I opened a separate PR because it adds internal functions that are also used by the WIP silent payments module.
Add secp256k1_pubkey_sort #1518
pull jonasnick wants to merge 1 commits into bitcoin-core:master from jonasnick:pubkey-sort changing 7 files +437 −0-
jonasnick commented at 7:29 pm on April 16, 2024: contributorThis PR adds a
-
real-or-random commented at 11:43 pm on April 16, 2024: contributor
Concept ACK. This is useful independently of MuSig and Silent Payments. Also for users with no standard library, e.g., on hardware wallets.
cc @roconnor-blockstream who wrote most of this code
Just out of curiosity, I checked what glibc actually does. They have Quicksort, Insertionsort, Mergesort, and Heapsort in their code base. If malloc fails, they’ll always fall back to Heapsort. I think they had been using Quicksort and Mergesort for decades (perhaps selected depending on the platform?), but they changed their mind quite often lately, and this was indeed related to degenerate and adversarially constructed inputs:
- Oct 2023: They added Heapsort and used it to implement Introsort. Introsort is basically Quicksort two fallbacks: 1) a fallback on some O(n log n) sorting algorithm in degenerate cases, and 2) a fallback on Insertion sort for small arrays (https://github.com/bminor/glibc/commit/274a46c9b25ab733a1fb9fb1497f1beecae30193). They had also removed Mergesort, making Introsort the algorithm used everywhere. (https://github.com/bminor/glibc/commit/03bf8357e8291857a435afcc3048e0b697b6cc04)
- Nov 2023: The logic for the fallback to Heapsort was tightened, and a test with a degerate array. (https://github.com/bminor/glibc/commit/64e4acf24da15c11cb83f933947df3b2e8a700cd: “To avoid a denial-of-service condition with adversarial input, it is necessary to fall over to heapsort if tail-recursing deeply, too […]”)
- Jan 2024: They reverted to Mergesort because it turns out that people rely on the sorting being stable, even though this is not guaranteed by the standard, and even though their Heapsort fallback won’t guarantee it anyway. (https://github.com/bminor/glibc/commit/709fbd3ec3595f2d1076b4fec09a739327459288)
This means that the current libc is probably fine. But old versions may be problematic, and other implementations of the standard library may also be problematic. If anything, all of this confirms that Heapsort is a good choice, and that we should not rely on the standard library here.
-
real-or-random added the label feature on Apr 16, 2024
-
in src/modules/extrakeys/Makefile.am.include:1 in cc7d18a8a8
real-or-random commented at 11:46 pm on April 16, 2024:nit: No need to touch this file. Is this a leftover diff from when the sorting function was still in extrakeys (see #1479 (review))? Also, the commit message still mentions “extrakeys”.
jonasnick commented at 5:44 pm on April 17, 2024:removed the file and the mention of extrakeysin src/hsort.h:29 in cc7d18a8a8 outdated
13+/* In-place, iterative heapsort with an interface matching glibc's qsort_r. This 14+ * is preferred over standard library implementations because they generally 15+ * make no guarantee about being fast for malicious inputs. 16+ * 17+ * See the qsort_r manpage for a description of the interface. 18+ */
josibake commented at 11:32 am on April 17, 2024:nit: Might be worth leaving a small reminder to the user thathsort
is unstable. I had forgotten this detail, which caused me quite a bit of confusion. A small comment would either a) remind the user or b) prompt them to look up what that means before using and hopefully save some headaches.
real-or-random commented at 1:52 pm on April 17, 2024:- See the qsort_r manpage for a description of the interface.
By the way, we could also just spell out the interface in the comment. But the current version is also fine, it’s an internal function in the end.
josibake commented at 11:34 am on April 17, 2024: memberConcept ACK. This is particularly useful for the silent payments module in that we are able to hide a lot of complexity from the caller. In addition, I agree with Tim’s assessment that this is a generally useful thing to have (not specific to Musig2/Silent payments).
Left a small nit regarding documentation but overall I found this pretty easy to work with as implemented when creating a custom
cmp
for silent payments.in src/hsort_impl.h:16 in cc7d18a8a8 outdated
11+ 12+/* An array is a heap when, for all non-zero indexes i, the element at index i 13+ * compares as less than or equal to the element at index parent(i) = (i-1)/2. 14+ */ 15+ 16+static SECP256K1_INLINE size_t child1(size_t i) {
roconnor-blockstream commented at 4:13 pm on April 17, 2024:Do we need to prefix all these functions withsecp256k1_
?
real-or-random commented at 4:45 pm on April 17, 2024:Yep, the child1/child2 functions should probably also get an additional_heap
prefix, e.g.,secp256k1_heap_child1
jonasnick commented at 5:44 pm on April 17, 2024:donejonasnick renamed this:
extrakeys: add secp256k1_pubkey_sort
Add secp256k1_pubkey_sort
on Apr 17, 2024jonasnick force-pushed on Apr 17, 2024in src/hsort_impl.h:42 in 137ed38a10 outdated
37+ stride -= 64; 38+ } 39+ secp256k1_heap_swap64(a, i, j, stride); 40+} 41+ 42+static SECP256K1_INLINE void heap_down(unsigned char *a, size_t i, size_t heap_size, size_t stride,
roconnor-blockstream commented at 6:24 pm on April 17, 2024:This needs asecp256k1_
prefix too.jonasnick force-pushed on Apr 17, 2024jonasnick force-pushed on Apr 17, 2024in include/secp256k1.h:477 in 4520d7527c outdated
473@@ -474,6 +474,20 @@ SECP256K1_API SECP256K1_WARN_UNUSED_RESULT int secp256k1_ec_pubkey_cmp( 474 const secp256k1_pubkey *pubkey2 475 ) SECP256K1_ARG_NONNULL(1) SECP256K1_ARG_NONNULL(2) SECP256K1_ARG_NONNULL(3); 476 477+/** Sort public keys keys using lexicographic (of compressed serialization) order
theStack commented at 10:45 pm on April 17, 2024:0/** Sort public keys using lexicographic (of compressed serialization) order
theStack commented at 10:45 pm on April 17, 2024: contributorConcept ACKin src/hsort_impl.h:43 in 4520d7527c outdated
35+ while (64 < stride) { 36+ secp256k1_heap_swap64(a + (stride - 64), i, j, 64); 37+ stride -= 64; 38+ } 39+ secp256k1_heap_swap64(a, i, j, stride); 40+}
real-or-random commented at 12:14 pm on April 18, 2024:secp256k1_heap_swap
is incorrect forstride > 64
. See https://github.com/real-or-random/secp256k1/commit/95138ca6c1336bcd20ef332dd742f8a5f0f7cf75 for a fix and a test. This was overlooked because we never use such a stride, not even in the tests.Alternatively, we could drop
secp256k1_heap_swap
entirely.secp256k1_heap_swap64
appears to be correct. But I’d say let’s just keep it if my fix looks good.
roconnor-blockstream commented at 4:51 pm on April 18, 2024:FWIW, below is more like what I intended to write:
0static SECP256K1_INLINE void swap64(unsigned char *a, unsigned char *b, size_t len) { 1 unsigned char tmp[64]; 2 VERIFY_CHECK(len <= 64); 3 memcpy(tmp, a, len); 4 memmove(a, b, len); 5 memcpy(b, tmp, len); 6} 7 8static SECP256K1_INLINE void swap(unsigned char *arr, unsigned char *a, size_t stride) { 9 unsigned char *a = arr + i*stride; 10 unsigned char *b = arr + j*stride; 11 size_t len = stride; 12 while (64 < len) { 13 swap64(a + (len - 64), b + (len - 64), 64); 14 len -= 64; 15 } 16 swap64(a, b, len); 17}
Obviously I’ve gotten this wrong before, so take it with a grain of salt.
real-or-random commented at 8:07 am on April 19, 2024:Looks correct (except thati
andj
should be in the parameter instead ofa
) to me, and is easier to read than my version, I think.
jonasnick commented at 2:23 pm on April 19, 2024:Oops. Good catch @real-or-random. I cherry picked your test, copied @roconnor-blockstream’s fix and improved the existing test.elichai commented at 3:55 pm on April 18, 2024: contributorA wild idea: What if we used something simple(like shellsort or even bubblesort) that is very easy to verify, and has good performance in the best case (when it’s already sorted) and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.
Needing to reimplement and maintaining highly efficient in place sorting every time is very sad
About it being useful, I’ll say that embedded rust still gives you sorting via libcore, and I think every embedded C toolkit has some sort of sorting utility, but I might be wrong on this
real-or-random commented at 8:14 am on April 19, 2024: contributorA wild idea: What if we used something simple(like shellsort or even bubblesort) that is very easy to verify, and has good performance in the best case (when it’s already sorted) and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.
Hm, I think this gives us the worst of both worlds in the end:
- We have to maintain the implementation of sorting algorithm.
- We still tell users to rely on the stdlib.
Also, I doubt that Shellsort or Bubblesort are much easier to implement than Heapsort. (And look at the nature of the bug I found: The swap was wrong, not the Heapsort itself. And you’ll need a swap for every algorithm.)
Needing to reimplement and maintaining highly efficient in place sorting every time is very sad
Yeah, but that’s because C is sad (sometimes).
josibake commented at 12:01 pm on April 19, 2024: memberI’ll echo @real-or-random ’s point and add:
and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.
I think this is fine if the sorting step is a convenience, but in the case of the silent payments module sorting is a required step for generating the outputs correctly. For this, I think it’s far better to have sorting handled inside the module vs requiring the user to sort before passing the public keys. Doing the sorting for the users eliminates a difficult to catch error and we’d need to sort anyways to be able to detect and alert the user that they did not sort (or sorted incorrectly).
in src/tests.c:3696 in 2bb0baf63d outdated
3691+} 3692+ 3693+static void test_hsort_is_sorted(unsigned char *elements, size_t n, size_t len) { 3694+ size_t i; 3695+ for (i = 1; i < n; i++) { 3696+ CHECK(memcmp(&elements[(i-1) * len], &elements[i * len], len) <= 0);
real-or-random commented at 3:58 pm on April 19, 2024:nit:secp256k1_memcmp_var
according to CONTRIBUTING.md (or add a rule that says we can skip in tests, I’ll ACK)in src/tests.c:3711 in 2bb0baf63d outdated
3704+ 3705+static int test_hsort_cmp(const void *ele1, const void *ele2, void *data) { 3706+ struct test_hsort_cmp_data *d = (struct test_hsort_cmp_data *) data; 3707+ d->counter += 1; 3708+ return memcmp((unsigned char *)ele1, (unsigned char *)ele2, d->element_len); 3709+}
real-or-random commented at 3:58 pm on April 19, 2024:- indentation
secp256k1_memcmp_var
in src/hsort_impl.h:110 in 2bb0baf63d outdated
105+ size_t i; 106+ 107+ for(i = count/2; 0 < i; --i) { 108+ secp256k1_heap_down(ptr, i-1, count, size, cmp, cmp_data); 109+ } 110+ for(i = count; 1 < i; --i) {
real-or-random commented at 3:59 pm on April 19, 2024:spacing micronits
0 void *cmp_data) { 1 size_t i; 2 3 for (i = count/2; 0 < i; --i) { 4 secp256k1_heap_down(ptr, i-1, count, size, cmp, cmp_data); 5 } 6 for (i = count; 1 < i; --i) {
in src/tests.c:3738 in 2bb0baf63d outdated
3731+ for (i = 0; i < COUNT; i++) { 3732+ int n = secp256k1_testrand_int(NUM); 3733+ secp256k1_testrand_bytes_test(elements, n*element_len); 3734+ secp256k1_hsort(elements, n, element_len, test_hsort_cmp, &data); 3735+ test_hsort_is_sorted(elements, n, element_len); 3736+ }
real-or-random commented at 4:01 pm on April 19, 2024:suggestion: Also call the C libraryqsort_r
and compare the results. This is a very effective test at the cost of just a handful of code lines.
jonasnick commented at 6:10 pm on April 19, 2024:To test that giving exactly the same arguments toqsort_r
gives the same result? Unfortunately,qsort_r
is not part of the standard lib.
real-or-random commented at 6:51 pm on April 21, 2024:Unfortunately,
qsort_r
is not part of the standard lib.Yes, okay, I see. I think we could implement a test also with
qsort
if we add an adapter comparison function. But then it’s a few lines of code more. It certainly won’t hurt, but then it’s not entirely clear if it’s worth the hassle and gives us much more than the simple check that the output array is sorted.
jonasnick commented at 3:25 pm on April 22, 2024:If we can’t test against the API ofqsort_r
then I doubt that’s worth it. Note that we now spell out the interface of hsort explicitly instead of referring toqsort_r
.
real-or-random commented at 5:06 pm on April 22, 2024:sounds reasonablein src/tests.c:6759 in 2bb0baf63d outdated
6731+ int32_t ecount = 0; 6732+ secp256k1_context_set_illegal_callback(CTX, counting_callback_fn, &ecount); 6733+ CHECK(secp256k1_ec_pubkey_sort(CTX, pks_ptr, 2) == 1); 6734+ CHECK(ecount == 2); 6735+ secp256k1_context_set_illegal_callback(CTX, NULL, NULL); 6736+ }
real-or-random commented at 4:07 pm on April 19, 2024:Not exactly this PR, but do you think it’s worth noting in the API docs that callbacks may be triggered more than once per API call? This may be interesting in the case of the illegal callback because you wouldn’t need to crash necessarily for this type of callback (e.g., thinking of python bindings that could choose to raise). If yes, we could track this separately.
jonasnick commented at 6:46 pm on April 19, 2024:Hm good point. But if we think that calling the callback multiple times causes problems, it may makes more sense to fix/change that because it’s unlikely someone will read this and take it into account.
For the error callback the documentation says “After this callback returns, anything may happen, including crashing.” and for the illegal callback “When this callback is triggered, the API function called is guaranteed not to cause a crash, though its return value and output arguments are undefined.”
real-or-random commented at 6:52 pm on April 21, 2024:But if we think that calling the callback multiple times causes problems, it may makes more sense to fix/change that because it’s unlikely someone will read this and take it into account.
Let’s maybe track this in a separate issue. It’s certainly not great to call the callback twice, but last time when I tried to “fix” it, I wasn’t happy with the added code complexity.
real-or-random commented at 4:08 pm on April 19, 2024: contributorACK mod nits, I’ve inspected the code and the proofs again.
New improved tests are nice.
jonasnick commented at 6:38 pm on April 19, 2024: contributorI added a description of the interface to the hsort doc. Apparently mac OS uses a different order of arguments.in src/tests.c:3727 in 2bb0baf63d outdated
3722+ secp256k1_hsort(elements, 0, element_len, test_hsort_cmp, &data); 3723+ CHECK(data.counter == 0); 3724+ secp256k1_hsort(elements, 1, element_len, test_hsort_cmp, &data); 3725+ CHECK(data.counter == 0); 3726+ secp256k1_hsort(elements, NUM, element_len, test_hsort_cmp, &data); 3727+ CHECK(data.counter >= NUM);
roconnor-blockstream commented at 7:26 pm on April 19, 2024:Not entirely sure what we are trying to do with this check, but if we are thinking of hsort as a black box sorting function, the minimum number of comparisons is
NUM-1
(min 0).If we are thinking of hsort as a white box heap sort, then the minimum number of comparisons is larger, on the order of (3/2 n) or so. We can probably figure it out if that’s your intention.
jonasnick commented at 7:02 am on May 6, 2024:Changed toNUM-1
.roconnor-blockstream commented at 8:14 pm on April 19, 2024: contributorDoing the sorting for the users eliminates a difficult to catch error and we’d need to sort anyways to be able to detect and alert the user that they did not sort (or sorted incorrectly).
You can test that a list is sorted in O(n) time, but sorting (by comparisons) take O(n log n) time
That said, I do agree that we should just sort ourselves, in O(n log n) worst case time, with constant memory overhead.
jonasnick force-pushed on Apr 22, 2024jonasnick commented at 3:23 pm on April 22, 2024: contributorRebased on master and added change log entry.in src/hsort_impl.h:76 in b764ca84f9 outdated
71+ * Else if [child1(i)] > [i], swap [i] with [child1(i)]. 72+ * Else if [child2(i)] > [i], swap [i] with [child2(i)]. 73+ */ 74+ if (secp256k1_heap_child2(i) < heap_size 75+ && 0 <= cmp(arr + secp256k1_heap_child2(i)*stride, arr + secp256k1_heap_child1(i)*stride, cmp_data)) { 76+ if (0 < cmp(arr + secp256k1_heap_child2(i)*stride, arr + i*stride, cmp_data)) {
sipa commented at 3:44 pm on April 24, 2024:Strange indentation.
jonasnick commented at 8:24 pm on April 25, 2024:fixedin src/hsort_impl.h:51 in b764ca84f9 outdated
40+ len -= 64; 41+ } 42+ secp256k1_heap_swap64(a, b, len); 43+} 44+ 45+static SECP256K1_INLINE void secp256k1_heap_down(unsigned char *arr, size_t i, size_t heap_size, size_t stride,
sipa commented at 3:46 pm on April 24, 2024:Maybe add a one-line description, like “Given an array arr of length heap_size strides, of which all elements >i satisfy the max-heap property, update it so all elements >=i satisfy the max-heap property”?
jonasnick commented at 8:24 pm on April 25, 2024:donein src/secp256k1.c:329 in b764ca84f9 outdated
325@@ -325,6 +326,40 @@ int secp256k1_ec_pubkey_cmp(const secp256k1_context* ctx, const secp256k1_pubkey 326 return secp256k1_memcmp_var(out[0], out[1], sizeof(out[0])); 327 } 328 329+/* This struct wraps a const context pointer to satisfy the secp256k1_hsort api
sipa commented at 3:46 pm on April 24, 2024:That sounds like an unnecessary indirection to me, just to avoid casting away aconst
?
jonasnick commented at 8:25 pm on April 25, 2024:Changed it to just cast away const now.Add secp256k1_pubkey_sort
Co-authored-by: Tim Ruffing <crypto@timruffing.de> Co-authored-by: Russell O'Connor <roconnor@blockstream.io>
jonasnick force-pushed on Apr 25, 2024in src/secp256k1.c:332 in 7d2591ce12
325@@ -325,6 +326,34 @@ int secp256k1_ec_pubkey_cmp(const secp256k1_context* ctx, const secp256k1_pubkey 326 return secp256k1_memcmp_var(out[0], out[1], sizeof(out[0])); 327 } 328 329+static int secp256k1_ec_pubkey_sort_cmp(const void* pk1, const void* pk2, void *ctx) { 330+ return secp256k1_ec_pubkey_cmp((secp256k1_context *)ctx, 331+ *(secp256k1_pubkey **)pk1, 332+ *(secp256k1_pubkey **)pk2);
josibake commented at 11:37 am on May 3, 2024:In #1519, I initially created a
secp256k1_silentpayments_recipient_sort_cmp
function using(secp256k1_silentpayments_recipient *)r->scan_pubkey
instead of*(secp256k1_silentpayments_recipient **)r->scan_pubkey
. Everything compiled fine, but the end result was my array was not sorted correctly. Took me awhile to figure out that it was due to using(typedef *)
instead of*(typedef **)
.This could just be a me problem in not having a strong C background, but wanted to mention it in case its worth documenting.
jonasnick commented at 2:24 pm on May 6, 2024:Hm, this is documented. The hsort doc says “comparison function is called with two arguments that point to the objects being compared” (which was copied from the qsort_r manpage). If the array you give to hsort consists of elements of typesecp256k1_silentpayments_recipient *
, the comparison function is called with pointers to the elements (secp256k1_silentpayments_recipient **
).
real-or-random commented at 2:42 pm on May 6, 2024:I guess seeing stuff like*(secp256k1_pubkey **)pk1
can be confusing even to seasoned C programmers at first glance, due to the asterisk being used for two related but different things here (first as a dereference operator and then in the cast for indicating a pointer type). But yeah, I don’t think we can do anything to improve this.
roconnor-blockstream commented at 3:37 pm on May 6, 2024:should this be*(const secp256k1_pubkey * const *)pk1
or something?josibake approvedjosibake commented at 11:40 am on May 3, 2024: memberACK https://github.com/bitcoin-core/secp256k1/pull/1518/commits/7d2591ce12d8a9b85f210cf9d678e91cee125ee9
Don’t have much to offer on the
hsort
side, but did review thesecp256k1
specific parts and have been using this commit in #1519 without any issues.sipa commented at 3:35 pm on May 3, 2024: contributorACK 7d2591ce12d8a9b85f210cf9d678e91cee125ee9in src/secp256k1.c:352 in 7d2591ce12
347+ * argument. */ 348+ secp256k1_hsort(pubkeys, n_pubkeys, sizeof(*pubkeys), secp256k1_ec_pubkey_sort_cmp, (void *)ctx); 349+ 350+ #if defined(_MSC_VER) && (_MSC_VER < 1933) 351+ #pragma warning(pop) 352+ #endif
real-or-random commented at 3:00 pm on May 6, 2024:I verified that the pragmas are still necessary now that the call to secp256k1_hsort was changed. See https://godbolt.org/z/13MY3acP1 .real-or-random approvedreal-or-random commented at 3:00 pm on May 6, 2024: contributorACK 7d2591ce12d8a9b85f210cf9d678e91cee125ee9sipa merged this on May 6, 2024sipa closed this on May 6, 2024
datis7756 approved
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 17:15 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me