Enable context creation in preallocated memory

real-or-random commented at 3:21 pm on October 22, 2018: contributor

This builds on #557.

Manually managing memory is always a pain in the ass in some way. I tried to keep the pain manageable. I’m open to suggestions to make this less ugly or error-prone.

to do:

tests
export functions

real-or-random force-pushed on Oct 22, 2018

in src/util.h:101 in a22ff928d3 outdated

 96+#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
 97+
 98+static SECP256K1_INLINE void *manual_alloc(void** prealloc_ptr, size_t alloc_size, void* base, size_t max_size) {
 99+    size_t aligned_alloc_size = ROUND_TO_ALIGN(alloc_size);
100+    void* ret = *prealloc_ptr;
101+    CHECK((char*)*prealloc_ptr != NULL);

real-or-random commented at 4:14 pm on October 22, 2018:

Todo: This check is not enough. The user could give us a NULL pointer, and then we want to call the callback and not just abort.

apoelstra commented at 5:48 pm on October 22, 2018:

I think it’s fine. We call this only internally, so actually even a VERIFY_CHECK would be sufficient.

In the public-API-facing function where this pointer originates we should use an ARG_CHECK.

in src/secp256k1.c:68 in e7c5e2640c outdated

63@@ -64,6 +64,25 @@ static secp256k1_context secp256k1_context_no_precomp_ = {
64 };
65 secp256k1_context *secp256k1_context_no_precomp = &secp256k1_context_no_precomp_;
66 
67+size_t secp256k1_context_prealloc_size(unsigned int flags) {

apoelstra commented at 5:51 pm on October 22, 2018:

This needs a corresponding entry in include/secp256k1.h.

real-or-random commented at 9:49 am on October 23, 2018:

Yep, “exporting functions” is on the todo list. :)

in src/secp256k1.c:113 in f24aa9cb95 outdated

 95 
 96     if (EXPECT((flags & SECP256K1_FLAGS_TYPE_MASK) != SECP256K1_FLAGS_TYPE_CONTEXT, 0)) {
 97             secp256k1_callback_call(&ret->illegal_callback,
 98                                     "Invalid flags");
 99-            free(ret);
100+            /* Unreachable */

apoelstra commented at 5:52 pm on October 22, 2018:

This isn’t unreachable, is it? If the user provides bad flags it will trigger.

real-or-random commented at 9:54 am on October 23, 2018:

We call default_illegal_callback here, which aborts, so the return is unreachable.

(By the way I think the diff is somewhat misleading. I’m not adding /* Unreachable */ because I’m removing the free; the two changes are unrelated.)

apoelstra commented at 3:09 pm on October 23, 2018:

Oh, yeah, you’re right.

in src/secp256k1.c:99 in f24aa9cb95 outdated

82@@ -83,29 +83,43 @@ size_t secp256k1_context_prealloc_size(unsigned int flags) {
83     return ret;
84 }
85 
86-secp256k1_context* secp256k1_context_create(unsigned int flags) {
87-    secp256k1_context* ret = (secp256k1_context*)checked_malloc(&default_error_callback, sizeof(secp256k1_context));
88+secp256k1_context* secp256k1_context_prealloc_create(void* prealloc, unsigned int flags) {

apoelstra commented at 5:53 pm on October 22, 2018:

Needs a corresponding entry in include/secp256k1.h, and we should EXPECT that prealloc is non-NULL.

apoelstra commented at 6:22 pm on October 22, 2018: contributor

When I run valgrind ./tests 1 I see 2Mb leaked.

real-or-random commented at 1:07 pm on October 25, 2018: contributor

Addressed your comments including the memory leak (reason was that I didn’t really implement context cloning) and tidied up a little.

Open things:

This removes the ability to clone ecmult_context and ecmult_gen_context individually without cloning an entire secp256k1_context in order to keep the code simple. Cloning the full secp256k1_context is still supported. We don’t clone these “subcontexts” internally, and I don’t see a reason why we would need this. Thoughts?
Do we need a secp256k1_context_prealloc_clone() to clone into preallocated memory?
One disadvantage of the approach that I implemented is that the user may call screw it up by calling secp256k1_context_prealloc_size(flags) with different flags than secp256k1_context_prealloc_create(..., flags). But I think this is acceptable.
todo: Exporting functions in public header (I will do this at the end once we’re clear about the other points)

real-or-random commented at 1:26 pm on October 25, 2018: contributor

Haha, I switched to testing in clang because it emitted a warning that gcc didn’t emit and now the gcc builds fails. Investigating…

apoelstra commented at 1:31 pm on October 25, 2018: contributor

No, it’s fine if we lose the ability to clone individual parts of a context object. We only had that separation for code cleanliness.
I’d like to have a secp256k1_context_prealloc_clone but I don’t know if it’d be useful. Just in case.
Yeah, if the user is inconsistent with flags they’ll get UB. Agreed that this is fine, that’s a pretty silly thing to do.

real-or-random force-pushed on Oct 25, 2018

in src/ecmult_gen_impl.h:111 in c43d6d247c outdated

115+static void secp256k1_ecmult_gen_context_finalize_memcpy(secp256k1_ecmult_gen_context *dst, const secp256k1_ecmult_gen_context *src) {
116 #ifndef USE_ECMULT_STATIC_PRECOMPUTATION
117-        dst->prec = (secp256k1_ge_storage (*)[64][16])checked_malloc(cb, sizeof(*dst->prec));
118-        memcpy(dst->prec, src->prec, sizeof(*dst->prec));
119+    if (src->prec != NULL) {
120+        /* We cast to void* first to suppress a -Wcast-align warning in clang. */

apoelstra commented at 2:06 pm on October 25, 2018:

Everything about this function is terrifying :) But I think it’s correct.

real-or-random commented at 2:33 pm on October 25, 2018:

I’m learning a lot about C…

real-or-random commented at 12:33 pm on October 26, 2018:

No but seriously, despite of the terrifying casts, I believe this memcpy cloning style is nicer because it’s much less code and thus much less code to make mistakes. Open to suggestions of course.

apoelstra commented at 2:06 pm on October 25, 2018: contributor

ACK. Can you squash?

real-or-random force-pushed on Oct 26, 2018

real-or-random commented at 12:30 pm on October 26, 2018: contributor

squashed
implemented cloning into prealloc memory
added tests
exported API functions

I can drop the last commit which creates a separate header file. I think it’s useful because the average user does not need the new function but the new header file somehow violates the idea that additional headers are modules.

We could make the life easier for the user if we store in the context whether we allocated the memory or the user. (And then we need only one destroy function and the user cannot screw it up.)

real-or-random renamed this:
~~WIP: Enable context creation with preallocated memory blob~~
Enable context creation in preallocated memory
on Oct 26, 2018

real-or-random force-pushed on Oct 26, 2018

gmaxwell commented at 0:18 am on October 27, 2018: contributor

The interface probably needs to document the required alignment of caller memory (at least size that it must be aligned similarly to the result of malloc, up to the largest primitive type). Unless we want to handle aligning the struct ourselves.

It would be useful to test this stuff on something that traps on unaligned reads (not x86, which just runs slower but traps only in freaky cases like unaligned SIMD loads using aligned instructions but only when they cross a page boundary)…

I might also suggest calling the function something like prealloced not prealloc, functions are usually read as verbs that describe what they do. So I would assume that prealloc performs preallocation.

in include/secp256k1_prealloc.h:58 in 864da3a29e outdated

53+ *
54+ *  Returns: a newly created context object.
55+ *  Args:    ctx:      an existing context to copy (cannot be NULL)
56+ *  In:      prealloc: a pointer to a rewritable contiguous block of memory of
57+ *                     size at least secp256k1_context_prealloc_size(flags)
58+ *                     bytes, suitably aligned to hold an object of any type

real-or-random commented at 2:42 pm on October 28, 2018:

@gmaxwell I wrote this as documentation. Suggestions? “any primitive type” may be better than “any type”.

real-or-random commented at 3:12 pm on October 28, 2018: contributor

It would be useful to test this stuff on something that traps on unaligned reads (not x86, which just runs slower but traps only in freaky cases like unaligned SIMD loads using aligned instructions but only when they cross a page boundary)…

Indeed. I haven’t tested it but this looks promising for x86 and might do to job: https://stackoverflow.com/a/17748435/2725281

I might also suggest calling the function something like prealloced not prealloc, functions are usually read as verbs that describe what they do. So I would assume that prealloc performs preallocation.

Good idea.

real-or-random commented at 7:01 pm on October 29, 2018: contributor

Indeed. I haven’t tested it but this looks promising for x86 and might do to job: https://stackoverflow.com/a/17748435/2725281

Well no. It does what it promises, i.e., throwing SIGBUS at you when access unaligned memory, but there are too many false positives. If anyone is interested: You need to link statically (otherwise you get a SIGBUS in the dynamic linker), better disable openssl tests and don’t use gmp (otherwise you get a SIGBUS in gmp). But even then, gcc will leave you with too many unaligned accesses in the binary.

apoelstra commented at 9:53 pm on October 29, 2018: contributor

Can you try adding -mno-unaligned-access to the CFLAGS?

real-or-random commented at 11:13 am on October 30, 2018: contributor

That flag is not available for x86. I did some digging and I don’t think there’s a way to force gcc or clang to perform only aligned accesses. Unfortunately, qemu does not help either; it just won’t trap.

real-or-random force-pushed on Nov 1, 2018

real-or-random commented at 4:55 pm on November 1, 2018: contributor

Renamed the functions and rebased on the changes in #557

in src/secp256k1.c:86 in 9fce155e9c outdated

82@@ -83,6 +83,17 @@ size_t secp256k1_context_prealloc_size(unsigned int flags) {
83     return ret;
84 }
85 
86+size_t secp256k1_context_prealloc_size_for_clone(const secp256k1_context* ctx) {

apoelstra commented at 10:04 pm on November 4, 2018:

Should be marked static so it’s clear this isn’t exposed to the user.

real-or-random commented at 11:27 am on November 5, 2018:

whops, this supposed to be exported but with the name secp256k1_context_preallocated_clone_size` … (if the user clones into preallocated memory, he needs to know how much memory is necessary). fixing

apoelstra commented at 10:08 pm on November 4, 2018: contributor

ACK except static nit.

I don’t feel very strongly about it, but I’d prefer we drop the commit that uses a separate header file.

real-or-random commented at 3:16 pm on November 5, 2018: contributor

fixed… (Haven’t touched the separate header so far, I can still do that.)

real-or-random force-pushed on Nov 27, 2018

real-or-random commented at 3:57 pm on November 27, 2018: contributor

rebased on master (no code changes besides merging)
squashed fixups (no code changes)
squashed the “rename from _prealloc to _preallocated” commit into the individual commits (no code changes)

in include/secp256k1_preallocated.h:9 in d63e7ab2d4 outdated

0@@ -0,0 +1,89 @@
1+#ifndef SECP256K1_preallocated_H
2+#define SECP256K1_preallocated_H
3+
4+#include "secp256k1.h"
5+
6+#ifdef __cplusplus
7+extern "C" {
8+#endif
9+

jonasnick commented at 9:39 pm on February 7, 2019:

Would be nice to have a 1 or 2 sentence explanation for what this module does.

in src/util.h:114 in 1a7b6d9b33 outdated

93+#define ALIGNMENT 16
94+#endif
95+
96+#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
97+
98+static SECP256K1_INLINE void *manual_alloc(void** prealloc_ptr, size_t alloc_size, void* base, size_t max_size) {

sipa commented at 10:26 pm on February 8, 2019:

Some comment about the function of this function would be useful.

in src/ecmult_gen_impl.h:19 in a34fb1a2bc outdated

14 #include "hash_impl.h"
15 #ifdef USE_ECMULT_STATIC_PRECOMPUTATION
16 #include "ecmult_static_context.h"
17 #endif
18+
19+static size_t secp256k1_ecmult_gen_context_preallocated_size(void) {

sipa commented at 10:30 pm on February 8, 2019:

I think this can be a constant rather than a function (the sizeof expression is always known at compile time, even if the value of its argument isn’t).

https://en.cppreference.com/w/c/language/sizeof says “Except if the type of expression is a VLA, expression is not evaluated and the sizeof operator may be used in an integer constant expression.”.

in src/ecmult_impl.h:297 in a34fb1a2bc outdated

293@@ -293,6 +294,14 @@ static void secp256k1_ecmult_odd_multiples_table_storage_var(const int n, secp25
294     } \
295 } while(0)
296 
297+static size_t secp256k1_ecmult_context_preallocated_size(void) {

sipa commented at 10:33 pm on February 8, 2019:

Same here.

sipa commented at 11:46 pm on February 8, 2019: contributor

Concept ACK

real-or-random commented at 3:31 pm on February 20, 2019: contributor

addressed @sipa’s and @jonasnick’s comments

real-or-random force-pushed on Mar 2, 2019

real-or-random commented at 2:15 pm on March 2, 2019: contributor

I forced-push (only changed the fixups). Changes:

Replaced CHECKs by VERIFY_CHECKS in manual_alloc() because we use CHECKs only in tests AFAIU
Removed “Unreachable” comments… They’re kind of inaccurate. This really depends on the behavior of the callbacks.
Added a note to the docs that malloc() is at most called once in the normal create and clone functions (as suggested by @gmaxwell)

cc @apoelstra

in include/secp256k1_preallocated.h:10 in a890e862fe outdated

 7 
 8 #ifdef __cplusplus
 9 extern "C" {
10 #endif
11 
12+/* The module provided by this header file is intended for settings, in which it

apoelstra commented at 7:33 pm on March 3, 2019:

Should remove the ,

apoelstra commented at 7:37 pm on March 3, 2019: contributor

utACK nitfixes

gmaxwell commented at 4:09 am on March 4, 2019: contributor

because we use CHECKs only in tests AFAIU

This concerned me some but so far I don’t see a case where it’s obviously wrong yet (haven’t reviewed the whole thing yet).

But I think it might be useful to comment on runtime failure handling works. I think our design is this: The program can find itself in an impossible state (IS) as a result of memory corruption/hardware fault, miscompilation, operating system failure, or a serious programming error in the library itself or the caller (e.g. malloc fails, a non-nullable argument is null, a loop is out of range etc.).

A few kinds of IS get detected at runtime, particularly ones which get detected as a product of tests at the API boundary which wouldn’t make any sense to make VERIFY_CHECKS (for multiple reasons including that those are only used in our own tests and couldn’t detect bad interactions with a caller).

It’s also generally considered impolite for libraries to be abort() prone: Many libraries haven’t given this stuff much thought and have sometimes called abort() for normal error handling rather than actual corruption. You could imagine a signature verifier that aborted if the DER input didn’t deserialize, instead of just returning false… and imagine how cross an application author would be to discover that behaviour in the field. Even when abort() usage is intended to be limited to impossible states, sometimes the library authors make a mistake and a state is both possible and otherwise harmless… this, and performance, is why we prefer to do most of our impossible state detection via VERIFY_CHECKs that run only in our tests where mistakes in their logic won’t introduce real bugs. So we generally only want to runtime detect IS in cases that have the best tradeoffs: Cases that are really unambiguously wrong e.g. a null pointer we’re about to dereference vs some complicated algebraic condition and cases that arise out of interactions with the caller since our tests can’t discover errors in the caller or in caller-library interactions are both candidates for runtime detection.

When an IS is detected we certainly don’t want to just continue on like nothing bad happened. But we can’t just return errors: (1) The IS may be detected in a location where it is not sensible to return an error (after all, the error is “impossible” so bending the design to accommodate returning might be a bad change), (2) the IS may happen inside an external interface which “cannot fail”. (3) Returning on an IS could be confused for a normal false condition like an invalid signature (so IS happens and a CHECKSIG NOT passes when it shouldn’t), (4) because IS is “impossible” applications would never be tested for handling them safely in any case. (5) If an IS happened it may be the case that the entire process is dangerously corrupted and couldn’t be trusted to do much error handling anyway.

So that’s all well and good and suggests for abort() being the only reasonable way to handle an IS when detected. However, even abort() may not be the safest thing for any particular application to do. For some device abort() might infinite loop and what we need to do is trigger a hardware reset. Aborting on a failure might, in some case, just end up bricking the device– perhaps the IS failure should cause the device to restart with a null config. It might amplify a benign bug in an optional component into a massive DOS, etc. So we offer a callback to allow custom handling of these states. The reason that we don’t use CHECK in the library code is because we instead use error callbacks to allow that customization.

But what happens if the IS prevents us from getting access to the callback? E.g. what if our context is uninitialized or the callback pointer is null? That is a case where using directly calling a default error handler is justified: silently proceeding in a known corrupted state is still something we don’t want to do. But we’d want to minimize the number of locations that do that and document, so that specialized applications that really want to be unconditionally abort() free, can go and make it happen and handle the cases that can’t be addressed via a callback in whatever way makes sense for them. (E.g. by replacing the default handler with one suitable for their device).

Right now we have some cases where we can’t use the callback where use the default handler, and a bunch of other places where we are essentially relying on a null pointer dereference to work well enough (e.g. a dereference of ctx) for the purpose of stopping execution. Probably more of these places should be turned into explicitly calling the default handler because on some systems (esp embedded ones) a null pointer dereference will not crash.

CHECK is used in the tests because we can do whatever we want in the tests… and aborting on test failure is fine and generally interacts well with dynamic instrumentation tools like valgrind/afl/etc.

real-or-random cross-referenced this on Mar 4, 2019 from issue Allow to use external default callbacks by real-or-random

real-or-random force-pushed on Mar 4, 2019

real-or-random commented at 4:00 pm on March 4, 2019: contributor

fixed (and improved the comment about a single malloc too).

I can squash if someone else wants to have a closer look.

real-or-random force-pushed on Mar 4, 2019

real-or-random commented at 5:20 pm on March 4, 2019: contributor

Okay, I changed my mind and squashed.

real-or-random cross-referenced this on Mar 4, 2019 from issue Enable context creation in preallocated memory by real-or-random

in src/ecmult_gen_impl.h:20 in 6cc6637185 outdated

15 #ifdef USE_ECMULT_STATIC_PRECOMPUTATION
16 #include "ecmult_static_context.h"
17 #endif
18+
19+#ifndef USE_ECMULT_STATIC_PRECOMPUTATION
20+    static const size_t secp256k1_ecmult_gen_context_preallocated_size = ROUND_TO_ALIGN(sizeof(*((secp256k1_ecmult_gen_context*) NULL)->prec));

sipa commented at 10:46 pm on March 4, 2019:

In “Add size functions for preallocated memory”:

Nit: use uppercase for constants.

sipa commented at 11:34 pm on March 4, 2019: contributor

utACK, I really like the approach used to represent context objects now as a single allocated blob.

real-or-random force-pushed on Mar 5, 2019

real-or-random commented at 11:43 am on March 5, 2019: contributor

Addressed @sipa’s comment

real-or-random closed this on Mar 5, 2019

real-or-random reopened this on Mar 5, 2019

real-or-random commented at 12:31 pm on March 5, 2019: contributor

(Closed and opened to trigger travis rebuild)

real-or-random closed this on Mar 5, 2019

real-or-random reopened this on Mar 5, 2019

real-or-random cross-referenced this on Mar 6, 2019 from issue Make WINDOW_G configurable by real-or-random

elichai commented at 9:09 pm on March 11, 2019: contributor

This enables using the library without it calling any mallocs, right? If so then this would be awesome for no-std environments :)

real-or-random commented at 9:21 pm on March 11, 2019: contributor

This enables using the library without it calling any mallocs, right? If so then this would be awesome for no-std environments :)

Correct.

elichai cross-referenced this on Mar 11, 2019 from issue Add no-std support by elichai

sipa commented at 11:02 pm on March 12, 2019: contributor

@gmaxwell Anything that’s left to address here?

gmaxwell commented at 2:50 am on March 15, 2019: contributor

ACK.

The documentation for this however needs some guidance on the lifetime management of the caller provided memory (it could be done in another PR but it must be done).

It should make clear that the caller is obligated to make sure the buffer it provides lives at least as long as the ctx object and while the context exists the buffer is the exclusive property of that context (e.g. you cannot move it, you cannot create two contexts using the same buffer)… and that after destroying the context the caller it’s up to the caller to free the buffer (if required).

[Basically don’t assume that the user has a feel for how memory management works… everything that I just described is obvious if you have any idea of how it would be implemented, but we can’t assume that the user has given it that much thought… or they might be use to programming languages that manage lifetimes for them, or mistake the buffer for runtime scratch space that can be arbitrarily rewriten between calls]

We might also want to include a note that it’s easier and less failure prone to use the normal functions in environments where dynamic allocation isn’t a problem.

real-or-random cross-referenced this on Mar 15, 2019 from issue scratch space: use single allocation by apoelstra

real-or-random commented at 9:29 pm on March 29, 2019: contributor

@gmaxwell I added a commit to explain all of this. Can you re-ACK if it’s okay?

real-or-random cross-referenced this on Apr 1, 2019 from issue Changes necessary for usage on Trezor by real-or-random

elichai commented at 8:20 am on May 13, 2019: contributor

Any updates here? both this and #595 are what’s missing to use this library in a no-std environment

gmaxwell cross-referenced this on May 13, 2019 from issue Use a static constant table for small ecmult WINDOW_G sizes. by gmaxwell

real-or-random commented at 7:22 pm on May 24, 2019: contributor

I pushed a fixup because more warnings appeared with gcc and ARM as target. I can squash when you looked at the diff / whenever you want.

gmaxwell commented at 9:43 am on May 25, 2019: contributor

ACK the fixups ack the text.

Prepare for manual memory management in preallocated memory

 * Determine ALIGNMENT more cleverly and move it to util.h
 * Implement manual_malloc() helper function

1bf7c056ba

Add size constants for preallocated memory ef020de16f

Switch to a single malloc call c4fd5dab45

Support cloning a context into preallocated memory 5feadde462

Check arguments of _preallocated functions ba12dd08da

Add tests for contexts in preallocated memory 814cc78d71

Export _preallocated functions 695feb6fbd

Move _preallocated functions to separate header 238305fdbb

Explain caller's obligations for preallocated memory 0522caac8f

real-or-random force-pushed on May 25, 2019

real-or-random commented at 12:12 pm on May 25, 2019: contributor

squashed and rebased

gmaxwell merged this on May 25, 2019

gmaxwell closed this on May 25, 2019

gmaxwell referenced this in commit a484e0008b on May 25, 2019

gmaxwell referenced this in commit 143dc6e9ee on May 27, 2019

elichai cross-referenced this on May 28, 2019 from issue Updating secp256k1 and supporting full no-std features by elichai

elichai cross-referenced this on May 28, 2019 from issue Implementing pre allocation context creation by elichai

laanwj cross-referenced this on Jun 10, 2019 from issue missing symbols for `no_std` target by laanwj

real-or-random cross-referenced this on Jun 17, 2019 from issue Low-footprint mode by gmaxwell

real-or-random cross-referenced this on Jul 2, 2019 from issue Null checks before dereferencing a pointer by elichai

sipa cross-referenced this on Jun 9, 2020 from issue Update libsecp256k1 subtree by sipa

fanquake referenced this in commit 8c97780db8 on Jun 13, 2020

sidhujag referenced this in commit 8a3a072968 on Jun 13, 2020

ComputerCraftr referenced this in commit b98f1c6e6c on Jun 16, 2020

UdjinM6 referenced this in commit 9d36ba6570 on Aug 10, 2021

5tefan referenced this in commit 8ded2caa74 on Aug 12, 2021

gades referenced this in commit d855cc511d on May 8, 2022

apoelstra cross-referenced this on Dec 8, 2022 from issue Fully describe safety requirements by tcharding

Enable context creation in preallocated memory #566