This builds on #557.
Manually managing memory is always a pain in the ass in some way. I tried to keep the pain manageable. I’m open to suggestions to make this less ugly or error-prone.
to do:
- tests
- export functions
This builds on #557.
Manually managing memory is always a pain in the ass in some way. I tried to keep the pain manageable. I’m open to suggestions to make this less ugly or error-prone.
to do:
96+#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
97+
98+static SECP256K1_INLINE void *manual_alloc(void** prealloc_ptr, size_t alloc_size, void* base, size_t max_size) {
99+ size_t aligned_alloc_size = ROUND_TO_ALIGN(alloc_size);
100+ void* ret = *prealloc_ptr;
101+ CHECK((char*)*prealloc_ptr != NULL);
I think it’s fine. We call this only internally, so actually even a VERIFY_CHECK
would be sufficient.
In the public-API-facing function where this pointer originates we should use an ARG_CHECK
.
63@@ -64,6 +64,25 @@ static secp256k1_context secp256k1_context_no_precomp_ = {
64 };
65 secp256k1_context *secp256k1_context_no_precomp = &secp256k1_context_no_precomp_;
66
67+size_t secp256k1_context_prealloc_size(unsigned int flags) {
include/secp256k1.h
.
95
96 if (EXPECT((flags & SECP256K1_FLAGS_TYPE_MASK) != SECP256K1_FLAGS_TYPE_CONTEXT, 0)) {
97 secp256k1_callback_call(&ret->illegal_callback,
98 "Invalid flags");
99- free(ret);
100+ /* Unreachable */
We call default_illegal_callback
here, which aborts, so the return is unreachable.
(By the way I think the diff is somewhat misleading. I’m not adding /* Unreachable */
because I’m removing the free
; the two changes are unrelated.)
82@@ -83,29 +83,43 @@ size_t secp256k1_context_prealloc_size(unsigned int flags) {
83 return ret;
84 }
85
86-secp256k1_context* secp256k1_context_create(unsigned int flags) {
87- secp256k1_context* ret = (secp256k1_context*)checked_malloc(&default_error_callback, sizeof(secp256k1_context));
88+secp256k1_context* secp256k1_context_prealloc_create(void* prealloc, unsigned int flags) {
include/secp256k1.h
, and we should EXPECT
that prealloc
is non-NULL.
valgrind ./tests 1
I see 2Mb leaked.
Addressed your comments including the memory leak (reason was that I didn’t really implement context cloning) and tidied up a little.
Open things:
ecmult_context
and ecmult_gen_context
individually without cloning an entire secp256k1_context
in order to keep the code simple. Cloning the full secp256k1_context
is still supported. We don’t clone these “subcontexts” internally, and I don’t see a reason why we would need this. Thoughts?secp256k1_context_prealloc_clone()
to clone into preallocated memory?secp256k1_context_prealloc_size(flags)
with different flags than secp256k1_context_prealloc_create(..., flags)
. But I think this is acceptable.secp256k1_context_prealloc_clone
but I don’t know if it’d be useful. Just in case.115+static void secp256k1_ecmult_gen_context_finalize_memcpy(secp256k1_ecmult_gen_context *dst, const secp256k1_ecmult_gen_context *src) {
116 #ifndef USE_ECMULT_STATIC_PRECOMPUTATION
117- dst->prec = (secp256k1_ge_storage (*)[64][16])checked_malloc(cb, sizeof(*dst->prec));
118- memcpy(dst->prec, src->prec, sizeof(*dst->prec));
119+ if (src->prec != NULL) {
120+ /* We cast to void* first to suppress a -Wcast-align warning in clang. */
I can drop the last commit which creates a separate header file. I think it’s useful because the average user does not need the new function but the new header file somehow violates the idea that additional headers are modules.
We could make the life easier for the user if we store in the context whether we allocated the memory or the user. (And then we need only one destroy function and the user cannot screw it up.)
The interface probably needs to document the required alignment of caller memory (at least size that it must be aligned similarly to the result of malloc, up to the largest primitive type). Unless we want to handle aligning the struct ourselves.
It would be useful to test this stuff on something that traps on unaligned reads (not x86, which just runs slower but traps only in freaky cases like unaligned SIMD loads using aligned instructions but only when they cross a page boundary)…
I might also suggest calling the function something like prealloced not prealloc, functions are usually read as verbs that describe what they do. So I would assume that prealloc performs preallocation.
53+ *
54+ * Returns: a newly created context object.
55+ * Args: ctx: an existing context to copy (cannot be NULL)
56+ * In: prealloc: a pointer to a rewritable contiguous block of memory of
57+ * size at least secp256k1_context_prealloc_size(flags)
58+ * bytes, suitably aligned to hold an object of any type
It would be useful to test this stuff on something that traps on unaligned reads (not x86, which just runs slower but traps only in freaky cases like unaligned SIMD loads using aligned instructions but only when they cross a page boundary)…
Indeed. I haven’t tested it but this looks promising for x86 and might do to job: https://stackoverflow.com/a/17748435/2725281
I might also suggest calling the function something like prealloced not prealloc, functions are usually read as verbs that describe what they do. So I would assume that prealloc performs preallocation.
Good idea.
Indeed. I haven’t tested it but this looks promising for x86 and might do to job: https://stackoverflow.com/a/17748435/2725281
Well no. It does what it promises, i.e., throwing SIGBUS at you when access unaligned memory, but there are too many false positives. If anyone is interested: You need to link statically (otherwise you get a SIGBUS in the dynamic linker), better disable openssl tests and don’t use gmp (otherwise you get a SIGBUS in gmp). But even then, gcc will leave you with too many unaligned accesses in the binary.
-mno-unaligned-access
to the CFLAGS?
82@@ -83,6 +83,17 @@ size_t secp256k1_context_prealloc_size(unsigned int flags) {
83 return ret;
84 }
85
86+size_t secp256k1_context_prealloc_size_for_clone(const secp256k1_context* ctx) {
static
so it’s clear this isn’t exposed to the user.
ACK except static
nit.
I don’t feel very strongly about it, but I’d prefer we drop the commit that uses a separate header file.
0@@ -0,0 +1,89 @@
1+#ifndef SECP256K1_preallocated_H
2+#define SECP256K1_preallocated_H
3+
4+#include "secp256k1.h"
5+
6+#ifdef __cplusplus
7+extern "C" {
8+#endif
9+
93+#define ALIGNMENT 16
94+#endif
95+
96+#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
97+
98+static SECP256K1_INLINE void *manual_alloc(void** prealloc_ptr, size_t alloc_size, void* base, size_t max_size) {
14 #include "hash_impl.h"
15 #ifdef USE_ECMULT_STATIC_PRECOMPUTATION
16 #include "ecmult_static_context.h"
17 #endif
18+
19+static size_t secp256k1_ecmult_gen_context_preallocated_size(void) {
I think this can be a constant rather than a function (the sizeof expression is always known at compile time, even if the value of its argument isn’t).
https://en.cppreference.com/w/c/language/sizeof says “Except if the type of expression is a VLA, expression is not evaluated and the sizeof operator may be used in an integer constant expression.”.
293@@ -293,6 +294,14 @@ static void secp256k1_ecmult_odd_multiples_table_storage_var(const int n, secp25
294 } \
295 } while(0)
296
297+static size_t secp256k1_ecmult_context_preallocated_size(void) {
I forced-push (only changed the fixups). Changes:
cc @apoelstra
7
8 #ifdef __cplusplus
9 extern "C" {
10 #endif
11
12+/* The module provided by this header file is intended for settings, in which it
,
because we use CHECKs only in tests AFAIU
This concerned me some but so far I don’t see a case where it’s obviously wrong yet (haven’t reviewed the whole thing yet).
But I think it might be useful to comment on runtime failure handling works. I think our design is this: The program can find itself in an impossible state (IS) as a result of memory corruption/hardware fault, miscompilation, operating system failure, or a serious programming error in the library itself or the caller (e.g. malloc fails, a non-nullable argument is null, a loop is out of range etc.).
A few kinds of IS get detected at runtime, particularly ones which get detected as a product of tests at the API boundary which wouldn’t make any sense to make VERIFY_CHECKS (for multiple reasons including that those are only used in our own tests and couldn’t detect bad interactions with a caller).
It’s also generally considered impolite for libraries to be abort() prone: Many libraries haven’t given this stuff much thought and have sometimes called abort() for normal error handling rather than actual corruption. You could imagine a signature verifier that aborted if the DER input didn’t deserialize, instead of just returning false… and imagine how cross an application author would be to discover that behaviour in the field. Even when abort() usage is intended to be limited to impossible states, sometimes the library authors make a mistake and a state is both possible and otherwise harmless… this, and performance, is why we prefer to do most of our impossible state detection via VERIFY_CHECKs that run only in our tests where mistakes in their logic won’t introduce real bugs. So we generally only want to runtime detect IS in cases that have the best tradeoffs: Cases that are really unambiguously wrong e.g. a null pointer we’re about to dereference vs some complicated algebraic condition and cases that arise out of interactions with the caller since our tests can’t discover errors in the caller or in caller-library interactions are both candidates for runtime detection.
When an IS is detected we certainly don’t want to just continue on like nothing bad happened. But we can’t just return errors: (1) The IS may be detected in a location where it is not sensible to return an error (after all, the error is “impossible” so bending the design to accommodate returning might be a bad change), (2) the IS may happen inside an external interface which “cannot fail”. (3) Returning on an IS could be confused for a normal false condition like an invalid signature (so IS happens and a CHECKSIG NOT passes when it shouldn’t), (4) because IS is “impossible” applications would never be tested for handling them safely in any case. (5) If an IS happened it may be the case that the entire process is dangerously corrupted and couldn’t be trusted to do much error handling anyway.
So that’s all well and good and suggests for abort() being the only reasonable way to handle an IS when detected. However, even abort() may not be the safest thing for any particular application to do. For some device abort() might infinite loop and what we need to do is trigger a hardware reset. Aborting on a failure might, in some case, just end up bricking the device– perhaps the IS failure should cause the device to restart with a null config. It might amplify a benign bug in an optional component into a massive DOS, etc. So we offer a callback to allow custom handling of these states. The reason that we don’t use CHECK in the library code is because we instead use error callbacks to allow that customization.
But what happens if the IS prevents us from getting access to the callback? E.g. what if our context is uninitialized or the callback pointer is null? That is a case where using directly calling a default error handler is justified: silently proceeding in a known corrupted state is still something we don’t want to do. But we’d want to minimize the number of locations that do that and document, so that specialized applications that really want to be unconditionally abort() free, can go and make it happen and handle the cases that can’t be addressed via a callback in whatever way makes sense for them. (E.g. by replacing the default handler with one suitable for their device).
Right now we have some cases where we can’t use the callback where use the default handler, and a bunch of other places where we are essentially relying on a null pointer dereference to work well enough (e.g. a dereference of ctx) for the purpose of stopping execution. Probably more of these places should be turned into explicitly calling the default handler because on some systems (esp embedded ones) a null pointer dereference will not crash.
CHECK is used in the tests because we can do whatever we want in the tests… and aborting on test failure is fine and generally interacts well with dynamic instrumentation tools like valgrind/afl/etc.
fixed (and improved the comment about a single malloc too).
I can squash if someone else wants to have a closer look.
15 #ifdef USE_ECMULT_STATIC_PRECOMPUTATION
16 #include "ecmult_static_context.h"
17 #endif
18+
19+#ifndef USE_ECMULT_STATIC_PRECOMPUTATION
20+ static const size_t secp256k1_ecmult_gen_context_preallocated_size = ROUND_TO_ALIGN(sizeof(*((secp256k1_ecmult_gen_context*) NULL)->prec));
In “Add size functions for preallocated memory”:
Nit: use uppercase for constants.
no-std
environments :)
This enables using the library without it calling any mallocs, right? If so then this would be awesome for
no-std
environments :)
Correct.
ACK.
The documentation for this however needs some guidance on the lifetime management of the caller provided memory (it could be done in another PR but it must be done).
It should make clear that the caller is obligated to make sure the buffer it provides lives at least as long as the ctx object and while the context exists the buffer is the exclusive property of that context (e.g. you cannot move it, you cannot create two contexts using the same buffer)… and that after destroying the context the caller it’s up to the caller to free the buffer (if required).
[Basically don’t assume that the user has a feel for how memory management works… everything that I just described is obvious if you have any idea of how it would be implemented, but we can’t assume that the user has given it that much thought… or they might be use to programming languages that manage lifetimes for them, or mistake the buffer for runtime scratch space that can be arbitrarily rewriten between calls]
We might also want to include a note that it’s easier and less failure prone to use the normal functions in environments where dynamic allocation isn’t a problem.
* Determine ALIGNMENT more cleverly and move it to util.h
* Implement manual_malloc() helper function