mem
followed by a lowercase letter. We are using the reserved function name memczero
, which is undefined behaviour.
mem
followed by a lowercase letter. We are using the reserved function name memczero
, which is undefined behaviour.
ACK fixing this.
I would gauge that this is extremely unlikely to ever cause issues (esp, not silent issues); but the function name is a free choice so better to be pedantically correct.
Ideally the code would be checked by static analysis which is anal enough to catch things like this (some of which might be less benign than this one) and keeping this wrong would just cause the tool to produce uninteresting reports.
Reserved wildcard issues have been fixed before. E.g. #479
__
prefix is UB. sigh.
If I understand C89 correctly, this is not an issue because memczero
is static and as such has internal linkage and is not an “external name”.
But the “mem…” identifiers are reserved as external names: 4.13: “All external names described below are reserved no matter what headers are included by the program. (…)”
3.1.2 defines “external name” to be “an identifier that has external linkage” and 3.1.2.2: “If the declaration of an identifier for an object or a function has file scope and contains the storage-class specifier static, the identifier has internal linkage.”
4.1.2: “If the program defines an external identifier with the same name as a reserved external identifier, even in a semantically equivalent form, the behavior is undefined.”
The language is pretty imprecise though. C99 and C11 make this clearer by explicitly saying which identifiers are reserved in which contexts. The result is the same in our case.
C99, 7.3.1: “All identifiers with external linkage in any of the following subclauses (including the future library directions) are always reserved for use as identifiers with external linkage.”
That’s enough language lawyering for today, even for my standards. No pun intended.
C99, 7.3.1: “Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.” (emphasis mine)
My reading of this was that declaring the memczero
function in any file that includes the <string.h>
header is UB. But I find this wording hard to read, and I may be mistaken.
Oh yes, I think you’re right!
I guess let’s just fix it instead of debating the standard (even though the latter may be more entertaining). :D
Should we expand the scope of this issue?
https://github.com/bitcoin-core/secp256k1/blob/master/src/ecmult_impl.h#L71
Beginning a macro with E followed by a digit or an uppercase letter, … undefined behaviour?
To be fair, I don’t think we include <errno.h>
.
wtf, as much as UB in C is interesting this is crazy, I thought only __ prefix is UB. sigh.
All UB really means is that the standard doesn’t make any promises about what happens when your program does it.
mem* is reserved, so maybe a later standard/compiler makes a “memczero” a built-in function which happens to take the right number of arguments, and .. uh.. zeros all memory accessible to C. This would make you unhappy. But (1) they’re unlikely to use that specific identifier, (2) if they did it would probably trigger warnings or failures to compile.
The whole thing where other kinds of UB causes extremely weird program behaviour is a product of optimization passes. E.g. Some optimization to unroll a loop can’t work unless it knows that the loop counter can’t wrap, and if it can’t unroll it can’t vectorize. Fortunately the counter is signed and so it can’t wrap by definition. Compose a bunch of optimizations and you get stuff like the compiler automatically removing blocks of code because it “proved” that it could never get executed, much to the programmer’s surprise. Without the optimizations the compiler’s performance would be much poorer (and it is– go compile this library with gcc 2.95, it’s visibly slower).
Some identifier language lawyering is extremely unlikely to result in weirdo behaviour through any path like that, only through a genuine clash. But being pedantically wrong, even if is unlikely doesn’t matter, makes it harder to use pedantic static analysis tools– and also potentially sends an implicit signal to contributors that the authors here don’t care about being pedantically correct. I’d even go a step further and say that if multiple experienced contributors think something might be verboten it should be avoided at least if avoiding it is essentially free: easy to review, harmless to performance, harmless to api compatibility, almost certain to not introduce bugs, like “pick a different function name”. :) If we thought it was wrong, someone else will later.
E
?
Should I open a new issue for macros beginning with
E
?
I think a PR would also work. ;)