real-or-random
commented at 4:43 pm on October 15, 2020:
contributor
This add a simple static checker based on clang-query, which is a
tool that could be described as a “clever grep” for abstract syntax
trees (ASTs).
As an initial proof of usefulness, this commit adds these checks:
No uses of floating point types.
No use of certain reserved identifiers (e.g., “mem(…)”, #829).
No use of memcmp (#823).
The checks are easily extensible.
The main purpose is to run the checker on CI, and this commit adds
the checker to the Travis CI script.
This currently requires clang-query version at least 10. (However,
it’s not required not compile with clang version 10, or with clang
at all. Just the compiler flags must be compatible with clang.)
Clang-query simply uses the clang compiler as a backend for
generating the AST.
In order to determine the compile in which the code is supposed to be
compiled (e.g., compiler flags such as -D defines and include paths),
it reads a compilation database in JSON format. There are multiple ways
to generate this database. The easiest way to obtain such a database is
to use a tool that intercepts the make process and build the database.
On Travis CI, we currently use “bear” for this purpose. It’s a natural
choice because there is an Ubuntu package for it. If you want to run
this locally, bear is a good choice but other tools such as compiledb
(Python) are available.
real-or-random force-pushed
on Oct 15, 2020
real-or-random force-pushed
on Oct 15, 2020
real-or-random force-pushed
on Oct 15, 2020
real-or-random
commented at 7:38 pm on October 15, 2020:
contributor
Here’s another interesting one:
match binaryOperator(allOf(unless(isExpansionInSystemHeader()), hasOperatorName("/")))
Unfortunately we use quite a few divisions, so this does not lead anywhere.
We could also check easily for reserved identifiers.
real-or-random
commented at 9:15 pm on October 15, 2020:
contributor
grml, apparently clang-query uses a compile_commands.json file to infer compile options/flags, which I had lying around in my local tree from some old experiments in 2018… So I need to rework this.
gmaxwell
commented at 10:01 pm on October 15, 2020:
contributor
Unfortunately we use quite a few divisions, so this does not lead anywhere.
Is there any way to tell if the divisor is a compile time constant? We generally don’t want divisions ending up in the machine code (though if its in some setup routine it’s not that big a deal) as they’re non-constant time[1] and slow (e.g. 50-100 cycles on x86, worse on a lot of arm). … but divisions with constants get converted by the compiler to multiples via strength reduction. I believe the codebase is free or nearly free of divisions with non-constant divisors (maybe there is one in scratch space sizing or something like that).
I’ve failed to yet fully impress my practice of always using masking over division onto Pieter yet… but there are some cases where masking doesn’t easily work.
[1] also, stock valgrind ctime check will not catch divisions involving secret data. I have a pretty trivial patch to valgrind on my laptop so that it does… and it passed for me as of the last time I ran it. I’m not really eager to figure out how to make travis use a custom compiled copy of valgrind…
real-or-random force-pushed
on Oct 16, 2020
real-or-random
commented at 1:18 pm on October 16, 2020:
contributor
Unfortunately we use quite a few divisions, so this does not lead anywhere.
Is there any way to tell if the divisor is a compile time constant? We generally don’t want divisions ending up in the machine code (though if its in some setup routine it’s not that big a deal) as they’re non-constant time[1] and slow (e.g. 50-100 cycles on x86, worse on a lot of arm). … but divisions with constants get converted by the compiler to multiples via strength reduction. I believe the codebase is free or nearly free of divisions with non-constant divisors (maybe there is one in scratch space sizing or something like that).
You can match on “constantExpr” but this is not what we want. This matches constant expression as seen from the point of view of the AST, i.e., this matches places where only constant expressions are allowed (such as after case).
This technique in general does not enable us to get insights on what the compiler will do with the code. It’s really just pattern matching on the AST, so it’s merely a clever grep. (One can do more advanced things with clang tooling but this requires more effort than writing matching expressions.)
What we could do is to check if the divisor is an integer literal, but even this is not precise we have also a few (x-1) divisors for example.
real-or-random force-pushed
on Oct 16, 2020
real-or-random force-pushed
on Oct 16, 2020
real-or-random force-pushed
on Oct 16, 2020
real-or-random
commented at 2:40 pm on October 16, 2020:
contributor
Edit: argh, the links to the exact lines don’t work due to the output folding.
I might add a commit that restructures the Travis output in general by adding folds to the main commands:
https://github.com/travis-ci/docs-travis-ci-com/issues/949#issuecomment-276755003
Then all the main commands be folded unless they fail, so see only the errors. Moreover we don’t need the cats at the end (which have confused people in the past) and we don’t need to ugly hack that I introduced here to make the color output work with cat.)
real-or-random marked this as ready for review
on Oct 16, 2020
sipa
commented at 7:16 am on October 17, 2020:
contributor
This is pretty cool. I’ll have to play with it a bit.
real-or-random force-pushed
on Oct 19, 2020
real-or-random renamed this:
Add simple static checker using clang-query
Add simple static checker based on clang-query
on Oct 19, 2020
real-or-random
commented at 3:57 pm on October 19, 2020:
contributor
I added a long commit message and updated the PR description to this message.
real-or-random referenced this in commit
e89278f211
on Oct 20, 2020
This add a simple static checker based on clang-query, which is a
tool that could be described as a "clever grep" for abstract syntax
trees (ASTs).
As an initial proof of usefulness, this commit adds these checks:
- No uses of floating point types.
- No use of certain reserved identifiers (e.g., "mem(...)", #829).
- No use of memcmp (#823).
The checks are easily extensible.
The main purpose is to run the checker on CI, and this commit adds
the checker to the Travis CI script.
This currently requires clang-query version at least 10. (However,
it's not required not compile with clang version 10, or with clang
at all. Just the compiler flags must be compatible with clang.)
Clang-query simply uses the clang compiler as a backend for
generating the AST.
In order to determine the compile in which the code is supposed to be
compiled (e.g., compiler flags such as -D defines and include paths),
it reads a compilation database in JSON format. There are multiple ways
to generate this database. The easiest way to obtain such a database is
to use a tool that intercepts the make process and build the database.
On Travis CI, we currently use "bear" for this purpose. It's a natural
choice because there is an Ubuntu package for it. If you want to run
this locally, bear is a good choice but other tools such as compiledb
(Python) are available.
4831f2ed4b
real-or-random force-pushed
on Oct 22, 2020
elichai
commented at 10:46 am on October 22, 2020:
contributor
FWIW
Names beginning with a capital ‘E’ followed a digit or uppercase letter
Names that begin with either ‘is’ or ‘to’ followed by a lowercase letter
Names that begin with ‘LC_’ followed by an uppercase letter
Names of all existing mathematics functions suffixed with ‘f’ or ‘l’
Names that begin with ‘SIG’ followed by an uppercase letter
Names that begin with ‘SIG_’ followed by an uppercase letter
Names beginning with ‘str’, ‘mem’, or ‘wcs’ followed by a lowercase letter
Names that end with ‘_t’
real-or-random
commented at 10:53 am on October 22, 2020:
contributor
@elichai Yeah, where did you get this list from? Is this really all about all names? I thought E is only for macros. We can’t check for macros using clang-query but we could add a simple grep for #define[[:space:]+]E etc. I also have uncommitted changes to add https://clang.llvm.org/docs/DiagnosticsReference.html#wreserved-id-macro if the compiler supports it (this covers only _ and __ macros). Not sure if overkill but it’s not a lot of work.
elichai
commented at 11:15 am on October 22, 2020:
contributor
@elichai Yeah, where did you get this list from? Is this really all about all names? I thought E is only for macros.
About E it’s weird, in the gcc docs it says for additional error code names which are macros, and the exact wording of the standard is:(7.1.4)
Macros that begin with E and a digit or E and an uppercase letter (followed by any combination of digits, letters, and underscore) may be added to the declarations in the <ermo. h> header.
But on 7.1.3 “Reserved identifies” it says:
Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.
However, p1 makes it clear that all identifiers reserved from this subclause are reserved identifiers regardless of what header files are included, meaning that these rules apply to all C code
In effect, these identifiers are reserved for all uses in C regardless of what header files (if any) are included
EDIT: I realize now that you asked about identifier vs macro, and not about if it’s related to errno.h or not. it sounds like it’s only for macros but the lawyering here is delicate
elichai
commented at 11:55 am on October 22, 2020:
contributor
Standard on 6.8.3 Semantics:
The identifier immediately following the define is called the macro name
and also Aaron Ballman(the author of that paper, and part of WG14 committee) told me:
macros are defined using an identifier, so “macro name” and “identifier” are interchangable
“reserved for use as a macro name” is just saying how it’s intended to be used, not that macro names are a special thing
So yeah :/
gmaxwell
commented at 1:52 pm on October 22, 2020:
contributor
It makes sense: if some macro is going to get called E1234 then you obviously can’t have a variable named E1234 without trouble.
sipa referenced this in commit
9e5939d284
on Nov 4, 2020
Add -Wreserved-id-macro to compile flags (only clang)
This warns when we define macros with some reserved identifiers, e.g.,
with underscores. It does not catch all reserved identifiers though.
bb8cf03a28
real-or-random cross-referenced this
on Mar 17, 2021
from issue
schnorrsig API overhaul
by jonasnick
real-or-random
commented at 5:36 pm on May 8, 2023:
contributor
Hm, closing for now… clang-query is a neat idea, but I’m not convinced that the checks introduced here (namely mostly “reserved identifier” stuff) are worth maintaining this, in particular because clang-query is a bit finicky.
This conclusion could certainly change if we want to use something like this here for tracking magnitude (https://github.com/bitcoin-core/secp256k1/issues/1001) or other advanced stuff. And then, we may also want to look into other similar tools such as CodeQL.
This is a metadata mirror of the GitHub repository
bitcoin-core/secp256k1.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-01-08 09:15 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me