Closes #1181.
Catches the actual symbol visibility issue.
Replaces #1135.
Maybe --env-file drops ENV?
Rebased to resolve conflicts.
Rebased.
0 | @@ -0,0 +1,12 @@ 1 | +#ifndef SECP256K1_UTIL_DEFS_H 2 | +#define SECP256K1_UTIL_DEFS_H 3 | + 4 | +/* Global variable visibility */ 5 | +/* See: https://github.com/bitcoin-core/secp256k1/issues/1181 */ 6 | +#if !defined(_WIN32) && defined(__GNUC__) && (__GNUC__ >= 4) 7 | +# define SECP256K1_LOCAL_VAR extern __attribute__ ((visibility ("hidden")))
The combo of extern + hidden isn't completely intuitive, but the gcc visibility guide confirms that it makes sense.
0 | @@ -0,0 +1,12 @@ 1 | +#ifndef SECP256K1_UTIL_DEFS_H 2 | +#define SECP256K1_UTIL_DEFS_H
Nit: util_defs.h is pretty ambiguous. How about util_local_visibility.h or so?
Thanks! Reworked.
266 | @@ -267,8 +267,6 @@ else() 267 | try_append_c_flags(-Wundef) 268 | endif() 269 | 270 | -set(CMAKE_C_VISIBILITY_PRESET hidden)
In case it's not obvious to other reviewers, this is no longer needed because every non-static function should be externally visible.
If we ever need to share private symbols between compilation units this won't work. But I agree that it makes sense to take advantage of that if we can.
Cool that this caught a real (new) symbol leak.
To clarify, the fix is in the first commit. The excerpt from the CI log with an error message from the previous iteration of this branch looks as follows:
+ python3 ./tools/symbol-check.py .libs/libsecp256k1.so
.libs/libsecp256k1.so: export of symbol secp256k1_musig_nonce_gen_internal not expected
.libs/libsecp256k1.so: export of symbol secp256k1_musig_nonce_gen_internal not expected
.libs/libsecp256k1.so: failed EXPORTED_SYMBOLS
Error: Process completed with exit code 1.
@fanquake suggested that the test could potentially be simplified for c-i.
Thanks! The tools/symbol-check.py script has been simplified.
Code review ACK 44a9392f8b124aab44bd38c7aab6fbf8c4ac3c11. No opinion on the tests/ci.
781 | @@ -763,6 +782,12 @@ jobs: 782 | run: | 783 | cd build/bin/RelWithDebInfo && file *tests.exe bench*.exe libsecp256k1-*.dll || true 784 | 785 | + - name: Symbol check 786 | + if: ${{ matrix.configuration.cmake_options != '-A x64 -DBUILD_SHARED_LIBS=OFF' }}
nit: Checking for string inequality seems a bit fragile if we modify the builds in the future. Perhaps it's cleaner to introduce a new variable symbol_check in addition to cmake_options? Or alternatively, a variable shared that controls -DBUILD_SHARED_LIBS.
0 | @@ -0,0 +1,98 @@ 1 | +#!/usr/bin/env python3 2 | +''' 3 | +A script to check that a secp256k1 shared library
A script to check that a libsecp256k1 shared library
:P
106 | @@ -107,6 +107,19 @@ file *tests* || true 107 | file bench* || true 108 | file .libs/* || true 109 | 110 | +if [ "$SYMBOL_CHECK" = "yes" ] 111 | +then 112 | + case "$HOST" in 113 | + *mingw*) 114 | + ls -l .libs 115 | + python3 ./tools/symbol-check.py .libs/libsecp256k1-2.dll
Now that we invoke python3 here, I think it makes sense to add a python3 --version at the top of the file, as we done for the other tools.
22 | + ./tools/symbol-check.py build/lib/libsecp256k1.dylib 23 | +''' 24 | +import re 25 | +import sys 26 | +import subprocess 27 | +from typing import List
hint: List[T] has been deprecated since Python 3.9, which is the oldest supported Python version. So you can safely use list[T], and this import is not needed. See https://docs.python.org/3.9/library/typing.html?highlight=typing#typing.List (Kudos for adding types!)
30 | # define WINDOW_G ECMULT_WINDOW_SIZE 31 | -extern const secp256k1_ge_storage secp256k1_pre_g[ECMULT_TABLE_SIZE(WINDOW_G)]; 32 | -extern const secp256k1_ge_storage secp256k1_pre_g_128[ECMULT_TABLE_SIZE(WINDOW_G)]; 33 | +SECP256K1_LOCAL_VAR const secp256k1_ge_storage secp256k1_pre_g[ECMULT_TABLE_SIZE(WINDOW_G)]; 34 | +SECP256K1_LOCAL_VAR const secp256k1_ge_storage secp256k1_pre_g_128[ECMULT_TABLE_SIZE(WINDOW_G)]; 35 | #endif /* defined(EXHAUSTIVE_TEST_ORDER) */
Is there a specific reason to keep the #include in a specific branch (instead of putting it at the top of file)?
28 | + 29 | +import lief 30 | + 31 | + 32 | +def grep_exported_symbols() -> List[str]: 33 | + grep_output = subprocess.check_output(["git", "grep", "^SECP256K1_API", "--", "include"], universal_newlines=True, encoding="utf8")
grep_output = subprocess.check_output(["git", "grep", "^[\s*]SECP256K1_API", "--", "include"], universal_newlines=True, encoding="utf8")
That's perhaps a bit more robust
I think this is small enough to keep it in util.h. Or is there a reason not to keep it here?
The reason is to facilitate the further attempts to make all headers self-contained.
I guess I'm just missing it. Can you explain how this will help with #1039?
My intention was to avoid bringing the high-level ../include/secp256k1.h header into lower-level code via util.h. However, at this point, util.h is already being included indirectly through other headers.
No opinion here, I defer to @real-or-random for preference.
@real-or-random's comments have been addressed.
66 | + return ok 67 | + 68 | + 69 | +def check_MACHO_exported_functions(library, expected_exports) -> bool: 70 | + ok: bool = True 71 | + macho_lib: lief.MACHO.Binary = library.concrete
What are the calls to .concrete for?
88 | + elif exe_format == lief.Binary.FORMATS.PE: 89 | + success = check_PE_exported_functions(library, grep_exported_symbols()) 90 | + elif exe_format == lief.Binary.FORMATS.MACHO: 91 | + success = check_MACHO_exported_functions(library, grep_exported_symbols()) 92 | + else: 93 | + print(f'{filename}: unknown executable format')
I don't think this code can be reached, because if it is an unknown format, .parse() will have errored (printing Unknown format), and then lief.Binary.FORMATS = library.format will fail:
./tools/symbol-check.py not_a_file
Unknown format
Traceback (most recent call last):
File "./tools/symbol-check.py", line 81, in <module>
exe_format: lief.Binary.FORMATS = library.format
^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'format'
I don't think this code can be reached,
This will be reachable once LIEF adds support for a new binary format, so I think it was useful to have it.
51 | + return ok 52 | + 53 | + 54 | +def check_PE_exported_functions(library, expected_exports) -> bool: 55 | + ok: bool = True 56 | + for function in library.exported_functions:
After reading the docs, I believe that .exported_functions is from the abstract Binary class, and [entry.name for entry in library.get_export()] may give the better result for PE because it includes non-function exports? See https://lief.re/doc/latest/formats/pe/python.html#export-entry
The suggested approach returns the same list of symbols on the master branch. Incorporated as a potentially superior one.
62 | + return ok 63 | + 64 | + 65 | +def check_MACHO_exported_functions(library, expected_exports) -> bool: 66 | + ok: bool = True 67 | + for function in library.exported_functions:
After reading the docs, I believe that .exported_functions is from the abstract Binary class, and .exported_symbols may give a better result for Mach-O because it includes non-function exports?
The suggested approach returns the same list of symbols on the master branch. Incorporated as a potentially superior one.
Hm, just checking: You're saying that this change didn't make a difference, right? So it still finds only exported functions and not exported variables?
(And I have the same question for PE.)
Hm, just checking: You're saying that this change didn't make a difference, right?
Correct.
So it still finds only exported functions and not exported variables?
With the reverted second commit, both variants catch the same local variables:
./tools/symbol-check.py: In .libs/libsecp256k1.dylib: Unexpected exported symbols: {'secp256k1_pre_g', 'secp256k1_pre_g_128', 'secp256k1_ecmult_gen_prec_table'}
(And I have the same question for PE.)
With the reverted second commit, both variants do not catch local variables.
For more details, please refer to:
Ok, that means that the LIEF API is a bit unexpected. Or we don't understand it.
exported_functions is the same as exported_symbols. Both include variables, but I'd expect the former to include only functions.secp256k1_ellswift_xdh_hash_function_bip324, so at least we test this in CI. And thus we can be sure that the variables are actually exported in PE, and it's just LIEF that doesn't show them.Does this match your understanding? Do you think it's worth asking LIEF about the variables?
Relatedly, on platforms where LIEF gives us both variables and functions, would it make sense also check that all expected symbols are actually exported?
Nevermind, my analysis of variables is wrong. LIEF shows them. I'll revisit this later.
This script can still be simplified. Let me suggest a refactored version later.
Here's a refactored version, which is more readable IMHO (haven't tested on Mac or Windows):
#!/usr/bin/env python3
"""Check that a libsecp256k1 shared library exports only expected symbols.
Usage examples:
- When building with Autotools:
./tools/symbol-check.py .libs/libsecp256k1.so
./tools/symbol-check.py .libs/libsecp256k1-<V>.dll
./tools/symbol-check.py .libs/libsecp256k1.dylib
- When building with CMake:
./tools/symbol-check.py build/lib/libsecp256k1.so
./tools/symbol-check.py build/bin/libsecp256k1-<V>.dll
./tools/symbol-check.py build/lib/libsecp256k1.dylib"""
import re
import sys
import subprocess
import lief
class UnexpectedExport(RuntimeError):
pass
def get_exported_exports(library) -> list[str]:
"""Adapter function to get exported symbols based on the library format."""
if library.format == lief.Binary.FORMATS.ELF:
return [symbol.name for symbol in library.exported_symbols]
elif library.format == lief.Binary.FORMATS.PE:
return [function.name for function in library.exported_functions]
elif library.format == lief.Binary.FORMATS.MACHO:
return [function.name[1:] for function in library.exported_functions]
raise NotImplementedError(f"Unsupported format: {library.format}")
def grep_expected_symbols() -> list[str]:
"""Guess the list of expected exported symbols from the source code."""
grep_output = subprocess.check_output(
["git", "grep", "^SECP256K1_API", "--", "include"], # TODO WHITESPACE
universal_newlines=True,
encoding="utf-8"
)
lines = grep_output.split("\n")
pattern = re.compile(r'\bsecp256k1_\w+')
exported: list[str] = [pattern.findall(line)[-1] for line in lines if line.strip()]
return exported
def check_symbols(library, expected_exports) -> None:
"""Check that the library exports only the expected symbols."""
actual_exports = list(get_exported_exports(library))
unexpected_exports = set(actual_exports) - set(expected_exports)
if unexpected_exports != set():
raise UnexpectedExport(f"Unexpected exported symbols: {unexpected_exports}")
def main():
if len(sys.argv) != 2:
print(__doc__)
return 1
library = lief.parse(sys.argv[1])
expected_exports = grep_expected_symbols()
try:
check_symbols(library, expected_exports)
except UnexpectedExport as e:
print(f"{sys.argv[0]}: In {sys.argv[1]}: {e}")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())
Here's a refactored version, which is more readable IMHO (haven't tested on Mac or Windows):
Tested on macOS and on Windows. Incorporated. Thank you!
35 | + 36 | + 37 | +def grep_expected_symbols() -> list[str]: 38 | + """Guess the list of expected exported symbols from the source code.""" 39 | + grep_output = subprocess.check_output( 40 | + ["git", "grep", "^SECP256K1_API", "--", "include"], # TODO WHITESPACE
["git", "grep", "^\s*SECP256K1_API", "--", "include"],
Oops, I think I had this in mind here, to make the matching a bit more flexible.
Thanks! Done.
35 | + 36 | + 37 | +def grep_expected_symbols() -> list[str]: 38 | + """Guess the list of expected exported symbols from the source code.""" 39 | + grep_output = subprocess.check_output( 40 | + ["git", "grep", "^\s*SECP256K1_API", "--", "include"],
This gives me error symbol-check.backup.py:40: SyntaxWarning: invalid escape sequence '\s'. I think it should be
r"^\s*SECP256K1_API"
or
"^\\s*SECP256K1_API"
Rebased and addressed @jonasnick's #1359 (review).
Code-review reACK 7ac46cf93c322193e428effeac3f024f4ab6a05b.
Still no opinion on the .py/ci.
Being able to drop -fvisibility=hidden is a nice improvement (and also a testament to the code structure!)
47 | + return exported 48 | + 49 | + 50 | +def check_symbols(library, expected_exports) -> None: 51 | + """Check that the library exports only the expected symbols.""" 52 | + actual_exports = list(get_exported_exports(library))
nit if you touch again:
actual_exports = get_exported_exports(library)
(I had this as a local change lying around in some git stash...)
Addressed the feedback from @real-or-random and rebased.
needs rebase, sorry :/
This change makes the `-fvisibility=hidden` compiler option unnecessary.
Co-authored-by: Tim Ruffing <crypto@timruffing.de>
needs rebase, sorry :/
Rebased :)
reACK d1478763a5f400fafb42af5911db4a9460dd4a5d
54 | + if unexpected_exports != set(): 55 | + raise UnexpectedExport(f"Unexpected exported symbols: {unexpected_exports}") 56 | + 57 | +def main(): 58 | + if len(sys.argv) != 2: 59 | + print(__doc__)
print(doc)
Don't think this does anything? The condition also doesn't catch a missing file argument:
./tools/symbol-check.py
Traceback (most recent call last):
File "/Users/michael/secp256k1/./tools/symbol-check.py", line 83, in <module>
filename: str = sys.argv[1]
~~~~~~~~^^^
IndexError: list index out of range
so would only be useful if someone has given multiple arguments.
What platform and Python version are you using?
On my Ubuntu 24.04.2 LTS with Python 3.12.3:
$ ./tools/symbol-check.py
Check that a libsecp256k1 shared library exports only expected symbols.
Usage examples:
- When building with Autotools:
./tools/symbol-check.py .libs/libsecp256k1.so
./tools/symbol-check.py .libs/libsecp256k1-<V>.dll
./tools/symbol-check.py .libs/libsecp256k1.dylib
- When building with CMake:
./tools/symbol-check.py build/lib/libsecp256k1.so
./tools/symbol-check.py build/bin/libsecp256k1-<V>.dll
./tools/symbol-check.py build/lib/libsecp256k1.dylib
macOS & Python 3.13.2
./tools/symbol-check.py Traceback (most recent call last): File "/Users/michael/secp256k1/./tools/symbol-check.py", line 83, in <module> filename: str = sys.argv[1] ~~~~~~~~^^^ IndexError: list index out of range
Are you on the most recent revision of this PR? The current ./tools/symbol-check.py has no line 83.
macOS & Python 3.13.2
No issues on my macOS with Python 3.13.2.
@real-or-random eh, I think an older version must have been the case.
Milestone
0.7.0