tests: refactor tagged hash verification #1725

pull josibake wants to merge 1 commits into bitcoin-core:master from josibake:tagged-hash-test-util changing 4 files +36 −30
  1. josibake commented at 10:05 am on August 15, 2025: member

    Opened in response to #1698 (review)


    We use tagged hashes in modules/musig, modules/schnorrsig, modules/ellswift, and the proposed modules/silentpayments. In looking for inspiration on how to add tagged hash midstate verification for #1698, it seemed like a good opportunity to DRY up the code across all of the modules.

    I chose the convention used in the ellswift module as this seems the most idiomatic C. Since the tags are normally specified as strings in the BIPs, I also added a comment above each char array for convenience.

    If its deemed too invasive to refactor the existing modules in this PR, I’m happy to drop the refactor commits for the ellswift and schnorrsig modules. All I need for #1698 is the first commit which moves the utility function out of the musig module to make it available to use in the silent payments module.

  2. in src/modules/ellswift/tests_impl.h:410 in 6424805039 outdated
    430-        secp256k1_sha256_initialize_tagged(&sha, bip324_tag, sizeof(bip324_tag));
    431-        secp256k1_ellswift_sha256_init_bip324(&sha_optimized);
    432-        test_sha256_eq(&sha, &sha_optimized);
    433+        secp256k1_sha256 sha_optimized;
    434+        {
    435+            unsigned char tag[] = "secp256k1_ellswift_encode";
    


    hebasto commented at 4:53 pm on August 15, 2025:

    8a17983bfbe8b62f3e6c8b081a54c419021bc4c7:

    fa67b6752d8ba3e4c41f6c36b1c6b94a21770419 can be relevant.


    josibake commented at 9:32 am on August 18, 2025:

    Nice find! The commit message states “However, it requires exactly specifying the array size, which can be cumbersome,” but I don’t think this is true.

    Using the test program:

    0// repro.c
    1#include <stdio.h>
    2
    3int main() {
    4    char str[] = "hello world";  // This should trigger the warning
    5    printf("%s\n", str);
    6    return 0;
    7}
    

    I am able to compile with gcc14:

    0nix-shell --expr 'with import <nixpkgs> {}; mkShell.override { stdenv = overrideCC stdenv gcc14; }'
    1gcc -v
    2gcc -Wall -Wextra -Wpedantic -Werror repro.c -o out
    

    and able to compile with gcc15:

    0nix-shell --expr 'with import <nixpkgs> {}; mkShell.override { stdenv = overrideCC stdenv gcc15; }'
    1gcc -v
    2gcc -Wall -Wextra -Wpedantic -Werror repro.c -o out
    

    However, if I specify the array size, I can reproduce the error:

    0// repro.c
    1#include <stdio.h>
    2
    3int main() {
    4    char str[11] = "hello world";  // This should trigger the warning
    5    printf("%s\n", str);
    6    return 0;
    7}
    

    No error with:

    0nix-shell --expr 'with import <nixpkgs> {}; mkShell.override { stdenv = overrideCC stdenv gcc14; }'
    1gcc -Wall -Wextra -Wpedantic -Werror repro.c -o out
    

    And an error with:

    0nix-shell --expr 'with import <nixpkgs> {}; mkShell.override { stdenv = overrideCC stdenv gcc15; }'
    1gcc -Wall -Wextra -Wpedantic -Werror repro.c -o out
    2
    3repro.c: In function ‘main’:
    4repro.c:4:20: error: initializer-string for array of ‘char’ truncates NUL terminator but destination lacks ‘nonstring’ attribute (12 chars into 11 available) [-Werror=unterminated-string-initialization]
    5    4 |     char str[11] = "hello world";  // This should trigger the warning
    6      |                    ^~~~~~~~~~~~~
    7cc1: all warnings being treated as errors
    

    Based on the above, I’d recommend we prefer the approach in this PR of not specifying the array size and perhaps document it as the preferred convention going forward? I find being able to specify the tag as a string to be much more reviewable than specifying the tag as an array of characters.

    That being said, also happy to go the other way and update the musig tests to match the other modules if thats the preferred convention, as I think the main benefit is to have all of the modules follow the same convention.


    josibake commented at 9:55 am on August 18, 2025:

    To convince myself, I also verified with a few versions of clang, e.g.,:

    0nix-shell --expr 'with import <nixpkgs> {}; mkShell.override { stdenv = llvmPackages_16.stdenv; }'
    1clang -Wall -Wextra -Wpedantic -Werror -Wmost repro.c
    

    real-or-random commented at 10:38 am on August 18, 2025:

    @josibake The NUL byte resulting from char str[] = "hello world" does not hurt per se, but there are two minor issues with this: First, it’s conceptually the wrong thing: If we want a char array, the simplest thing to do is to define a char array instead of a NUL-terminated string. Second and probably more relevant, it changes sizeof(str) to be 12 instead of 11. (See https://godbolt.org/z/da6PExKTh for demonstration. godbolt.org is the easiest way to test toy examples on many compilers.) We could, of course, accept this and always use sizeof(str) - 1, but it’s easy to miss this.

    edit: Sorry, I now saw that you’re aware of the - 1 thing. And I agree, the ability to grep for the string is a good argument for the NUL-terminated string. If you ask me, I prefer to forego the grepability and define the right kind of object and have sizeof correct. But there’s no definitive answer in the end.


    josibake commented at 10:59 am on August 18, 2025:

    @real-or-random thanks for the context! That explains the sizeof(str) - 1 for the musig examples. So it seems the choices are:

    1. Do something conceptually wrong for something that is slightly easier to review
    2. Do the conceptually correct thing for something that is slightly harder to review

    “Slightly harder/easier” is a bit hand-wavy, but the fact that we used to specify the tags as strings (and the recently added musig also adopted this convention vs staying consistent with the existing modules) indicates option 1 is the more natural option. However, it likely needs an explainer, especially for why we are using sizeof(tag) - 1. On the flipside, I’m guessing option 2 feels more natural for reviewers who review/write a majority of the time in C?

    Regardless of which convention is chosen, I do think its worth documenting in CONTRIBUTING.md. I’ll add a commit for that once reviewers have weighed in on which convention they prefer.


    hebasto commented at 12:38 pm on August 18, 2025:

    real-or-random commented at 12:49 pm on August 18, 2025:

    Maybe godbolt.org/z/eKbT6sha4?

    That still generates a warning if I add -Wextra.


    hebasto commented at 1:10 pm on August 18, 2025:

    Maybe godbolt.org/z/eKbT6sha4?

    That still generates a warning if I add -Wextra.

    Right.

    https://godbolt.org/z/n5rf5Y7cP


    real-or-random commented at 1:25 pm on August 18, 2025:

    Oh, interesting, I wasn’t aware of nonstring. That’s another neat way.

    Though when I think about it, I still prefer {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'}. Code is read much more often than it’s written, so it makes sense to optimize reader (or reviewer) burden, and {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'} is immediately clear to a reviewer familiar with C. It’s just a bit hard on the eyes, but there will be no need to look up macros or GNU extension attributes, etc.


    hebasto commented at 3:07 pm on August 18, 2025:

    Though when I think about it, I still prefer {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'}. Code is read much more often than it’s written, so it makes sense to optimize reader (or reviewer) burden, and {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'} is immediately clear to a reviewer familiar with C. It’s just a bit hard on the eyes, but there will be no need to look up macros or GNU extension attributes, etc.

    Agreed. That’s why I raised this point in the first place.


    josibake commented at 4:51 pm on August 18, 2025:
    Sounds like 2 votes for keeping it as is, vs one vote to change it 😅 I’ll update this PR tomorrow to instead convert the musig module to the existing convention, and add a note documenting the convention.
  3. real-or-random commented at 10:44 am on August 18, 2025: contributor
    Concept ACK it’s a good idea to make this consistent
  4. real-or-random added the label assurance on Aug 18, 2025
  5. real-or-random added the label tweak/refactor on Aug 18, 2025
  6. theStack commented at 9:24 pm on August 18, 2025: contributor

    Concept ACK

    In the risk of sounding heretic, wouldn’t it also be an option to let sha256_tag_test_internal simply take a string and compute the tag length at run-time via strlen (it’s test-only code anyways…), in order to avoid having to declare char arrays and deal with specifying the correct lengths repeatedly in the first place? I’d be very surprised at least if future BIP authors would break the tradition and ever use tags that include NUL-bytes. Happy to review either variant, of course (also, obviously feel free to just ignore, since there has been a good amount of discussion already).

  7. real-or-random commented at 6:42 am on August 19, 2025: contributor

    In the risk of sounding heretic, wouldn’t it also be an option to let sha256_tag_test_internal simply take a string and compute the tag length at run-time via strlen (it’s test-only code anyways…),

    Hehe, I think that’s also a good approach. It increases legibility at the cost of introducing the assumption that there are NUL bytes (which is most likely true even for future tags, yes). If I had to pick, I’d still pack the array initializer simply because the tag is conceptually an array.

    I think we have reached a point where @josibake should just pick one of the many good options, and we’ll move on with that one. :smile:

  8. josibake force-pushed on Aug 20, 2025
  9. josibake commented at 8:30 am on August 20, 2025: member

    Thanks everyone for chiming in! I reworked this to update the musig tests to use static const unsigned char arrays and refactored the existing tests to use the sha256_tag_test_internal function. I think @real-or-random made some compelling arguments for this approach, namely:

    {‘h’, ’e’, ’l’, ’l’, ‘o’, ’ ‘, ‘w’, ‘o’, ‘r’, ’l’, ’d’} is immediately clear to a reviewer familiar with C. It’s just a bit hard on the eyes, but there will be no need to look up macros or GNU extension attributes, etc.

    Given that this library is written in C, it seems best to write code that is familiar to reviewers and is idiomatic C.

    If I had to pick, I’d still pack the array initializer simply because the tag is conceptually an array.

    Agree. Though we can represent tags as strings, ultimately they are character arrays. Creating them as char arrays seems to have the least surprises, e.g., sizeof works as expected. I still think its nice to have a string representation of the tag in the code, so I added a comment above each char array.

    Lastly, I decided against adding a blurb to CONTRIBUTING.md. I think “New code should adhere to the style of existing, in particular surrounding, code..” is sufficient, and I expect new tagged hashes to be infrequent. Happy to add a documentation commit, however, if others feel it warrants a blurb in CONTRIBUTING.md.

  10. real-or-random commented at 8:44 am on August 20, 2025: contributor

    Lastly, I decided against adding a blurb to CONTRIBUTING.md. I think “New code should adhere to the style of existing, in particular surrounding, code..” is sufficient, and I expect new tagged hashes to be infrequent. Happy to add a documentation commit, however, if others feel it warrants a blurb in CONTRIBUTING.md.

    Agreed, this is too much of a niche thing to bother with in this file. Of course, it won’t hurt if it’s documented there, but then we could also document hundreds of other things in CONTRIBUTING.md.

  11. in src/tests.c:618 in 17af09dcd4 outdated
    613+ * tagged hash midstate. This function is used by some module tests. */
    614+static void sha256_tag_test_internal(secp256k1_sha256 *sha_tagged, const unsigned char *tag, size_t taglen) {
    615+    secp256k1_sha256 sha;
    616+    secp256k1_sha256_initialize_tagged(&sha, tag, taglen);
    617+    test_sha256_eq(&sha, sha_tagged);
    618+}
    


    real-or-random commented at 8:52 am on August 20, 2025:
    The only comment I have is that this could get a better name, e.g., test_sha256_tag_midstate, to make it more descriptive and consistent with the rest. Not sure what the point of “internal” was, but now the function is not internal to the module anymore.
  12. real-or-random commented at 8:52 am on August 20, 2025: contributor
    ACK mod nit, you could also squash these commits
  13. tests: refactor tagged hash tests
    Move the sha256_tag_test_internal function out of the musig module
    into tests.c. This makes it available to other modules wishing to verify tagged
    hashes without needing to duplicate the function.
    
    Change the function signature to expect a const unsigned char and update
    the tagged hash tests to use static const unsigned char character
    arrays (where necessary).
    
    Add a comment for each tag. This is done as a convenience for checking
    the strings against the protocol specifications, where the tags are
    normally specified as strings.
    
    Update tests in the ellswift and schnorrsig modules to use the
    sha256_tag_test_internal helper function.
    5153cf1c91
  14. josibake force-pushed on Aug 20, 2025
  15. josibake commented at 9:43 am on August 20, 2025: member
    Renamed helper function to test_sha256_tag_midstate and squashed the commits (h/t @real-or-random )
  16. real-or-random approved
  17. real-or-random commented at 9:44 am on August 20, 2025: contributor
    utACK 5153cf1c91b6967b3cb0adcd0b990a0cffb2a0b2 assuming CI passes
  18. theStack approved
  19. theStack commented at 7:49 pm on August 20, 2025: contributor
    Code-review ACK 5153cf1c91b6967b3cb0adcd0b990a0cffb2a0b2
  20. real-or-random merged this on Aug 21, 2025
  21. real-or-random closed this on Aug 21, 2025


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-08-30 14:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me