kernel: assert invalid buffer preconditions in `btck_*_create` functions #35312

pull stringintech wants to merge 1 commits into bitcoin:master from stringintech:2026/05/kernel-assert-buffer-preconditions changing 3 files +13 −24
  1. stringintech commented at 6:38 AM on May 18, 2026: contributor

    The kernel API appears to use nullptr returns to report failures that callers may reasonably want to recover from: malformed serialized input, object construction failures, chainstate/load failures, and similar runtime conditions.

    The raw create-function buffer checks seem to be a different case. A failure of ptr == nullptr && len > 0 does not indicate malformed input data or a failure encountered while deserializing or constructing the requested object. Returning nullptr for these checks widens the recoverable error surface with cases that are better treated as programmer errors, similar to other asserted preconditions in this API such as invalid indices and impossible enum/flag states.

    This change switches those buffer argument checks from nullptr returns to assertions in btck_transaction_create, btck_script_pubkey_create, btck_block_create, btck_block_header_create, and btck_chainstate_manager_options_create. btck_block_header_create additionally asserts the pre-existing documented length contract (must be 80 bytes). These functions still return nullptr when the provided bytes cannot be parsed or when object creation fails during processing.

    I ended up looking at this while working on the kernel-bindings-tests spec/schema for btck_script_pubkey_create, where treating this path as a regular error did not seem like the right contract: https://github.com/stringintech/kernel-bindings-tests/pull/14#discussion_r3240859568.

  2. DrahtBot added the label Validation on May 18, 2026
  3. DrahtBot commented at 6:38 AM on May 18, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35312.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK stickies-v, janb84, w0xlt, sedited

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. in src/kernel/bitcoinkernel.cpp:1378 in 4c6d65317e
    1374 | @@ -1381,9 +1375,7 @@ int btck_chain_contains(const btck_Chain* chain, const btck_BlockTreeEntry* entr
    1375 |  
    1376 |  btck_BlockHeader* btck_block_header_create(const void* raw_block_header, size_t raw_block_header_len)
    1377 |  {
    1378 | -    if (raw_block_header == nullptr && raw_block_header_len != 0) {
    1379 | -        return nullptr;
    1380 | -    }
    1381 | +    assert(raw_block_header != nullptr || raw_block_header_len == 0);
    


    stickies-v commented at 4:09 PM on May 18, 2026:

    As you highlighted elsewhere, since the interface states we require non-null and len 80, I think it makes sense to assert that here too (and with BITCOINKERNEL_ARG_NONNULL)? In my view this is orthogonal to the changes done in #33853, where empty spans are allowed.


    stringintech commented at 7:46 AM on May 19, 2026:

    I had noticed that submitheader, via DecodeHexBlockHeader(), is currently lenient and accepts full block hex too, with some functional tests relying on that behavior, so I wondered whether we should mirror that here by loosening the API contract instead. But on second thought, that leniency may not be especially necessary or justified here: if the goal is just to avoid copying when starting from existing block bytes, callers could still pass a subspan over the first 80 bytes. What do you think? (Also the length check was previously discussed here: #33822 (review))


    maflcko commented at 7:56 AM on May 19, 2026:

    (Note if the prior GitHub link doesn't work. (it does not for me): https://mirror.b10c.me/bitcoin-bitcoin/33822#discussion_r2508035777 does)


    stickies-v commented at 9:35 AM on May 27, 2026:

    if the goal is just to avoid copying when starting from existing block bytes, callers could still pass a subspan over the first 80 bytes. What do you think?

    I agree with that. I think it's safer to keep the contract strict, and I don't see the benefit of accepting > 80 bytes.


    stringintech commented at 8:00 AM on May 28, 2026:

    If raw_block_header_len is no longer useful, can't we just drop it and use const unsigned char raw_block_header[80] instead of asserting on it: https://github.com/stringintech/bitcoin/commit/7f8d5597060d9cbee633ede4b0c90002e15a30b5?

    An alternative would be to keep it but treat an invalid length as a recoverable error, as in https://github.com/stringintech/bitcoin/commit/eca1e26e15a7e37da43be448ddece135b121b513 (this approach also preserves the invalid-length tests in test_kernel.cpp).

    I prefer the first approach in favor of a simpler API and don't see real downsides - except for potential UB in the case of unusual usages.

    I'm also wondering whether either approach would take this change out of scope for the current PR.


    stickies-v commented at 11:39 AM on May 28, 2026:

    If raw_block_header_len is no longer useful, can't we just drop it and use const unsigned char raw_block_header[80] instead of asserting on it: stringintech@7f8d559?

    In C, an array function parameter decays to a pointer so its length information is lost. At that point, it's basically equivalent to documentation, but without the option to assert invalid lengths in the impl. For example, the below diff compiles without warnings for me:

    <details> <summary>git diff on 7f8d559706</summary>

    diff --git a/src/kernel/bitcoinkernel.cpp b/src/kernel/bitcoinkernel.cpp
    index 6d1bec001d..018a1088ed 100644
    --- a/src/kernel/bitcoinkernel.cpp
    +++ b/src/kernel/bitcoinkernel.cpp
    @@ -1375,7 +1375,7 @@ int btck_chain_contains(const btck_Chain* chain, const btck_BlockTreeEntry* entr
         return btck_Chain::get(chain).Contains(btck_BlockTreeEntry::get(entry)) ? 1 : 0;
     }
     
    -btck_BlockHeader* btck_block_header_create(const unsigned char raw_block_header[80])
    +btck_BlockHeader* btck_block_header_create(const unsigned char raw_block_header[5])
     {
         auto header{std::make_unique<CBlockHeader>()};
         SpanReader stream{std::span{raw_block_header, 80}};
    diff --git a/src/kernel/bitcoinkernel.h b/src/kernel/bitcoinkernel.h
    index 96cb5abcea..430f986f0f 100644
    --- a/src/kernel/bitcoinkernel.h
    +++ b/src/kernel/bitcoinkernel.h
    @@ -1868,7 +1868,7 @@ BITCOINKERNEL_API void btck_block_hash_destroy(btck_BlockHash* block_hash);
      * [@return](/bitcoin-bitcoin/contributor/return/)                      The allocated btck_BlockHeader, or null on error.
      */
     BITCOINKERNEL_API btck_BlockHeader* BITCOINKERNEL_WARN_UNUSED_RESULT btck_block_header_create(
    -    const unsigned char raw_block_header[80]) BITCOINKERNEL_ARG_NONNULL(1);
    +    const unsigned char raw_block_header[5]) BITCOINKERNEL_ARG_NONNULL(1);
     
     /**
      * [@brief](/bitcoin-bitcoin/contributor/brief/) Copy a btck_BlockHeader.
    
    

    </details>

    An alternative would be to keep it but treat an invalid length as a recoverable error, as in stringintech@eca1e26 (this approach also preserves the invalid-length tests in test_kernel.cpp).

    I think it best adheres to the current philosophy where we assert on preconditions that are documented in the interface, to minimize the API surface the developer needs to understand and implement correctly. Until we agree on changing that approach in general, I think sticking to it makes sense.


    stringintech commented at 12:01 PM on May 28, 2026:

    In C, an array function parameter decays to a pointer, so its length information is lost.

    Yes, but we used it for btck_block_hash_create, and the implicit 32-byte length contract seemed acceptable there.

    I think it best adheres to the current philosophy where we assert on preconditions ...

    Fair enough. I'll include the fixed length check in the assertion for now.


    janb84 commented at 12:04 PM on May 28, 2026:

    An alternative would be to keep it but treat an invalid length as a recoverable error, as in stringintech@eca1e26 (this approach also preserves the invalid-length tests in test_kernel.cpp).

    I think it best adheres to the current philosophy where we assert on preconditions that are documented in the interface, to minimize the API surface the developer needs to understand and implement correctly. Until we agree on changing that approach in general, I think sticking to it makes sense.

    I agree with this. There was also a relevant HN post about the "weirdness" of C arrays

  5. stickies-v commented at 4:10 PM on May 18, 2026: contributor

    Approach ACK

  6. carloantinarella commented at 7:30 PM on May 19, 2026: contributor

    I see there is also the function btck_chainstate_manager_options_create, which handles argument validation slightly differently and returns nullptr when, for example, data_dir == nullptr && data_dir_len > 0 (i.e., a programmer error). A suggestion could be:

    - if (data_dir == nullptr || data_dir_len == 0 || blocks_dir == nullptr || blocks_dir_len == 0) {
    + assert(data_dir != nullptr || data_dir_len == 0);
    + assert(blocks_dir != nullptr || blocks_dir_len == 0);
    + if (data_dir_len == 0 || blocks_dir_len == 0) {
    

    I don't know if it is considered out of scope for this PR.

  7. sedited added this to a project on May 22, 2026
  8. github-project-automation[bot] changed the project status on May 22, 2026
  9. janb84 commented at 7:18 PM on May 26, 2026: contributor

    Concept ACK 4c6d65317ee5a430e81406ef297b503696542093

    The changes in this PR bring the _create functions more inline with the rest of the API.

    BUT ! i'm unsure that this is the correct way to proceed. The current assert() way is common for C api's but mostly for internal use (Linux Kernel) This is NDEBUG-sensitive. We do not control the build environment down stream so we do not control if NDEBUG is set or not.

    So ACK for consistency. ~0 for direction.

    Small NIT, change the documentation in the headerfile to match the changes, make it explicit what is not allowed / breaks the lib.

                                    when raw_transaction_len > 0;
    
  10. stringintech commented at 8:08 AM on May 27, 2026: contributor

    Yes, the main goal here is consistency with the rest of the API. I'm not sure if asserts are the best long-term solution, but even when building outside of Bitcoin Core's build system, setting NDEBUG results in a compile error: "Cannot compile without assertions!" enforced by util/check.h. So there's no scenario where the library builds but asserts are silently disabled? And even if that weren't the case, asserts are used throughout the internal code and I think removing them from the C API alone wouldn't help.

    On the doc nit: this feels like the same class as other fundamental invariants we don't document, like index out of bounds. A non-zero-length span with a null pointer is a basic contract violation and it seems adding it to every _create function's docs could add more noise than clarity. That said, happy to reconsider if others feel differently too or if I'm missing something!

  11. stickies-v commented at 9:58 AM on May 27, 2026: contributor

    BUT ! i'm unsure that this is the correct way to proceed.

    What is the better alternative here? The obvious one is returning error codes everywhere, but while that prevents users from running into UB without NDEBUG, it significantly increases API surface and complexity and the associated likelihood of (subtle) implementation errors. I don't think that's better (and it is largely why we have opted for the current API with asserts, iirc). The other of course is to switch to a canonical C++ interface.

  12. stringintech marked this as a draft on May 28, 2026
  13. stringintech commented at 8:04 AM on May 28, 2026: contributor

    Drafted while deciding on how/where to enforce buffer length in btck_block_header_create - #35312 (review).

  14. janb84 commented at 11:40 AM on May 28, 2026: contributor

    BUT ! i'm unsure that this is the correct way to proceed.

    What is the better alternative here? The obvious one is returning error codes everywhere, but while that prevents users from running into UB without NDEBUG, it significantly increases API surface and complexity and the associated likelihood of (subtle) implementation errors. I don't think that's better (and it is largely why we have opted for the current API with asserts, iirc). The other of course is to switch to a canonical C++ interface.

    Well IMHO, there's a middle path between error codes and asserts()

    Looking at proven, well-known C libraries; secp256k1 uses illegal_callback + ARG_CHECK to fire a diagnostic and return an error without aborting; SQLite's SQLITE_ENABLE_API_ARMOR returns a recoverable error code instead of crashing; OpenSSL pushes to a thread-local error queue so callers get diagnostics without needing a context parameter. The common ground here is (to me); precondition violations should be loud (errors/diagnostics) but not fatal.

    My (rough) example of this for this codebase is in this gist: https://gist.github.com/janb84/43cd1d026c88d2166598fb388387a2c2

  15. stickies-v commented at 1:03 PM on May 28, 2026: contributor

    My (rough) example of this for this codebase is in this gist:

    Yeah, I agree that using error callbacks could be a useful improvement, but it's not really part of the interface. In secp256k1, even with the callbacks, the program is still expected to crash, otherwise it's UB.

    And also, as per your example, these error callbacks would ideally not be part of the function signature.

    The direction of this PR is to keep contract violations out of the API. From what it seems, I think you agree with that, despite your earlier " ~0 for direction."?

  16. stringintech force-pushed on May 28, 2026
  17. stringintech commented at 1:25 PM on May 28, 2026: contributor

    Added fixed length buffer check assertion for btck_block_header_create per #35312 (review) by @stickies-v. Also, as pointed out by @carloantinarella in #35312 (comment), btck_chainstate_manager_options_create also returned nullptr for a violated documented precondition; an assertion was added for this one too to fail for both null/empty dirs.

  18. stringintech marked this as ready for review on May 28, 2026
  19. janb84 commented at 1:35 PM on May 28, 2026: contributor

    My (rough) example of this for this codebase is in this gist:

    Yeah, I agree that using error callbacks could be a useful improvement, but it's not really part of the interface. In secp256k1, even with the callbacks, the program is still expected to crash, otherwise it's UB.

    And also, as per your example, these error callbacks would ideally not be part of the function signature.

    The direction of this PR is to keep contract violations out of the API. From what it seems, I think you agree with that, despite your earlier " ~0 for direction."?

    secp256K1 crashes yes, SQLite does not, OpenSSL also does not crash. Whether the default should be abort or return-null is a design choice. It's about how friendly is our API going to be?

    My ~0 was more targeted on the usage of the Assert() (going into that direction) not on the return-null (contract violations out of the API).

    Resume; ACK for Keeping contract violations out of the API. ACK for constancy ~0 for assert() usage.

    Hope this clears things up.

  20. in src/kernel/bitcoinkernel.cpp:964 in 4314787461
     960 | @@ -965,10 +961,7 @@ btck_BlockValidationResult btck_block_validation_state_get_block_validation_resu
     961 |  
     962 |  btck_ChainstateManagerOptions* btck_chainstate_manager_options_create(const btck_Context* context, const char* data_dir, size_t data_dir_len, const char* blocks_dir, size_t blocks_dir_len)
     963 |  {
     964 | -    if (data_dir == nullptr || data_dir_len == 0 || blocks_dir == nullptr || blocks_dir_len == 0) {
     965 | -        LogError("Failed to create chainstate manager options: dir must be non-null and non-empty");
     966 | -        return nullptr;
     967 | -    }
     968 | +    assert(data_dir != nullptr && data_dir_len > 0 && blocks_dir != nullptr && blocks_dir_len > 0);
    


    sedited commented at 8:10 PM on May 28, 2026:

    I think I agree with tightening the other cases, but I'm not sure about this one. The null check seems fine, but I'm not sure about enforcing the length check. The documentation does say that the string must not be empty, but that does seem a bit brittle to me. A scenario where the string is directly piped through from user input sounds plausible to me.


    stickies-v commented at 8:35 PM on May 28, 2026:

    Are you suggesting we allow empty strings (which we shouldn't), or that we remove the non-empty from the documentation so it's no longer a contract violation but a runtime error?


    sedited commented at 8:39 PM on May 28, 2026:

    I have a slight preference for making it a runtime error, so I'm suggesting the latter, yes.


    stickies-v commented at 10:22 AM on May 29, 2026:

    Works for me. That also best aligns with the current docstring for BITCOINKERNEL_ARG_NONNULL, specifically:

    This attribute should only be used for arguments where a null pointer is unambiguously a programmer error, such as for opaque handles, and not for pointers to raw input data that might validly be null (e.g., from an empty std::span or std::string).

    Then I think the full header should be:

    /**
     * [@brief](/bitcoin-bitcoin/contributor/brief/) Create options for the chainstate manager.
     *
     * [@param](/bitcoin-bitcoin/contributor/param/)[in] context          Non-null, the created options and through it the chainstate manager will
     *                             associate with this kernel context for the duration of their lifetimes.
     * [@param](/bitcoin-bitcoin/contributor/param/)[in] data_directory   Path string of the directory containing the chainstate data. If the directory does not
     *                             exist yet, it will be created.
     * [@param](/bitcoin-bitcoin/contributor/param/)[in] blocks_directory Path string of the directory containing the block data. If the directory does not exist
     *                             yet, it will be created.
     * [@return](/bitcoin-bitcoin/contributor/return/)                     The allocated chainstate manager options, or null on error (e.g. if a path is invalid).
     */
    BITCOINKERNEL_API btck_ChainstateManagerOptions* BITCOINKERNEL_WARN_UNUSED_RESULT btck_chainstate_manager_options_create(
        const btck_Context* context,
        const char* data_directory,
        size_t data_directory_len,
        const char* blocks_directory,
        size_t blocks_directory_len) BITCOINKERNEL_ARG_NONNULL(1);
    
  21. sedited commented at 8:10 PM on May 28, 2026: contributor

    Approach ACK

  22. stringintech force-pushed on May 29, 2026
  23. stringintech commented at 11:47 AM on May 29, 2026: contributor

    Force pushed to address #35312 (review) by @sedited to allow recoverable error for empty directories passed to btck_chainstate_manager_options_create (I kept the null check assertions to only allow null when len is zero) and took @stickies-v's #35312 (review) for a more accurate documentation of the same function. Thanks!

  24. kernel: assert invalid buffer preconditions in `btck_*_create` functions
    Switch buffer `ptr == nullptr && len > 0` checks from `nullptr` returns to assertions. These checks represent invalid caller preconditions, not failures encountered while deserializing or constructing the requested object. `btck_block_header_create` additionally asserts the pre-existing documented length contract (must be 80 bytes).
    570a627640
  25. stringintech force-pushed on May 29, 2026
  26. DrahtBot added the label CI failed on May 29, 2026
  27. stringintech commented at 12:17 PM on May 29, 2026: contributor

    Updated commit body.

  28. DrahtBot removed the label CI failed on May 29, 2026
  29. stickies-v approved
  30. stickies-v commented at 1:59 PM on May 29, 2026: contributor

    ACK 570a6276400401f56ac5bb3a81c43e9553a3ee8a

  31. DrahtBot requested review from janb84 on May 29, 2026
  32. DrahtBot requested review from sedited on May 29, 2026
  33. janb84 commented at 3:24 PM on May 29, 2026: contributor

    ACK 570a6276400401f56ac5bb3a81c43e9553a3ee8a

    LGTM, Bikesheddng about changing the asserts() in something else we can do in a follow up (if needed)

  34. w0xlt commented at 10:08 PM on May 29, 2026: contributor

    ACK 570a6276400401f56ac5bb3a81c43e9553a3ee8a

  35. sedited approved
  36. sedited commented at 7:39 PM on June 1, 2026: contributor

    ACK 570a6276400401f56ac5bb3a81c43e9553a3ee8a

  37. sedited merged this on Jun 1, 2026
  38. sedited closed this on Jun 1, 2026

  39. github-project-automation[bot] changed the project status on Jun 1, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-10 04:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me