net: Disallow invalid HeadersSyncState due to lagging clock #35351

pull hodlinator wants to merge 3 commits into bitcoin:master from hodlinator:pr/35208_alt changing 5 files +50 −14
  1. hodlinator commented at 7:56 PM on May 21, 2026: contributor

    Problem

    Headers presync computes m_max_commitments from the elapsed time since the chain-start MTP plus MAX_FUTURE_BLOCK_TIME. When the local system clock is more than MAX_FUTURE_BLOCK_TIME behind the chain-start MTP, that elapsed value is negative, but it is used in arithmetic assigned to the unsigned commitment cap. This can turn the intended zero bound into a large cap, letting low-work headers presync continue instead of aborting when a reasonable commitment cap would have been exceeded.

    Fix

    Instead of allowing an invalid HeadersSyncState object to be created, throw an exception from the constructor and shut down the node.

    Typically, the node will detect that the system clock is set too far in the past when comparing it to the chain tip during chain state loading and shut down before we start syncing headers. So in practice this is very unlikely to make a difference (might be possible if the system clock jumps backwards after we loaded the chain state).

    Commits

    • A regression test pinning the current behavior.
    • The fix, along with the corresponding test change.
    • The headers_sync_state fuzz target also now allows the same clock-skew case instead of always choosing a time at or after the chain-start MTP.

    Replaces #35208 which was clamping m_max_commitments to zero and then letting the HeadersSyncState consume headers until the block height either reached the the next commitment_period point and aborted, or reached the minimum work threshold and succeeded (possible when having been offline for >144 blocks).

  2. test: Cover future chain-start MTP boundary in headers presync
    When the local clock sits more than `MAX_FUTURE_BLOCK_TIME` behind the chain-start MTP, the elapsed value used to derive `m_max_commitments` is negative.
    The later division by the unsigned commitment period converts that negative value to a large unsigned bound.
    Presync ends up with A LOT of commitment slots, whereas it should abort.
    
    Pin that pre-fix behavior in a regression test that the subsequent fix will alter.
    
    Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
    b0a2c0daf5
  3. net: Throw exception from HeadersSyncState when system clock is behind start block MTP
    We should not proceed syncing headers from peers when the local system clock is incorrectly set.
    
    A node with a system clock set too far back will typically fail early during startup when the chainstate detects the tip to be too far in the future. This means that in practice we don't expect the failure to ever happen in net_processing.cpp.
    One can provoke the std::abort() in net_processing.cpp to happen through adding a call to SetMockTime(1) right before creating the HeadersSyncState.
    
    Issue discovered by: Lőrinc <pap.lorinc@gmail.com>
    db9c704117
  4. fuzz: Make headerssync target also cover HeadersSyncState::SystemClockError
    Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
    2cb8aa05be
  5. DrahtBot added the label P2P on May 21, 2026
  6. DrahtBot commented at 7:56 PM on May 21, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35351.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK w0xlt

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #35114 (test: NodeClockContext follow-ups by seduless)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

    LLM Linter (✨ experimental)

    Possible places where comparison-specific test macros should replace generic comparisons:

    • [src/test/headers_sync_chainwork_tests.cpp] BOOST_CHECK_THROW(CreateState(), HeadersSyncState::SystemClockError); -> Consider BOOST_CHECK_EXCEPTION with a predicate that verifies the exception message, so the test checks the specific failure reason instead of only the exception type.

    <sup>2026-05-21 19:57:01</sup>

  7. DrahtBot added the label CI failed on May 21, 2026
  8. DrahtBot removed the label CI failed on May 22, 2026
  9. w0xlt commented at 2:01 PM on May 22, 2026: contributor

    Concept ACK

  10. hodlinator commented at 1:37 PM on May 27, 2026: contributor

    Been thinking more about this during the weekend.

    One could posit that Bitcoin nodes should be more shelf stable in a post-apocalyptic future where NTP servers are unreachable. Re-establishing a shared accurate definition of current UTC time just from observing the sun's position could be challenging.

    In that kind of scenario, preventing the node from continuing, as this PR does, could be problematic.

    So should we do like #35208 and just set m_max_commitments to 0 instead? It would still be pretty unforgiving if start-height is very close to a multiple of the commitment_period - maybe allowing at least 1 commitment could be argued? (commitment_period is currently 641, giving an average of 961 blocks before aborting if we start at a random height).

    Although, if clocks in the network disagree too much about UTC time, block propagation would also be suffering due to the rule to not accept blocks from too far into the future (MAX_FUTURE_BLOCK_TIME/2h). The current check failing would also mean that the we somehow ended up with a starting block read from disk that is dated more than 2h in the future, meaning our clock is not just out of sync with UTC but has also jumped backwards since we accepted the starting block. So I think the current approach in the PR is consistent with the rest of the node.

  11. in src/headerssync.cpp:45 in db9c704117
      42 | -                                       + MAX_FUTURE_BLOCK_TIME};
      43 | +    const NodeClock::time_point now{NodeClock::now()};
      44 | +    const int64_t max_seconds_since_start{Ticks<std::chrono::seconds>(now - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}})
      45 | +                                          + MAX_FUTURE_BLOCK_TIME};
      46 | +    if (max_seconds_since_start < 0) {
      47 | +        throw SystemClockError{strprintf(
    


    l0rinc commented at 12:31 PM on June 6, 2026:
  12. in src/test/fuzz/headerssync.cpp:1 in 2cb8aa05be


    l0rinc commented at 12:46 PM on June 6, 2026:

    The PR description should not imply that the time has to be off compared to genesis, the actual condition is relative to the known block, so it can be a recent local tip/header, not genesis. Timezone being off likely does not change local UTC, but if the actual system clock jumps backward, the condition can happen if the headers connect to a known block whose MTP is more than 2 hours ahead of the wrong local time.

  13. in src/headerssync.h:106 in db9c704117
      98 | @@ -99,8 +99,13 @@ struct CompressedHeader {
      99 |   * sync (temporary, per-peer storage).
     100 |   */
     101 |  
     102 | -class HeadersSyncState {
     103 | +class HeadersSyncState
     104 | +{
     105 |  public:
     106 | +    struct SystemClockError : std::runtime_error {
     107 | +        using std::runtime_error::runtime_error;
    


    l0rinc commented at 12:52 PM on June 6, 2026:

    db9c704 net: Throw exception from HeadersSyncState when system clock is behind start block MTP:

    diff --git a/src/headerssync.h b/src/headerssync.h
    --- a/src/headerssync.h	(revision 39798b46d08ff642ed59b84a6eb2ee911c070c94)
    +++ b/src/headerssync.h	(revision 1675b824feaf899e16c66ccff74c8d65bc3bf182)
    @@ -15,6 +15,7 @@
     #include <util/hasher.h>
     
     #include <deque>
    +#include <stdexcept>
     #include <vector>
     
     // A compressed CBlockHeader, which leaves out the prevhash
    

    Not important if my other suggestions are taken.

  14. in src/test/fuzz/headerssync.cpp:73 in 2cb8aa05be
      69 | @@ -70,11 +70,18 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
      70 |          .redownload_buffer_size = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, Params().HeadersSync().redownload_buffer_size * 2),
      71 |      };
      72 |      arith_uint256 min_work{UintToArith256(ConsumeUInt256(fuzzed_data_provider))};
      73 | -    FuzzedHeadersSyncState headers_sync(
      74 | -        params,
      75 | -        /*commit_offset=*/fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, params.commitment_period - 1),
      76 | -        /*chain_start=*/start_index,
      77 | -        /*minimum_required_work=*/min_work);
      78 | +    std::unique_ptr<FuzzedHeadersSyncState> headers_sync;
    


    l0rinc commented at 12:59 PM on June 6, 2026:

    The target only needs to handle a constructor path that may throw (not sure why we'd want to do an extra heap allocation here). An std::optional seems like a better fit for signalling that:

    diff --git a/src/test/fuzz/headerssync.cpp b/src/test/fuzz/headerssync.cpp
    --- a/src/test/fuzz/headerssync.cpp	(revision 09a5ecc777aa6490e094a6283a4f8d3ae021989d)
    +++ b/src/test/fuzz/headerssync.cpp	(revision 1d58a9da6ac133f4f18fd8b11814d73bbb019215)
    @@ -72,9 +72,9 @@
             .redownload_buffer_size = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, Params().HeadersSync().redownload_buffer_size * 2),
         };
         arith_uint256 min_work{UintToArith256(ConsumeUInt256(fuzzed_data_provider))};
    -    std::unique_ptr<FuzzedHeadersSyncState> headers_sync;
    +    std::optional<FuzzedHeadersSyncState> headers_sync;
         try {
    -        headers_sync = std::make_unique<FuzzedHeadersSyncState>(
    +        headers_sync.emplace(
                 params,
                 /*commit_offset=*/fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, params.commitment_period - 1),
                 /*chain_start=*/start_index,
    

    But we likely shouldn't need this new try/catch in the first place, we should rather do the validation outside and pass in a valid time instead.

  15. in src/test/headers_sync_chainwork_tests.cpp:260 in 2cb8aa05be
     252 | @@ -252,4 +253,10 @@ BOOST_AUTO_TEST_CASE(too_little_work)
     253 |          /*exp_locator_hash=*/std::nullopt);
     254 |  }
     255 |  
     256 | +BOOST_AUTO_TEST_CASE(system_clock_lagging_behind_chain_start)
     257 | +{
     258 | +    const NodeClockContext clock_ctx{std::chrono::seconds{genesis.GetBlockTime() - MAX_FUTURE_BLOCK_TIME - 1}};
     259 | +    BOOST_CHECK_THROW(CreateState(), HeadersSyncState::SystemClockError);
     260 | +}
    


    l0rinc commented at 1:58 PM on June 6, 2026:

    Could the regression coverage pin the exact MAX_FUTURE_BLOCK_TIME boundary?

    BOOST_AUTO_TEST_CASE(system_clock_lagging_behind_chain_start)
    {
        const test_only_CheckFailuresAreExceptionsNotAborts mock_checks{};
        NodeClockContext clock_ctx{(genesis.GetBlockTime() - MAX_FUTURE_BLOCK_TIME) * 1s};
        BOOST_CHECK_NO_THROW(CreateState());
    
        clock_ctx -= 1s;
        BOOST_CHECK_EXCEPTION(CreateState(), NonFatalCheckError, HasReason{"Internal bug detected: max_seconds_since_start >= 0"});
    }
    
  16. in src/net_processing.cpp:2795 in 2cb8aa05be
    2788 | @@ -2789,8 +2789,17 @@ bool PeerManagerImpl::TryLowWorkHeadersSync(Peer& peer, CNode& pfrom, const CBlo
    2789 |              // of headers is known, some header in this set must be new, so
    2790 |              // advancing to the first unknown header would be a small effect.
    2791 |              LOCK(peer.m_headers_sync_mutex);
    2792 | -            peer.m_headers_sync.reset(new HeadersSyncState(peer.m_id, m_chainparams.GetConsensus(),
    2793 | -                m_chainparams.HeadersSync(), chain_start_header, minimum_chain_work));
    2794 | +            try {
    2795 | +                peer.m_headers_sync.reset(new HeadersSyncState(peer.m_id, m_chainparams.GetConsensus(),
    2796 | +                    m_chainparams.HeadersSync(), chain_start_header, minimum_chain_work));
    2797 | +            } catch (const HeadersSyncState::SystemClockError& e) {
    


    l0rinc commented at 2:04 PM on June 6, 2026:

    We're using the exception here for control flow.

    Could the local clock value be captured and validated before HeadersSyncState so it no longer needs a recoverable exception type? We could also notify through fatalError() when the local clock is too far behind (passing the same now value into HeadersSyncState).

    This way we could do the validation before the call in PeerManagerImpl::TryLowWorkHeadersSync:

    const auto now{Now<NodeSeconds>()};
    if (now < NodeSeconds{(chain_start_header.GetMedianTimePast() - MAX_FUTURE_BLOCK_TIME) * 1s}) {
        m_chainman.GetNotifications().fatalError(Untranslated("System clock too far behind chain start MTP."));
        headers = {};
        return true;
    }
    LOCK(peer.m_headers_sync_mutex);
    peer.m_headers_sync.reset(new HeadersSyncState{peer.m_id, m_chainparams.GetConsensus(), m_chainparams.HeadersSync(), chain_start_header, minimum_chain_work, now});
    
    

    and HeadersSyncState::HeadersSyncState could remain just:

    const int64_t max_seconds_since_start{Ticks<std::chrono::seconds>(now - NodeSeconds{chain_start.GetMedianTimePast() * 1s}) + MAX_FUTURE_BLOCK_TIME};
    Assert(max_seconds_since_start >= 0);
    m_max_commitments = 6 * max_seconds_since_start / m_params.commitment_period;
    

    See https://github.com/l0rinc/bitcoin/pull/186/changes#diff-6875de769e90cec84d2e8a9c1b962cdbcda44d870d42e4215827e599e11e90e3R2792-R2797 for the whole implementation

  17. l0rinc changes_requested
  18. l0rinc commented at 6:31 PM on June 6, 2026: contributor

    I'm not sure this is going in the right direction, seems weird to discover time drift in the constructor and catch on use site - instead of sanitizing the value before call. Please see https://github.com/l0rinc/bitcoin/pull/186/commits for how I imagined simplifying this a lot more.

  19. DrahtBot added the label Needs rebase on Jun 9, 2026
  20. DrahtBot commented at 1:35 PM on June 9, 2026: contributor

    <!--cf906140f33d8803c4a75a2196329ecb-->

    🐙 This pull request conflicts with the target branch and needs rebase.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-11 10:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me