Bitcoin Core Lacks Block Processing Observability #34901

issue morozow openend this issue on March 23, 2026
  1. morozow commented at 11:31 am on March 23, 2026: none

    Problem

    Bitcoin Core’s block processing pipeline is a black box. When performance issues occur, there’s no way to answer basic questions:

    • Which peer announced the block first?
    • Why did we request from peer X instead of peer Y?
    • Which peer is stalling our download?
    • How long did each stage take?

    Impact

    For operators:

    • Can’t diagnose slow IBD
    • Can’t identify problematic peers
    • No visibility into tip-following performance

    For researchers:

    • Can’t study block propagation without custom patches
    • No data for peer selection optimization
    • Can’t measure BIP152 compact block efficiency

    For developers:

    • Can’t measure impact of P2P changes
    • No baseline for optimization work
    • Blind to real-world performance patterns

    Proposed Solution: Zero-Cost Observability Layer

    Add structured event emission at key points in block processing:

     0sequenceDiagram
     1    participant P as Peer
     2    participant N as Node  
     3    participant O as Observability
     4
     5    P->>N: Block announced
     6    N->>O: BlockAnnounce
     7    N->>O: RequestDecision
     8    N->>O: BlockInFlight
     9    
    10    alt Timeout
    11        N->>O: StallerDetected
    12    end
    13    
    14    P->>N: Block received
    15    N->>O: BlockValidated
    

    Key Requirement: Zero Performance Cost

    Observability must not degrade node performance. Achieved through:

    • Async event processing (background thread)
    • Lock-free bounded queue
    • Fail-open design (drop events if queue full)
    • <100μs callback budget

    Measured result: 0% overhead (benchmark shows -5.95%, within variance)

    Event Schema

    BlockAnnounceEvent

    0{
    1  "ts_us": 1711180800000000,
    2  "event": "block_announce",
    3  "hash": "000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f",
    4  "peer_id": 42,
    5  "via": "headers",
    6  "height": 800000
    7}
    

    StallerDetectedEvent

    0{
    1  "ts_us": 1711180802500000,
    2  "event": "staller_detected", 
    3  "hash": "000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f",
    4  "staller_peer_id": 42,
    5  "waiting_peer_id": 15,
    6  "stall_duration_us": 2400000
    7}
    

    Use Cases

    1. IBD Performance Analysis

    0Problem: IBD taking 12 hours instead of 6
    1Analysis: Event stream shows peer 42 stalling 30% of blocks
    2Action: Investigate peer 42's connection, consider deprioritizing
    

    2. Peer Selection Research

    0Question: Is current peer selection optimal?
    1Method: Correlate RequestDecision events with actual delivery times
    2Finding: Peers with faster announcements deliver 40% faster
    

    3. Compact Block Efficiency

    0Question: How effective is BIP152?
    1Method: Analyze CompactBlockDecision events
    2Result: 85% reconstructed locally, 15% fallback to full block
    

    Requirements

    1. Zero overhead - Must not degrade performance
    2. Fail-open - Errors in observability don’t affect node
    3. Structured output - Machine-parseable NDJSON
    4. Optional - Disabled by default (-stdiobus=off)
    5. Extensible - Easy to add new event types

    Extensibility Path

    This observability layer is a foundation:

    Phase Extension Value
    3 RPC hooks Measure RPC latency under P2P load
    4 Mempool hooks TX admission analysis
    5 Active mode Data-driven optimizations
    6 Security Controlled fault injection

    Each phase uses same infrastructure - add events, no new overhead.

    Implementation

    • StdioBusHooks interface with event callbacks
    • NoOpStdioBusHooks default (zero overhead when disabled)
    • StdioBusSdkHooks implementation with async queue
    • Injection via PeerManager::Options
    • CLI: -stdiobus=off|shadow

    Backward Compatibility

    • Default: disabled (-stdiobus=off)
    • No new dependencies for default build
    • Optional static library for SDK

    Bitcoin Core’s block processing pipeline is a black box. When performance issues occur, there’s no way to answer basic questions:

    • Which peer announced the block first?
    • Why did we request from peer X instead of peer Y?
    • Which peer is stalling our download?
    • How long did each stage take?

    Describe the solution you’d like

    Add structured event emission at key points in block processing:

     0sequenceDiagram
     1    participant P as Peer
     2    participant N as Node  
     3    participant O as Observability
     4
     5    P->>N: Block announced
     6    N->>O: BlockAnnounce
     7    N->>O: RequestDecision
     8    N->>O: BlockInFlight
     9    
    10    alt Timeout
    11        N->>O: StallerDetected
    12    end
    13    
    14    P->>N: Block received
    15    N->>O: BlockValidated
    

    Describe any alternatives you’ve considered

    No response

    Please leave any additional context

    No response

  2. morozow added the label Feature on Mar 23, 2026
  3. sedited commented at 12:40 pm on March 23, 2026: contributor
    Bitcoin Core already supports tracing through USDT. This (and your pull request) read heavily LLM assisted. Can you describe in your own words what novel approach this brings? See the existing tracing docs here: https://github.com/bitcoin/bitcoin/blob/master/doc/tracing.md .
  4. maflcko commented at 2:35 pm on March 23, 2026: member
    Yes, either USDT tracepoints, or debug logging can be used here. Closing for now, but a new issue can be created, if there is still need. Please make sure to not use an LLM.
  5. maflcko closed this on Mar 23, 2026

  6. morozow commented at 4:21 pm on March 23, 2026: none
    This introduces an observability layer with zero performance cost and no parallelised core execution overhead. It enables monitoring, transaction tracing, and telemetry in a lightweight way. USDT requires Linux + eBPF + root. This proposal targets cross-platform JSON output for simpler tooling integration. Different use case.
  7. maflcko commented at 4:43 pm on March 23, 2026: member

    The debug log is already cross platform and easy to parse. Also, it is trivial to run a Linux VM on any other OS.

    Such a massive change that you are proposing needs a good motivation. Though, you are asking for a use-case yourself in #34898 (comment), which makes the motivation even weaker.

  8. morozow commented at 4:58 pm on March 23, 2026: none
    Mentioned PR comment depends on another P2P/RPC issue. This issue approach provides a low-level observability protocol aligned with zero-cost performance, independent of monitoring platform technology and maintenance, and does not break any workflow because processing is independent.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-03-31 12:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me