test, tracing: don’t use problematic bpf_usdt_readarg_p() #31848

pull 0xB10C wants to merge 3 commits into bitcoin:master from 0xB10C:2025-01-dont-use-bpf_usdt_readarg_p changing 8 files +115 −59
  1. 0xB10C commented at 4:07 pm on February 12, 2025: contributor

    Instead of using the undocumented bcc helper bpf_usdt_readarg_p(), use bpf_usdt_readarg() and bpf_probe_read_user()/bpf_probe_read_user_str() as documented in the bcc USDT reference guide.

    Note that the bpf_probe_read_user() documentation says the following:

    For safety, all user address space memory reads must pass through bpf_probe_read_user().

    It’s assumed that using bpf_usdt_readarg_p() caused a lifetime issue. With bpf_usdt_readarg() and bpf_probe_read_user(), this doesn’t seem to be a problem anymore.

    This allows to revert https://github.com/bitcoin/bitcoin/commit/faed5337435f025811caeb5f782ecbf9683a24b3 and closes #27380.

  2. test: don't use bpf_usdt_readarg_p
    Instead of using the undocumented bcc helper bpf_usdt_readarg_p(),
    use bpf_usdt_readarg() [1] and bpf_probe_read_user{_str}() [2, 3] as
    documented in the bcc USDT reference guide [1].
    
    Note that the bpf_probe_read_user() documentation says the following:
    
    > For safety, all user address space memory reads must pass through bpf_probe_read_user().
    
    It's assumed that using bpf_usdt_readarg_p() caused a lifetime issue.
    See https://github.com/bitcoin/bitcoin/issues/27380#issuecomment-2286505348
    With bpf_usdt_readarg() and bpf_probe_read_user(), this doesn't seem
    to be a problem anymore. See https://github.com/bitcoin/bitcoin/issues/27380#issuecomment-2528671656
    
    [1]: https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md#6-usdt-probes
    [2]: https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md#10-bpf_probe_read_user
    [3]: https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md#11-bpf_probe_read_user_str
    35ae6ff60f
  3. contrib: don't use bpf_usdt_readarg_p ec47ba349d
  4. Revert "test: Disable known broken USDT test for now"
    This reverts commit faed5337435f025811caeb5f782ecbf9683a24b3.
    
    This commit worked around a lifetime issue likely caused by using
    bpf_usdt_readarg_p(). Since we don't use bpf_usdt_readarg_p() anymore
    this commit can be reverted.
    
    See the discussion in https://github.com/bitcoin/bitcoin/issues/27380#issuecomment-2500962838
    a0b66b4bff
  5. DrahtBot commented at 4:07 pm on February 12, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31848.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Approach ACK willcl-ark

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  6. 0xB10C commented at 4:14 pm on February 12, 2025: contributor
    I did >15 re-runs of https://github.com/bitcoin/bitcoin/compare/master...0xB10C:bitcoin:2024-12-dont-use-bpf_usdt_readarg_p-testing similar to the 20 runs in #27380 (comment) to make sure this won’t intermittently fail again.
  7. glozow added the label Tests on Feb 12, 2025
  8. laanwj requested review from laanwj on Feb 12, 2025
  9. willcl-ark commented at 11:29 am on February 13, 2025: member

    Approach ACK

    So if I understand the change correctly, the new (documented/recommended) behaviour is to always first read_arg() which gets the memory location of the data, then read_user[_str]() which then actually performs the copy from user memory to the BPF memory space.

    If I have this correct, then using the documented, seemingly more-reliable, and more explicit approach seems fine to me.

    Not sure how I could test the reliability of this myself, but #31848 (comment) looks promising.

  10. 0xB10C commented at 1:14 pm on February 13, 2025: contributor

    So if I understand the change correctly, the new (documented/recommended) behaviour is to always first read_arg() which gets the memory location of the data, then read_user{_str}() which then actually performs the copy from user memory to the BPF memory space.

    Correct. Here’s an example from the mempool:added tracepoint test.

    • (1) initialize a null-pointer phash
    • (2) read the first tracepoint argument (a user-space address) into the phash pointer
    • (3) copy the value behind the phash pointer into added.hash
    0  void *phash = NULL;                                              // (1)
    1  bpf_usdt_readarg(1, ctx, &phash);                                // (2)
    2  bpf_probe_read_user(&added.hash, sizeof(added.hash), phash);     // (3)
    

    bpf_usdt_readarg_p() is a helper that does this, but apparently not safely enough

  11. laanwj commented at 2:52 pm on February 13, 2025: member

    bpf_usdt_readarg_p() is a helper that does this, but apparently not safely enough

    i looked into it a bit; the implementation of bpf_usdt_readarg_p is here: https://github.com/iovisor/bcc/blob/master/src/cc/frontends/clang/b_frontend_action.cc#L1270

    It’s a macro implemented in the clang frontend:

     0        } else if (Decl->getName() == "bpf_usdt_readarg_p") {
     1          text = "({ u64 __addr = 0x0; ";
     2          text += "_bpf_readarg_" + current_fn_ + "_" + args[0] + "(" +
     3                  args[1] + ", &__addr, sizeof(__addr));";
     4
     5          bool overlap_addr = false;
     6          text += check_bpf_probe_read_user(StringRef("bpf_probe_read_user"),
     7                  overlap_addr);
     8          if (overlap_addr) {
     9            error(GET_BEGINLOC(Call), "bpf_probe_read_user not found. Use latest kernel");
    10            return false;
    11          }
    12
    13          text += "(" + args[2] + ", " + args[3] + ", (void *)__addr);";
    14          text += "})";
    15          rewriter_.ReplaceText(expansionRange(Call->getSourceRange()), text);
    16        }
    

    At first glance it generates the exact same code: a bpf_readarg folllowed by bpf_probe_read_user of that address. So it’s unclear why the explicit formulation would change anything.

    However, check_bpf_probe_read_user does not always return bpf_probe_read_user! If the symbol cannot be resolved for the specific kernel it returns bpf_probe_read instead. Maybe that’s it, and calling check_bpf_probe_read_user works around it? Not sure.

    On the other hand there’s also some #defining going on which does the same (but not sure it’s under the same conditions).


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-02-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me