USDTs are tracepoints and behave similar to trace logging (no performance impact if not used), but can pass data into scripts running in a Linux kernel VM for collection and further evaluation. The required build system changes and a tracing framework for User Static Defined Traces (USDTs) were merged in #19866 and some additional discussion took place in #20960. Simpler scripts, such as collecting and printing metrics as histogram, can be created with the bpftrace tool. Complexer bcc scripts are written in C and loaded into the kernel with, for example, the bcc Python module where the passed data can be processed.
Following the core-dev-meeting discussion on 21st Jan 2021, the goal here is to list potentially interesting USDTs and what data they should pass. The USDT should have:
- a clear use-case and motivation
- a semi-stable API (order of arguments don’t change so people can rely on these for scripting)
- no expensive operations added specifically for this USDT (no parsing, serialization, … - just passing of already avaliable data)
- a documentation in
docs/tracing.md
- (ideally) a usage example as bftrace or bcc script
- (need to double check this) a maximum of 512 bytes passed (max stack size in bpftrace)
While some kind of testing of the USDT examples would be good, root privileges are required to load programs in the kernel VM. Might still be possible to run them in a CI, though?
What could be useful USDTs? What data should they pass? Where should they be placed in the code?