doc: Amend notes on benchmarking

maflcko commented at 11:46 am on January 20, 2025: member

This gives some more context on the motivation and larger picture of benchmarks.

DrahtBot commented at 11:46 am on January 20, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31690.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	darosior, l0rinc, brunoerg, instagibbs
Concept ACK	tdb3
Stale ACK	furszy, kevkevinpal

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

maflcko renamed this:
~~[doc] Amend notes on benchmarking~~
doc: Amend notes on benchmarking
on Jan 20, 2025

DrahtBot added the label Docs on Jan 20, 2025

in doc/benchmarking.md:72 in 61281c72cf outdated

73+too much to justify the improvement.
74+
75+Benchmarks are ill-suited for testing of denial of service issues as they are
76+restricted to the same input set (introducing bias). [Fuzz
77+tests](/doc/fuzzing.md) are better suited for this purpose, as they are much
78+better (and aimed) at exploring the possible input space.

l0rinc commented at 12:01 pm on January 20, 2025:

0tests](/doc/fuzzing.md) are better suited for this purpose, as they are specifically
1aimed at exploring the possible input space.

in doc/benchmarking.md:69 in 61281c72cf outdated

70+the user facing functions mentioned above. If a clear end-to-end performance
71+improvement cannot be demonstrated, the pull request is likely to be rejected.
72+The change might also be rejected if the code bloat or review/maintenance is
73+too much to justify the improvement.
74+
75+Benchmarks are ill-suited for testing of denial of service issues as they are

l0rinc commented at 12:02 pm on January 20, 2025:

Seems simpler to remove the leading “of”:

0Benchmarks are ill-suited for testing denial-of-service issues, as they are

in doc/benchmarking.md:67 in 61281c72cf outdated

68+
69+Benchmark additions and performance improvements are valuable if they target
70+the user facing functions mentioned above. If a clear end-to-end performance
71+improvement cannot be demonstrated, the pull request is likely to be rejected.
72+The change might also be rejected if the code bloat or review/maintenance is
73+too much to justify the improvement.

l0rinc commented at 12:03 pm on January 20, 2025:

may not be obvious what “review is too much” means:

0The change might also be rejected if the code bloat or review/maintenance burden is
1too high to justify the improvement.

in doc/benchmarking.md:62 in 61281c72cf outdated

62+- Initial block download (Cost: slow IBD results in full node operation being
63+  less accessible)
64+- Block template creation (Cost: slow block template creation may result in
65+  lower fee revenue for miners)
66+- Block propagation (Cost: slow block propagation may increase the rate of
67+  orphaned blocks)

l0rinc commented at 12:06 pm on January 20, 2025:

very useful list - could you add explicit benchmark examples for further understanding?

maflcko commented at 12:38 pm on January 20, 2025:

The list is non-exhaustive and exists for illustration, so I’ll leave as-is for now.

in doc/benchmarking.md:57 in 61281c72cf outdated

57+for future performance improvements. They should cover components that impact
58+performance critical functions of the system. Functions are performance
59+critical, if their performance impacts users and the cost associated with a
60+degradation in performance is high. A non-exhaustive list:
61+
62+- Initial block download (Cost: slow IBD results in full node operation being

l0rinc commented at 12:07 pm on January 20, 2025:

Users often complain about slow RPC calls - could we add those to this list?

l0rinc commented at 12:19 pm on January 20, 2025:

Could we clarify when we should use benchmark::PriorityLevel::HIGH and when the priority should be lower?

maflcko commented at 12:38 pm on January 20, 2025:

The list is non-exhaustive and exists for illustration, so I’ll leave as-is for now.

l0rinc commented at 12:53 pm on January 20, 2025:

I’d just like to avoid reviers claiming that they reject a PR because it’s not in this list - but you’re right, the “non-exhaustive list” should make that clear - you can resolve these

in doc/benchmarking.md:23 in 61281c72cf outdated

21@@ -22,18 +22,17 @@ After compiling bitcoin-core, the benchmarks can be run with:
22 
23 The output will look similar to:

l0rinc commented at 12:08 pm on January 20, 2025:

might want to clarify that CLang produces a very different output

maflcko commented at 12:36 pm on January 20, 2025:

Why should the output be different?

l0rinc commented at 12:51 pm on January 20, 2025:

See the examples in e.g. #31682:

C++ compiler …………………….. AppleClang 16.0.0.16000026

ns/block	block/s	err%	total	benchmark
361,535.93	2,765.98	0.5%	11.02	`CheckBlockBench`

C++ compiler …………………….. GNU 13.3.0

ns/block	block/s	err%	ins/block	cyc/block	IPC	bra/block	miss%	total	benchmark
1,096,261.84	912.19	0.1%	7,963,390.88	3,487,375.26	2.283	1,266,941.00	1.8%	11.03	`CheckBlockBench`

maflcko commented at 1:04 pm on January 20, 2025:

That is due to __linux__ not being set on macOS, so I’ll just drop this. It can be left as-is.

l0rinc commented at 1:07 pm on January 20, 2025:

That is due to linux not being set on macOS

Thanks for clarifying

so I’ll just drop this

I think it did make sense to replace the base58 examples with more representative ones

maflcko commented at 1:14 pm on January 20, 2025:

This is just for illustration of the output format, so anything should be fine.

(edit: resolving for now)

l0rinc commented at 1:31 pm on January 20, 2025:

I’ve mentioned this since it confused me at the beginning, seeing base58 benchmarks as examples - which is why I worked on optimizing it at first.

l0rinc commented at 12:11 pm on January 20, 2025: contributor

Concept ACK - thanks for clarifying the priorities

Could also be useful to clarify that the data used in the benchmark has to be representative, so we should aim to cover real values (such as block413567.raw).

maflcko force-pushed on Jan 20, 2025

brunoerg approved

brunoerg commented at 1:27 pm on January 20, 2025: contributor

code review ACK 2db6923332c7daa20d250cc5aa6bde080a7d0caf

DrahtBot requested review from l0rinc on Jan 20, 2025

l0rinc commented at 1:32 pm on January 20, 2025: contributor

ACK 2db6923332c7daa20d250cc5aa6bde080a7d0caf

furszy commented at 2:32 pm on January 20, 2025: member

ACK 2db6923332c7daa20d250cc5aa6bde080a7d0caf

kevkevinpal commented at 2:49 pm on January 20, 2025: contributor

ACK 2db6923

If it is wanted I can cherry pick the commits from #31153 and open up a PR to remove the various extraneous benchmarks

in doc/benchmarking.md:66 in 2db6923332 outdated

67+- Block propagation (Cost: slow block propagation may increase the rate of
68+  orphaned blocks)
69+
70+Benchmark additions and performance improvements are valuable if they target
71+the user facing functions mentioned above. If a clear end-to-end performance
72+improvement cannot be demonstrated, the pull request is likely to be rejected.

jonatack commented at 2:59 pm on January 20, 2025:

Would prefer to drop this in favor of per-PR review feedback.

maflcko commented at 3:17 pm on January 20, 2025:

The goal here is to provide context that a change may be rejected if a performance improvement is the only goal and it can not be observed end-to-end. This has been the case in the past and the goal here is not to change anything about that.

Happy to close this pull, if this is obvious to everyone and the note isn’t needed.

tdb3 commented at 10:41 pm on January 20, 2025:

I also tend to lean toward a per-PR approach, but do see the value in providing insight so new contributors know to provide a more complete case for a PR. Maybe something more encouraging could accomplish a similar goal, e.g. Supporting the case for a performance improvement with benchmark results is encouraged.

l0rinc commented at 6:50 am on January 21, 2025:

It’s not obvious to everyone. This is a useful change. Some of the overly harsh/opinionated parts could indeed be dropped/reformulated, but it’s important that newcomers know the basic directions.

maflcko commented at 7:34 am on January 21, 2025:

Supporting the case for a performance improvement with benchmark results is encouraged.

The benchmark also needs to be put into context in what real end-to-end scenario it would result in a measurable improvement. Just because a benchmark exists or can be written, doesn’t mean that it is a goal to improve its performance. I think noting that the considerations are not one-dimensional is useful.

maflcko commented at 7:35 am on January 21, 2025:

I am happy to push any change (ideally a diff) on the wording, which conveys the essence of the note better.

maflcko commented at 3:53 pm on January 22, 2025:

Just pushed a change, but I am happy to push any other change (ideally a diff) on the wording, which conveys the essence of the note better.

maflcko commented at 3:54 pm on January 22, 2025:

Pushed a small change

jonatack commented at 3:04 pm on January 20, 2025: member

-0 on this, conceptually prefer ad hoc, individual case-by-case PR review over setting top-down collectivist norms. The former will occur anyway, and ponderous notes like this will be used selectively and not evenhandedly.

maflcko commented at 3:04 pm on January 20, 2025: member

If it is wanted I can cherry pick the commits from #31153 and open up a PR to remove the various extraneous benchmarks

The goal of this pull request is to add useful context. Removing benchmarks can be done on a case-by-case basis in a separate pull request, if properly motivated. However, it is not the goal of this pull request to serve as the sole reason to remove (or refuse to add) benchmarks.

tdb3 commented at 10:48 pm on January 20, 2025: contributor

Concept ACK

Adding more context for newcomers adds value and can reduce churn.

darosior approved

darosior commented at 7:21 pm on January 21, 2025: member

ACK 2db6923332c7daa20d250cc5aa6bde080a7d0caf

I hear others’ criticisms but i think it’s an improvement over the current doc. We can iterate in follow-ups on how best to phrase these general guidelines.

DrahtBot requested review from tdb3 on Jan 21, 2025

in doc/benchmarking.md:54 in 2db6923332 outdated

55-- P2P throughput
56+
57+Benchmarks help with monitoring for performance regressions and act as a scope
58+for future performance improvements. They should cover components that impact
59+performance critical functions of the system. Functions are performance
60+critical, if their performance impacts users and the cost associated with a

instagibbs commented at 3:17 pm on January 22, 2025:

0critical if their performance impacts users and the cost associated with a

maflcko commented at 3:54 pm on January 22, 2025:

thx, removed, comma

in doc/benchmarking.md:60 in 2db6923332 outdated

61+degradation in performance is high. A non-exhaustive list:
62+
63+- Initial block download (Cost: slow IBD results in full node operation being
64+  less accessible)
65+- Block template creation (Cost: slow block template creation may result in
66+  lower fee revenue for miners)

instagibbs commented at 3:17 pm on January 22, 2025:

0  lower fee revenue for miners and mining centralization)

instagibbs approved

instagibbs commented at 3:21 pm on January 22, 2025: member

ACK 2db6923332c7daa20d250cc5aa6bde080a7d0caf

better than status quo

[doc] Amend notes on benchmarking e94c9d1712

maflcko force-pushed on Jan 22, 2025

darosior approved

darosior commented at 4:02 pm on January 22, 2025: member

reACK e94c9d171239c1fc44fa9c77a4595ecd626b767f

DrahtBot requested review from instagibbs on Jan 22, 2025

DrahtBot requested review from furszy on Jan 22, 2025

DrahtBot requested review from brunoerg on Jan 22, 2025

l0rinc commented at 4:30 pm on January 22, 2025: contributor

ACK e94c9d171239c1fc44fa9c77a4595ecd626b767f

https://github.com/bitcoin/bitcoin/compare/2db6923..e94c9d171239c1fc44fa9c77a4595ecd626b767f?w=1

brunoerg approved

brunoerg commented at 4:32 pm on January 22, 2025: contributor

reACK e94c9d171239c1fc44fa9c77a4595ecd626b767f

instagibbs commented at 4:34 pm on January 22, 2025: member

reACK e94c9d171239c1fc44fa9c77a4595ecd626b767f

fanquake merged this on Jan 22, 2025

fanquake closed this on Jan 22, 2025

maflcko deleted the branch on Jan 22, 2025

doc: Amend notes on benchmarking #31690

Code Coverage & Benchmarks

Reviews