We can cherry-pick one commit from upstream leveldb, make the same change in crc32c, and then ultimately drop our build infra for testing endianness.
Not for merging until subtrees are updated:
We can cherry-pick one commit from upstream leveldb, make the same change in crc32c, and then ultimately drop our build infra for testing endianness.
Not for merging until subtrees are updated:
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29852.
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
No conflicts as of last run.
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
Concept ACK. Looking at the leveldb godbolt link, this is nicely optimized everywhere except MSVC.
However, the changes in MSVC generated assembly code look quite significant.
I’m ok with a possible regression there for the sake of the cleanup.
I disagree. Before stacking another performance deterioration change on top of the pile of the currently unresolved performance issues in the MSVC builds, it would be nice to compare benchmarks in the first place.
However, the changes in MSVC generated assembly code look quite significant. Before stacking another performance deterioration change on top of the pile
Isn’t that because optimisations haven’t been turned on? Otherwise, can you provide a concrete example of what you’re talking about.
Want to upstream the crc32 patch to match the others we have sitting there?
Sure. Opened a PR in our crc32c subtree fork: https://github.com/bitcoin-core/crc32c-subtree/pull/7, and one in Google upstream: https://github.com/google/crc32c/pull/64.
However, the changes in MSVC generated assembly code look quite significant. Before stacking another performance deterioration change on top of the pile
Isn’t that because optimisations haven’t been turned on? Otherwise, can you provide a concrete example of what you’re talking about.
https://godbolt.org/z/of4T8hM8j provides examples with the /O2
optimization flag.
What benchmarks might be appropiate for testing changes like these?
Microbenchmarks + IBD?
Is there a venue for reporting this to MSVC? They recently patted themselves on the back for detecting similar patterns. It’s a shame MSVC can’t detect something that (in 2024) seems so obvious.
cc @sipsorcery
Is there a venue for reporting this to MSVC? They recently patted themselves on the back for detecting similar patterns. It’s a shame MSVC can’t detect something that (in 2024) seems so obvious.
cc @sipsorcery
Most likely fruitless but can’t hurt to ask.
We can cherry-pick one commit from upstream leveldb, make the same change in crc32c, and then ultimately drop our build infra for testing endianness.
The same goal, which is dropping “build infra for testing endianness”, might be achieved with an alternative approach, which essentially boils down to:
0--- a/src/leveldb/util/coding.h
1+++ b/src/leveldb/util/coding.h
2@@ -62,7 +62,7 @@ char* EncodeVarint64(char* dst, uint64_t value);
3 inline void EncodeFixed32(char* dst, uint32_t value) {
4 uint8_t* const buffer = reinterpret_cast<uint8_t*>(dst);
5
6- if (port::kLittleEndian) {
7+ if constexpr (std::endian::native == std::endian::little) {
8 // Fast path for little-endian CPUs. All major compilers optimize this to a
9 // single mov (x86_64) / str (ARM) instruction.
10 std::memcpy(buffer, &value, sizeof(uint32_t));
And no MSVC code degradation :)
if constexpr (std::endian::native == std::endian::little) {
This is a c++20 feature unfortunately. So I don’t imagine either upstream accepting it any time soon.
I agree with @fanquake that we shouldn’t let MSVC (an unsupported and closed-source compiler) stand in the way of our progress. And this is a real barrier to us staying in sync with upstream. If we shipped msvc-built binaries that’d be one thing, but I don’t see that ever happening.
Clang 10 includes the optimizations described in
https://bugs.llvm.org/show_bug.cgi?id=41761. This means that the
platform-independent implementations of {Decode,Encode}Fixed{32,64}()
compile to one instruction on the most recent Clang and GCC.
PiperOrigin-RevId: 306330166
Similar to 038755784d88ce7522ac2f98e8ba138010a64f82 from leveldb.