logging: use bitset for categories #26697

pull LarryRuane wants to merge 1 commits into bitcoin:master from LarryRuane:2022-12-log-categories-stdbitset changing 6 files +137 −47
  1. LarryRuane commented at 5:40 am on December 14, 2022: contributor

    … rather than integer bitmasks such as (1 « 28). This removes the limit of 32 logging categories – we’re up to 28 currently.

    This is an alternative to #26619 that was suggested by luke-jr.

    tag jonatack klementtan

  2. DrahtBot commented at 5:40 am on December 14, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ryanofsky
    Concept ACK kevkevinpal
    Approach ACK stickies-v
    Stale ACK codo1

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30595 (kernel: Introduce initial C header API by TheCharlatan)
    • #30315 (Stratum v2 Transport by Sjors)
    • #29432 (Stratum v2 Template Provider (take 3) by Sjors)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
    • #29346 (Stratum v2 Noise Protocol by Sjors)
    • #26619 (log: expand BCLog::LogFlags (categories) to 64 bits by LarryRuane)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. fanquake commented at 9:30 am on December 14, 2022: member

    https://github.com/bitcoin/bitcoin/pull/26697/checks?check_run_id=10081903933

    0/usr/bin/ccache g++ -m32 -std=c++17 -DHAVE_CONFIG_H -I. -I../src/config  -fmacro-prefix-map=/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-i686-pc-linux-gnu=. -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -DHAVE_BUILD_INFO -D_FILE_OFFSET_BITS=64 -DPROVIDE_FUZZ_MAIN_FUNCTION -I. -I./minisketch/include -I./secp256k1/include -I./univalue/include -I./leveldb/include -I/tmp/cirrus-ci-build/depends/i686-pc-linux-gnu/include/  -fdebug-prefix-map=/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-i686-pc-linux-gnu=. -fstack-reuse=none -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection   -Werror    -fno-extended-identifiers -fvisibility=hidden -fPIE -pipe -std=c++17 -O2  -c -o bitcoind-bitcoind.o `test -f 'bitcoind.cpp' || echo './'`bitcoind.cpp
    1In file included from ./interfaces/node.h:9,
    2                 from ./interfaces/init.h:10,
    3                 from bitcoind.cpp:16:
    4./logging.h: In function ‘constexpr std::bitset<28> BCLog::all()’:
    5./logging.h:91:64: error: call to non-‘constexpr’ function ‘std::bitset<_Nb>& std::bitset<_Nb>::set(std::size_t, bool) [with unsigned int _Nb = 28; std::size_t = unsigned int]’
    6         for (size_t i{0}; i < LOGFLAGSMAX; ++i) all_flags.set(i);
    7                                                                ^
    8make[2]: *** [Makefile:15498: bitcoind-bitcoind.o] Error 1
    
  4. LarryRuane force-pushed on Dec 14, 2022
  5. LarryRuane commented at 4:07 pm on December 14, 2022: contributor
    Force pushed to fix CI failure, thanks @fanquake
  6. maflcko commented at 7:37 am on December 15, 2022: member
    CI: logging.cpp:105 EnableCategory: Assertion flag < LOGFLAGSMAX’ failed.`
  7. LarryRuane force-pushed on Dec 16, 2022
  8. LarryRuane force-pushed on Dec 16, 2022
  9. LarryRuane commented at 8:52 pm on December 16, 2022: contributor
    Force pushed (twice) to fix CI, thanks @MarcoFalke. One other small change I made in this force push is that if the category value is beyond the size (range) of the std::bitset, then ignore the request rather than throwing a std::out_of_range exception. It should no longer be possible for this to happen, but if it does, it’s not that big of a problem, it seems too drastic to throw an exception for this reason in a production build. The Assume() remains, and will catch the bug during testing.
  10. in src/logging.cpp:148 in 14e30cf238 outdated
    143@@ -121,7 +144,12 @@ bool BCLog::Logger::DisableCategory(const std::string& str)
    144 
    145 bool BCLog::Logger::WillLogCategory(BCLog::LogFlags category) const
    146 {
    147-    return (m_categories.load(std::memory_order_relaxed) & category) != 0;
    148+    Assume(category < NONE);
    149+    if (category < NONE) {
    


    maflcko commented at 10:39 am on December 17, 2022:
    0   
    1    if ( Assume(category < NONE)) {
    

    nit: Can be combined, IIRC

  11. LarryRuane force-pushed on Dec 17, 2022
  12. LarryRuane commented at 3:59 pm on December 17, 2022: contributor
    Force pushed for review comment, thanks @MarcoFalke!
  13. in src/logging.h:39 in 0dbe9b487f outdated
    60-        QT          = (1 << 19),
    61-        LEVELDB     = (1 << 20),
    62-        VALIDATION  = (1 << 21),
    63-        I2P         = (1 << 22),
    64-        IPC         = (1 << 23),
    65+    enum LogFlags : size_t {
    


    stickies-v commented at 10:49 am on December 21, 2022:
    What’s the rationale behind changing this into a size_t? I’d think with this change we could turn it into a smaller type, if anything?
  14. in src/logging.h:77 in 0dbe9b487f outdated
    104+
    105+        // The following have no representation in m_categories:
    106+        NONE, // this is also the size of the bitset
    107+        ALL,
    108     };
    109+    using LogFlagsBitset = std::bitset<NONE>;
    


    stickies-v commented at 11:02 am on December 21, 2022:
    nit: I think it’d be more intuitive to switch NONE and ALL so we have LogFlagsBitset defined as std::bitset<ALL>
  15. in src/logging.cpp:128 in 0dbe9b487f outdated
    124+    switch (flag) {
    125+        case BCLog::NONE:
    126+            return;
    127+        case BCLog::ALL:
    128+            // set all the category flags to false
    129+            m_categories = 0;
    


    stickies-v commented at 11:15 am on December 21, 2022:
    nit: for consistency with EnableCategory, could use .reset()?
  16. in src/logging.cpp:109 in 0dbe9b487f outdated
    105+            // with no argument, set() sets all the bits to true
    106+            m_categories.set();
    107+            return;
    108+        default: break;
    109+    }
    110+    if (Assume(flag < NONE)) m_categories.set(flag);
    


    stickies-v commented at 11:50 am on December 21, 2022:

    Potential more concise alternative (works for DisableCategory too):

    0    if (flag == BCLog::NONE) return;
    1    StdLockGuard scoped_lock(m_cs);
    2    flag == BCLog::ALL ? m_categories.set() : m_categories.set(flag);
    
  17. in src/logging.cpp:313 in 0dbe9b487f outdated
    311@@ -286,6 +312,8 @@ std::string LogCategoryToStr(BCLog::LogFlags category)
    312         return "txreconciliation";
    313     case BCLog::LogFlags::SCAN:
    314         return "scan";
    315+    // Add new entries before this line.
    316+
    


    stickies-v commented at 11:56 am on December 21, 2022:
    nit: reorder BCLog::LogFlags::NONE here in line with the definition in logging.h?
  18. in src/interfaces/node.h:10 in 0dbe9b487f outdated
    5@@ -6,6 +6,7 @@
    6 #define BITCOIN_INTERFACES_NODE_H
    7 
    8 #include <consensus/amount.h>          // For CAmount
    9+#include <logging.h>                   // For BCLog
    


    stickies-v commented at 12:03 pm on December 21, 2022:
    I think “For …” comments are discouraged from use in new code because of maintainability
  19. in src/logging.cpp:170 in 0dbe9b487f outdated
    141@@ -121,7 +142,11 @@ bool BCLog::Logger::DisableCategory(const std::string& str)
    142 
    143 bool BCLog::Logger::WillLogCategory(BCLog::LogFlags category) const
    144 {
    145-    return (m_categories.load(std::memory_order_relaxed) & category) != 0;
    146+    if (Assume(category < NONE)) {
    147+        StdLockGuard scoped_lock(m_cs);
    148+        return m_categories.test(category);
    149+    }
    150+    return false;
    


    stickies-v commented at 12:04 pm on December 21, 2022:
    Not 100% sure about this. Feels a bit brittle to use Assume here. Previously it was fine to pass ALL or NONE here - we could keep that behaviour?

    LarryRuane commented at 9:52 pm on December 22, 2022:

    Currently, neither NONE nor ALL is passed to this function. The only caller to this function is LogAcceptCategory(), which is called by LogPrint() and similar, and only actual logging categories (for example, BCLog::NET) are passed to those, never NONE or ALL. I don’t think passing those makes sense. NONE would mean don’t log at all (so why is the LogPrint statement there to begin with?), and ALL would be just the same as LogPrintf(). We could make those work, I guess. But, as I said, they’re not used today.

    But here I would like to not use NONE; better to use m_categories.size().

    I think the Assume() is nice to keep, because we would like to know if someone calls LogPrint(BCLog::NONE, ...) (or similar).


    stickies-v commented at 11:39 pm on December 22, 2022:

    Yeah they’re not currently being used, but I think (unnecessarily) bounding which elements of an enum we can accept is confusing at best since that exception is not immediately obvious from the function signature. It can also make maintenance more burdensome. And even if we don’t see any use cases (and I think, for example, it’s not unreasonable for a future callsite to check if any logging is happening at the moment), I still think we shouldn’t necessarily prevent ALL or NONE from being used.

    As an alternative solution, I think having a single function to convert LogFlags into LogFlagsBitSet simplifies the code and allows callsites to accept any kind of LogFlags:

     0diff --git a/src/logging.cpp b/src/logging.cpp
     1index 970f99c12..10b0110c6 100644
     2--- a/src/logging.cpp
     3+++ b/src/logging.cpp
     4@@ -94,19 +94,18 @@ void BCLog::Logger::DisconnectTestLogger()
     5     m_print_callbacks.clear();
     6 }
     7 
     8+/** Convert LogFlags into their equivalent bitset for generalized bitwise comparison */
     9+BCLog::LogFlagsBitset FlagToBitset(BCLog::LogFlags flag){
    10+    BCLog::LogFlagsBitset flag_bitset;
    11+    if (flag == BCLog::LogFlags::NONE) return flag_bitset;
    12+    flag == BCLog::LogFlags::ALL ? flag_bitset.set() : flag_bitset.set(flag);
    13+    return flag_bitset;
    14+}
    15+
    16 void BCLog::Logger::EnableCategory(BCLog::LogFlags flag)
    17 {
    18     StdLockGuard scoped_lock(m_cs);
    19-    switch (flag) {
    20-        case BCLog::NONE:
    21-            return;
    22-        case BCLog::ALL:
    23-            // with no argument, set() sets all the bits to true
    24-            m_categories.set();
    25-            return;
    26-        default: break;
    27-    }
    28-    if (Assume(flag < NONE)) m_categories.set(flag);
    29+    m_categories |= FlagToBitset(flag);
    30 }
    31 
    32 bool BCLog::Logger::EnableCategory(const std::string& str)
    33@@ -120,16 +119,7 @@ bool BCLog::Logger::EnableCategory(const std::string& str)
    34 void BCLog::Logger::DisableCategory(BCLog::LogFlags flag)
    35 {
    36     StdLockGuard scoped_lock(m_cs);
    37-    switch (flag) {
    38-        case BCLog::NONE:
    39-            return;
    40-        case BCLog::ALL:
    41-            // set all the category flags to false
    42-            m_categories = 0;
    43-            return;
    44-        default: break;
    45-    }
    46-    if (Assume(flag < NONE)) m_categories.reset(flag);
    47+    m_categories &= ~FlagToBitset(flag);
    48 }
    49 
    50 bool BCLog::Logger::DisableCategory(const std::string& str)
    51@@ -139,14 +129,10 @@ bool BCLog::Logger::DisableCategory(const std::string& str)
    52     DisableCategory(flag);
    53     return true;
    54 }
    55-
    56 bool BCLog::Logger::WillLogCategory(BCLog::LogFlags category) const
    57 {
    58-    if (Assume(category < NONE)) {
    59-        StdLockGuard scoped_lock(m_cs);
    60-        return m_categories.test(category);
    61-    }
    62-    return false;
    63+    StdLockGuard scoped_lock(m_cs);
    64+    return (m_categories & FlagToBitset(category)) != 0;
    65 }
    66 
    67 bool BCLog::Logger::WillLogCategoryLevel(BCLog::LogFlags category, BCLog::Level level) const
    

    (Note: at this point, we also almost just as well might have m_categories be a uint64_t instead of std::bitset, but I’m not very familar with the trade-offs between std::bitset and uint64_t - it seems bitsets can be a bit safer to use?)


    stickies-v commented at 11:46 pm on December 22, 2022:
    I guess in more generalized terms: I think in this approach it’s nice that we have a single place (FlagToBitset()) where we define what each LogFlag means, as opposed to having to redefine that in every function.

    LarryRuane commented at 7:51 pm on December 23, 2022:

    I still think we shouldn’t necessarily prevent ALL or NONE from being used.

    I like this idea, I just force-pushed an update (40e1923e856661fdf68cf783ed9c6d1931dfbdf3) to allow those to have the obvious meanings when specified by LogPrint() (i.e. WillLogCategory()).

    As for FlagToBitset(), that’s very interesting, but it has the downside that evaluating m_categories & FlagToBitset(category) (in WillLogCategory()) requires (in effect) checking all of the categories (bits), whereas m_categories.test(category) checks only one and so feels more efficient (although it may not really be, depending on the implementation). Also, FlagToBitset() makes an actual function call to the constructor (I verified this using https://godbolt.org/), and there may also be a call from WillLogCategory() to FlagToBitset() (if the compiler doesn’t inline it). Since these happen for testing a category, which is performance-critical, perhaps they should be avoided.

    we also almost just as well might have m_categories be a uint64_t …

    One nice advantage of using std::bitset is that the BCLog::LogFlags categories are a simple enum, rather than being defined with a manually-selected shift constant, like 1 << 28. Also, this seems like a perfect use case for std::bitset and is just more modern than low-level bit manipulation. So I think I still like std::bitset here.

  20. stickies-v approved
  21. stickies-v commented at 12:06 pm on December 21, 2022: contributor

    Approach ACK 0dbe9b487

    This was a great suggestion by @luke-jr, thanks for implementing.

  22. in src/logging.h:112 in 0dbe9b487f outdated
    108@@ -104,7 +109,7 @@ namespace BCLog {
    109         std::atomic<Level> m_log_level{DEFAULT_LOG_LEVEL};
    110 
    111         /** Log categories bitfield. */
    112-        std::atomic<uint32_t> m_categories{0};
    113+        LogFlagsBitset m_categories GUARDED_BY(m_cs);
    


    jonatack commented at 9:41 pm on December 21, 2022:
    Tested approach ACK. I haven’t yet checked impact on performance with the benchmarks.

    ajtowns commented at 1:58 am on January 22, 2023:
    Switching this from an atomic to be blocking on the m_cs mutex doesn’t seem great? It’s checked for every LogPrintf where the log category is disabled, etc. Should at least use a reader/writer lock?
  23. in src/logging.cpp:168 in 0dbe9b487f outdated
    141@@ -121,7 +142,11 @@ bool BCLog::Logger::DisableCategory(const std::string& str)
    142 
    143 bool BCLog::Logger::WillLogCategory(BCLog::LogFlags category) const
    144 {
    145-    return (m_categories.load(std::memory_order_relaxed) & category) != 0;
    146+    if (Assume(category < NONE)) {
    147+        StdLockGuard scoped_lock(m_cs);
    148+        return m_categories.test(category);
    


    jonatack commented at 9:48 pm on December 21, 2022:
    For instance, this code can be called frequently with some logging configs IIRC.

    LarryRuane commented at 3:56 pm on December 22, 2022:

    Thanks, good observation, I tried to continue using atomic, but wasn’t able to get it to work.

    But also, I think that with current master, it may be possible for two threads to try to enable two different logging categories at the same time, and one silently fails (both threads read, thread 1 writes, thread 2 writes, undoing thread 1’s write). Using a mutex as this PR does solves this problem. What do you think? Maybe still better to use std::atomic?


    LarryRuane commented at 3:59 pm on December 22, 2022:
    Maybe this should be a rw lock? You’re correct that testing performance is more important, and using rw lock would help with that.

    stickies-v commented at 11:41 pm on December 22, 2022:
    As per my footnote on #26697 (review) - if using std::atomic<uint64_t> would improve performance, that could be an argument to stop using the bitset altogether?
  24. LarryRuane force-pushed on Dec 23, 2022
  25. LarryRuane commented at 7:58 pm on December 23, 2022: contributor

    Force pushed 40e1923e856661fdf68cf783ed9c6d1931dfbdf3 to address review comments (thanks, @stickies-v!):

    • remove the unnecessary comment on #include <logging.h> line
    • simplify EnableCategory() and DisableCategory(), and remove the calls to Assume()
    • allow WillLogCategory() to work with special logging categories NONE and ALL
    • remove unnecessary size_t from the definition of enum LogFlags
    • reverse the order of NONE and ALL so that LogFlagsBitset can be std::bitset<ALL>
  26. kevkevinpal commented at 2:34 am on January 18, 2023: contributor
    Concept ACK
  27. in src/logging.h:40 in 40e1923e85 outdated
    60-        QT          = (1 << 19),
    61-        LEVELDB     = (1 << 20),
    62-        VALIDATION  = (1 << 21),
    63-        I2P         = (1 << 22),
    64-        IPC         = (1 << 23),
    65+    enum LogFlags {
    


    kouloumos commented at 3:30 pm on January 18, 2023:

    We refer to those as “categories” in multiple occasions, which I found a bit confusing while reading the code. Is touching this an opportunity to replace “flags” with “categories”? (This has also been suggested in #25614#pullrequestreview-1078726454)

    Also, reading a bit on enum, my understanding is that the underlying type is int unless a different type is specified (strongly typed enum).


    ajtowns commented at 1:26 am on January 22, 2023:
    I think this should be an enum class to make it an error to do bitwise operations?
  28. codo1 commented at 3:46 pm on January 18, 2023: none

    Tested ACK 40e1923e

    With two disclaimers: a. I should not be considered a C++ programmer b. this is my first review for Bitcoin Core

    I tested as follows:

    1. I ran all unittests in random order: pass

    2. I ran all functional tests except wallet tests (as I didn’t compile the wallet in): pass (two rpc tests timed out, which was unrelated to this PR)

    3. I added a new logging category “fakecategory” to the logging code plus a LogPrint call using it (git diff). I turned the category on via bitcoin.conf (tried both explicitly by its name, and by using “debug=1”). I turned off and on via rpc (again both explicitly by its name, and by using “all”) Result: the LogPrint message showed up in debug.log where and when I expected it.

    4. I tried to break things: I moved the new logging category to a wrong place within the BCLog::LogFlags enum. I tried two spots: between ALL and NONE, and after NONE. Each spot made logging_tests.cpp fail (example output).

  29. kouloumos commented at 4:50 pm on January 18, 2023: contributor

    Below are benchmarks for single LogPrintf() and LogPrint() invocations run with src/bench/bench_bitcoin -filter=Logging.* -min-time=5000

    PR branch (40e1923e856661fdf68cf783ed9c6d1931dfbdf3)

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    48.81 20,486,853.23 0.1% 183.00 96.84 1.890 43.00 0.0% 5.50 LoggingNoCategory
    226.15 4,421,939.99 0.1% 913.00 448.60 2.035 218.00 0.0% 5.50 LoggingNoFile
    6,151.01 162,574.91 0.2% 20,563.01 8,362.38 2.459 4,076.01 0.1% 5.48 LoggingNoThreadNames
    6,318.64 158,261.86 0.2% 21,129.07 8,671.84 2.437 4,222.93 0.1% 5.50 LoggingYoCategory
    6,316.44 158,317.09 0.0% 21,152.01 8,654.85 2.444 4,225.01 0.1% 5.40 LoggingYoThreadNames

    master branch (52d6509497a717cb01154aa05aa2b69f99ed5391)

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    14.50 68,988,872.00 0.2% 78.00 28.75 2.713 14.00 0.0% 5.50 LoggingNoCategory
    188.08 5,316,965.41 0.2% 808.00 373.10 2.166 189.00 0.0% 5.50 LoggingNoFile
    6,245.34 160,119.36 0.2% 20,594.65 8,514.24 2.419 4,083.06 0.1% 5.49 LoggingNoThreadNames
    6,373.67 156,895.41 0.3% 21,027.66 8,775.28 2.396 4,197.40 0.2% 5.57 LoggingYoCategory
    6,473.95 154,465.32 0.7% 21,464.64 8,983.89 2.389 4,296.28 0.2% 5.15 LoggingYoThreadNames

    This suggests:

    • No performance regression when benchmarks are enabled (Logging{Yo,No}ThreadNames and LoggingYoCategory).
    • Slowdown when benchmarks are disabled (LoggingNoCategory), which I think is insignificant due to how small the initial value is.
  30. in src/logging.h:77 in 40e1923e85 outdated
    104+
    105+        // The following have no representation in m_categories:
    106+        ALL, // this is also the size of the bitset
    107+        NONE,
    108     };
    109+    using LogFlagsBitset = std::bitset<ALL>;
    


    ajtowns commented at 1:26 am on January 22, 2023:

    Currently BCLog::LogFlags can represent multiple categories, whereas after this patch it can only represent one at a time, with LogFlagsBitset taking over representing multiple categories. Also, you can no longer represent multiple categories by writing BCLog::NET | BCLog::MEMPOOL. The ALL and NONE values in particular would seem much more sensible if they had the bitset type.

    Might be worth making class LogFlagsSet { private: std::bitset<Flags::COUNT> m_flags; public: ... } and being more explicit about what operations are supported (and more strict about needing to provide a BCLog::LogFlags rather than an arbitrary int for various operations).

  31. in src/logging.cpp:101 in 40e1923e85 outdated
     95@@ -95,7 +96,9 @@ void BCLog::Logger::DisconnectTestLogger()
     96 
     97 void BCLog::Logger::EnableCategory(BCLog::LogFlags flag)
     98 {
     99-    m_categories |= flag;
    100+    if (flag == BCLog::NONE) return;
    101+    StdLockGuard scoped_lock(m_cs);
    102+    (flag == BCLog::ALL) ? m_categories.set() : m_categories.set(flag);
    


    ajtowns commented at 1:28 am on January 22, 2023:
    Seems strange to do this as a ? : operation rather than if .. else ..
  32. LarryRuane commented at 0:18 am on January 23, 2023: contributor

    Thanks for the reviews, @ajtowns and @codobtc, and for the benchmarking @kouloumos.

    I think now that #26619 (closed in favor of the current PR) may actually be preferable because the performance degradation could be considered a show-stopper. It is very important that non-enabled log category testing be super fast.

    What do all of you think?

    This PR is also getting complicated. If it added a significant feature or functionality, that would be one thing, but the only thing along those lines is to allow for an unlimited number of logging categories. But having 64 logging categories will carry us for a long time. It’s taken 14 years to reach 28 categories, and the rate of new category additions will likely decrease as the project continues to mature.

    So I’ve reopened #26619 so that reviews can comment on that approach. (It’s much easier to review than the current PR.)

  33. ajtowns commented at 1:47 am on January 23, 2023: contributor

    How about a slightly different approach? Change atomic<uint32_t> m_categories to FlagCategories m_categories to a custom thread safe class that offers just the operations we need:

    • bool any() const
    • bool none() const
    • void set() (for ALL) and void set(flag) (for a particular flag)
    • void reset() and void reset(flag)
    • bool test(flag) const

    A thread safe class probably shouldn’t support copy/move constructors, but the only time we need to copy m_categories seems to be in rpc/node.cpp where we’re checking of the LIBEVENT category changes; but the copy there can be replaced by .test(LIBEVENT) before and after and seeing if they’re !=.

    I think a patch to change LogFlags to a regular enum like you’ve done here, and encapsulate m_categories as a class containing an atomic<uint32_t> should be reasonably straightforward, then updating that to a uint64_t when we go over 32 flags, or an array of atomics once we go over 64 flags would be simple? I had a go at messing with the latter, and came up with https://github.com/ajtowns/bitcoin/commits/202301-log-atomic-bitset

  34. jonatack commented at 5:21 pm on January 23, 2023: member

    Given that performance and benchmarks have been a key part of the discussion, #26957 has updated benchmarks that reviewers may use for evaluating this pull and #26619.

    I’m currently rebasing and updating #25203; hopefully any approach adopted in these refactorings won’t require a lot of rewriting for that large ongoing migration. Would be helpful to keep that in mind.

  35. LarryRuane force-pushed on Jan 27, 2023
  36. LarryRuane renamed this:
    logging: use std::bitset for categories
    logging: use bitset for categories
    on Jan 27, 2023
  37. LarryRuane commented at 8:14 pm on January 27, 2023: contributor

    Thanks, @ajtowns and @jonatack, I force pushed ac3c16c6978370c9c5ea4c591acbbbcf626804a6 to include Anthony’s suggested atomic bitset, which is really nice. I only changed a few of the variable names to be a little clearer to me.

    Also, I verified using Jon’s updated benchmarks (and the existing benchmarks) that the performance of this branch is the same as master, although someone should confirm this.

  38. kouloumos commented at 8:39 am on January 28, 2023: contributor

    although someone should confirm this.

    PR branch (ac3c16c6978370c9c5ea4c591acbbbcf626804a6)

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    13.09 76,415,390.27 0.1% 76.00 25.97 2.927 15.00 0.0% 5.50 LoggingNoCategory
    191.93 5,210,284.18 0.2% 806.00 380.68 2.117 190.00 0.0% 5.46 LoggingNoFile
    6,316.45 158,316.81 0.1% 21,105.01 8,681.62 2.431 4,200.01 0.1% 5.47 LoggingNoThreadNames
    6,444.81 155,163.59 0.1% 21,556.93 8,931.83 2.413 4,317.32 0.1% 5.40 LoggingYoCategory
    6,525.51 153,244.80 0.2% 21,993.15 9,092.68 2.419 4,417.28 0.2% 5.50 LoggingYoThreadNames

    Results can be compared with #26697#pullrequestreview-1253717862 to confirm that the performance degradation that was previously observed is no longer an issue.

  39. achow101 referenced this in commit 630756cac0 on Mar 23, 2023
  40. sidhujag referenced this in commit 433494e1d3 on Mar 24, 2023
  41. DrahtBot added the label Needs rebase on Apr 3, 2023
  42. LarryRuane force-pushed on Apr 7, 2023
  43. DrahtBot removed the label Needs rebase on Apr 7, 2023
  44. DrahtBot added the label Needs rebase on Jun 12, 2023
  45. LarryRuane force-pushed on Jun 12, 2023
  46. DrahtBot removed the label Needs rebase on Jun 12, 2023
  47. LarryRuane commented at 9:47 pm on June 12, 2023: contributor
    Force-pushed rebase for merge conflict
  48. DrahtBot added the label Needs rebase on Jun 20, 2023
  49. LarryRuane force-pushed on Jun 28, 2023
  50. LarryRuane commented at 11:13 pm on June 28, 2023: contributor
    Rebased to fix merge conflict
  51. DrahtBot removed the label Needs rebase on Jun 29, 2023
  52. DrahtBot added the label CI failed on Aug 30, 2023
  53. DrahtBot added the label Needs rebase on Aug 31, 2023
  54. LarryRuane force-pushed on Aug 31, 2023
  55. DrahtBot removed the label Needs rebase on Aug 31, 2023
  56. DrahtBot removed the label CI failed on Aug 31, 2023
  57. DrahtBot added the label CI failed on Jan 13, 2024
  58. LarryRuane force-pushed on Mar 26, 2024
  59. DrahtBot removed the label CI failed on Mar 26, 2024
  60. LarryRuane commented at 5:03 am on March 26, 2024: contributor
    Rebased to the latest master (since there are no commit-specific acks). This is ready to review, thanks.
  61. DrahtBot added the label Needs rebase on Apr 2, 2024
  62. LarryRuane force-pushed on Apr 8, 2024
  63. DrahtBot removed the label Needs rebase on Apr 8, 2024
  64. DrahtBot added the label Needs rebase on Jul 26, 2024
  65. LarryRuane force-pushed on Jul 26, 2024
  66. DrahtBot removed the label Needs rebase on Jul 26, 2024
  67. DrahtBot added the label Needs rebase on Aug 5, 2024
  68. LarryRuane force-pushed on Aug 6, 2024
  69. LarryRuane commented at 8:08 pm on August 6, 2024: contributor
    Force pushed rebase to fix conflict and also did some additional testing, I added several fake logging categories to ensure there can be more than 32 categories (the current limit), which is the main benefit of this pull. Also, I re-ran the benchmark tests (including with more than 32 categories), and this PR has no measurable performance impact. @vasild and @ryanofsky, maybe you can take a look since you know this area, thanks.
  70. DrahtBot removed the label Needs rebase on Aug 6, 2024
  71. in src/logging.h:103 in dadce0ba94 outdated
    130+        AtomicBitSet& operator=(AtomicBitSet&&) = delete;
    131+        ~AtomicBitSet() = default;
    132+
    133+        constexpr size_t size() const { return BITS; }
    134+
    135+        bool is_any() const
    


    ryanofsky commented at 7:03 pm on August 7, 2024:

    In commit “logging: use bitset class for categories” (dadce0ba946aa9c209fc5c15d9efe627f7a486a2)

    In std::bitset, the “is_any” and “is_none” methods seem to be called just any and none. IMO, the standard names make more sense than these names because I don’t think it makes sense to say a set “is any” or a set “is none” but I do think it makes sense for it to have “any” and “none” properties.

  72. ryanofsky approved
  73. ryanofsky commented at 7:25 pm on August 7, 2024: contributor

    Code review ACK dadce0ba946aa9c209fc5c15d9efe627f7a486a2

    Not important, but I think you could consider renaming BCLog::LogFlagsBitset to BCLog::CategoryMask to provide a less clunky type name and be consistent with existing BCLog::::Logger::GetCategoryMask() method.

  74. DrahtBot requested review from stickies-v on Aug 7, 2024
  75. LarryRuane force-pushed on Aug 12, 2024
  76. LarryRuane commented at 1:50 am on August 12, 2024: contributor
    Thanks, Russ, I made both suggested changes (force pushed).
  77. in src/logging.h:111 in d8c637a0ad outdated
    136+        {
    137+            for (const auto& i : bits) {
    138+                if (i > 0) return true;
    139+            }
    140+            return false;
    141+        }
    


    vasild commented at 5:44 am on August 12, 2024:

    This can observe a state that never was. For example, lets say bits[] has 2 elements, initially [0000, 0010]:

    1. [0000, 0010], any() == true
    2. any() is called and it reads the first element 0000
    3. another thread sets the 1st bit, new state: [1000, 0010], any() == true
    4. another thread resets the 7th bit, new state: [1000, 0000], any() == true
    5. the execution of any() from 1. continues and reads the second element: 0000 and that any() call returns false, even though the bitset was never empty.

    I would prefer the dump solution of changing to 64 bits from #26619. ACK 90c3d904bcbf294a6fcef8bc9ecb9445fd41c4bd from that PR (I can’t post it there because it is locked). 64 would be sufficient for a loooong time and personally, I don’t see “I think there’s no urgent need for >32 bits, so might as well go straight to bitset.”* as enough justification to jump into over-engineering this.


    l0rinc commented at 7:57 am on August 12, 2024:
    +1 for uint64_t instead of AtomicBitSet

    LarryRuane commented at 7:21 pm on August 12, 2024:

    As I look at this again, I agree this is over-engineered (what do you think, @ryanofsky?). #26619 was my first attempt but someone suggested using std::bitset and I took that suggestion. But that approach (this PR) turned out to be more complicated than expected.

    The one thing I like about the current approach is not having to specify all the bit shift values explicitly, such as NET = (1 << 0), instead it’s just a clean list. But that isn’t much of an advantage.

    Why don’t I rebase #26619 and get it re-opened (I think a maintainer has to do that, @ryanofsky?), then both will be open and up to date, and reviewers can decide which to ACK (or maybe ACK both and a maintainer decides). I don’t mind either one, but slightly prefer #26619 because it is simpler, and 64 categories should last a very long time. In the worst case, if we need even more, this PR can be resurrected.


    maflcko commented at 7:40 pm on August 12, 2024:

    both will be open and up to date, and reviewers can decide which to ACK (or maybe ACK both and a maintainer decides).

    I haven’t looked at the code or the bug here, but if this is a real bug that could be hit in production, acking and merging should not be done.


    LarryRuane commented at 8:02 pm on August 12, 2024:

    Hard to say if it’s a bug, it’s definitely not undefined behavior. This any() function is currently used only by qt where it says, if there are any debug logs enabled, then display a bunch of extra debug stuff (which is a bit questionable, because the logging categories that are enabled may have nothing whatsoever to do with qt). So if the sequence vasild outlined above happens, and the logging is being changed when qt is deciding whether to show this debug stuff, then it could make the wrong decision. But it’s just debug stuff.

    Of course, there could be other calls to any() in the future.

    The nature of a function like any() is that as soon as you call it, it could change anyway. But what vasild found is slightly worse, because there was never a moment where no logging was enabled, yet any() returns false.

    I tried to fix the problem but it turns out to be quite messy. I suppose this is another reason to go with the other approach (uint64_t), although not a strong reason, IMO.

    Thanks for reopening #26619, Marco. I’ll rebase it now.


    ryanofsky commented at 11:34 pm on August 12, 2024:
    This is a good catch and I think it’s worth commenting on, but I don’t think it is a bug. Would suggest adding a comment like: “Checks if any bits are set. Note: Bits are checked independently, so result reflects the state of all bits, but not at a single instant in time.”
  78. in src/logging.h:89 in d8c637a0ad outdated
    116+        static constexpr size_t BITS_PER_T = 8 * sizeof(T);
    117+
    118+        // number of Ts needed for BITS bits (round up)
    119+        static constexpr size_t NT = (BITS + BITS_PER_T - 1) / BITS_PER_T;
    120+
    121+        std::atomic<T> bits[NT]{0};
    


    ryanofsky commented at 11:37 pm on August 12, 2024:

    In commit “logging: use bitset class for categories” (d8c637a0ad704c30dc128d0c66215d5ac220f85d)

    Would be nice to rename bits member to m_bits for clarity I think

  79. ryanofsky approved
  80. ryanofsky commented at 11:46 pm on August 12, 2024: contributor

    Code review ACK d8c637a0ad704c30dc128d0c66215d5ac220f85d

    I wouldn’t have a problem just switching the code to use uint64_t but I think this approach is nicer and more type safe. I like that this gets rid of all the bitwise operations and (1 « 28) cruft and stops hardcoding uint32_t various places. The atomic bitset class seems pretty simple to me and has a nice interface.

  81. LarryRuane force-pushed on Aug 13, 2024
  82. LarryRuane commented at 0:13 am on August 13, 2024: contributor

    @ryanofsky - I took your suggestions (bits to m_bits and added the comment), and I noticed something else: By defining LogCategoryMask as comprising uint64_t instead of uint32_t base elements, the non-atomic timing window problem won’t happen until we exceed 64 logging categories, because the loop in any() will iterate only once. That loop doesn’t look at individual bits, it looks at one entire base element (now uint64_t) at a time.

    Yes, the AtomicBitSet class is pretty cool, we could consider moving it out to its own file for anyone to use, but I guess it’s better to leave in in logging.h until it’s needed elsewhere. The only possible problem is a lack of awareness of it.

  83. logging: use bitset class for categories
    ... rather than integer bitmasks such as (1 << 28). This removes the
    limit of 32 logging categories. Adds `AtomicBitSet` class, which is
    similar to `std::bitset` but supports atomic (lock-free) operation,
    which is important for performance.
    
    Co-authored-by: Anthony Towns <aj@erisian.com.au>
    d2067124aa
  84. LarryRuane force-pushed on Aug 13, 2024
  85. ryanofsky approved
  86. ryanofsky commented at 2:23 pm on August 13, 2024: contributor
    Code review ACK d2067124aae098980de5c4a810753486491faea7. Since last review, just renamed array member to m_bits, switched from 32 to 64 bit ints and added comment about any and none methods being not atomic.
  87. LarryRuane commented at 8:39 pm on September 3, 2024: contributor
    Closing because #26619 (an alternative to this PR) has merged. We could make this active again because it is arguably cleaner and nicer, but the immediate need to increase the number of logging categories no longer exists.
  88. LarryRuane closed this on Sep 3, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 19:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me