util: avoid using thread_local variable that has a destructor #30095

pull vasild wants to merge 1 commits into bitcoin:master from vasild:thead_local_without_destructor changing 4 files +27 −21
  1. vasild commented at 3:07 pm on May 13, 2024: contributor

    Store the thread name in a thread_local variable of type char[] instead of std::string. This avoids calling the destructor when the thread exits. This is a workaround for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278701

    For type-safety, return std::string from util::ThreadGetInternalName() instead of char[].

    As a side effect of this change, we no longer store a reference to a thread_local variable in CLockLocation. This was dangerous because if the thread quits while the reference still exists (in the global variable lock_data, see inside GetLockData()) then the reference will become dangling.

  2. DrahtBot commented at 3:07 pm on May 13, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK hebasto, theuni, laanwj

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Utils/log/libs on May 13, 2024
  4. vasild commented at 3:09 pm on May 13, 2024: contributor
    This came from the discussion in https://github.com/bitcoin/bitcoin/issues/29952
  5. vasild force-pushed on May 13, 2024
  6. DrahtBot added the label CI failed on May 13, 2024
  7. DrahtBot commented at 4:35 pm on May 13, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/24905574812

  8. fanquake commented at 2:53 am on May 14, 2024: member

    https://github.com/bitcoin/bitcoin/pull/30095/checks?check_run_id=24909710597

     0'./'util/threadnames.cpp
     1In file included from /usr/include/string.h:535,
     2                 from /usr/include/c++/11/cstring:42,
     3                 from util/threadnames.cpp:7:
     4In function ‘void* memcpy(void*, const void*, size_t)’,
     5    inlined from ‘void SetInternalName(std::string)’ at util/threadnames.cpp:50:16,
     6    inlined from ‘void util::ThreadSetInternalName(std::string&&)’ at util/threadnames.cpp:69:20:
     7/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:33: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ forming offset [32, 143] is out of the bounds [0, 32] of object ‘<anonymous>’ with type ‘std::string’ {aka ‘std::__cxx11::basic_string<char>’} [-Werror=array-bounds]
     8   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
     9      |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
    10   30 |                                  __glibc_objsize0 (__dest));
    11      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
    12util/threadnames.cpp: In function ‘void util::ThreadSetInternalName(std::string&&)’:
    13util/threadnames.cpp:69:20: note: ‘<anonymous>’ declared here
    14   69 |     SetInternalName(std::move(name));
    15      |     ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
    16In file included from /usr/include/string.h:535,
    17                 from /usr/include/c++/11/cstring:42,
    18                 from util/threadnames.cpp:7:
    19In function ‘void* memcpy(void*, const void*, size_t)’,
    20    inlined from ‘void SetInternalName(std::string)’ at util/threadnames.cpp:50:16,
    21    inlined from ‘void util::ThreadRename(std::string&&)’ at util/threadnames.cpp:64:20:
    22/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:33: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ forming offset [32, 143] is out of the bounds [0, 32] of object ‘<anonymous>’ with type ‘std::string’ {aka ‘std::__cxx11::basic_string<char>’} [-Werror=array-bounds]
    23   29 |   return __builtin___memcpy_chk (__dest, __src, __len,
    24      |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
    25   30 |                                  __glibc_objsize0 (__dest));
    26      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~
    27util/threadnames.cpp: In function ‘void util::ThreadRename(std::string&&)’:
    28util/threadnames.cpp:64:20: note: ‘<anonymous>’ declared here
    29   64 |     SetInternalName(std::move(name));
    30      |     ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
    
  9. vasild force-pushed on May 14, 2024
  10. vasild commented at 9:48 am on May 14, 2024: contributor
    13f438a667...c5f9afd946: fix the above and return std::string instead of std::string_view which internally has a pointer to the thread_local variable. This way the API can’t the misused to store the reference to the thread_local longer than the thread’s lifespan.
  11. in configure.ac:1054 in c5f9afd946 outdated
    1048@@ -1049,11 +1049,6 @@ if test "$use_thread_local" = "yes" || test "$use_thread_local" = "auto"; then
    1049           dnl https://gist.github.com/jamesob/fe9a872051a88b2025b1aa37bfa98605
    1050           AC_MSG_RESULT([no])
    1051           ;;
    1052-        *freebsd*)
    1053-          dnl FreeBSD's implementation of thread_local is also buggy (per
    1054-          dnl https://groups.google.com/d/msg/bsdmailinglist/22ncTZAbDp4/Dii_pII5AwAJ)
    


    fanquake commented at 11:10 am on May 14, 2024:

    This is being removed, but from the discussion in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278701, I see

    I started investigating this from here: https://groups.google.com/g/bsdmailinglist/c/22ncTZAbDp4/m/Dii_pII5AwAJ I am not sure if that comment has merit. If that is an issue, it is different than the one from this bug report.

    So if it’s a different issue, unrelated to the one just fixed, or it’s still unclear, why is the assumption this should be removed? At least, this change seems separate from what is being done here.


    vasild commented at 12:55 pm on May 14, 2024:
    My understanding is that it is safe to use thread_local on FreeBSD for variables that do not have a destructor even in the presence of the above bug (or bugs if they are two separate bugs).
  12. DrahtBot removed the label CI failed on May 14, 2024
  13. in src/util/threadnames.cpp:50 in c5f9afd946 outdated
    40@@ -40,27 +41,31 @@ static void SetThreadName(const char* name)
    41 // global.
    42 #if defined(HAVE_THREAD_LOCAL)
    43 
    44-static thread_local std::string g_thread_name;
    45-const std::string& util::ThreadGetInternalName() { return g_thread_name; }
    46+static thread_local char g_thread_name[128]{'\0'};
    


    hebasto commented at 3:10 pm on May 14, 2024:
    Maybe add a comment to prevent future attempts to reintroduce std::string?

    fanquake commented at 4:23 am on May 15, 2024:
    Seems like a weird comment to add without specfic/obvious motivation, and doesn’t prevent the same thing from happening elsewhere in the codebase?

    laanwj commented at 9:59 am on May 16, 2024:
    Adding a comment would make sense. Storing a POD data structure in thread-local data instead of a C++ object that allocates on the heap is simpler and more widely supported. But this may not be obvious to someone coming across this code for the first time (and not aware of the long history here). Please also mention that this should be the only use of thread-local data in the program.

    vasild commented at 4:17 pm on May 16, 2024:
    Added a comment.
  14. hebasto commented at 3:11 pm on May 14, 2024: member
    Approach ACK c5f9afd946efb4eb6b27b2d0316f2f787a9608c7.
  15. hebasto added the label Needs CMake port on May 14, 2024
  16. hebasto approved
  17. hebasto commented at 7:38 pm on May 14, 2024: member

    ACK c5f9afd946efb4eb6b27b2d0316f2f787a9608c7, tested on FreeBSD 14.0.

    ./src/bitcoind -logthreadnames works as expected.

  18. fanquake commented at 5:23 am on May 15, 2024: member

    My understanding is that it is safe to use thread_local on FreeBSD for variables that do not have a destructor

    So if we are moving forward with this assumption, what is preventing these kinds of variables being reintroduced (elsewhere) into the codebase? I’d rather thread_local be safe to use (in all circumstance) on a platform, or we just not use it.

    This change feels a bit odd/forced because it’s basically opting back into thread_local, but only certain usage/uncertain assumptions, and there’s nothing to prevent regressions. Putting us into this middle-ground doesn’t seem so productive, especially given there’s currently no real need.

  19. laanwj commented at 9:51 am on May 15, 2024: member

    Concept ACK as i said here: #29952 (comment)

    I’d rather thread_local be safe to use (in all circumstance) on a platform, or we just not use it.

    Sure, that’d be ideal, but handling destructors with TLS is very hard and easy to get wrong. It’s not something we want to rely on.

    Ideally we’d get rid of thread-local usage completely. But we’ve considered alternative solutions, and tracking some data until a thread disappears is hard to do also manually. i have a PR somewhere that keeps track of the thread names in a mutex-protected map, but it leaks (as well as is less efficient).

    This should be the only place we really need is. So i think doing something special is fine here.

    We should avoid any further TLS usage being introduced. But here it’s warranted, imo.

  20. util: avoid using thread_local variable that has a destructor
    Store the thread name in a `thread_local` variable of type `char[]`
    instead of `std::string`. This avoids calling the destructor when
    the thread exits. This is a workaround for
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278701
    
    For type-safety, return `std::string` from
    `util::ThreadGetInternalName()` instead of `char[]`.
    
    As a side effect of this change, we no longer store a reference
    to a `thread_local` variable in `CLockLocation`. This was
    dangerous because if the thread quits while the reference still
    exists (in the global variable `lock_data`, see inside `GetLockData()`)
    then the reference will become dangling.
    d35ba1b3f1
  21. vasild force-pushed on May 16, 2024
  22. vasild commented at 4:17 pm on May 16, 2024: contributor
    c5f9afd946...d35ba1b3f1: add a comment as suggested above
  23. hebasto approved
  24. hebasto commented at 7:17 pm on May 16, 2024: member
    re-ACK d35ba1b3f16071b8fe9b36398ba15352dbf2a54d.
  25. DrahtBot requested review from laanwj on May 16, 2024
  26. theuni approved
  27. theuni commented at 4:56 pm on May 17, 2024: member

    Agree with @laanwj’s take.

    utACK d35ba1b3f16071b8fe9b36398ba15352dbf2a54d

  28. in src/util/threadnames.cpp:47 in d35ba1b3f1
    40@@ -40,27 +41,37 @@ static void SetThreadName(const char* name)
    41 // global.
    42 #if defined(HAVE_THREAD_LOCAL)
    43 
    44-static thread_local std::string g_thread_name;
    45-const std::string& util::ThreadGetInternalName() { return g_thread_name; }
    46+/**
    47+ * The name of the thread. We use char array instead of std::string to avoid
    48+ * complications with running a destructor when the thread exits. Avoid adding
    49+ * other thread_local variables.
    


    fanquake commented at 2:17 am on May 18, 2024:
    If we are outlawing thread_local usage, then at the same time, this commit should remove other comments from our codebase suggesting further thread_local usage. i.e * If thread-safety is needed, the global could be made thread_local (given

    laanwj commented at 7:40 am on May 18, 2024:

    Agree! i mean in a hypothetical future (say, C++30) when every compiler and OS finally implements thread-local storage correctly including destructors, it could be reconsidered, but i think a “no-thread-local” policy makes sense until then.

    TBH in general i don’t really like comments that suggest TODOs. It’s better to keep track of those things in issues where discussion and context is visible.


    fanquake commented at 9:19 am on May 21, 2024:
    I’m going to do this in #30137.
  29. laanwj commented at 7:43 am on May 18, 2024: member
    Code review ACK d35ba1b3f16071b8fe9b36398ba15352dbf2a54d
  30. fanquake merged this on May 21, 2024
  31. fanquake closed this on May 21, 2024

  32. fanquake referenced this in commit 1e7c20bc19 on May 21, 2024
  33. in src/util/threadnames.cpp:48 in d35ba1b3f1
    45-const std::string& util::ThreadGetInternalName() { return g_thread_name; }
    46+/**
    47+ * The name of the thread. We use char array instead of std::string to avoid
    48+ * complications with running a destructor when the thread exits. Avoid adding
    49+ * other thread_local variables.
    50+ * @see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278701
    


    maflcko commented at 11:10 am on May 21, 2024:
    So when the minimum freeBSD is 15.0, this workaround can be reverted? It would be good to clarify this in the source code.

    fanquake commented at 12:35 pm on May 21, 2024:
    This was already backported to the 13 and 14 branches, but we could add another comment here.

    maflcko commented at 12:46 pm on May 21, 2024:
    Ok, so assuming this was fixed, it can probably be reverted some time next year, given the EOL policy https://www.freebsd.org/security/#sup

    theuni commented at 1:01 pm on May 21, 2024:
    I think not using thread_local vars with destructors is a reasonable policy anyway, regardless of where it’s supported.

    maflcko commented at 1:11 pm on May 21, 2024:
    I see, so the comment should say: “While this particular bug has been fixed, thread_local and especially thread_local with non-POD objects should be avoided.”?

    theuni commented at 2:20 pm on May 21, 2024:
    That’d be my preference, yes. I’ll see about adding a clang-tidy check for that combo.

    maflcko commented at 2:35 pm on May 21, 2024:

    clang-tidy check

    A faster and easier approximation would be to use git grep -l thread_local, excluding this file (threadnames.cpp), and the normal lint-exclude list.

  34. vasild deleted the branch on May 21, 2024
  35. hebasto commented at 2:23 pm on June 5, 2024: member
    Ported to the CMake-based build system in https://github.com/hebasto/bitcoin/pull/214.
  36. hebasto removed the label Needs CMake port on Jun 5, 2024
  37. theuni commented at 2:35 pm on June 5, 2024: member
    @hebasto I think this is only ~half of what needs to be done here. Need to port #30137 as well, which should make this even simpler for CMake.
  38. theuni commented at 2:36 pm on June 5, 2024: member
    Ah nm, I see, that’s already done :)

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 13:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me