util: generalize accounting of system-allocated memory in pool resource #27748

pull LarryRuane wants to merge 1 commits into bitcoin:master from LarryRuane:2023-05-resource-pool-system-alloc changing 4 files +52 −16
  1. LarryRuane commented at 3:32 am on May 25, 2023: contributor

    Follow-on to PR #25325, “Add pool based memory resource”

    The DynamicUsage() function for the version of unordered_map that uses the PoolAllocator returns a close approximation of the amount of physical memory used by the map. (This is the map used for the dbcache.) It accounts for several of the allocator’s internal data structures, such as the memory chunks and the freelist. It also includes the unordered_map’s bucket array (vector), which is a bit out of place because the rest of the pool allocator doesn’t know or make any assumption about the type of container that the pool allocator is being used for, namely, currently only std::unordered_map.

    This change could prevent a possible future pool memory usage calculation error, although it would likely be a small error: If the pool is configured with a large MAX_BLOCK_SIZE_BYTES, it could turn out that the size of the bucket array is small enough to be allocated by the resource allocator; it would then be double-counted.

    Another aspect of the pool allocator that this PR improves is a refactor of the pool allocator’s DynamicUsage() implementation, to move details about the pool allocator’s internal data structures out of memusage.h, which is a general source file, to the allocator’s pool.h.

  2. in src/support/allocators/pool.h:190 in 40a4a3adf9 outdated
    180+        } else if (sizeof(void*) == 4) {
    181+            return ((alloc + 15) >> 3) << 3;
    182+        } else {
    183+            assert(0);
    184+        }
    185+    }
    


    LarryRuane commented at 3:34 am on May 25, 2023:
    Copied this from https://github.com/bitcoin/bitcoin/blob/a13f3746dccd9c4ec16d6bfe9b33ebd26e3238e1/src/memusage.h#L50 because including that file would create a circular dependency. Is there a better way to do this?
  3. in src/test/pool_tests.cpp:58 in 40a4a3adf9 outdated
    53@@ -54,33 +54,40 @@ BOOST_AUTO_TEST_CASE(basic_allocating)
    54     PoolResourceTester::CheckAllDataAccountedFor(resource);
    55     BOOST_TEST(1 == PoolResourceTester::FreeListSizes(resource)[1]);
    56     BOOST_TEST(expected_bytes_available == PoolResourceTester::AvailableMemoryFromChunk(resource));
    57+    // the system allocates 32 bytes for an 8-byte allocation
    58+    BOOST_TEST(32 == resource.SystemBytes());
    


    LarryRuane commented at 3:37 am on May 25, 2023:
    This test may be somewhat fragile; it depends on MallocUsage(8) == 32. Maybe the test condition here should be 8 <= resource.SystemBytes(), but I think that doesn’t test enough.

    LarryRuane commented at 4:19 am on May 25, 2023:
    CI failed here (on 32-bit), so I force-pushed to make the test more precise.
  4. LarryRuane force-pushed on May 25, 2023
  5. LarryRuane commented at 4:17 am on May 25, 2023: contributor

    How much of a difference does this make? I ran IBD on Ubuntu using the default dbcache (450) and maxmempool (300), which gives a total cache size (during IBD) of about 750 MB, and this (missing) system allocation amount was 97 MB (96941536 bytes).

    With this amount of allocation omitted from the memory usage estimate (that is, current master), nodes will use 97 MB more memory than expected, or about 13% more than configured, a pretty significant error. This regression could cause surprises on low-memory systems.

    I also verified that the operating system (Ubuntu) reports the process using more memory on master than with this PR.

    0$ grep VmData /proc/$(pgrep bitcoind)/status
    

    That reports 1281764 kB on master and 1203800 on this PR branch. (This is after IBD had been running long enough to do at least one cache flush.) I’m not sure how reliable that is, or if that’s the correct value to look at (proc/status show many memory-related values).

    The top command reported the RES (resident) memory size as “1.7g” on master and “1.3g” on the PR branch, but that seems to be too much of a difference (about 400 MB), so take that with a grain of salt.

    It would be helpful if others could try to reproduce these results.

    cc @martinus @theStack @sipa @jonatack

  6. in src/memusage.h:186 in 576a270bbc outdated
    182@@ -183,7 +183,9 @@ static inline size_t DynamicUsage(const std::unordered_map<Key,
    183     size_t estimated_list_node_size = MallocUsage(sizeof(void*) * 3);
    184     size_t usage_resource = estimated_list_node_size * pool_resource->NumAllocatedChunks();
    185     size_t usage_chunks = MallocUsage(pool_resource->ChunkSizeBytes()) * pool_resource->NumAllocatedChunks();
    186-    return usage_resource + usage_chunks + MallocUsage(sizeof(void*) * m.bucket_count());
    187+    size_t usage_system = pool_resource->SystemBytes();
    


    martinus commented at 4:50 am on May 25, 2023:

    I don’t think that calculation is now more correct, in fact this now counts the index array twice; once in usage_system and once in the estimated usage_buckets.

    The most correct way is probably to just let the pool do all the malloc & free accounting, and don’t do any other estimation. Then the DynamicUsage method boils down to something like return m.get_allocator().resource().memory_usage();. That would give more theoretically correct numbers, but in practice I doubt that this is much different from what we already have.

    I think the current inaccuray comes from either the mystery behind glibc & how the OS kernel does its thing, or some other data structures are not correctly accounted for.


    LarryRuane commented at 5:24 am on May 25, 2023:

    I see, yes, it’s counting the buckets twice, thanks. I agree that either way gives the same answer; the way I’m suggesting here would isolate this pool code a little more from the details of std::unordered_map (i.e., the existence of the m.bucket_count() method), which would be a little more general, but I think the way it is now is fine. I guess if we use this pool resource for a different kind of container (for example, an ordered map), then this PR’s approach would be better since we wouldn’t have to understand the internals of, say, std::map.

    I’ll go ahead and close this now, we can keep it in mind for possible future use.


    martinus commented at 6:07 am on May 25, 2023:
    I agree, also with current memory accounting code its not possible to use the same memory resource for multiple containers. So it would be nice to add full accounting to the pool, and then remove the special handling of std::unordered_map with pool and just count the ressource’s total usage.
  7. LarryRuane commented at 5:25 am on May 25, 2023: contributor
    Closing at least for now, see the comment above.
  8. LarryRuane closed this on May 25, 2023

  9. martinus commented at 5:38 am on May 25, 2023: contributor
    Also related: #26614
  10. DrahtBot commented at 6:06 am on May 25, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28939 (memusage: let PoolResource keep track of all allocated/deallocated memory by martinus)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. LarryRuane renamed this:
    bugfix: account for system-allocated memory in pool resource
    refactor: generalize accounting of system-allocated memory in pool resource
    on May 26, 2023
  12. DrahtBot added the label Refactoring on May 26, 2023
  13. LarryRuane renamed this:
    refactor: generalize accounting of system-allocated memory in pool resource
    util: generalize accounting of system-allocated memory in pool resource
    on May 26, 2023
  14. LarryRuane commented at 10:43 pm on May 26, 2023: contributor
    I decided to reopen this; I think it’s a good generalization improvement to the pool allocator. See the description for details. Thanks, @martinus, for the suggested improvement!
  15. LarryRuane commented at 10:49 pm on May 26, 2023: contributor

    GitHub isn’t allowing me to reopen this; @achow101 could you re-open this for me, please? Or, if you’d prefer, I could make a new PR. Thanks!

    TL;DR on this PR’s history: When I created this PR, I thought I was fixing a bug. But @martinus pointed out that there is no bug, so I closed this PR. I then decided this would be a worthwhile improvement after all (although of course not trying to fix a bug), so I’d like it to be reopened.

  16. ryanofsky commented at 11:20 pm on May 28, 2023: contributor

    GitHub isn’t allowing me to reopen this

    In order to reopen the PR, the branch needs to point at the same commit it was pointing to at the time the PR was closed.

    Right now the branch https://github.com/LarryRuane/bitcoin/commits/2023-05-resource-pool-system-alloc points to d25b54346fed931830cf3f538b96c5c346165487, but the PR points to 576a270bbcbb21725770b07ba32af8e0c8965f44. If you temporarily force push 576a270bbcbb21725770b07ba32af8e0c8965f44 to the branch, I think you should be able to reopen the PR.

  17. LarryRuane commented at 3:23 am on May 29, 2023: contributor
    Thanks, Ryan!
  18. LarryRuane reopened this on May 29, 2023

  19. LarryRuane force-pushed on May 29, 2023
  20. DrahtBot added the label CI failed on May 29, 2023
  21. in src/support/allocators/pool.h:180 in d25b54346f outdated
    173@@ -164,6 +174,21 @@ class PoolResource final
    174         m_allocated_chunks.emplace_back(m_available_memory_it);
    175     }
    176 
    177+    // copied from src/memusage.h
    178+    size_t MallocUsage(size_t alloc) const
    179+    {
    180+        // Measured on libc6 2.19 on Linux.
    


    fanquake commented at 9:54 am on May 29, 2023:
    Not sure what value this comment adds. Also don’t see a reason to mention a version of glibc that we don’t support?

    LarryRuane commented at 4:41 pm on May 29, 2023:

    I copied this function from memusage.h (here). I’m hoping not to duplicate this function, even though it’s pretty small. pool.h can’t include memusage.h because that creates a circular dependency. I’ll try to figure out a way to improve this and also of course I welcome suggestions.

    Anyway, if we need to duplicate this function, I’ll remove this comment, I agree it’s not useful.


    martinus commented at 7:18 am on May 30, 2023:
    About that function, I do not think that it is accurate any more. I wrote a different version of that estimation that makes use of malloc_usable_size, and that way I get different values, see here: https://godbolt.org/z/fde6T7srb It might be worth having a look if using that estimation is more accurate

    LarryRuane commented at 9:51 pm on May 31, 2023:

    Thanks, I got a slightly different result – malloc_usable_size() + sizeof(void*) is perfect (note adding only sizeof(void*) instead of 2 times that). I’m not set up for 32-bit, can you try running this test program on 32-bit if you are able to easily?

    Whichever is the most accurate, would a PR to improve memusage.h: MallocUsage() be worth doing? (I would say a separate PR would be better than doing that here; then this PR could be based on that one.)

     0// malloc space usage estimation
     1//
     2// g++ -std=c++17 -m64 t.cpp
     3//
     4// See:
     5// [#27748 (review)](/bitcoin-bitcoin/27748/#discussion_r1209810958)
     6//
     7// This program does a bunch of memory allocations of various
     8// sizes, and calculates the memory usage increase using /proc.
     9//
    10// For each allocation size, it prints:
    11//   allocation size
    12//   malloc_usable_size() + sizeof(void*)
    13//   empirical estimate
    14//   empirical actual
    15//
    16// This seems to show that, at least for x86_64, an accurate estimate is:
    17//
    18//   malloc_usable_size() + sizeof(void*)
    19//
    20// I don't know why we need to add sizeof one void* instead of two.
    21// I didn't try this with -m32 (32-bit model).
    22// (LarryRuane@gmail.com)
    23//
    24#include <malloc.h>
    25#include <algorithm>
    26#include <iostream>
    27#include <cstdio>
    28#include <cstdlib>
    29#include <fstream>
    30#include <unistd.h>
    31
    32size_t heapsize()
    33{
    34    std::ifstream maps("/proc/" + std::to_string(getpid()) + "/maps");
    35    std::string line;
    36    while (std::getline(maps, line)) {
    37        if (line.find("[heap]") == std::string::npos) continue;
    38        // example:
    39        // 563579244000-56358c384000 rw-p 00000000 00:00 0 [heap]
    40        std::string::size_type dash_pos{line.find('-')};
    41        std::string::size_type space_pos{line.find(' ')};
    42        std::string start_str{line.substr(0, dash_pos)};
    43        std::string end_str{line.substr(dash_pos+1, space_pos-dash_pos-1)};
    44
    45        size_t start{std::stoul(start_str, nullptr, 16)};
    46        size_t end{std::stoul(end_str, nullptr, 16)};
    47        return end - start;
    48    }
    49    return 0;
    50}
    51
    52constexpr size_t n_allocations{100'000};
    53int main(int argc, char** argv) {
    54    for (size_t s{0}; s < 120; ++s) {
    55        size_t start{heapsize()};
    56        void* p;
    57        for (int i = 0; i < n_allocations; ++i)
    58            p = malloc(s);
    59        size_t end{heapsize()};
    60        double actual{double(end-start) / n_allocations};
    61
    62        // empirically-derived, 64-bit linux x86_64
    63        size_t estimate{std::max<size_t>(32, ((s+7)/16+1)*16)};
    64        std::cout << s << ' '
    65            << malloc_usable_size(p) + sizeof(void*) << ' '
    66            << estimate << ' '
    67            << actual << '\n';
    68    }
    69    return 0;
    70}
    

    sipa commented at 1:51 pm on June 3, 2023:

    This is what I measure (GCC 12.2.0, glibc 2.37, x86_64 and i686):

    • 64-bit:

      • 1-24: 32
      • 25-40: 48
      • 41-56: 64
      • 57-72: 80
      • 73-88: 96
      • Rule: max(32, (X+23) & ~15)
    • 32-bit:

      • 1-12: 16
      • 13-28: 32
      • 29-44: 48
      • 45-60: 64
      • 61-76: 80
      • Rule: (X+19) & ~15
  22. DrahtBot commented at 9:50 am on August 8, 2023: contributor

    There hasn’t been much activity lately. What is the status here?

    Finding reviewers may take time. However, if the patch is no longer relevant, please close this pull request. If the author lost interest or time to work on this, please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.

  23. achow101 commented at 4:45 pm on September 20, 2023: member
    Needs rebase if still relevant.
  24. util: generalize accounting of system-allocated memory in pool resource
    Follow-on to PR 25325, "Add pool based memory resource". No functional
    change.
    
    - Track non-internal system memory allocation (new and free) in m_system_alloc_bytes
    - Move DynamicUsage() implementation details from memusage.h to pool.h
    5d079aebc4
  25. LarryRuane force-pushed on Sep 21, 2023
  26. LarryRuane commented at 6:19 pm on September 21, 2023: contributor
    This is still relevant, rebased, but changing to Draft because it should build on another PR I’m working on to implement #27748 (review).
  27. LarryRuane marked this as a draft on Sep 21, 2023
  28. martinus commented at 7:04 am on November 20, 2023: contributor
    @LarryRuane are you still working on this? I think this shouldn’t be blocked by #28531.
  29. jonatack commented at 4:31 pm on November 28, 2023: member
    Question @martinus @LarryRuane, is #28939 an alternative to this PR?
  30. martinus commented at 5:37 pm on November 28, 2023: contributor

    The PRs are very similar and achieve the same thing, with a bit different styles. Some differences:

    • Here only system memory is continuously tracked and the rest is calculated on demand. In #28939 I track everything continuously. The result should be the same
    • In #28939 I moved MallocUsage into a separate function to avoid code duplication
    • I added another unit test in #28939 and fixed failing tests due to the new calculation (sometimes the old calculation understimated things)
  31. LarryRuane commented at 6:28 pm on November 28, 2023: contributor
    Your PR #28939 is definitely better, I especially like tracking total allocation continuously. I’ll go ahead and close this.
  32. LarryRuane closed this on Nov 28, 2023

  33. bitcoin locked this on Nov 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-04 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me