coins: make sure PoolAllocator uses the correct alignment #28913

pull martinus wants to merge 2 commits into bitcoin:master from martinus:2023-11-fix-pool-allocation-alignment-on-ARM changing 4 files +17 −15
  1. martinus commented at 4:25 pm on November 19, 2023: contributor

    The class CTxOut has a member CAmount which is an int64_t, and on ARM 32bit int64_t are 8 byte aligned, which is larger than the pointer alignment of 4 bytes.

    So for CCoinsMap to be able to use the pool, we need to use the alignment of the member instead of just alignof(void*).

    This fixes #28906 (first noted in #28718 (comment)) and #28440.

  2. DrahtBot commented at 4:25 pm on November 19, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK pinheadmz, hebasto, theStack
    Concept ACK pablomartin4btc
    Stale ACK sipa

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label UTXO Db and Indexes on Nov 19, 2023
  4. mzumsande commented at 5:14 pm on November 19, 2023: contributor
    The pattern of alignof(void*) was used in 3 places in #25325 (also bench and tests) - looks like those other spots could be changed as well? (just a superficial observation, I didn’t review that PR)
  5. fanquake added this to the milestone 26.0 on Nov 19, 2023
  6. hebasto commented at 5:30 pm on November 19, 2023: member

    The pattern of alignof(void*) was used in 3 places in #25325 (also bench and tests) - looks like those other spots could be changed as well? (just a superficial observation, I didn’t review that PR)

    Test doesn’t fail now. Maybe replace int with int64_t?

  7. martinus commented at 5:40 pm on November 19, 2023: contributor

    The pattern of alignof(void*) was used in 3 places

    Thanks, I forgot about these… I’ll change the PoolAllocator to default to the alignment of the given type, that makes much more sense and reduces the code.

    Test doesn’t fail now. Maybe replace int with int64_t?

    Good idea, I’ll change that too

  8. pool: make sure PoolAllocator uses the correct alignment
    This changes the PoolAllocator to default the alignment to the given type. This makes the code simpler, and most importantly
    fixes a bug on ARM 32bit that caused OOM: The class CTxOut has a member CAmount which is an int64_t and on ARM 32bit int64_t
    are 8 byte aligned which is larger than the pointer alignment of 4 bytes. So for CCoinsMap to be able to use the pool, we
    need to use the alignment of the member instead of just alignof(void*).
    ce881bf9fc
  9. martinus force-pushed on Nov 19, 2023
  10. hebasto commented at 6:08 pm on November 19, 2023: member
    Concept ACK.
  11. pinheadmz commented at 7:19 pm on November 19, 2023: member
    concept ACK running this branch on my RPi same as #28718 (comment) will update
  12. sipa commented at 7:28 pm on November 19, 2023: member
    utACK 486c2916a18b56d6011117077d906c1f8a81623f
  13. DrahtBot requested review from pinheadmz on Nov 19, 2023
  14. DrahtBot requested review from hebasto on Nov 19, 2023
  15. pinheadmz commented at 0:44 am on November 20, 2023: member
    Up to height 376554 now which is way further than I crashed at on 26rc2. htop says RAM is only about 890MB used. Are there any RPCs I should check?
  16. maflcko added the label Needs backport (26.x) on Nov 20, 2023
  17. hebasto approved
  18. hebasto commented at 9:39 am on November 20, 2023: member

    ACK 486c2916a18b56d6011117077d906c1f8a81623f, tested on ARM 32-bit (ODROID-HC1 + Armbian 23.8.3 jammy) both native and Guix binaries.

    The first commit should not introduce any behaviour change in any other release platforms as they are 64-bit and it holds that alignof(T) == alignof(void*) == 8 for all of them.

    I agree with the second commit diff. It indeed handles the case where “int64_t is aligned to 8 bytes but void* is aligned to 4 bytes”. But, due do its design flaw, the test still not failing in such a case, i.e., on ARM 32-bit, if the first commit were reverted. I believe that the reason of that is the test does not cover cases where IsFreeListUsable returns false only due to the alignment <= ELEM_ALIGN_BYTES condition fails while the bytes <= MAX_BLOCK_SIZE_BYTES condition still holds. It can be improved in a follow up.

    nit: The 486c2916a18b56d6011117077d906c1f8a81623f commit message has a stray substring in the end.

    Suggesting for backporting to the 26.x branch. EDIT: nm, already tagged :)

  19. martinus commented at 10:24 am on November 20, 2023: contributor
    @pinheadmz I don’t think there’s a need to test any any RPC calls, if sync doesn’t go OOM it means it is now working :)
  20. pinheadmz commented at 3:16 pm on November 20, 2023: member

    confirmed unit passes on arm32 but also passes without the actual bugfix commit. so effectively this on top of d752349029:

     0commit 2c7f49e83a5d05a5ba8b29d52108c14bf4d65af7
     1Author: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
     2Date:   Sun Nov 19 18:43:15 2023 +0100
     3
     4    pool: change memusage_test to use int64_t
     5    
     6    If alignment of the PoolAllocator would be insufficient, then the test would fail. This also catches the issue with ARM 32bit, where int64_t is aligned to 8 bytes but void* is aligned to 4 bytes.catches cause the ARM 32bit tests to fail when a
     7
     8diff --git a/src/test/pool_tests.cpp b/src/test/pool_tests.cpp
     9index 8a07e09a44..9590f80a33 100644
    10--- a/src/test/pool_tests.cpp
    11+++ b/src/test/pool_tests.cpp
    12@@ -156,21 +156,21 @@ BOOST_AUTO_TEST_CASE(random_allocations)
    13 
    14 BOOST_AUTO_TEST_CASE(memusage_test)
    15 {
    16-    auto std_map = std::unordered_map<int, int>{};
    17-
    18-    using Map = std::unordered_map<int,
    19-                                   int,
    20-                                   std::hash<int>,
    21-                                   std::equal_to<int>,
    22-                                   PoolAllocator<std::pair<const int, int>,
    23-                                                 sizeof(std::pair<const int, int>) + sizeof(void*) * 4,
    24+    auto std_map = std::unordered_map<int64_t, int64_t>{};
    25+
    26+    using Map = std::unordered_map<int64_t,
    27+                                   int64_t,
    28+                                   std::hash<int64_t>,
    29+                                   std::equal_to<int64_t>,
    30+                                   PoolAllocator<std::pair<const int64_t, int64_t>,
    31+                                                 sizeof(std::pair<const int64_t, int64_t>) + sizeof(void*) * 4,
    32                                                  alignof(void*)>>;
    33     auto resource = Map::allocator_type::ResourceType(1024);
    34 
    35     PoolResourceTester::CheckAllDataAccountedFor(resource);
    36 
    37     {
    38-        auto resource_map = Map{0, std::hash<int>{}, std::equal_to<int>{}, &resource};
    39+        auto resource_map = Map{0, std::hash<int64_t>{}, std::equal_to<int64_t>{}, &resource};
    40 
    41         // can't have the same resource usage
    42         BOOST_TEST(memusage::DynamicUsage(std_map) != memusage::DynamicUsage(resource_map));
    
  21. hebasto commented at 3:53 pm on November 20, 2023: member
    If anyone consider the second commit as a blocker, it might be dropped from this PR, no?
  22. pool: change memusage_test to use int64_t, add allocation check
    If alignment of the PoolAllocator would be insufficient, then the test would fail. This also catches the issue with ARM 32bit,
    where int64_t is aligned to 8 bytes but void* is aligned to 4 bytes. The test adds a check to ensure the pool has allocated
    a minimum number of chunks
    d5b4c0b69e
  23. martinus force-pushed on Nov 20, 2023
  24. martinus commented at 4:14 pm on November 20, 2023: contributor
    I’ve updated the test in the PR, this time I think it should have caught the issue. It should be reproducible by adding an alignof(void*) back as the PoolAllocator’s last template argument
  25. hebasto commented at 4:36 pm on November 20, 2023: member
    A failure in the “Win64 native, VS 2022” CI job is unrelated. See: #28905.
  26. theStack commented at 5:35 pm on November 20, 2023: contributor

    Concept ACK

    Due to a lack of real hardware I’ve set up an emulated 32-bit ARM system using qemu, maybe someone finds this useful: https://gist.github.com/theStack/6eaeadd3fa1e521ed038b1ed93c101d6

    (Of course, the obvious drawback is that it’s very sloooow.)

  27. pinheadmz approved
  28. pinheadmz commented at 6:29 pm on November 20, 2023: member

    ACK d5b4c0b69e543de51bb37d602d488ee0949ba185

    Synced up to height 472600 on ARM 32 no issues, < 1 GB RAM used. Also confirmed the test fails without the patch:

    test/pool_tests.cpp(189): error: in "pool_tests/memusage_test": check resource.NumAllocatedChunks() >= min_num_allocated_chunks has failed [1 < 157]

    Great catch, great work!

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA256
     2
     3ACK d5b4c0b69e543de51bb37d602d488ee0949ba185
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmVbpSoACgkQ5+KYS2KJ
     7yTrHlw/9ERNwDT6jLISX7vGjXTY2/wxzItPlJbUBk6ieYHzAny0jvXm/NdxNMLpi
     8j5HUWX/gPAHOPz5csX7lKA0VAgIEfeBd5llMSI1H8nvQOwiyKFKHE8ijhyJh0hoS
     9oBW+lLbZwxp3tBAl6QHeiVYD4NG0j2BX66qvmfmAzPPYIEvqS/4x1JzUpLztRRgv
    10nukx4XjPIAecQO1bK1aQZoQFBgHziQYeUNEwsvdJ6eiWHlJx2iUgZPNikGP43pGr
    11cEPra1rQEz9MaRePX0FQvgky66ldn+MdoAL1xxnKhuINpX1xR0xI2M2FLYfPDxDG
    12IYz8gxmFAgWksc0gQt7OD0fIXrLVADWwc4QwBB+pCEereZC5by456/V6VCe8NXeD
    134DeEyvFygaMVVrv1CeliJ08Tu2tFbf9uDOpkhWliEpEyf62cCJaOK1hT1iVDEOZC
    14ZadYowUeTSObRv+brITAqOBaVIWhufGU4PfKgifj6ndB/Q8UD+p8qMKN6x+fcVvs
    15UxWctlnNsgMcXX20cL54Kbspd44RiKvpA5i8JZXWh8QCsO5+k9L/IQV3vf27duMs
    16jH8gOQN8tK9+W89e9M1nTtIWf/M/RPQdgDZfQjkJwBLg9x2daPv6RcSkUyS8VPz1
    17po4PF2mVKvzq9xa07Q7PnblISl7Eb8MDgH/TJ/g+cSAw62/h0gM=
    18=OBRK
    19-----END PGP SIGNATURE-----
    

    pinheadmz’s public key is on keybase

  29. DrahtBot requested review from theStack on Nov 20, 2023
  30. DrahtBot requested review from sipa on Nov 20, 2023
  31. DrahtBot requested review from hebasto on Nov 20, 2023
  32. hebasto approved
  33. hebasto commented at 8:15 pm on November 20, 2023: member

    re-ACK d5b4c0b69e543de51bb37d602d488ee0949ba185, the only change since my recent review is an updated test.

    I’ve tried to break the updated test with the diff as follows:

     0--- a/src/test/pool_tests.cpp
     1+++ b/src/test/pool_tests.cpp
     2@@ -163,7 +163,8 @@ BOOST_AUTO_TEST_CASE(memusage_test)
     3                                    std::hash<int64_t>,
     4                                    std::equal_to<int64_t>,
     5                                    PoolAllocator<std::pair<const int64_t, int64_t>,
     6-                                                 sizeof(std::pair<const int64_t, int64_t>) + sizeof(void*) * 4>>;
     7+                                                 sizeof(std::pair<const int64_t, int64_t>) + sizeof(void*) * 4,
     8+                                                 alignof(void*)>>;
     9     auto resource = Map::allocator_type::ResourceType(1024);
    10 
    11     PoolResourceTester::CheckAllDataAccountedFor(resource);
    

    It passes on x86_64 and fails on ARM 32-bit as expected.

  34. pablomartin4btc commented at 9:53 pm on November 20, 2023: member
    Concept ACK
  35. theStack approved
  36. theStack commented at 2:36 pm on November 21, 2023: contributor

    Tested ACK d5b4c0b69e543de51bb37d602d488ee0949ba185

    Using the emulated 32-bit ARM machine using qemu and Debian bookworm for armhf, I could both reproduce issue #28906 on master and verify the fix in this PR. Having 2GB of RAM available without swap memory, bitcoind failed due to OOM on block height 322923 on the master branch, as the coins flush never happened up to that height due to an underestimation of the cache size. On the PR branch, the flushes happen regularly before that height as expected, and the RAM usage never exceeds 0.9 GB.

    The code looks correct and as mentioned by hebasto above, shouldn’t have any effect on the other architectures (all of them being 64-bit) that we support.

    I also verified that the modified memusage_test succeeds on the PR branch and fails if cherry-picked on master.

    Thanks for the fix!

  37. DrahtBot requested review from pablomartin4btc on Nov 21, 2023
  38. fanquake merged this on Nov 22, 2023
  39. fanquake closed this on Nov 22, 2023

  40. fanquake removed the label Needs backport (26.x) on Nov 22, 2023
  41. fanquake commented at 11:33 am on November 22, 2023: member
    Will be backported in #28872.
  42. fanquake referenced this in commit bcc183ccce on Nov 22, 2023
  43. fanquake referenced this in commit 1488648104 on Nov 22, 2023
  44. pablomartin4btc approved
  45. pablomartin4btc commented at 3:17 pm on November 22, 2023: member

    Tested ACK d5b4c0b69e543de51bb37d602d488ee0949ba185

     02023-11-22T08:24:03Z UpdateTip: new best=00000000000000001771de0c2dc0ff95d619943d4160308c93f9184fcee8d5f8 height=318617 version=0x00000002 log2_work=80.467750 tx=45766401 date='2014-09-01T14:44:48Z' progress=0.049383 cache=107.7MiB(14296037txo)
     1[44951.075682] b-addcon invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
     2[44951.078750] CPU: 0 PID: 15497 Comm: b-addcon Not tainted 6.1.0-13-armmp-lpae [#1](/bitcoin-bitcoin/1/)  Debian 6.1.55-1
     3[44951.078960] Hardware name: Generic DT based system
     4[44951.079486]  unwind_backtrace from show_stack+0x18/0x1c
     5[44951.080028]  show_stack from dump_stack_lvl+0x40/0x4c
     6[44951.080112]  dump_stack_lvl from dump_header+0x50/0x1f4
     7[44951.080193]  dump_header from oom_kill_process+0x22c/0x238
     8[44951.080260]  oom_kill_process from out_of_memory+0x218/0x34c
     9[44951.080327]  out_of_memory from __alloc_pages+0xe64/0x1088
    10[44951.080394]  __alloc_pages from __folio_alloc+0x18/0x3c
    11[44951.080576]  __folio_alloc from __filemap_get_folio+0x160/0x40c
    12[44951.080654]  __filemap_get_folio from filemap_fault+0x25c/0x9c4
    13[44951.080723]  filemap_fault from __do_fault+0x44/0x158
    14[44951.080814]  __do_fault from handle_mm_fault+0xd38/0x12f4
    15[44951.080906]  handle_mm_fault from do_page_fault+0x17c/0x324
    16[44951.080998]  do_page_fault from do_PrefetchAbort+0x38/0x88
    17[44951.081087]  do_PrefetchAbort from ret_from_exception+0x0/0x28
    18[44951.081213] Exception stack(0xf0a89fb0 to 0xf0a89ff8)
    19[44951.081415] 9fa0:                                     01c70ed8 00000081 00000001 00000000
    20[44951.081536] 9fc0: 00000000 00000001 af3bdb88 00000013 6dc0dff0 00000013 6dc0e008 af3bdc6c
    21[44951.081656] 9fe0: 000000f0 af3bda9c b6bf378f b6bb7610 60030030 ffffffff
    22[44951.081995] Mem-Info:
    23[44951.082176] active_anon:57 inactive_anon:497856 isolated_anon:0
    24[44951.082176]  active_file:14 inactive_file:33 isolated_file:0
    25[44951.082176]  unevictable:64 dirty:0 writeback:0
    26[44951.082176]  slab_reclaimable:2360 slab_unreclaimable:2143
    27[44951.082176]  mapped:3 shmem:94 pagetables:1168
    28[44951.082176]  sec_pagetables:0 bounce:0
    29[44951.082176]  kernel_misc_reclaimable:0
    30[44951.082176]  free:7475 free_pcp:372 free_cma:0
    31[44951.082844] Node 0 active_anon:228kB inactive_anon:1991424kB active_file:56kB inactive_file:132kB unevictable:256kB isolated(anon):0kB isolated(file):0kB mapped:12kB dirty:0kB writeback:0kB shmem:376kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1402880kB writeback_tmp:0kB kernel_stack:584kB pagetables:4672kB sec_pagetables:0kB all_unreclaimable? no
    32[44951.083418] DMA free:29468kB boost:0kB min:22528kB low:28160kB high:33792kB reserved_highatomic:2048KB active_anon:0kB inactive_anon:689528kB active_file:80kB inactive_file:132kB unevictable:0kB writepending:0kB present:786432kB managed:747580kB mlocked:0kB bounce:0kB free_pcp:724kB local_pcp:724kB free_cma:0kB
    33[44951.083926] lowmem_reserve[]: 0 0 1280 1280
    34[44951.084301] HighMem free:432kB boost:0kB min:512kB low:10724kB high:20936kB reserved_highatomic:0KB active_anon:228kB inactive_anon:1301896kB active_file:20kB inactive_file:40kB unevictable:256kB writepending:0kB present:1310720kB managed:1310720kB mlocked:256kB bounce:0kB free_pcp:764kB local_pcp:764kB free_cma:0kB
    35[44951.084665] lowmem_reserve[]: 0 0 0 0
    36[44951.084788] DMA: 139*4kB (UMEH) 226*8kB (UME) 174*16kB (UME) 100*32kB (ME) 98*64kB (UME) 62*128kB (ME) 27*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 29468kB
    37[44951.085242] HighMem: 0*4kB 0*8kB 11*16kB (U) 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 432kB
    38[44951.085518] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
    39[44951.085660] 150 total pagecache pages
    40[44951.085742] 0 pages in swap cache
    41[44951.085802] Free swap  = 0kB
    42[44951.085858] Total swap = 0kB
    43[44951.085946] 524288 pages RAM
    44[44951.085997] 327680 pages HighMem/MovableOnly
    45[44951.086059] 9713 pages reserved
    46[44951.086117] 4096 pages cma reserved
    47[44951.086203] Tasks state (memory values in pages):
    48[44951.086277] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
    49[44951.086484] [    212]     0   212     5251      182    53248        0         -1000 systemd-udevd
    50[44951.086657] [    267]   997   267     4934      150    49152        0             0 systemd-timesyn
    51[44951.086788] [    290]     0   290      946      146    32768        0             0 dhclient
    52[44951.086912] [    292]     0   292     1766       46    45056        0             0 cron
    53[44951.087031] [    293]   100   293     1551      113    36864        0          -900 dbus-daemon
    54[44951.087160] [    298]     0   298     2752      177    45056        0             0 systemd-logind
    55[44951.087286] [    319]     0   319     1001       14    32768        0             0 agetty
    56[44951.087415] [    327]     0   327     2624      177    49152        0         -1000 sshd
    57[44951.087540] [   1790]     0  1790     2152      107    40960        0             0 login
    58[44951.087661] [   1797]  1000  1797     3053      242    49152        0           100 systemd
    59[44951.087839] [   1799]  1000  1799     7818      490    57344        0           100 (sd-pam)
    60[44951.087982] [   1814]  1000  1814     1970      242    40960        0             0 bash
    61[44951.088089] [  15486]  1000 15486   530244   495348  4116480        0             0 bitcoind-master
    62[44951.088208] [  15657]     0 15657     4699      157    57344        0          -250 systemd-journal
    63[44951.088338] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-4.scope,task=bitcoind-master,pid=15486,uid=1000
    64[44951.089813] Out of memory: Killed process 15486 (bitcoind-master) total-vm:2120976kB, anon-rss:1981392kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:4020kB oom_score_adj:0
    65Killed
    

    I coudn’t reproduce the issue using this PR branch.

    Verifying the issue is not happening in x86 64bits even forcing the modified test as mentioned above by other reviewers.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 19:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me