validation: fix broken maxtipage comparison #26477

pull jamesob wants to merge 2 commits into bitcoin:master from jamesob:2022-11-fix-maxtipage changing 2 files +15 −10
  1. jamesob commented at 5:10 pm on November 9, 2022: member

    Since https://github.com/bitcoin/bitcoin/commit/faf44876db555f7488c8df96db9fa88b793f897c, the maxtipage comparison in IsInitialBlockDownload() has been broken, since the NodeClock::now() time_point is in the system’s native denomination (nanoseconds).

    Without this patch, specifying the maximum allowable -maxtipage (9223372036854775807) results in a SIGABRT crash:

     0% gdb --args ./src/bitcoind -maxtipage=9223372036854775207 -minimumchainwork=0x00 -stopatheight=30000
     1...
     22022-11-09T15:55:17Z [dnsseed] dnsseed thread exit
     3[Thread 0x7fff937fe640 (LWP 69883) exited]
     4
     5Thread 29 "b-msghand" received signal SIGABRT, Aborted.
     6[Switching to Thread 0x7fff91ffb640 (LWP 69886)]
     7__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
     844      ./nptl/pthread_kill.c: No such file or directory.
     9(gdb) bt
    10[#0](/bitcoin-bitcoin/0/)  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
    11[#1](/bitcoin-bitcoin/1/)  0x00007ffff768989f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
    12[#2](/bitcoin-bitcoin/2/)  0x00007ffff763da52 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
    13[#3](/bitcoin-bitcoin/3/)  0x00007ffff7628469 in __GI_abort () at ./stdlib/abort.c:79
    14[#4](/bitcoin-bitcoin/4/)  0x00007ffff7cf79a4 in __mulvdi3 () from /lib/x86_64-linux-gnu/libgcc_s.so.1
    15[#5](/bitcoin-bitcoin/5/)  0x00005555558d13ab in std::chrono::__duration_cast_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> >, std::ratio<1000000000l, 1l>, long, false, true>::__cast<long, std::ratio<1l, 1l> > (__d=...) at /usr/include/c++/12/bits/chrono.h:521
    16[#6](/bitcoin-bitcoin/6/)  std::chrono::duration_cast<std::chrono::duration<long, std::ratio<1l, 1000000000l> >, long, std::ratio<1l, 1l> > (__d=...)
    17    at /usr/include/c++/12/bits/chrono.h:260
    18[#7](/bitcoin-bitcoin/7/)  std::chrono::duration<long, std::ratio<1l, 1000000000l> >::duration<long, std::ratio<1l, 1l>, void> (__d=..., this=<optimized out>)
    19    at /usr/include/c++/12/bits/chrono.h:514
    20[#8](/bitcoin-bitcoin/8/)  std::chrono::operator-<long, std::ratio<1l, 1000000000l>, long, std::ratio<1l, 1l> > (__rhs=..., __lhs=...)
    21    at /usr/include/c++/12/bits/chrono.h:650
    22[#9](/bitcoin-bitcoin/9/)  std::chrono::operator-<NodeClock, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, long, std::ratio<1l, 1l> > (__rhs=...,
    23    __lhs=...) at /usr/include/c++/12/bits/chrono.h:1020
    24[#10](/bitcoin-bitcoin/10/) Chainstate::IsInitialBlockDownload (this=0x555556071940) at ./src/validation.cpp:1545
    25[#11](/bitcoin-bitcoin/11/) 0x00005555556efd1e in operator() (__closure=<optimized out>) at ./src/net_processing.cpp:3369
    26[#12](/bitcoin-bitcoin/12/) (anonymous namespace)::PeerManagerImpl::ProcessMessage (this=0x555556219be0, pfrom=..., msg_type=..., vRecv=..., time_received=...,
    27    interruptMsgProc=...) at ./src/net_processing.cpp:3369
    28[#13](/bitcoin-bitcoin/13/) 0x00005555556f75cc in (anonymous namespace)::PeerManagerImpl::ProcessMessages (this=0x555556219be0, pfrom=<optimized out>,
    29    interruptMsgProc=std::atomic<bool> = { false }) at ./src/net_processing.cpp:4985
    30[#14](/bitcoin-bitcoin/14/) 0x00005555556a83c9 in CConnman::ThreadMessageHandler (this=0x5555560ebc70) at ./src/net.cpp:2014
    31[#15](/bitcoin-bitcoin/15/) 0x0000555555c4d5d6 in std::function<void ()>::operator()() const (this=0x7fff91ffadb0) at /usr/include/c++/12/bits/std_function.h:591
    32[#16](/bitcoin-bitcoin/16/) util::TraceThread(std::basic_string_view<char, std::char_traits<char> >, std::function<void ()>) (
    33    thread_name="0\255\377\221\377\177\000\000\v\000\000\000\000\000\000\000TraceThread\000\000\000\000\000P\255\377\221\377\177\000\000\017\000\000\000\000\000\000\000util/thread.cpp\000\000\000\000\000\000\000\000\000\000ihB鵿6\000\000\000\000\000\000\000\000\260\255\377\221\377\177\000\000\277\211\321UUU\000\000p\324\304UUU\000\000\002\000\000\000\000\000\000\000\240xh\367\377\177\000\000\000\000\000\000\000\000\000\000]\340iUUU\000\000p\274\016VUU\000\000\000\000\000\000\000\000\000\000\300\303iUUU\000\000p\206jUUU", '\000' <repeats 11 times>, "ihB鵿6\200\251!VUU\000\000"..., thread_func=...) at util/thread.cpp:21
    34[#17](/bitcoin-bitcoin/17/) 0x000055555569e05d in std::__invoke_impl<void, void (*)(std::basic_string_view<char>, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > (__f=<optimized out>) at /usr/include/c++/12/bits/invoke.h:61
    35[#18](/bitcoin-bitcoin/18/) std::__invoke<void (*)(std::basic_string_view<char>, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > (__fn=<optimized out>) at /usr/include/c++/12/bits/invoke.h:96
    36[#19](/bitcoin-bitcoin/19/) std::thread::_Invoker<std::tuple<void (*)(std::basic_string_view<char, std::char_traits<char> >, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > >::_M_invoke<0, 1, 2> (this=<optimized out>) at /usr/include/c++/12/bits/std_thread.h:252
    37[#20](/bitcoin-bitcoin/20/) std::thread::_Invoker<std::tuple<void (*)(std::basic_string_view<char, std::char_traits<char> >, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > >::operator() (this=<optimized out>) at /usr/include/c++/12/bits/std_thread.h:259
    38[#21](/bitcoin-bitcoin/21/) std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(std::basic_string_view<char, std::char_traits<char> >, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > > >::_M_run(void) (this=<optimized out>)
    39    at /usr/include/c++/12/bits/std_thread.h:210
    40[#22](/bitcoin-bitcoin/22/) 0x00007ffff7ad43d3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
    41[#23](/bitcoin-bitcoin/23/) 0x00007ffff7687b27 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:435
    42[#24](/bitcoin-bitcoin/24/) 0x00007ffff770a78c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    43(gdb)
    
  2. jamesob force-pushed on Nov 9, 2022
  3. DrahtBot added the label Validation on Nov 9, 2022
  4. jamesob force-pushed on Nov 9, 2022
  5. in test/functional/test_framework/util.py:261 in b3a7afcd28 outdated
    259-        timeout = 60
    260-    timeout = timeout * timeout_factor
    261+    timeout_float = float(timeout if timeout is not None else 'inf')
    262+    if attempts == float('inf') and timeout_float == float('inf'):
    263+        timeout_float = 60
    264+    timeout_float = timeout_float * timeout_factor
    


    jamesob commented at 7:01 pm on November 9, 2022:
    All this churn here is because of the mypy linter. I tried assert isinstance(timeout, float) and assert timeout is not None to no avail.

    maflcko commented at 4:01 pm on November 11, 2022:
    if mypy is broken, wouldn’t it be better to not add the annotation, as opposed to changing the code?

    jamesob commented at 2:47 pm on November 14, 2022:
    It’s not the annotation that’s broken; it’s the fact that in order to specify a timeout to connect_nodes, we have to change the default parameter here from a float to None, which then makes mypy confused.
  6. maflcko changes_requested
  7. maflcko commented at 8:25 am on November 10, 2022: member

    Not sure about this fix. The whole point of std::chrono is that the programmer does not have to do verbose manual conversions. Instead the compiler injects the multiplication only where needed.

    Generally, the underlying issue is that int settings are not sanitized. This affects almost all int settings. See for example #21893 (comment) or #25561 (comment)

    As for the fix, this reminds me a bit of 290ff5ef6d3886409e5e8688f7d28a4c8246c617, so I think what should really be done is some kind of saturating parser.

    Another reason why this should be done in the parser is that it doesn’t really make sense to accept durations that are longer than 50 times the average lifespan of a star.

  8. maflcko added this to the milestone 25.0 on Nov 10, 2022
  9. jamesob commented at 1:38 pm on November 10, 2022: member

    As for the fix, this reminds me a bit of https://github.com/bitcoin/bitcoin/commit/290ff5ef6d3886409e5e8688f7d28a4c8246c617, so I think what should really be done is some kind of saturating parser.

    This sounds less simple and immediately accessible than the one-line fix here.

    Right now master is broken; for usages that worked prior to https://github.com/bitcoin/bitcoin/commit/faf44876db555f7488c8df96db9fa88b793f897c, bitcoind now crashes. IMO the priority should be addressing that and then moving on to less immediate issues, like what the aesthetically preferable fix should be.

    I agree with you that probably some more substantial change is needed to avoid similar issues in the future, but this change is probably the simplest thing that could be done to unbreak master, and it introduces test coverage.

    In the spirit of forward progress (and not having a broken master!), I would recommend that either we merge this as-is or you propose an alternative in short order, because I have other stuff to do in the immediate term besides rethink integer parsing across the codebase, and I’d like to e.g. be able to rebase assumeutxo on master and test it without crashing.

  10. maflcko approved
  11. maflcko commented at 1:47 pm on November 10, 2022: member

    lgtm, the validation.cpp changes are fine, haven’t looked at the tests.

    Though, if you need an immediate fix, I’d recommend to temporarily replace the value in your config with a smaller, but large enough value.

    Bitcoin currently does not support timestamps after year 2106, so using a duration that puts you there should be fine for your needs.

    0>>> 2**32 - 1
    14294967295
    

    I am happy to merge this instead, if reviewers prefer.

  12. in src/validation.cpp:1545 in b3a7afcd28 outdated
    1540@@ -1541,7 +1541,8 @@ bool Chainstate::IsInitialBlockDownload() const
    1541     if (m_chain.Tip()->nChainWork < m_chainman.MinimumChainWork()) {
    1542         return true;
    1543     }
    1544-    if (m_chain.Tip()->Time() < NodeClock::now() - m_chainman.m_options.max_tip_age) {
    1545+    if (m_chain.Tip()->Time() <
    1546+            std::chrono::time_point_cast<std::chrono::seconds>(NodeClock::now()) - m_chainman.m_options.max_tip_age) {
    


    maflcko commented at 9:34 pm on November 10, 2022:
    Why not simply use Now<NodeSeconds>(), which does the same in less code?

    jamesob commented at 3:33 pm on November 11, 2022:
    Good call, fixed.
  13. in test/functional/feature_maxtipage.py:30 in b3a7afcd28 outdated
    29 
    30         self.restart_node(1, [f'-maxtipage={maxtipage}'] if set_parameter else None)
    31-        self.connect_nodes(0, 1)
    32-
    33-        # tips older than maximum age -> stay in IBD
    34+        self.connect_nodes(0, 1, timeout=5)
    


    maflcko commented at 9:35 pm on November 10, 2022:
    Can you explain why this is changed? If you locally want to change the timeout, you can use --timeout-factor. Changing it in the code may result in CI sanitizer timeouts.

    jamesob commented at 3:29 pm on November 11, 2022:
    If the test fails, it’ll hang for a while and lock up CI. But if the convention is to rely on --timeout-factor and kind of disregard CI/local default experience, I can revert.

    jamesob commented at 3:35 pm on November 11, 2022:
    Bumped this to 10, which I would think would be long enough to not cause spurious failures given such a simple test?

    maflcko commented at 4:02 pm on November 11, 2022:
    Yeah, it is probably fine. It seems to be pulling in a bunch of mypy refactorings, which I don’t like too much.
  14. in test/functional/feature_maxtipage.py:55 in b3a7afcd28 outdated
    51@@ -51,6 +52,9 @@ def run_test(self):
    52             self.log.info(f"Test IBD with maximum tip age of {hours} hours (-maxtipage={maxtipage}).")
    53             self.test_maxtipage(maxtipage)
    54 
    55+        self.log.info(f"Test IBD with max. allowable maximum tip age ({maxtipage}).")
    


    maflcko commented at 9:36 pm on November 10, 2022:

    This is printing a value that went out of scope (maybe we should use rust to write tests?)

    Also, the value is not the max, as claimed here.

    You can take commit 0d3b7631af03ecc4d6b9e56a74ac2abfe8977291, where I fixed this.


    jamesob commented at 3:33 pm on November 11, 2022:
    Fixed, thanks.
  15. maflcko commented at 9:37 pm on November 10, 2022: member
    Left some nits, feel free to ignore
  16. maflcko commented at 9:07 am on November 11, 2022: member
    For reference, I tried to fix the overflow via ccccee014f8dae70f760c9ce3ab8a5db73aa446a, but this only fixes it for large values, not small ones. The pull here also fixes it only for large ones, which is fine, because underflow was the behaviour previously as well.
  17. jamesob force-pushed on Nov 11, 2022
  18. jamesob force-pushed on Nov 11, 2022
  19. in test/functional/test_framework/test_framework.py:718 in 35ef2bf574 outdated
    713@@ -714,7 +714,8 @@ def sync_all(self, nodes=None):
    714         self.sync_blocks(nodes)
    715         self.sync_mempools(nodes)
    716 
    717-    def wait_until(self, test_function, timeout=60):
    718+    def wait_until(self, test_function, timeout: Optional[int] = None):
    719+        timeout = timeout if timeout is not None else 60
    


    maflcko commented at 4:01 pm on November 11, 2022:
    nit: I don’t think those are improvements
  20. DrahtBot commented at 0:51 am on November 12, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK MarcoFalke

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  21. fix: validation: cast now() to seconds for maxtipage comparison
    Since faf44876db555f7488c8df96db9fa88b793f897c, the maxtipage comparison
    in IsInitialBlockDownload() has been broken, since the NodeClock::now()
    time_point is in the system's native denomination (micrcoseconds).
    
    Without this patch, specifying the maximum allowable -maxtipage
    (9223372036854775807) results in a SIGABRT crash.
    
    Co-authored-by: MacroFake <falke.marco@gmail.com>
    a451e832b4
  22. jamesob force-pushed on Nov 14, 2022
  23. jamesob commented at 2:54 pm on November 14, 2022: member
    Even though I think it’s probably nice to be able to specify a timeout for connect_nodes, I’ve removed that commit from this PR to avoid more bikeshedding.
  24. in test/functional/feature_maxtipage.py:55 in 8eafa99db0 outdated
    51@@ -51,6 +52,10 @@ def run_test(self):
    52             self.log.info(f"Test IBD with maximum tip age of {hours} hours (-maxtipage={maxtipage}).")
    53             self.test_maxtipage(maxtipage)
    54 
    55+        max_long_val = 9223372036854775207
    


    maflcko commented at 3:03 pm on November 14, 2022:

    nit: this is still wrong

    0        max_long_val = 9223372036854775807
    

    maflcko commented at 3:05 pm on November 14, 2022:
    not that it matters for the test or otherwise, as the value can be anything that overflows, but it isn’t the max long value

    jamesob commented at 3:33 pm on November 14, 2022:
    Heh fixed, thanks.
  25. maflcko approved
  26. maflcko commented at 3:04 pm on November 14, 2022: member
    lgtm ACK 8eafa99db0fffbf6e15bc7cb626284932a56edf7
  27. jamesob force-pushed on Nov 14, 2022
  28. test: add -maxtipage test for the maximum allowable value e4be0e9b06
  29. jamesob commented at 8:37 pm on November 30, 2022: member
    More reviewers needed or RFM?
  30. maflcko commented at 9:02 am on December 12, 2022: member

    review ACK e4be0e9b0661a8af49c4e6d5472804913f04b8fc 🏽

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3review ACK e4be0e9b0661a8af49c4e6d5472804913f04b8fc 🏽
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUgfbAv/Q618Oe/8sU98Uq73gtq+71Cw15oamDfQ6UJfPVeb3LBwoCXv2tayfN33
     8Y8rzG/gGNDuZRnSIpkrMjUFlDJq6TIlw8T5kXe/rA/Ar1q3IRVNAHFr3d3b0H/kT
     9P9Q87OAt+V8u4muT4sy6OOJ3CvEW3D9wvlwfmZSiTzmQ092wV2YtOTpZKRWqX/97
    103Fg84JpIwmaDFhHdaK6y7hr6TndsUM/OpJSIqEqq6kHdw7/fcXrnwg/sW41gZ7qu
    11yQEadD00eHvSONA60TzqONF1G95XUZDzgRof954Npbd1H2ScWbo3TjbDwreUQ7tB
    12w/8fp+daTOZuy08wPKr3h96mUJNQSbByF3uEsMHLwHMkYXCG2tGyne6vxCH9bzh5
    13/67oEkwLn0UbSFKDbS3xXuVUJ26/+bZdnJ3PNOwee3VOdF4WBCR8U+4Mti2AYsGW
    14PxB+n2tv/f5GogAErSbtsjS9EDDY+b8oaGDMTe42hYznSZ8RaGobxZKvgmvJjlBV
    150XUhRpUw
    16=Lp4Q
    17-----END PGP SIGNATURE-----
    
  31. fanquake merged this on Dec 13, 2022
  32. fanquake closed this on Dec 13, 2022

  33. sidhujag referenced this in commit e4f896d701 on Dec 13, 2022
  34. bitcoin locked this on Dec 13, 2023


jamesob maflcko DrahtBot

Labels
Validation

Milestone
25.0


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me