util: Restore GetIntArg saturating behavior #24041

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/green changing 5 files +66 −25
  1. ryanofsky commented at 5:05 pm on January 11, 2022: member

    The new locale-independent atoi64 method introduced in #20452 parses large integer values higher than maximum representable value as 0 instead of the maximum value, which breaks backwards compatibility. This commit restores compatibility and adds test coverage for this case in terms of the related GetIntArg and strtoll functions.

    Specifically, command line or bitcoin.conf integer values greater than 9223372036854775807 (2**63-1) used to be parsed as 9223372036854775807 before #20452. Then #20452 caused them to be parsed as 0. And after this PR they will be parsed as 9223372036854775807 again.

    This change is a stripped-down alternative version of #23841 by jamesob

  2. ryanofsky force-pushed on Jan 11, 2022
  3. ryanofsky commented at 5:18 pm on January 11, 2022: member
    Updated 4a15ddc19ccfc4e8e7537aefd758083465d4d819 -> 61847648999de3c845f6580a69b58db688ab9b59 (pr/green.1 -> pr/green.2, compare) fixing a comment typo.
  4. MarcoFalke approved
  5. MarcoFalke commented at 5:20 pm on January 11, 2022: member

    Concept ACK, if jamesob is ok that you are taking this over.

    Nit: Could make sense to adjust the title to clarify:

    • This only affects GetIntArg, as the other integers don’t overflow during normal use
    • The new (and old) behavior is saturation

    Suggested title: “Restore GetIntArg saturating behavior”

  6. DrahtBot commented at 5:25 pm on January 11, 2022: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    No conflicts as of last run.

  7. jb55 commented at 5:27 pm on January 11, 2022: member
    Concept ACK
  8. ryanofsky renamed this:
    Restore atoi compatibility with old versions of Bitcoin Core
    Restore GetIntArg saturating behavior
    on Jan 11, 2022
  9. jamesob commented at 5:47 pm on January 11, 2022: member

    Concept ACK, if jamesob is ok that you are taking this over.

    I will never not be honored by Ryan Ofsky taking over one of my changes.

    But here’s the thing - the reason I didn’t go this route in the original PR is because it’s doing what I was trying to avoid in the first place, which is changing behavior. This is evidenced by the changes to existing tests that accompany this PR.

    IMO we shouldn’t change number parsing without some kind of deprecation warning. But I guess it’s okay if we can make some good arguments that we’re reasonably sure there are no existing uses of non-int64 atoi parsing that will fail if saturation behavior changes?

  10. ryanofsky force-pushed on Jan 11, 2022
  11. ryanofsky commented at 5:48 pm on January 11, 2022: member
    Updated 61847648999de3c845f6580a69b58db688ab9b59 -> d5c0e2ddd8954f740715978d14f6529fab966cdd (pr/green.2 -> pr/green.3, compare) updating fuzz test, and fixing include error https://cirrus-ci.com/task/4726914576285696 and MSVC error https://cirrus-ci.com/task/5712076994772992
  12. jamesob commented at 5:53 pm on January 11, 2022: member

    But here’s the thing - the reason I didn’t go this route in the original PR is because it’s doing what I was trying to avoid in the first place, which is changing behavior.

    Oh, I guess atoi out of range behavior is UB (unlike stoll) so maybe this isn’t actually changing any defined behavior, in which case I’m ACK.

  13. ryanofsky commented at 5:54 pm on January 11, 2022: member

    But here’s the thing - the reason I didn’t go this route in the original PR is because it’s doing what I was trying to avoid in the first place, which is changing behavior. This is evidenced by the changes to existing tests that accompany this PR.

    It’s definitely changing behavior relative to #20452 and the new tests added in #20452. I’m actually not sure if this is changing behavior from before #20452. If it is changing behavior from before #20452, I think it’s just platform specific behavior.

  14. in src/test/fuzz/string.cpp:279 in d5c0e2ddd8 outdated
    278@@ -279,17 +279,12 @@ FUZZ_TARGET(string)
    279         const int atoi_result = atoi(random_string_1.c_str());
    


    vincenzopalazzo commented at 6:31 pm on January 11, 2022:
    In accordance with the CI, maybe this can be removed?

    ryanofsky commented at 8:10 pm on January 11, 2022:

    In accordance with the CI, maybe this can be removed?

    Could do. Keeping for now, just because I’m not sure it doesn’t have value.

  15. DrahtBot added the label Utils/log/libs on Jan 11, 2022
  16. ryanofsky force-pushed on Jan 11, 2022
  17. ryanofsky commented at 8:15 pm on January 11, 2022: member
    Updated d5c0e2ddd8954f740715978d14f6529fab966cdd -> 389c47a78e98ae12cba4697fcbe1e857d8e25e35 (pr/green.3 -> pr/green.4, compare) to fix CI errors about unused variable in fuzz test https://cirrus-ci.com/task/6214569847685120, https://cirrus-ci.com/task/6724743242973184, https://cirrus-ci.com/task/5598843336130560. This is an expedient fix since I’m not totally sure what kind of fuzz coverage is useful. Maybe it is good to keep compare three integer parsing functions against each other instead of only comparing two.
  18. ryanofsky renamed this:
    Restore GetIntArg saturating behavior
    util: Restore GetIntArg saturating behavior
    on Jan 11, 2022
  19. jamesob commented at 10:13 pm on January 11, 2022: member
    ACK pending CI pass
  20. ryanofsky force-pushed on Jan 11, 2022
  21. ryanofsky commented at 10:21 pm on January 11, 2022: member
    Updated 389c47a78e98ae12cba4697fcbe1e857d8e25e35 -> 1b0b1ddc42fa7a88969ba60aa5a28c14a7036feb (pr/green.4 -> pr/green.5, compare) fixing typo in fuzz test that led to unfixed CI errors https://cirrus-ci.com/task/6099332351918080 https://cirrus-ci.com/task/5817857375207424 https://cirrus-ci.com/task/4691957468364800
  22. JeremyRubin commented at 10:24 pm on January 11, 2022: contributor
    i dont know enough about string parsing to be super useful, but based on the breaking changes analysis & trusting the implementations are OK this approach has my concept ACK.
  23. util: Restore GetIntArg saturating behavior
    The new locale-independent atoi64 method introduced in #20452 parses
    large integer values higher than maximum representable value as 0
    instead of the maximum value, which breaks backwards compatibility.
    This commit restores compatibility and adds test coverage for this case
    in terms of the related GetIntArg and strtoll functions.
    
    Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
    b5c9bb5cb9
  24. ryanofsky force-pushed on Jan 12, 2022
  25. ryanofsky commented at 0:57 am on January 12, 2022: member
    Updated 1b0b1ddc42fa7a88969ba60aa5a28c14a7036feb -> b5c9bb5cb9f4a8db57b33ef7399310c7d6de5822 (pr/green.5 -> pr/green.6, compare) dropping fuzz check to avoid depending on undefined atoi behavior https://cirrus-ci.com/task/5686396143796224
  26. MarcoFalke commented at 8:03 am on January 12, 2022: member

    Could update OP to clarify how the intermediate behavior changes cancel each other out?

    0(initial state) ----------------> (midstate) --------------------> (end result)
    1atoi UB         ----- 20452 ----> atoi 0     ----- this pull ----> atoi saturate
    2atoi64 saturate ----- 20452 ----> atoi64 0   ----- this pull ----> atoi64 saturate
    
  27. vincenzopalazzo approved
  28. ryanofsky commented at 3:59 pm on January 12, 2022: member

    Could update OP to clarify how the intermediate behavior changes cancel each other out?

    I’m not sure I followed the diagram, but I added a sentence to the OP about what previous behavior was and how this restores it. Feel free to edit OP directly if you were asking for something different or want to change anything.

  29. MarcoFalke commented at 5:27 pm on January 12, 2022: member

    review ACK b5c9bb5cb9f4a8db57b33ef7399310c7d6de5822 🌘

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3review ACK b5c9bb5cb9f4a8db57b33ef7399310c7d6de5822 🌘
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUiLpQv8CcSnEdE4GY3Ka1zv9fy6FGmfW/RdO8CCT2Kazcf3FeGQ35f5YObu2wQF
     8dTgojk3/rzZwvwXEmpLx1zxKbG19wKDXz7J+TJGMM9QOu+enxHADKLGJt3ejGUQ8
     9IDco8ien8jsK0/qBDyVReR2mRuC1eSXaxTjiNhJAJ5j47WzTjmhUJZHeNYznxtIT
    10ATYNOU4jOZLpTdVLm/WS/Vg543PMU+tzFpElbSORgbO3M4h7stpopMt5liYwSsj+
    11YoEAcWQ7qqVFFwmtymR0AdzmwxvfQwMVtocCEZNt4/tyzGSYk0gZrx7yqm3hK9li
    12YuWQ29EY64S3OdthWRE/Y1WY8b5fkU9oAJ5SNXGwGuD08TFiYrOWKtPKfneW5sAE
    133pE4zrBb/pmt+chsw9bBqBLs8NhPv0Z8yh+6ykChns7wR2QPis409mtaJD5tfZB/
    14TP26N4JawuHUTWdtaG3f4yY0SsdkgWQGmj3fAKd8d9AkMxEbFSKhRM9JByYIePbn
    15mJ9EiloO
    16=hA0/
    17-----END PGP SIGNATURE-----
    
  30. MarcoFalke added this to the milestone 23.0 on Jan 12, 2022
  31. MarcoFalke merged this on Jan 12, 2022
  32. MarcoFalke closed this on Jan 12, 2022

  33. MarcoFalke commented at 5:46 pm on January 12, 2022: member

    There is still another behaviour change:

    123a would be parsed as 123.

    Now it is mapped to 0.

    Edit: Wrong. See discussion below.

  34. jamesob commented at 5:49 pm on January 12, 2022: member
    @MarcoFalke that is due to #20452, correct?
  35. MarcoFalke commented at 5:53 pm on January 12, 2022: member
    Yes, everything is due to 20452
  36. ryanofsky commented at 6:22 pm on January 12, 2022: member

    There is still another behaviour change:

    123a would be parsed as 123.

    Now it is mapped to 0.

    This is surprising. std::from_chars “expects the pattern identical to the one used by std::strtol” and returns the “first character not matching the pattern.” And std::strtol accepts a pattern that “takes as many characters as possible to form a valid base-n (where n=base) integer number representation”. So I’d expect 123a to be parsed as 123. But we should have test coverage for this, whether or not there is a bug.

  37. MarcoFalke commented at 6:40 pm on January 12, 2022: member
    strtol has error reporting for this case, but atoi64 disables that by setting nullptr. I expect you’ll have to call from_chars twice on the string (the second time passing in the returned string) to emulate that.
  38. ryanofsky commented at 7:05 pm on January 12, 2022: member

    strtol has error reporting for this case

    Where do you see that strtol would return an error for this case? That seems counter to the documentation https://en.cppreference.com/w/cpp/string/byte/strtol#Return_value

    I’m also not able to reproduce a bug. For example the following test passes on master (290ff5ef6d3886409e5e8688f7d28a4c8246c617)

     0diff --git a/src/test/getarg_tests.cpp b/src/test/getarg_tests.cpp
     1index d5142c8d748..5286aa6d812 100644
     2--- a/src/test/getarg_tests.cpp
     3+++ b/src/test/getarg_tests.cpp
     4@@ -154,6 +154,9 @@ BOOST_AUTO_TEST_CASE(intarg)
     5     BOOST_CHECK_EQUAL(m_local_args.GetIntArg("-foo", 0), 11);
     6     BOOST_CHECK_EQUAL(m_local_args.GetIntArg("-bar", 11), 12);
     7 
     8+    ResetArgs("-foo=123a");
     9+    BOOST_CHECK_EQUAL(m_local_args.GetIntArg("-foo", 456), 123);
    10+
    11     ResetArgs("-foo=NaN -bar=NotANumber");
    12     BOOST_CHECK_EQUAL(m_local_args.GetIntArg("-foo", 1), 0);
    13     BOOST_CHECK_EQUAL(m_local_args.GetIntArg("-bar", 11), 0);
    
    0make -C src test/test_bitcoin && src/test/test_bitcoin -l test_suite -t getarg_tests
    
  39. MarcoFalke commented at 7:12 pm on January 12, 2022: member

    Oh, right. I had a typo in my own test and it didn’t show the wrong line due to the std::map indirection, but now it passes:

     0diff --git a/src/test/util_tests.cpp b/src/test/util_tests.cpp
     1index 20d27a237d..280ea7cbd3 100644
     2--- a/src/test/util_tests.cpp
     3+++ b/src/test/util_tests.cpp
     4@@ -1668,6 +1668,9 @@ BOOST_AUTO_TEST_CASE(test_LocaleIndependentAtoi)
     5         {"-9223372036854775808", -9'223'372'036'854'775'807LL - 1LL},
     6         {"9223372036854775807", 9'223'372'036'854'775'807},
     7         {"9223372036854775808", std::numeric_limits<int64_t>::max()},
     8+        {"+123a", 123},
     9+        {"123a", 123},
    10+        {"-123a", -123},
    11         {"+-", 0},
    12         {"0x1", 0},
    13         {"ox1", 0},
    
  40. sidhujag referenced this in commit 640843fbd4 on Jan 12, 2022
  41. ryanofsky commented at 12:48 pm on January 20, 2022: member

    <luke-jr> ryanofsky: what is the meaning of “green” in #24041 ? @luke-jr, it’s just a nonsense branch name from frogs are green tangent in previous PR

  42. DrahtBot locked this on Jan 20, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me