script: prevent UB when computing abs value for num opcode serialize #18413

pull pierreN wants to merge 1 commits into bitcoin:master from pierreN:fix-script-absolute changing 3 files +3 โˆ’11
  1. pierreN commented at 8:32 pm on March 23, 2020: contributor

    This was reported by practicalswift here #18046

    It seems that the original author of the line used a reference to glibc abs: https://github.com/lattera/glibc/blob/master/stdlib/abs.c

    However depending on some implementation details this can be undefined behavior for unusual values.

    A detailed explanation of the UB is provided here : https://stackoverflow.com/questions/17313579/is-there-a-safe-way-to-get-the-unsigned-absolute-value-of-a-signed-integer-with (by Billy O’Neal)

    Simple relevant godbolt example : https://godbolt.org/z/yRwtCG

    Thanks!

  2. sipa commented at 8:35 pm on March 23, 2020: member
    This shouldn’t be needed, I think, because CScriptNum values are restricted to 32-bit.
  3. pierreN commented at 9:00 pm on March 23, 2020: contributor

    OK, thanks.

    Should I close this PR even if it’s a trivial fix to #18046 ? (cf discussion in #18383 )

  4. DrahtBot added the label Consensus on Mar 23, 2020
  5. sipa commented at 10:41 pm on March 23, 2020: member
    I must be wrong, because I don’t really understand why it’s possible that that fuzzer is able to trigger a negation of -2^63.
  6. practicalswift commented at 11:07 pm on March 23, 2020: contributor

    @sipa FWIW:

    0$ xxd -p -r <<< "2d360932445550092d36093609092d393939393939393939393939393939393939360955" > crash-parse_script
    1$ src/test/fuzz/parse_script crash-parse_script
    2script/script.h:332:35: runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself
    3    [#0](/bitcoin-bitcoin/0/) 0x55e134173738 in CScriptNum::serialize(long const&) src/./script/script.h:332:35
    4    [#1](/bitcoin-bitcoin/1/) 0x55e134172f40 in CScript::push_int64(long) src/./script/script.h:405:22
    5    [#2](/bitcoin-bitcoin/2/) 0x55e13416984f in CScript::operator<<(long) src/./script/script.h:445:45
    6    [#3](/bitcoin-bitcoin/3/) 0x55e13416984f in ParseScript(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) src/core_read.cpp:62:20
    7    [#4](/bitcoin-bitcoin/4/) 0x55e134167b0b in test_one_input(std::vector<unsigned char, std::allocator<unsigned char> > const&) src/test/fuzz/parse_script.cpp:13:15
    
  7. pierreN force-pushed on Mar 24, 2020
  8. in src/script/script.h:332 in bab50c56e1 outdated
    328@@ -328,8 +329,7 @@ class CScriptNum
    329             return std::vector<unsigned char>();
    330 
    331         std::vector<unsigned char> result;
    332-        const bool neg = value < 0;
    333-        uint64_t absvalue = neg ? -value : value;
    334+        uint64_t absvalue = static_cast<uint64_t>(std::llabs(value));
    


    sipa commented at 1:26 am on March 24, 2020:

    I think you want to use std::abs, which since C++11 is overloaded to work for all signed integer types as input (avoiding the assumption that long long is enough for 64 bits, and also avoiding promoting values to higher than 64 bits in case long long is even bigger).

    See https://en.cppreference.com/w/cpp/numeric/math/abs


    pierreN commented at 1:32 am on March 24, 2020:
    OK thanks I’m testing this. I thought that llabs showed more clearly the intend but using abs might be better if you want to switch to a smaller type later indeed.

    sipa commented at 1:34 am on March 24, 2020:

    Oh no, this won’t ever switch to a smaller type.

    I’m just commenting this because hardcoding the assumption that long long = 64 bits seems bad (if there were an abs64bit or so that would seem ideal, but there isn’t). Thankfully, thanks to std::abs being overloaded for all integer types, there is no need for such an assumption.


    pierreN commented at 1:42 am on March 24, 2020:
    Hah indeed yes, this is better that way then.
  9. pierreN commented at 1:32 am on March 24, 2020: contributor

    I don’t really understand why it’s possible that that fuzzer is able to trigger a negation of -2^63.

    In the input provided by @practicalswift above, 2d39393939393939393939393939393939393936 is the ASCII string -9999999999999999996 which is translated by the atoi64 in core_read.cpp:ParseScript to -2^63 while looping over the words. This is then sent to CScriptNum::serialize via CScript::push_int64.

    I could not find the proposed input in https://github.com/bitcoin-core/qa-assets/ nor https://github.com/bitcoin-core/qa-assets/pull/7 so I guess the end goal here is to increase code coverage by adding more data to qa-assets in a future PR ?

    TL;DR I reckon there is 3 options:

    • not considereing the provided ASCII text input
    • since the atoi64 already changes values beyond 2^63 when doing the ASCII->int64_t conversion, limit the atoi64 output to (-)2^32 as @sipa thought (by modifying it manually)
    • the current CI build fails seems to be due to some compiler versions not respecting the standard (namely MSVC, quite ironic), I updated my branch to use abs instead (which hopefully all compilers have fully tested..)

    From what I gather, ParseScript is only used in bitcoin-tx.cpp once (except for tests), which takes input directly from the CLI.

    So if you find this PR too intrusive, limiting the output of atoi64 in ParseScript might be a better option ? (I might be wrong though)

  10. pierreN force-pushed on Mar 24, 2020
  11. sipa commented at 1:43 am on March 24, 2020: member

    Code review ACK bab50c56e1fa2477f7d625855485e8527c78747c, with either std::abs or std::llabs.

    It’s more obviously correct and should avoid the fuzzer issues we’re seeing. I have also verified that there are no consensus-critical code paths that can trigger this (they’re all restricted to much smaller numbers).

    I think the use of atoi64 and its clamping behavior in core_read.cpp is independently problematic (it should just fail when the number is too big), but I don’t think that’s on the critical path for fixing this issue. I’ll open a PR to restrict it to -0xFFFFFFFF..0xFFFFFFFF.

  12. pierreN commented at 1:50 am on March 24, 2020: contributor

    OK, thanks. (also note that atoi64 seems platform dependant (which might explain why other people in the above issue didn’t see the fuzzer error)).

    I would’nt mind trying to do the PR for the atoi64 too, but I guess it will be faster/less work for other people if @sipa do it :)

  13. sipa commented at 2:12 am on March 24, 2020: member
    @pierreN Feel free to restrict the number conversion in ParseScript yourself if you like.
  14. theStack commented at 10:11 am on March 24, 2020: member

    I’m afraid that using std::abs instead of the manual calculation doesn’t get rid of the UB. According to https://en.cppreference.com/w/cpp/numeric/math/abs: “Computes the absolute value of an integer number. The behavior is undefined if the result cannot be represented by the return type.

    So, for the corner case input -2^63 (=numeric_limits<int64_t>::min()), both variants are UB (probably internally, std::abs does exactly work the same than our manual way – one example for libstdc++: https://github.com/openbsd/src/blob/master/gnu/lib/libstdc%2B%2B/libstdc%2B%2B/include/c_std/std_cstdlib.h#L149).

    See also https://embeddedgurus.com/stack-overflow/2012/02/the-absolute-truth-about-abs/ (talks about C, but that applies to C++ as well I’d guess).

  15. pierreN commented at 3:55 pm on March 24, 2020: contributor

    Thanks, it seems you are right. I guess the fuzzer is not triggered here since we use abs but the UB is still there.

    I’m not sure that adding a special check for -2^63 inside CScript::serialize (or inside abs) is a good thing. The only decent option available seems to be the stackoverflow post above, but it is inapplicable here.

    Hence, I reckon there is 2 possible options for this PR :

    • just close it since #18416 prevents the UB from happening anyway
    • rename the commit into something like “script: rely on std implementation to get abs value for num opcode serialize” - since relying on the standard seems cleaner and show more obviously the intent ?

    What do you guys think ?

  16. sipa commented at 4:10 pm on March 24, 2020: member

    @theStack Looking at the standard library implementation isn’t relevant, as it is allowed to rely on platform-specific behavior. However, it looks like std::abs is indeed specified to be undefined if the result cannot be represented in the input type.

    What is wrong with uint64_t absvalue = (value < 0) ? -uint64_t(value) : uint64_t(value). I remember you tried this, but don’t remember what error came out.

    An alternative can be uint64_t absvalue = (value < 0) ? ~uint64_t(value) + 1 : uint64_t(value) which is semantically exactly the same but avoids a negation of a unsigned type.

  17. pierreN force-pushed on Mar 25, 2020
  18. pierreN commented at 4:18 am on March 25, 2020: contributor

    The issue with -uint64_t(value) was that MSVC seemed not to implement the 5.3.1 ยง8 of the standard about the - unary operator properly…

    It seems however that 2s complement by hand works : https://godbolt.org/z/cr-phu so I updated the branch accordingly. Thanks.

  19. theStack commented at 11:01 am on March 25, 2020: member
    FWIW, I just checked where else the manual abs approach of “determining-sign-and-when indicating-negating-the-number”, is used in the codebase, and found this function: https://github.com/bitcoin/bitcoin/blob/5b4a9f4bdf9d90f29921425e13e50a0076edffab/src/core_write.cpp#L18-L26 I didnt’ check though in which range the amount values are passed from the call-sites. Maybe you want to look into it if UB is also a problem here in the course of this (or a possible future) PR.
  20. pierreN commented at 11:35 am on March 25, 2020: contributor

    I didnt’ check though in which range the amount values are passed from the call-sites. Maybe you want to look into it if UB is also a problem here in the course of this (or a possible future) PR.

    Thanks. From a quick check and believing that the variable name COIN is accurate, you have 2^62/21 000 000/COINS > 1000 so this shoould not be able to trigger the UB ?

    It seems it costs nothing to prevent it though, and I’d be happy to do a PR for that if others agree.

  21. practicalswift commented at 11:46 am on March 25, 2020: contributor

    @theStack Good catch regarding ValueFromAmount(const CAmount& amount). I noticed it too when fuzzing ValueFromAmount:

    https://github.com/bitcoin/bitcoin/blob/3f5107d008e15efa364c53bd3ed4e819cd2c6712/src/test/fuzz/integer.cpp#L122-L128

    I think ideally all our functions should i.) state pre-conditions clearly (using documentation and/or assertions), and ii.) robustly handle all inputs that are not explicitly disallowed by the clearly stated pre-conditions :)

  22. pierreN commented at 12:24 pm on March 25, 2020: contributor

    I noticed it too when fuzzing ValueFromAmount:

    Looking at the interesting integer.cpp, shouldn’t the same be applied to FormatMoney if we follow that logic ?

    With a quick regexp search I also found occurences in :

    • qt/bitcoinunits.cpp : qint64 n_abs = (n > 0 ? n : -n);
    • timedata.cpp : in the abs64 function (edit: this can’t happen since at worst we abs the average between 0 and the nOffsetSample argument)
  23. laanwj referenced this in commit b53af72b82 on Mar 27, 2020
  24. practicalswift commented at 7:44 pm on March 31, 2020: contributor

    Looking at the interesting integer.cpp, shouldn’t the same be applied to FormatMoney if we follow that logic ?

    I suggest doing that in a follow-up PR to keep this PR focused. Would be really nice to get this one merged to get rid of the last known fuzzing roadblock :)

  25. MarcoFalke commented at 9:09 pm on March 31, 2020: member
    @practicalswift What is the fuzzer that hits this in current master? And what is the backtrace?
  26. practicalswift commented at 5:09 am on April 1, 2020: contributor

    @MarcoFalke Oh, it seems I got the crashes mixed up.

    This is the remaining case (it is unrelated to this PR):

    0$ UBSAN_OPTIONS="print_stacktrace=1:halt_on_error=1:report_error_type=1" src/test/fuzz/transaction crash-transaction
    1INFO: Seed: 2635477314
    2INFO: Loaded 1 modules   (415121 inline 8-bit counters): 415121 [0x5575e380b6d8, 0x5575e3870c69), 
    3INFO: Loaded 1 PC tables (415121 PCs): 415121 [0x5575e3870c70,0x5575e3ec6580), 
    4src/test/fuzz/transaction: Running 1 inputs 1 time(s) each.
    5Running: crash-transaction
    6primitives/transaction.cpp:87:19: runtime error: signed integer overflow: 1095216725760 + 9223372032559808512 cannot be represented in type 'long'
    7    [#0](/bitcoin-bitcoin/0/) 0x5575e2041ec1 in CTransaction::GetValueOut() const src/primitives/transaction.cpp:87:19
    8    [#1](/bitcoin-bitcoin/1/) 0x5575e0eb4a97 in test_one_input(std::vector<unsigned char, std::allocator<unsigned char> > const&) src/test/fuzz/transaction.cpp:78:18
    

    This remaining case is fixed in #18383.

  27. pierreN commented at 9:03 am on April 1, 2020: contributor

    OK, thanks.

    What about recycling this PR to fix those custom absolute value functions ?

    For this purpose I added a small fuzzing harness in the PR #18491

    Or we can just close this PR (and hence also close the fuzzing harness in #18491)

  28. pierreN force-pushed on Apr 2, 2020
  29. pierreN force-pushed on Apr 2, 2020
  30. pierreN commented at 1:22 pm on April 2, 2020: contributor
    So following this comment : #18491 (comment) just added CScriptNum::serialize to the integer fuzzing harness to this PR instead of opening another one.
  31. MarcoFalke commented at 0:42 am on April 3, 2020: member

    Could also add the following diff:

     0diff --git a/src/test/fuzz/scriptnum_ops.cpp b/src/test/fuzz/scriptnum_ops.cpp
     1index db44bb9e19..d45c85bc51 100644
     2--- a/src/test/fuzz/scriptnum_ops.cpp
     3+++ b/src/test/fuzz/scriptnum_ops.cpp
     4@@ -128,10 +128,6 @@ void test_one_input(const std::vector<uint8_t>& buffer)
     5             script_num &= fuzzed_data_provider.ConsumeIntegral<int64_t>();
     6             break;
     7         }
     8-        // Avoid negation failure:
     9-        // script/script.h:332:35: runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself
    10-        if (script_num != CScriptNum{std::numeric_limits<int64_t>::min()}) {
    11             (void)script_num.getvch();
    12-        }
    13     }
    14 }
    
  32. pierreN force-pushed on Apr 3, 2020
  33. pierreN commented at 2:55 pm on April 3, 2020: contributor
    Ow, right, thanks. Branch updated.
  34. DrahtBot commented at 8:24 pm on April 3, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    No conflicts as of last run.

  35. DrahtBot added the label Needs rebase on Apr 4, 2020
  36. pierreN force-pushed on Apr 8, 2020
  37. DrahtBot removed the label Needs rebase on Apr 8, 2020
  38. script: prevent UB when computing abs value for num opcode serialize 2748e87932
  39. in src/test/fuzz/integer.cpp:285 in 7b47f780c2 outdated
    278@@ -279,4 +279,8 @@ void test_one_input(const std::vector<uint8_t>& buffer)
    279         catch (const std::ios_base::failure&) {
    280         }
    281     }
    282+
    283+    {
    284+        CScriptNum::serialize(i64);
    285+    }
    


    MarcoFalke commented at 1:12 pm on April 8, 2020:
    Also adjust line 139 of this file?

    pierreN commented at 11:34 pm on April 8, 2020:

    Ow, right, thanks, I missed this one when I grepped for the runtime error.

    Hence removed the CScriptNum::serialize(i64); since line 139 already tests it.

  40. pierreN force-pushed on Apr 8, 2020
  41. pierreN commented at 7:51 am on April 28, 2020: contributor

    It’s been around a month this PR has been opened - do you guys think I should close it?

    Thanks.

  42. sipa commented at 8:30 am on April 28, 2020: member
    ACK 2748e8793267126c5b40621d75d1930e358f057e
  43. MarcoFalke commented at 7:52 pm on April 29, 2020: member

    ACK 2748e8793267126c5b40621d75d1930e358f057e, only checked that the bitcoind binary does not change with clang -O2 ๐ŸŽ“

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK 2748e8793267126c5b40621d75d1930e358f057e, only checked that the bitcoind binary does not change with clang -O2 ๐ŸŽ“
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUi98wwAwknOFbsqPTE+alfdjNCoL0MsBT7VrsX7Io4GqSF0gopF0hSAvuDiiR+9
     8I0+Hl10iCynEOzQgDPRLQGpuQOp1uAxqbKc6QCOF1vdwwu2WdRoCTGCtVr/jxCeg
     9MYg0S0IWlyoaM8v5rP7tci4HXsI7TDkrx13izCKfTuB7HXcUjA3vja6K1Crb4vIZ
    102qpI1yCy2unnAwkoQO8qj752rFIefw2b0hDly2IkWWxfE5CCwkqBYlft9kfR03zA
    11L81M/Eo2xniXosHiVjSRr5q01LZBWykD8EexYQPOMqP8eYXumPgLob1WD1h/ryVp
    12k9qmDsCoV32tfsUiOv53I5CRLG69jJaLqvNAkdTMfrlb1VZlvQwmtdVNx+W1UPNC
    13fknQj1whu40fPMNYX+Qk8F9HKHXV3zCZu27sFad0nnGtg5Xio29ayck/OTb5uLcU
    14ACSj6sFSf9rta3WvFGU9HwoicnkArFhb/81itfvn4K+M9YHDO0Fafokv2xm4grjx
    15CZwsrSnk
    16=ovF6
    17-----END PGP SIGNATURE-----
    

    Timestamp of file with hash c058e638449b2ccb23d207a2e64064c908b5cebd9528068119d048e87b501d5a -

  44. practicalswift commented at 12:45 pm on April 30, 2020: contributor
    ACK 2748e8793267126c5b40621d75d1930e358f057e
  45. jonatack commented at 4:17 pm on April 30, 2020: member
    @pierreN no worries, a month (or much, much longer) isn’t unusual due to the number of PRs opened relative to the number of people who review.
  46. pierreN commented at 4:44 am on May 1, 2020: contributor
    Ow my bad - OK, thanks for the ACKs, I’ll leave the PR open.
  47. fanquake merged this on May 2, 2020
  48. fanquake closed this on May 2, 2020

  49. sidhujag referenced this in commit 3fdaa759ad on May 2, 2020
  50. fanquake referenced this in commit df7045d5a1 on May 15, 2020
  51. fanquake referenced this in commit 245c862cfd on May 15, 2020
  52. deadalnix referenced this in commit c05850dc07 on Jan 22, 2021
  53. furszy referenced this in commit 99f478be04 on Mar 1, 2021
  54. backpacker69 referenced this in commit a48fabdfb7 on Mar 28, 2021
  55. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me