Instances of operations on uninitialized values #753

issue elichai openend this issue on May 19, 2020
  1. elichai commented at 8:26 pm on May 19, 2020: contributor

    Hi, Currently we have 2 instances of operations on uninitialized values, in the cmov implementation. all the _cmov implementations read from the resulting variable(ie r->n[0] & mask0) so if r points to uninitialized values this is undefined behavior [1] [2] Unlike the memcpy debate, actual user operations and conditions on uninit values are assumed to not happen in the compiler and are used for optimizations (especially in clang)

    Instance 1: In ecdsa_sign, if noncefp returns 0 then we get to: https://github.com/bitcoin-core/secp256k1/blob/41fc78560223aa035d9fe2cbeedb3ee632c740e2/src/secp256k1.c#L517-L518 with both r and s being unitialized. Instance 2: In ecdh, we pass &tmpa here: https://github.com/bitcoin-core/secp256k1/blob/41fc78560223aa035d9fe2cbeedb3ee632c740e2/src/ecmult_const_impl.h#L186 to the ECMULT_CONST_TABLE_GET_GE in an uninit state, which become the r value in the macro. Then in: https://github.com/bitcoin-core/secp256k1/blob/41fc78560223aa035d9fe2cbeedb3ee632c740e2/src/ecmult_const_impl.h#L31-L32 we cmov into &(r)->x which will perform operations on uninit values. notice that under VERIFY we actually do clear those before: https://github.com/bitcoin-core/secp256k1/blob/41fc78560223aa035d9fe2cbeedb3ee632c740e2/src/ecmult_const_impl.h#L26-L27

    Possible solutions: The only solution I can currently think of is default initializing those arguments.

    Mitigations: We could add VALGRIND_CHECK_MEM_IS_DEFINED(r, sizeof(*r)); to all the cmovs but we only enable valgrind(-DVALGRIND) under tests, which use the VERIFY macro too, so that will not catch the second case.

    [1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_451.htm [2] https://markshroyer.com/2012/06/c-both-true-and-false/

  2. real-or-random commented at 7:43 am on May 20, 2020: contributor

    That’s an interesting find. I’m curious how you found this.

    Currently we have 2 instances of operations on uninitialized values, in the cmov implementation. all the _cmov implementations read from the resulting variable(ie r->n[0] & mask0) so if r points to uninitialized values this is undefined behavior [1] [2] Unlike the memcpy debate, actual user operations and conditions on uninit values are assumed to not happen in the compiler and are used for optimizations (especially in clang)

    To be fair, all these values have its address taken and from types guaranteed no to have trap representations, so (ignoring differences between C versions) these are unspecified values, and not immediate UB. But yes, if the result of the cmov is unspecified, that’s not much better. This means that the result of the sign function is unspecified.

    Currently compilers don’t exploit this, so this is not a vulnerability, but it should be fixed.

    Possible solutions: The only solution I can currently think of is default initializing those arguments.

    This seems sensible to me. It’s certainly better than exposing unspecified values to the caller or foregoing constant-time code.

    If we want to make sure that valgrind will complain about uninitialized values in 1), we could alternatively initialize to zero but then call VALGRIND_MAKE_MEM_UNDEFINED, and keep the cmov. I think the same applies to 2.

    Mitigations: We could add VALGRIND_CHECK_MEM_IS_DEFINED(r, sizeof(*r)); to all the cmovs but we only enable valgrind(-DVALGRIND) under tests, which use the VERIFY macro too, so that will not catch the second case.

    Well, why not. Depending on your abstraction, it’s unexpected that cmov functions read the target. I guess this was the reason why this was not caught by code review. We should also add a note to the docs of the cmov functions.

  3. real-or-random added this to the milestone initial release (1.0.0-rc.1) on May 20, 2020
  4. practicalswift commented at 8:14 am on May 20, 2020: contributor

    Wow, great find @elichai!

    Did you find these by fuzzing under MSan?

    Very exciting to follow your great work!

  5. peterdettman commented at 8:15 am on May 20, 2020: contributor

    https://github.com/bitcoin-core/secp256k1/blob/41fc78560223aa035d9fe2cbeedb3ee632c740e2/src/ecmult_const_impl.h#L31-L32

    Note that the logic for ECMULT_CONST_TABLE_GET_GE would work fine still if the first loop iteration was an unconditional copy. The m == idx_n can be ignored because either the first value is the correct one, or else it will be replaced by the correct one in a later iteration.

    I suppose it should be verified that the table size is > 0, and that the index is definitely in bounds.

  6. elichai commented at 8:23 am on May 20, 2020: contributor

    That’s an interesting find. I’m curious how you found this.

    Wow, great find @elichai!

    Did you find these by fuzzing under MSan?

    Very exciting to follow your great work!

    Thanks. Actually no, I couldn’t get MSan or valgrind to trip here (without adding an explicit call to VALGRIND_CHECK_MEM_IS_DEFINED) I think I saw some weird gcc-10 warning on the scalar_cmov impl in scalar_low that made me investigate these functions, but for some reason I can’t recreate that, so I’m honestly not sure I got there

  7. elichai commented at 8:59 am on May 20, 2020: contributor

    Well, why not. Depending on your abstraction, it’s unexpected that cmov functions read the target. I guess this was the reason why this was not caught by code review. We should also add a note to the docs of the cmov functions.

    Mostly because it has overhead https://godbolt.org/z/MSTRKM. (scroll down to main) at least this overhead doesn’t add any conditions that allow sidechannel attacks :)

    But also this interacts badly with valgrind_ctime_test, because both the input and output are definitely tainted, so this makes valgrind trip :/

  8. real-or-random commented at 12:00 pm on May 20, 2020: contributor

    I think I saw some weird gcc-10 warning on the scalar_cmov impl in scalar_low that made me investigate these functions, but for some reason I can’t recreate that, so I’m honestly not sure I got there

    Ah I see. It’s on my list to try to new sanitizer in gcc 10. It has landed in Arch Linux now, so I could give it a try.

    Adding MEM_DEFINED to the cmov: We can leave this aside then. (I’m not sure how arbitrary this is, we do this nowhere else.) But I don’t really understand the issues. Efficiency should not be a concern if do this in VERIFY only. And how exactly does it affect the ct time test?

  9. elichai commented at 12:07 pm on May 20, 2020: contributor

    Efficiency should not be a concern if do this in VERIFY only.

    Oh, I thought about adding it under VALGRIND, but yeah under VERIFY probably makes more sense.

    And how exactly does it affect the ct time test?

    It trips valgrind in the cmov, although putting it under VERIFY solves this.

    I’ll open a PR with those in separate commits so I can drop that if there’s any resistance

  10. elichai cross-referenced this on May 20, 2020 from issue Fix uninit values passed into cmov by elichai
  11. jonasnick commented at 2:32 pm on May 22, 2020: contributor
    Any ideas why this wasn’t caught by the existing tests run under valgrind, such as https://github.com/bitcoin-core/secp256k1/blob/05d315affe0acd02591c8db783ce1badb0c37a31/src/tests.c#L5141 ?
  12. sipa commented at 4:35 pm on May 22, 2020: contributor

    @jonasnick I think the answer is just that Valgrind can only detect bugs that are present in the compiled code. x86 machine code does not have a requirement of initializing a register before you can use it, which a compiler can exploit.

    Specifically, I think Valgrind keeps track of definedness per bit of every register and memory byte, but will treat e.g. (x & 0) as 0, even when (bits of) x are undefined. That makes sense because at the machine code level, whatever value x has, the output will always be a well-defined 0.

    In other words: the fact that the Valgrind test does not detect this is evidence that (on the platform the Valgrind test is compiled for), this code does not result in non-constant time effects (but doesn’t guarantee it’s correct, of course).

  13. real-or-random commented at 12:42 pm on May 23, 2020: contributor

    See https://valgrind.org/docs/manual/mc-manual.html#mc-manual.uninitvals

    “A complaint is issued only when your program attempts to make use of uninitialised data in a way that might affect your program’s externally-visible behaviour.” And as @sipa points out correctly, “uninit & 0” is always “0” for the compiled code that runs your real machine, but not the abstract C “machine” that we have in mind when talking about the semantics of C.

  14. real-or-random closed this on Jun 2, 2020


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 10:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me