GCC 9.2.1 on POWER9 emits a lot of branches for carries in the 32-bit scalar code. :( The issue seems to be similar to the one in the ECDH code. Those comparisons aren’t reliably turned into constant time assembly.
Originally posted by @gmaxwell in #708 (comment)