Unify and simplify CScript push methods #5091

pull sipa wants to merge 2 commits into bitcoin:master from sipa:unipush changing 12 files +201 −228
  1. sipa commented at 5:35 am on October 16, 2014: member
    • Make CScript::operator«(const std::vector&) cover all push operations (including OP_0, OP_1..OP_16 and OP_1NEGATE), and make the push for numbers use that.
    • Convert the uint256, uint160 and CPubKey into a single templated one using the vector push code.
    • Get rid of the int64_t operator«; better be explicit about what needs pushing, so change all number pushes to use CScriptNum (which, due to the generic vector push now also correctly uses OP_n).
    • Get rid of some unused methods and constructors as a result.
  2. sipa force-pushed on Oct 16, 2014
  3. in src/chainparams.cpp: in c4156c6930 outdated
    130+        const unsigned char pszCoinbase[] = "\x04\xff\xff\x00\x1d\x01\x04\x45The Times 03/Jan/2009 Chancellor on brink of second bailout for banks";
    131         CMutableTransaction txNew;
    132         txNew.vin.resize(1);
    133         txNew.vout.resize(1);
    134-        txNew.vin[0].scriptSig = CScript() << 486604799 << CScriptNum(4) << vector<unsigned char>((const unsigned char*)pszTimestamp, (const unsigned char*)pszTimestamp + strlen(pszTimestamp));
    135+        txNew.vin[0].scriptSig = CScript(pszCoinbase, pszCoinbase + 77);
    


    TheBlueMatt commented at 5:55 am on October 16, 2014:
    Why?

    sipa commented at 5:58 am on October 16, 2014:
    Because the number push operator doesn’t work anymore - it uses OP_4 intead of 1-byte push of 0x04. As this is one of the very few cases where you need the byte-level ability, I figured it’s just shorter to turn it into a single byte sequence here.

    TheBlueMatt commented at 6:01 am on October 16, 2014:
    Ugh, ok then…maybe put in a comment of the old code?

    sipa commented at 6:03 am on October 16, 2014:
    Why does the old code matter? The coinbase input is just a byte array.

    TheBlueMatt commented at 6:05 am on October 16, 2014:
    I suppose for historical nostalgia? In any case, it matters not.

    sipa commented at 6:10 am on October 16, 2014:
    Done.
  4. in src/script/script.h: in c4156c6930 outdated
    395+        else if (b.size() == 1 && b[0] >= 1 && b[0] <= 16)
    396+        {
    397+            insert(end(), b[0] + (OP_1 - 1));
    398+            return *this;
    399+        }
    400+        else if (b.size() < OP_PUSHDATA1)
    


    TheBlueMatt commented at 5:57 am on October 16, 2014:
    Tiny nit: Can you drop this else and put in a \n to indicate the end of the standalone ifs? (and maybe make the above three ifs without else, but that is more personal style).

    sipa commented at 6:04 am on October 16, 2014:
    I’ve changed it a bit, have a look.
  5. in src/script/script.h: in c4156c6930 outdated
    403@@ -436,6 +404,13 @@ class CScript : public std::vector<unsigned char>
    404         return *this;
    405     }
    406 
    407+    // Append a push operation for any data type convertible to std::vector<unsigned char>.
    408+    template<typename T>
    409+    CScript& operator<<(const T& b)
    


    TheBlueMatt commented at 5:59 am on October 16, 2014:
    As much as begin() and end() are standard, I kinda prefer making this more explicit, if possible?

    sipa commented at 6:04 am on October 16, 2014:
    Any suggestion?

    TheBlueMatt commented at 6:07 am on October 16, 2014:
    well not using the « syntax, for one. But maybe something even more explicit like ConvertToVectorAndPush?

    sipa commented at 6:09 am on October 16, 2014:
    If we want that, I think we should do it for all of these operator«’s.

    TheBlueMatt commented at 6:11 am on October 16, 2014:
    My concern is only of blind type conversion…it has certainly bitten us in the past.

    sipa commented at 6:15 am on October 16, 2014:
    Yeah, I understand. But I also prefer not to touch hundreds of lines of code. Given that the type passed in must be convertible to a std::vector, this should be pretty safe (and CScript itself is already excluded specifically).

    laanwj commented at 6:52 am on October 16, 2014:

    As said in #4981, I’m not incredibly happy about this. It seems the kind of magic use of C++ that results in obscure hard-to-debug problems.

    I’d like the serialization code (and consensus code in general) to move in the direction of being as human-readable and explicit as possible.

    So in this case I’d prefer having an explicit function that does this conversion to a serializable type.


    jgarzik commented at 6:53 am on October 16, 2014:
    +1 @laanwj

    sipa commented at 6:57 am on October 16, 2014:
    Sure, no objection against a ToByteVector function. It’s probably easier to do that in #4981 though.
  6. sipa force-pushed on Oct 16, 2014
  7. sipa force-pushed on Oct 16, 2014
  8. TheBlueMatt commented at 6:31 am on October 16, 2014: member
    utACK…hope no one ever writes a class with unsigned char* begin() and unsigned char* end()
  9. laanwj commented at 6:46 am on October 16, 2014: member

    hope no one ever writes a class with unsigned char* begin() and unsigned char* end()

    Why not? That would still be transparently converted to a std::vector, right?

  10. TheBlueMatt commented at 7:00 am on October 16, 2014: member

    OK, hope no one ever writes a class with unsigned char* begin() and unsigned char* end() and doesnt make those map to a consecutive memory region they want turned into a vector :)

    On 10/16/14 06:46, Wladimir J. van der Laan wrote:

    0hope no one ever writes a class with unsigned char* begin() and
    1unsigned char* end()
    

    Why not? That would still be transparently converted to a std::vector, right?

    — Reply to this email directly or view it on GitHub #5091 (comment).

  11. sipa force-pushed on Oct 17, 2014
  12. sipa commented at 0:11 am on October 17, 2014: member
    Rebased on top of #4981.
  13. sipa force-pushed on Oct 17, 2014
  14. sipa commented at 1:18 am on October 17, 2014: member
    Also added a move-only commit to move a lot of the larger script methods to the .cpp file.
  15. TheBlueMatt commented at 0:12 am on October 18, 2014: member
     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA1
     2
     3utACK commithash c5a758d7cd63c2b3bbfd8daef5d0ebe5d786a705
     4-----BEGIN PGP SIGNATURE-----
     5Version: GnuPG v2
     6
     7iQIcBAEBAgAGBQJUQbA0AAoJEIm7uGY+LmXOcfIP/2KISAfdH5eHoMbR+YEZSPC7
     8YmhfuitFgf2Q94itF5K33j1QDtzAIFyNQSrAPRZJrf3aKuoaVGMjB0yOJ5pLqagL
     9kfuHlLVXXCZeRxkGn1dp0KSJCzkpMYWgmCh1sFRkoB/9SJYEAF0pW0GNUGlCWoGR
    10uqAcwOv+KGv3K2DHDG0vaSa4DFpGPMqAGIF0ZhPXUwrrYw5P1iPBp/WTjhyRVEfd
    11lEALQ2byIbUK4JC5TSNFpj7hkPHbxMNWlFC6uX+4Y1ZSa+ZJFe3/1UuCiJPVvw9C
    127JQRZhWBH2XK3fq8s71ovSLDrkyyEAGkby6pKN0HguJfqAEpWNFi0RQ73rxG/U1+
    13bPZ4OcXSHyFn9NWYeX60WGTCL6kHus10Il01PAcS1k1j+O4wGajuCXeAiPQtfCU6
    14D1BjBmI3SNBktkSuEZecOjSy0msbqRRN3B+bMdmQO9oCf0T9YpDQ1JmaRoBrJkUC
    15f3tSjP/2HhJgWsi+NHgH885G+2N5nXl8J5EZ7OCfX1xi9oMGOYNbGCs6Ovki7UuY
    160g4RqTWd24sr+Qor5k3FbaCjjCV98KVTGalyUGcH9wB5D6CzaCmRIXHZB7q7Bq5P
    177Mp/6y9ZO2J+iaj3lKboKSgJTrZdhIAyzyfkWqGgbu8HYh3Gbf9pffLeL/8Xyguj
    18C36EXr+hz3Ecz2Dm1kEo
    19=pkgg
    20-----END PGP SIGNATURE-----
    
  16. theuni commented at 0:34 am on October 18, 2014: member
    I’m not sure this makes things any clearer. I’ll delay judgement until giving it a re-read this weekend.
  17. TheBlueMatt commented at 0:35 am on October 18, 2014: member
    It makes it marginally clearer, and the move to the .cpp is nice.
  18. sipa commented at 2:21 am on October 18, 2014: member

    I should probably clarify a bit why I think this is nicer. Previously, there were some different ways of pushing vectors/numbers:

    • operator«(vector): pushes the vector using OP_0, direct push, OP_PUSHDATA1 or OP_PUSHDATA2, but not using OP_1..OP_16 or OP_1NEGATE.
    • opterator«(int64_t) pushes the vector representing the number using OP_1NEGATE, OP_0..OP_16 or a direct push.
    • operator«(CSCriptNum) pushes the vector representing the number using OP_0 or direct push.

    So the logic for number conversion and vector-to-opcode conversion was being done in several places, and pushing a vector or scriptnum would never result in using OP_1..OP_16 or OP_1NEGATE, even if that was the shortest way of doing it (which would violate BIP62).

    This pull request moves all number-to-vector logic to CScriptNum, and all vector-to-opcode logic to operator«(vector), and makes the rest wrappers around it. Also, it removes a bunch of unnecessary methods.

  19. laanwj added the label Improvement on Oct 22, 2014
  20. sipa force-pushed on Oct 25, 2014
  21. sipa commented at 9:46 am on October 25, 2014: member
    Rebased. @theuni Do you understand my motivation now for this?
  22. theuni commented at 7:01 pm on October 26, 2014: member

    @sipa Yes, It’s fine by me. My iffyness stemmed from the fact that this changes behavior somewhat from “I’ll blindly add whatever you tell me to” to “just tell me what you want and I’ll clean it up”. I’d usually prefer to allow people to shoot themselves in the foot if they’re really trying, but this really isn’t the place for that.

    Fwiw, I really meant for the ScriptBinaryData change (what eventually turned into the ToByteVector() function) to be used for this. I had hoped to leave a method around for adding exact data to a script. That would’ve made the chainparams change a good bit cleaner. But I suppose it’s possible there aren’t enough of those cases to justify easy the foot-shooting.

  23. sipa commented at 11:33 am on October 27, 2014: member
    @laanwj comments?
  24. gavinandresen commented at 5:50 pm on October 27, 2014: contributor
    utACK
  25. sipa force-pushed on Oct 29, 2014
  26. sipa commented at 3:22 pm on October 29, 2014: member
    Rebased.
  27. Unify and simplify CScript push methods
    * Make CScript::operator<<(const std::vector<unsigned char>&) cover all push
      operations (including OP_0, OP_1..OP_16 and OP_1NEGATE), and make the push
      for numbers use that.
    * Convert the uint256, uint160 and CPubKey into a single templated one using
      the vector push code.
    * Get rid of the int64_t operator<<; better be explicit about what needs pushing,
      so change all number pushes to use CScriptNum (which, due to the generic
      vector push now also correctly uses OP_n).
    * Get rid of some unused methods and constructors as a result.
    6681fdab33
  28. Move some CScript methods to script/script.cpp 6f95254462
  29. sipa force-pushed on Nov 4, 2014
  30. sipa commented at 1:56 pm on November 4, 2014: member
    Rebased.
  31. jtimon commented at 10:54 am on November 5, 2014: contributor
    Untested ACK
  32. in src/script/interpreter.cpp: in 6f95254462
    782@@ -783,7 +783,7 @@ bool EvalScript(vector<vector<unsigned char> >& stack, const CScript& script, un
    783                     for (int k = 0; k < nSigsCount; k++)
    784                     {
    785                         valtype& vchSig = stacktop(-isig-k);
    786-                        scriptCode.FindAndDelete(CScript(vchSig));
    787+                        scriptCode.FindAndDelete(CScript() << vchSig);
    


    sipa commented at 5:40 pm on November 6, 2014:
    @petertodd I think this would actually be a consensus change…

    TheBlueMatt commented at 9:28 pm on November 6, 2014:
    Yep! Well, maybe, probably not. I think this may only matter in the case of a valid signature, so you’d need a valid signature that is representable by a non-push opcode.

    petertodd commented at 9:37 pm on November 6, 2014:
    @TheBlueMatt If STRICTENC was written such that it caused the script to immediately fail, this could matter as CHECKMULTISIG evaluation exits early; the fork would be the difference between a passing CHECKMULTISIG NOT and a failed script.

    TheBlueMatt commented at 9:57 pm on November 6, 2014:
    True, though I think its rather orthogonal to this issue…and we knew (its in the src!) that STRICTENC was not a softfork.

    petertodd commented at 10:34 pm on November 6, 2014:
    That’s not the point. It’s an example of how a very subtle change could have lead to a fork because of a known consensus-critical dependency. Just because you don’t know how to exploit it right now doesn’t mean you can’t.

    TheBlueMatt commented at 10:36 pm on November 6, 2014:
    Huh? My point was that STRICTENC is already a hardfork, and while I agree 100% with everything you said in your last statement, its again not connected to the previous discussion.

    petertodd commented at 10:45 pm on November 6, 2014:

    Again, you’re comment gives the impression that you’re arguing this change is ok, on the basis that it can’t be exploited right now. I’m arguing this change isn’t ok, because we have solid evidence that it changes behavior that could be used for an exploit. I’m not sure how, and maybe it isn’t possible, but it’s obviously a risky change.

    Adding a separate method to serialize a unsigned char vector to a PUSHDATA and nothing else, and changing FindAndDelete() calls to use that method, may be one way to mitigate the risk.

    Note how all of the above also suggests that conversions of data to CScripts() is something that should ideally be kept separate of the consensus-critical code.


    petertodd commented at 10:46 pm on November 6, 2014:
    This is also a good argument for incorporating hashed instruction-by-instruction execution traces into any future OP_EVAL proposal, where failure to match the trace hash exactly fails immediately. Merkelized abstract syntax tree ideas lend themselves naturally to this.

    TheBlueMatt commented at 10:49 pm on November 6, 2014:
    My point was not that this is OK, but that arguing based on STRICTENC is a completely invalid argument. In any case, we clearly agree, so…whatever.

    petertodd commented at 11:32 pm on November 6, 2014:
    There’s nothing invalid about arguing that we had a close call. We should follow development practices that prevent close calls, given it’s only a matter of time before one of those close calls turns into an actual exploit.
  33. sipa commented at 11:43 pm on November 6, 2014: member
    Too risky. Closing.
  34. sipa closed this on Nov 6, 2014

  35. jtimon commented at 3:05 pm on November 7, 2014: contributor
    Maybe we can push a subset of the changes that is not risky?
  36. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me