Optimize CScript.FindAndDelete #7884

pull gavinandresen wants to merge 2 commits into bitcoin:master from gavinandresen:optimize_FindAndDelete changing 2 files +133 −1
  1. gavinandresen commented at 3:50 PM on April 15, 2016: contributor

    The FindAndDelete function could move the same bytes multiple times if the signature being deleted appeared in the script multiple times.

    These two commits add a unit test for FindAndDelete and optimize it so that bytes are moved at most once.

  2. laanwj added the label Validation on Apr 15, 2016
  3. laanwj added the label Resource usage on Apr 15, 2016
  4. in src/script/script.h:None in ad1fa51e8d outdated
     578 |          {
     579 |              while (end() - pc >= (long)b.size() && memcmp(&pc[0], &b[0], b.size()) == 0)
     580 |              {
     581 | -                pc = erase(pc, pc + b.size());
     582 | +                // Lazy copy-in-place if there is a match:
     583 | +                while (copied != pc-nShift) {
    


    kazcw commented at 10:42 PM on April 15, 2016:

    A test for !nShift here would prevent copying everything upto the first match to itself

  5. in src/script/script.h:None in ad1fa51e8d outdated
     591 |          }
     592 |          while (GetOp(pc, opcode));
     593 | +
     594 | +        if (nShift > 0)
     595 | +        {
     596 | +            while (copied != pc-nShift) {
    


    kazcw commented at 10:44 PM on April 15, 2016:

    I think this changes the result of FindAndDelete for invalid CScripts that end in a pushdata without enough data. Replacing "pc" with "end()" would keep the original behavior in this case.

  6. sipa commented at 9:30 AM on April 16, 2016: member

    Concept ACK on the improvement, but there are maybe a few more edge cases to test:

    • Removing a byte sequence that spans multiple opcodes
    • Trying to remove a sequence that only occurs in the middle of opcodes (which doesn't work)
    • Trying to remove "OP_0 OP_1" from "OP_0 OP_0 OP_1 OP_1" leaves "OP_0 OP_1" (because we don't go backward after a replacement)
  7. gavinandresen force-pushed on Apr 18, 2016
  8. gavinandresen force-pushed on Apr 18, 2016
  9. Unit test for CScript::FindAndDelete 9d213d2b39
  10. Optimize CScript::FindAndDelete
    CScript::FindAndDelete, called by the CHECKSIG Script opcodes, could
    end up moving the bytes of the Script multiple times if there were
    multiple matches for the Find.
    
    This commit tweaks the algorithm so it never moves bytes in the Script
    more than once per call to FindAndDelete.
    aae53f20dc
  11. gavinandresen force-pushed on Apr 18, 2016
  12. gavinandresen commented at 7:42 PM on April 18, 2016: contributor

    Added unit tests as suggested by @sipa. Fixed to be compatible with invalid CScripts that end with partial PUSHDATA's (and added unit tests) as suggested by @kazcw

    And saved a couple of lines of code (and probably made faster, but copying is never done in normal operation with reasonable transactions) by switching to using std::copy instead of an explicit loop.

    Thanks for the code reviews!

  13. in src/test/script_tests.cpp:None in aae53f20dc
    1109 | +    BOOST_CHECK(s == expect);
    1110 | +
    1111 | +    // This is an odd edge case: strip of the push-three-bytes
    1112 | +    // prefix, leaving 02ff03 which is push-two-bytes:
    1113 | +    s = ScriptFromHex("0302ff030302ff03");
    1114 | +    d = ScriptFromHex("03");
    


    TheBlueMatt commented at 12:58 AM on April 19, 2016:

    I'd prefer to not have tests that check that FindAndDelete do exactly what we currently do when called against invalid scripts when we dont require that in any callers.


    gavinandresen commented at 3:03 PM on April 25, 2016:

    RE: behavior with invalid Scripts: I don't care one way or the other; seems safer to behave exactly the same, since this is consensus-critical code.

  14. in src/script/script.h:None in aae53f20dc
     579 |          {
     580 |              while (end() - pc >= (long)b.size() && memcmp(&pc[0], &b[0], b.size()) == 0)
     581 |              {
     582 | -                pc = erase(pc, pc + b.size());
     583 | +                // Lazy copy-in-place if there is a match:
     584 | +                if (nShift == 0) copied = pc;
    


    dcousens commented at 2:27 AM on April 21, 2016:

    Isn't nShift always == 0 at the beginning? Why bother initializing copied above?


    gavinandresen commented at 11:14 AM on April 21, 2016:

    Why initialize copied instead of just eliminating the if/else and always copying? Because std::copy behavior is undefined if the first and last arguments point to the same element.


    kazcw commented at 4:03 PM on April 21, 2016:

    Why would that be? The Standard's definition of std::copy doesn't seem to say anything that would prohibit an empty range.


    gavinandresen commented at 4:30 PM on April 21, 2016:

    The standard says:

    "result shall not be in the range [first,last)."


    kazcw commented at 5:21 PM on April 21, 2016:

    last is not in the range [first,last)


    gavinandresen commented at 3:14 PM on April 25, 2016:

    @kazcw : The first time a match is found, copied will be begin(). nShift will be zero, and pc will be pointing into the array somewhere.

    If we removed copied = pc and just did: copied = std::copy(copied+nShift, pc, copied);

    ... then first is copied, last is pc, and result is copied. copied IS in the range [copied, pc)

  15. sipa commented at 8:01 AM on May 17, 2016: member

    Superseded by #7907.

  16. sipa closed this on May 17, 2016

  17. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-19 00:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me