The FindAndDelete function could move the same bytes multiple times if the signature being deleted appeared in the script multiple times.
These two commits add a unit test for FindAndDelete and optimize it so that bytes are moved at most once.
The FindAndDelete function could move the same bytes multiple times if the signature being deleted appeared in the script multiple times.
These two commits add a unit test for FindAndDelete and optimize it so that bytes are moved at most once.
578 | {
579 | while (end() - pc >= (long)b.size() && memcmp(&pc[0], &b[0], b.size()) == 0)
580 | {
581 | - pc = erase(pc, pc + b.size());
582 | + // Lazy copy-in-place if there is a match:
583 | + while (copied != pc-nShift) {
A test for !nShift here would prevent copying everything upto the first match to itself
591 | } 592 | while (GetOp(pc, opcode)); 593 | + 594 | + if (nShift > 0) 595 | + { 596 | + while (copied != pc-nShift) {
I think this changes the result of FindAndDelete for invalid CScripts that end in a pushdata without enough data. Replacing "pc" with "end()" would keep the original behavior in this case.
Concept ACK on the improvement, but there are maybe a few more edge cases to test:
CScript::FindAndDelete, called by the CHECKSIG Script opcodes, could
end up moving the bytes of the Script multiple times if there were
multiple matches for the Find.
This commit tweaks the algorithm so it never moves bytes in the Script
more than once per call to FindAndDelete.
Added unit tests as suggested by @sipa. Fixed to be compatible with invalid CScripts that end with partial PUSHDATA's (and added unit tests) as suggested by @kazcw
And saved a couple of lines of code (and probably made faster, but copying is never done in normal operation with reasonable transactions) by switching to using std::copy instead of an explicit loop.
Thanks for the code reviews!
1109 | + BOOST_CHECK(s == expect); 1110 | + 1111 | + // This is an odd edge case: strip of the push-three-bytes 1112 | + // prefix, leaving 02ff03 which is push-two-bytes: 1113 | + s = ScriptFromHex("0302ff030302ff03"); 1114 | + d = ScriptFromHex("03");
I'd prefer to not have tests that check that FindAndDelete do exactly what we currently do when called against invalid scripts when we dont require that in any callers.
RE: behavior with invalid Scripts: I don't care one way or the other; seems safer to behave exactly the same, since this is consensus-critical code.
579 | {
580 | while (end() - pc >= (long)b.size() && memcmp(&pc[0], &b[0], b.size()) == 0)
581 | {
582 | - pc = erase(pc, pc + b.size());
583 | + // Lazy copy-in-place if there is a match:
584 | + if (nShift == 0) copied = pc;
Isn't nShift always == 0 at the beginning? Why bother initializing copied above?
Why initialize copied instead of just eliminating the if/else and always copying? Because std::copy behavior is undefined if the first and last arguments point to the same element.
Why would that be? The Standard's definition of std::copy doesn't seem to say anything that would prohibit an empty range.
The standard says:
"result shall not be in the range [first,last)."
last is not in the range [first,last)
@kazcw : The first time a match is found, copied will be begin(). nShift will be zero, and pc will be pointing into the array somewhere.
If we removed copied = pc and just did: copied = std::copy(copied+nShift, pc, copied);
... then first is copied, last is pc, and result is copied. copied IS in the range [copied, pc)