Make script interpreter independent from storage type CScript

sipa commented at 11:11 pm on April 23, 2018: member

This introduces versions of GetScriptOp, EvalScript, and VerifyScript that operate on scripts represented by Span<const unsigned char>. This makes it possible to use different representation types for scripts, as Spans can be used to refer to scripts stored in any continuous fashion in memory, regardless of the container.

This is also a step towards reducing the consensus-criticalness of CScript, but not entirely. The interpreter code still uses CScript internally for a few purposes (notably DecodeOP_N, and operator<<).

Longer term, the goal is removing the need for all scripts to share the same representation. Currently CScript is a prevector with 28 preallocated bytes - a choice we need because it’s favorable for scriptPubKeys in the UTXO cache, but it’s pretty terrible for storing scriptSigs. With this change, separate (or even custom) data structures can be used for UTXO scriptPubKeys, and scriptPubKeys/scriptSigs in transactions. One possibility for the latter is storing all scripts in a transaction concatenated in a single allocated area, rather than using separate allocations for each.

in src/span.h:36 in 3fd0d0bf52 outdated

22@@ -23,7 +23,12 @@ class Span
23     constexpr Span(C* data, std::ptrdiff_t size) noexcept : m_data(data), m_size(size) {}
24 
25     constexpr C* data() const noexcept { return m_data; }
26+    constexpr C* begin() const noexcept { return m_data; }
27+    constexpr C* end() const noexcept { return m_data + m_size; }
28     constexpr std::ptrdiff_t size() const noexcept { return m_size; }
29+    constexpr C& operator[](std::ptrdiff_t pos) const noexcept { return m_data[pos]; }
30+
31+    constexpr Span<C> subspan(std::ptrdiff_t offset) const noexcept { return Span<C>(m_data + offset, m_size - offset); }

promag commented at 11:54 pm on April 23, 2018:

Note, equivalent to count = std::dynamic_extent in std::span.

sipa commented at 0:34 am on May 22, 2018:

Which is the default, so I think it’s fine. Generally all methods here mimic a subset of the behaviour of std::span, but sometimes with optional arguments/types omitted.

promag commented at 0:00 am on April 24, 2018: member

Looks good, will look more closely later.

laanwj added the label Refactoring on Apr 24, 2018

laanwj added the label Consensus on Apr 24, 2018

laanwj commented at 1:51 pm on April 25, 2018: member

Makes interpreter.cpp more self-contained, that’s good. Verified that:

Consensus IsPayToScriptHash matches CScript::IsPayToScriptHash: copy-only apart from variable names
Consensus IsWitnessProgram matches CScript::IsWitnessProgram:
- a size_t becomes ptr_diff_t: not an issue as only positive values are possible (unsigned char + 2)
- std::vector<unsigned char>(this->begin() + 2, this->end()) becomes script.subspan(2): should be ok, it’s defined as Span(m_data + offset, m_size - offset)
Consensus IsPushOnly matches CScript::IsPushOnly:
- pc becomes Span instead of a const_iterator: effective behavior is the same
- GetOp becomes GetScriptOp: GetOp is already defined in that way, so should be fine

Other changes look straightforward.

utACK 91c12000dd97d7b9b9c82d1bbc92c7bbadd3d6ed

TheBlueMatt commented at 4:40 pm on April 27, 2018: contributor

This seems to be lacking a bunch of motivation. Can you clarify why you want to run a script stored in a span?

sipa commented at 4:51 pm on April 27, 2018: member

@TheBlueMatt They’re not stored in a Span; a Span is just a way to refer to a script stored anywhere (vector, array, prevector, whatever custom structure that can hold a continuous sequence of bytes).

I’ll add some more motivation.

TheBlueMatt commented at 4:55 pm on April 27, 2018: contributor

Suresure, I was asking for motivation of where we want to do this in the immediate future (do you have any branches that run scripts in non-prevector form?). The only place I can think of it being useful is in libbitcoinconsensus.

sipa commented at 5:02 pm on April 27, 2018: member

@TheBlueMatt No code to show, but I added some thoughts to the PR description.

TheBlueMatt commented at 5:12 pm on April 27, 2018: contributor

Ah, the scriptPubKey/scriptSig distinction seems like a reasonable idea. Would love to see a branch with that implemented to see the feasibility of using this in the short-term after a merge.

in src/script/script.cpp:365 in f31d811228 outdated

344     return true;
345 }
346+
347+bool GetScriptOp(CScriptBase::const_iterator& pc, CScriptBase::const_iterator end, opcodetype& opcodeRet, std::vector<unsigned char>* pvchRet)
348+{
349+    Span<const unsigned char> script(&*pc, end - pc);

theuni commented at 10:29 pm on April 27, 2018:

I believe this needs a check before dereference. Before, it would’ve been caught in the real GetScriptOp() by the newly removed

0if (pc >= end)
1    return false;

sipa commented at 0:04 am on April 28, 2018:

Good catch, fixed.

in src/script/script.cpp:336 in f31d811228 outdated

347+bool GetScriptOp(CScriptBase::const_iterator& pc, CScriptBase::const_iterator end, opcodetype& opcodeRet, std::vector<unsigned char>* pvchRet)
348+{
349+    Span<const unsigned char> script(&*pc, end - pc);
350+    bool ret = GetScriptOp(script, opcodeRet, pvchRet);
351+    assert(&*script.end() == &*end);
352+    pc += (&*script.begin() - &*pc);

theuni commented at 10:45 pm on April 27, 2018:

This seems unnecessarily brittle. How about caching the before-size and adding back the difference from the final script.size() ?

sipa commented at 0:04 am on April 28, 2018:

That’s indeed cleaner; done.

in src/script/interpreter.cpp:1399 in 5aa2610dd4 outdated

1394+
1395+/** Consensus-critical analogue of CScript::IsWitnessProgram. */
1396+static bool IsWitnessProgram(const Span<const unsigned char>& script, int& version, Span<const unsigned char>& program)
1397+{
1398+    if (script.size() < 4 || script.size() > 42) return false;
1399+    if (script[0] != OP_0 && (script[0] < OP_1 || script[1] > OP_16)) return false;

theuni commented at 11:16 pm on April 27, 2018:

how did script[1] sneak in here?

sipa commented at 0:04 am on April 28, 2018:

Fixed, going to add a test.

sipa commented at 10:06 pm on April 30, 2018:

Done, added a test in a follow-up commit.

sipa force-pushed on Apr 28, 2018

ryanofsky commented at 2:08 pm on May 11, 2018: contributor

Related discussion in IRC: https://botbot.me/freenode/bitcoin-core-dev/msg/99929791/

in src/span.h:34 in 3fd0d0bf52 outdated

22@@ -23,7 +23,12 @@ class Span
23     constexpr Span(C* data, std::ptrdiff_t size) noexcept : m_data(data), m_size(size) {}
24 
25     constexpr C* data() const noexcept { return m_data; }
26+    constexpr C* begin() const noexcept { return m_data; }
27+    constexpr C* end() const noexcept { return m_data + m_size; }
28     constexpr std::ptrdiff_t size() const noexcept { return m_size; }
29+    constexpr C& operator[](std::ptrdiff_t pos) const noexcept { return m_data[pos]; }

martinus commented at 3:24 pm on May 31, 2018:

I think it’s a bit dangerous to use noexcept for operator[]. If Span is used for a custom collection that might throw an exception here, using noexcept here will cause the app to terminate. std::span’s operator[] also does not use noexcept here either: http://en.cppreference.com/w/cpp/container/span/operator_at

Same goes for subspan, it might be better to throw an exception for an invalid offset than to terminate.

sipa commented at 3:26 pm on May 31, 2018:

m_data is just a pointer; operator[] is pointer arithmetic + dereferencing of a pointer. There is no chance for anything custom.

martinus commented at 3:35 pm on May 31, 2018:

ah you are of course right

jb55 commented at 0:52 am on June 21, 2018:

Just a thought: wouldn’t asserting pos < m_size be a quick win here to catch out of bounds bugs during development?

sipa commented at 2:20 am on June 22, 2018:

Sounds like a good idea to add in a form of debug mode (like DEBUG_ADDRMAN etc), but perhaps independently of this PR?

jb55 commented at 2:29 am on June 22, 2018:

👍. usually asserts are disabled when NDEBUG is defined but it looks like asserts can’t be disabled in bitcoin for some reason? so would need to do something custom… ¯_(ツ)_/¯

sipa commented at 2:35 am on June 22, 2018:

Yes, we already have that; see DEBUG_ADDRMAN, DEBUG_LOCKORDER, DEBUG_LOCKCONTENTION. More could be added.

sipa force-pushed on May 31, 2018

sipa commented at 6:53 pm on May 31, 2018: member

Rebased.

in src/script/script.cpp:279 in c88b9efbdc outdated

276@@ -276,18 +277,16 @@ bool CScript::HasValidOps() const
277     return true;
278 }
279 
280-bool GetScriptOp(CScriptBase::const_iterator& pc, CScriptBase::const_iterator end, opcodetype& opcodeRet, std::vector<unsigned char>* pvchRet)
281+bool GetScriptOp(Span<const unsigned char>& script, opcodetype& opcodeRet, std::vector<unsigned char>* pvchRet)

promag commented at 1:03 pm on June 11, 2018:

AFAICT span end can change whereas before the end iterator couldn’t change. Is this worth noting?

sipa commented at 10:46 pm on June 18, 2018:

Noted.

promag commented at 11:05 pm on June 18, 2018:

👀

sipa commented at 11:09 pm on June 18, 2018:

I’m not sure what you’re asking here. Yes, it’s a change, but I don’t think it significant enough to point out in the commit, and certainly not in the code itself.

If you mean that it should be pointed out to other reviewers, I agree - and you just did :)

promag commented at 9:34 am on June 19, 2018:

Ok then.

in src/script/script.cpp:332 in c88b9efbdc outdated

343+{
344+    if (pc >= end) return false;
345+    ptrdiff_t old_size = end - pc;
346+    Span<const unsigned char> script(&*pc, old_size);
347+    bool ret = GetScriptOp(script, opcodeRet, pvchRet);
348+    pc += (old_size - script.size());

promag commented at 1:05 pm on June 11, 2018:

Could be pc += script.begin() - &*pc?

If so old_size above is not necessary if you add Span(C* begin, C* end) constructor.

promag commented at 1:07 pm on June 11, 2018:

Should not touch pc if !ret?

sipa commented at 10:47 pm on June 18, 2018:

Could be pc += script.begin() - &*pc?

Nice, done.

Should not touch pc if !ret?

Yes it should. The old GetScriptOp implementation operating on iterators did too.

promag commented at 10:58 pm on June 18, 2018:

Okay.

achow101 commented at 10:58 pm on June 11, 2018: member

utACK 6ae8a5d681b181c7cb772778e18285139df74ece

sipa force-pushed on Jun 18, 2018

jb55 commented at 1:04 am on June 21, 2018: contributor

utACK ae8df82ad34a6baaf883411308255e6c6d093c53

promag commented at 7:01 pm on June 21, 2018: member

utACK ae8df82.

achow101 commented at 10:05 pm on June 29, 2018: member

re-utACK ae8df82ad34a6baaf883411308255e6c6d093c53

DrahtBot added the label Needs rebase on Jul 30, 2018

sipa force-pushed on Aug 1, 2018

sipa commented at 6:42 pm on August 1, 2018: member

Rebased.

DrahtBot removed the label Needs rebase on Aug 1, 2018

sipa force-pushed on Aug 1, 2018

DrahtBot commented at 3:18 pm on September 21, 2018: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25331 (Add HashWriter without ser-type and ser-version and use it where possible by MarcoFalke)
#20100 (Script: split policy/error consensus codes for CLEANSTACK, MINIMALIF by sanket1729)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot closed this on Apr 28, 2019

DrahtBot commented at 7:11 pm on April 28, 2019: contributor

DrahtBot reopened this on Apr 28, 2019

DrahtBot added the label Needs rebase on Jun 6, 2019

sipa commented at 3:35 pm on October 3, 2019: member

Will rebase next week.

sipa force-pushed on Oct 10, 2019

sipa commented at 7:45 pm on October 10, 2019: member

Rebased.

DrahtBot removed the label Needs rebase on Oct 10, 2019

sipa force-pushed on Nov 9, 2019

sipa commented at 7:33 pm on November 9, 2019: member

Rebased, and also made GetScriptOp return a span for the data pushed.

sipa force-pushed on Nov 11, 2019

theStack commented at 4:24 pm on February 2, 2020: contributor

Related discussion in IRC: https://botbot.me/freenode/bitcoin-core-dev/msg/99929791/

The link is dead unfortunately, is the IRC log still available at another place?

in src/script/script.cpp:283 in c3ab02b8d0 outdated

283-    if (pvchRet)
284-        pvchRet->clear();
285-    if (pc >= end)
286-        return false;
287+    if (data_out)
288+        *data_out = script.subspan(0, 0);

theStack commented at 4:30 pm on February 2, 2020:

nit: Is there any reason why an empty subspan on script ist preferred over just the default constructor (i.e. *data_out = Span<const unsigned char>();? For a zero-size span the address doesn’t seem to matter.

sipa commented at 4:29 am on February 3, 2020:

No reason, changed it.

in src/script/script.cpp:331 in c3ab02b8d0 outdated

342+{
343+    if (pc >= end) return false;
344+    Span<const unsigned char> script(&*pc, end - pc);
345+    Span<const unsigned char> data;
346+    bool ret = GetScriptOp(script, opcodeRet, &data);
347+    if (pvchRet) *pvchRet = std::vector<unsigned char>(data.begin(), data.end());

theStack commented at 4:35 pm on February 2, 2020:

This could also be done without constructing a new vector instance through pvchRet->assign(...,...);?

sipa commented at 4:30 am on February 3, 2020:

Good idea, done.

in src/script/interpreter.cpp:295 in caff5a9629 outdated

293-    CScript::const_iterator pend = script.end();
294-    CScript::const_iterator pbegincodehash = script.begin();
295+    Span<const unsigned char> pc = script;
296+    Span<const unsigned char> codehash = script;
297     opcodetype opcode;
298     valtype vchPushValue;

theStack commented at 4:37 pm on February 2, 2020:

With the changes, vchPushValue is now unused and can be deleted.

sipa commented at 4:30 am on February 3, 2020:

Nice find. Gone.

in src/script/interpreter.cpp:444 in caff5a9629 outdated

311             //
312             // Read instruction
313             //
314-            if (!script.GetOp(pc, opcode, vchPushValue))
315+            Span<const unsigned char> push_value;
316+            if (!GetScriptOp(pc, opcode, &push_value)) {

theStack commented at 4:43 pm on February 2, 2020:

nit: Braces can be removed here as the if-body only consists of one single statement (to be consistent with the remaining ifs in the same function).

sipa commented at 4:31 am on February 3, 2020:

I’d rather not. The style guide only allows brace-less ifs if the then clause is on the same line as the if condition. Doing so would make the line a bit long in my opinion.

The fact that it’s inconsistent with the code around it doesn’t matter.

theStack commented at 1:10 pm on February 4, 2020:

Oh, my assumption was that mimicing the surrounding style was favored over new style rather than the other way around, mea culpa.

in src/script/interpreter.cpp:1467 in 5b97a31622 outdated

1460@@ -1461,21 +1461,22 @@ static bool IsPushOnly(const Span<const unsigned char>& script)
1461     return true;
1462 }
1463 
1464-static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion, const std::vector<unsigned char>& program, unsigned int flags, const BaseSignatureChecker& checker, ScriptError* serror)
1465+static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion, const Span<const unsigned char>& program, unsigned int flags, const BaseSignatureChecker& checker, ScriptError* serror)
1466 {
1467     std::vector<std::vector<unsigned char> > stack;
1468-    CScript scriptPubKey;
1469+    CScript temp_script;

theStack commented at 4:49 pm on February 2, 2020:

As temp_script is only used once, it could be declared and initialized immediately (as const) below at the place where it is needed.

sipa commented at 4:32 am on February 3, 2020:

That does not work (see other comment).

in src/script/interpreter.cpp:1487 in 5b97a31622 outdated

1484@@ -1484,7 +1485,8 @@ static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion,
1485             if (witness.stack.size() != 2) {
1486                 return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_MISMATCH); // 2 items in witness
1487             }
1488-            scriptPubKey << OP_DUP << OP_HASH160 << program << OP_EQUALVERIFY << OP_CHECKSIG;
1489+            temp_script << OP_DUP << OP_HASH160 << program << OP_EQUALVERIFY << OP_CHECKSIG;
1490+            script = Span<const unsigned char>(temp_script.data(), temp_script.size());
1491             stack = witness.stack;

theStack commented at 4:51 pm on February 2, 2020:

Given that temp_script is declared as const (see previous reviews comment above), one can simply use MakeSpan(temp_script) here.

sipa commented at 4:32 am on February 3, 2020:

That’s unfortunately incorrect. The script variable is a Span which doesn’t have its own storage, so it needs the temp_script to remain intact over its lifetime. I’ve added a comment to clarify.

theStack commented at 12:58 pm on February 4, 2020:

Whoops, you are of course right, temp_script wouldn’t be alive anymore when script is used below. The evil thing with errors like those is that they are hard to get catch by tests: when I tried out my suggestion all of the unit tests still passed successfully. Obviously even though temp_script wasn’t alive anymore the data at the address was still there unmodified. Only creating a destructor ~CScript() { assign(size(), 0}; } that nullifies the data at the end helped to trigger the failures.

theStack commented at 4:55 pm on February 2, 2020: contributor

Concept ACK – I left some code review comments, mostly nits and minor readability issues though.

sipa force-pushed on Feb 3, 2020

theStack approved

theStack commented at 1:12 pm on February 4, 2020: contributor

Code-Review ACK https://github.com/bitcoin/bitcoin/pull/13062/commits/82a4c2cbaf9bded80f2bc4f19735e0c801126c53

DrahtBot added the label Needs rebase on Mar 13, 2020

sipa force-pushed on Mar 14, 2020

sipa commented at 2:07 am on March 14, 2020: member

Rebased (nontrivially) on the now-merged #18002.

DrahtBot removed the label Needs rebase on Mar 14, 2020

DrahtBot added the label Needs rebase on Mar 14, 2020

sipa force-pushed on Mar 14, 2020

sipa commented at 10:05 pm on March 14, 2020: member

Rebased on top of the now-merged #16902.

DrahtBot removed the label Needs rebase on Mar 14, 2020

DrahtBot added the label Needs rebase on Mar 27, 2020

sipa force-pushed on Mar 27, 2020

sipa commented at 11:18 pm on March 27, 2020: member

Rebased now that #18388 is merged. Also added a commit to get rid of two more unnecessary CScript instances in the verification code.

DrahtBot removed the label Needs rebase on Mar 28, 2020

sipa force-pushed on Mar 28, 2020

sipa force-pushed on Mar 30, 2020

sipa force-pushed on Mar 31, 2020

DrahtBot added the label Needs rebase on Apr 10, 2020

sipa force-pushed on Apr 12, 2020

sipa commented at 8:38 pm on April 12, 2020: member

Rebased now that #18422 is merged.

DrahtBot removed the label Needs rebase on Apr 12, 2020

sipa force-pushed on May 1, 2020

sipa force-pushed on May 2, 2020

sipa force-pushed on May 5, 2020

DrahtBot added the label Needs rebase on May 11, 2020

sipa force-pushed on May 12, 2020

DrahtBot removed the label Needs rebase on May 12, 2020

adamjonas commented at 4:57 pm on June 18, 2020: member

For those looking to review - this had 4 pre-rebase utACKs (laanwj, jb55, achow101, promag) and a code review ACK (theStack) with no NACKs or any show-stopping concerns raised.

jb55 commented at 6:04 pm on June 18, 2020: contributor

post-rebase Concept ACK

sipa force-pushed on Jun 19, 2020

sipa commented at 6:33 pm on June 19, 2020: member

(Trivially) rebased now that #18468 is merged.

theuni commented at 7:30 pm on June 19, 2020: member

Just a general question/concern as I re-review: I don’t see in the std::span docs what the guarantees are if the underlying structure is extended after the span is created. If it’s supposed to remain generally sane (which I would vaguely expect because it’s required to be contiguous), I’m nervous about prevector:

 0int main()
 1{
 2    std::vector<int> foo;
 3    foo.push_back(1);
 4    foo.push_back(2);
 5    foo.push_back(3);
 6    Span<int> foospan(foo);
 7    foo.push_back(4);
 8    printf("foospan last element: %i\n", foospan.back());
 9
10    prevector<3, int> bar;
11    bar.push_back(1);
12    bar.push_back(2);
13    bar.push_back(3);
14    Span<int> barspan(bar);
15    bar.push_back(4);
16    printf("barspan last element: %i\n", barspan.back());
17}

result:

0$ ./spantest
1foospan last element: 3
2barspan last element: 6

sipa commented at 7:38 pm on June 19, 2020: member

@theuni I imagine that the guarantees for any container are that anything that may invalidate references, invalidates a span over those elements.

It’s a good point that we should document reference/iterator invalidation for our custom data structures like prevector (which could be something like “extending a prevector to any size beyond its preallocated limit invalidates all references”).

Both of your examples are UB (it’s the same as taking a reference to container.back(), and then extending the container).

in src/script/interpreter.cpp:1571 in a6b0522c44 outdated

1567@@ -1569,21 +1568,20 @@ static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion,
1568             if (stack.size() == 0) {
1569                 return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_WITNESS_EMPTY);
1570             }
1571-            const valtype& script_bytes = SpanPopBack(stack);
1572-            scriptPubKey = CScript(script_bytes.begin(), script_bytes.end());
1573+            const valtype& script = SpanPopBack(stack);

theuni commented at 8:17 pm on June 19, 2020:

This would be much less confusing to read if reordered:

0uint256 hashScriptPubKey;
1CSHA256().Write(stack.back().data(), stack.back().size()).Finalize(hashScriptPubKey.begin());
2if (memcmp(hashScriptPubKey.begin(), program.data(), 32)) {
3    return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_MISMATCH);
4}
5return ExecuteWitnessScript(stack, SpanPopBack(stack), flags, SigVersion::WITNESS_V0, checker, serror);

Instead of as-is, popping off the stack just before it would’ve been useful.

sipa commented at 9:21 pm on June 26, 2020:

Huh, this code just didn’t make sense - it had script already, why was it accessing it through witness.stack.back()?

I’ve changed it to just use Write(script.data(), script.size()).

theuni approved

theuni commented at 8:40 pm on June 19, 2020: member

Code-review ACK 3629303374fe94b25db41a6be325e891b695720f. This is a very nice improvement. Single nit about code reordering, but it’s only a preference.

As mentioned on IRC, I am a little nervous about the greedy Span constructor as I prefer the explicitness of MakeSpan, but as @sipa pointed out, it matches std::span’s behavior.

sipa force-pushed on Jun 26, 2020

pinheadmz approved

pinheadmz commented at 10:28 pm on June 29, 2020: member

Compiled and ran all tests locally. Reviewed all code changes. Confirmed 20 0’s in {OP_DUP, OP_HASH160, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, OP_EQUALVERIFY, OP_CHECKSIG}; ;-)

ACK 591f1037e474ffb1c9f84926bb8f846a515096e9

 0-----BEGIN PGP SIGNED MESSAGE-----
 1Hash: SHA256
 2
 3ACK 591f1037e474ffb1c9f84926bb8f846a515096e9
 4-----BEGIN PGP SIGNATURE-----
 5
 6iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAl76aqcACgkQ5+KYS2KJ
 7yTrQ5g/9ETmk6FkFdcM6+fOdA3yd9CUcYDmTxSKYSwvBZAPyB/lAvsyfITl/2MX4
 8tDhJiRKlmKjLWzNxb3ajophjrm9umulylKri7aNtu6MLCCOFnHoxu3fUpsIN0GXj
 99LeisVi3W9tpq1xWNpDLf+m3ZFuYEEk4dd5ZLUo9/wbDAQ5CSjCE8YpMxbAcJwn0
10gG+OY1+o7w/9cgRyF8xVVUp5Jvew6PtrmE6Y/kGzwofCgkvg834q92cPYa2rRG7i
117LLbeKirPguqzoacP4KmHsJeJDiQkZlFQJOJDgoFJDfAPxLzNLx7u/6+Uo0twbsr
128eE19xI9EyfV7qmQAlfYC8FQhlVMXFplGCTWrWj5xehqoLlestQUROufIq465fwW
13XWBKYQbt9TkVzrc9EYxWGxGbo4/YiqmtdytHixWT3FpqfkV2ey51l/X6H22tgCHb
14GG+7QfsacchlD1TQ4OwF6zSCjmiyloSZwtyQ0M2QLn/XSSSxaVAJg2jks2r2ILjw
15oFBGiMNYg6qMaYhALJMjWjFyiNyPvg3zCiPSWUmFpLdViCrHJUoqQcCVZ99wNv8K
16G46Z1VbxX47UkjqBLkwZ7Wffc9EuwzQnxvvirlOrdwSW/2cIxP1kZJ3IDDdv1uL0
17iAXePC9pcOSBRrOd1Bdmc79DNTN+ikwL2TL/MgfFcWXaI5rdE4c=
18=mLy5
19-----END PGP SIGNATURE-----

pinheadmz’s public key is on keybase

in src/script/interpreter.cpp:443 in 591f1037e4 outdated

402 
403             //
404             // Read instruction
405             //
406-            if (!script.GetOp(pc, opcode, vchPushValue))
407+            Span<const unsigned char> push_value;

pinheadmz commented at 10:29 pm on June 29, 2020:

Do we get the benefit of the Span here even though it’s only used in a short context?

sipa commented at 4:11 am on July 1, 2020:

Not sure what you mean by “the benefit of the Span”. Spans are a more readable way of dealing with span-like data, but it depends what you care about, and what you compare with.

Compared to the current code, it’s more efficient: it’s avoiding an allocation and a copy (vchPushValue was storing a copy of the pushed data). A copy still happens when emplacing on the stack, but only in executed branches.

Compared to manually maintaining a pointer & length, which would be equally efficient… it’s certainly a lot more readable.

pinheadmz commented at 11:16 am on July 1, 2020:

I see, thanks

in src/script/interpreter.cpp:2076 in 591f1037e4 outdated

1675@@ -1674,8 +1676,7 @@ bool VerifyScript(const Span<const unsigned char>& scriptSig, const Span<const u
1676         if (flags & SCRIPT_VERIFY_WITNESS) {
1677             if (IsWitnessProgram(redeem_script, witnessversion, witnessprogram)) {
1678                 hadWitness = true;
1679-                const CScript expected_scriptsig = CScript() << redeem_script;
1680-                if (scriptSig != expected_scriptsig) {
1681+                if (scriptSig.size() != redeem_script.size() + 1 || scriptSig[0] != redeem_script.size()) {

theStack commented at 12:26 pm on July 23, 2020:

Strictly speaking that’s not semantically equivalent to the code replaced. Before it would compare scriptSig and the pushed redeem_script byte-by-byte, now with the change it only compares that the size matches and the first byte (the push). I guess that is indended though, as redeem_script is derived from scriptSig above anyways, and hence the content comparison is not needed?

hebasto commented at 11:34 am on August 6, 2020: member

Concept ACK.

hebasto commented at 12:03 pm on August 6, 2020: member

Related discussion in IRC: https://botbot.me/freenode/bitcoin-core-dev/msg/99929791/

The link is dead unfortunately, is the IRC log still available at another place?

http://www.erisian.com.au/bitcoin-core-dev/log-2018-05-10.html#l-524

hebasto commented at 12:14 pm on August 6, 2020: member

As mentioned on IRC, I am a little nervous about the greedy Span constructor as I prefer the explicitness of MakeSpan, but as @sipa pointed out, it matches std::span’s behavior.

http://www.erisian.com.au/bitcoin-core-dev/log-2020-06-19.html#l-488

in src/script/interpreter.cpp:1941 in 591f1037e4 outdated

1579@@ -1580,7 +1580,9 @@ static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion,
1580             if (stack.size() != 2) {
1581                 return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_MISMATCH); // 2 items in witness
1582             }
1583-            CScript script = CScript() << OP_DUP << OP_HASH160 << program << OP_EQUALVERIFY << OP_CHECKSIG;
1584+            //! The script "OP_DUP OP_HASH160 <program> OP_EQUALVERIFY OP_CHECKSIG".
1585+            unsigned char script[25] = {OP_DUP, OP_HASH160, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, OP_EQUALVERIFY, OP_CHECKSIG};
1586+            std::copy(program.begin(), program.end(), script + 3);

hebasto commented at 6:26 pm on August 6, 2020:

591f1037e474ffb1c9f84926bb8f846a515096e9

nit: Mind adding #include <algorithm> ?

ajtowns commented at 12:57 pm on September 22, 2020:

That change makes me twitchy; maybe static_assert(WITNESS_V0_KEYHASH_SIZE==20, "consensus critical parameter will result in buffer overrun if changed"); ?

hebasto approved

hebasto commented at 6:31 pm on August 6, 2020: member

ACK 591f1037e474ffb1c9f84926bb8f846a515096e9, tested on Linux Mint 20 (x86_64).

Just a note: functions introduced in d711c1d79b2de6cf1ba8397e0e1a6f7f7550f53e “Introduce consensus versions of script type checks on Span” still unused till commit 24739c5118a30bf324e1c9f932168f82ef118534 “gMake VerifyScript operate on Span”.

In commit 24739c5118a30bf324e1c9f932168f82ef118534 message why “gMake VerifyScript operate on Span”?

theStack approved

theStack commented at 3:16 pm on August 7, 2020: contributor

ACK 591f1037e474ffb1c9f84926bb8f846a515096e9 :package:

in src/script/interpreter.cpp:1546 in 591f1037e4 outdated

1543+    }
1544+    return false;
1545+}
1546+
1547+/** Consensus-critical analogue of CScript::IsPushOnly. */
1548+static bool IsPushOnly(const Span<const unsigned char>& script)

maflcko commented at 6:55 pm on August 27, 2020:

Any reason to make this private? static makes it harder to test and also the “other” version of IsPushOnly can’t call this to avoid code duplication.

in src/script/interpreter.cpp:1828 in d711c1d79b outdated

1518@@ -1519,6 +1519,44 @@ static bool ExecuteWitnessScript(const Span<const valtype>& stack_span, const CS
1519     return true;
1520 }
1521 
1522+/** Consensus-critical analogue of CScript::IsPayToScriptHash. */

ajtowns commented at 12:46 pm on September 22, 2020:

Was there a bad rebase here somewhere? The “Introduce consensus versions of script type checks on Span” doesn’t actually use the new code and there’s not a separate commit for VerifyWitnessProgram accepting spans – both those are instead part of “gMake VerifyScript operate on Span” which has a weird typo in the commit title. (That commit also has a typo in the description: “storare”)

sipa commented at 3:28 am on May 25, 2021:

This commit just introduces these functions; they’re used later on.

in src/test/data/script_tests.json:1272 in e1b6a6120a outdated

1268@@ -1269,6 +1269,10 @@
1269 [["51", 0.00000000 ], "", "0 0x206e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d", "P2SH,WITNESS", "WITNESS_PROGRAM_MISMATCH", "Witness script hash mismatch"],
1270 [["00", 0.00000000 ], "", "0 0x206e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d", "", "OK", "Invalid witness script without WITNESS"],
1271 [["51", 0.00000000 ], "", "0 0x206e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d", "", "OK", "Witness script hash mismatch without WITNESS"],
1272+[["51", 0.00000000 ], "", "-1 0x021234", "P2SH,WITNESS", "WITNESS_UNEXPECTED", "OP_1NEGATE does not introduce a witness program"],

ajtowns commented at 1:14 pm on September 22, 2020:

If you rebase, could move the test additions to the first commit, since they’re testing things that are already true.

ajtowns commented at 2:35 pm on September 22, 2020: contributor

Prefer to see the commits cleaned up a little and re-review then, but I don’t see any problems.

DrahtBot added the label Needs rebase on Oct 15, 2020

sipa force-pushed on May 25, 2021

sipa commented at 3:25 am on May 25, 2021: member

Rebased on top of, and integrated with, the Taproot validation logic.

DrahtBot removed the label Needs rebase on May 25, 2021

sipa force-pushed on May 25, 2021

sipa force-pushed on May 26, 2021

sipa force-pushed on May 27, 2021

DrahtBot added the label Needs rebase on Jun 23, 2021

maflcko commented at 1:21 pm on June 23, 2021: member

Still wondering about #13062 (review) btw

sipa force-pushed on Dec 24, 2021

sipa commented at 9:34 pm on December 24, 2021: member

Rebased. Will address outstanding comments soon.

DrahtBot removed the label Needs rebase on Dec 24, 2021

fanquake referenced this in commit 3ce40e64d4 on Feb 14, 2022

maflcko added the label Needs rebase on Feb 14, 2022

maflcko added the label Waiting for author on Feb 14, 2022

DrahtBot removed the label Needs rebase on Feb 14, 2022

laanwj referenced this in commit abfcecc97b on Feb 14, 2022

sidhujag referenced this in commit 7dc1bc0be9 on Feb 14, 2022

DrahtBot added the label Needs rebase on Apr 5, 2022

Make uint{160,256} support construction from Span 4d905a7c4f

Support pushing Spans onto CScript 1bf6f3a99f

Make GetScriptOp operate on Span

This introduces a version of GetScriptOp that is independent from the script
storage type.

2da0be9ca0

Introduce consensus versions of CScript checks and use them

The CScript methods IsPayToScriptHash, IsWitnessProgram, and IsPushOnly are currently
consensus critical, and dependent on the storage type (CScript).

This creates new consensus-critical versions inside interpreter that operate on the
more efficient Span<const unsigned char> type, independent from the script storage.

The CScript methods remain, but are no longer consensus-critical.

7c4ff3cb95

Make EvalScript operate on Span

This introduces a version of EvalScript that is independent from the storage type.

bef9b402f4

Make VerifyScript operate on Span 9732fdd8a7

Avoid a few instances of CScript in interpreter 356791de84

Make CountWitnessSigOps operate on Span f5ad705b52

sipa force-pushed on Apr 5, 2022

sipa commented at 7:02 pm on April 5, 2022: member

Rebased.

DrahtBot removed the label Needs rebase on Apr 5, 2022

w0xlt approved

w0xlt commented at 7:20 pm on April 5, 2022: contributor

Code Review ACK f5ad705

DrahtBot added the label Needs rebase on Jul 22, 2022

DrahtBot commented at 7:57 am on July 22, 2022: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

maflcko removed the label Waiting for author on Aug 1, 2022

ajtowns commented at 2:31 am on October 5, 2022: contributor

I think rebasing this to current master only requires:

 0+++ src/script/interpreter.cpp
 1-+    return (CHashWriter(HASHER_TAPLEAF) << leaf_version << COMPACTSIZE(script.size()) << script).GetSHA256();
 2++    return (HashWriter(HASHER_TAPLEAF) << leaf_version << COMPACTSIZE(script.size()) << script).GetSHA256();
 3
 4+++ src/psbt.h
 5@@ -889,7 +889,7 @@ struct PSBTOutput
 6                     } else if (key.size() != 33) {
 7                         throw std::ios_base::failure("Output Taproot BIP32 keypath key is not at 33 bytes");
 8                     }
 9-                    XOnlyPubKey xonly(uint256({key.begin() + 1, key.begin() + 33}));
10+                    XOnlyPubKey xonly(Span<const unsigned char>(key).subspan(1, 32));

achow101 commented at 5:54 pm on October 12, 2022: member

Closing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.

achow101 closed this on Oct 12, 2022

fanquake added the label Up for grabs on Oct 13, 2022

bitcoin locked this on Oct 13, 2023

Make script interpreter independent from storage type CScript #13062

Conflicts