(As suggested by @jtimon)
We’re currently recomputing the signature hash multiple times during an OP_CHECKMULTISIG validation, mostly all with the same data and nHashType anyway. Implement a small per-evaluation cache to reuse those values.
No we didn’t and probably the gains won’t be very big. This came up in a conversation when I was pointing out that if we knew the sighash before calling EvalScript, the serialization could be done before and only once per script evaluation (instead of once per CheckSig). EvalScript receives a nHashType parameter but it’s only used for checking it is the same as the ones contained at the end of the signatures. As shown in #4555, it is almost always called with a 0 which makes CheckSig ignore the check and just use what’s contained at the end of the signature. I’m still struggling to understand why there can be different sighashes for the same script/output and why they’re are placed after the signatures instead of somewhere else. Sipa’s “ask Satoshi” it’s not a satisfactory answer to me…
I’m also still thinking what good be a way to benchmark this…
#4555 is removing a feature in EvalScript() that lets the callee determine what type of sighashes a scriptSig contains. For instance I might want to verify that a scriptSig only contains SIGHASH_ANYONECANPAY sigs to see if I can add/remove inputs.
As for why each CHECKSIG can use a different nHashType and thus sighash, that’s just flexibility. For instance GreenAddress.it’s nLockTime scheme uses that feature - the refund scripts that they sign with nLockTime are signed with ANYONECANPAY|NONE letting the other key do whatever they want too.
What do you mean by “why they’re are placed after the signatures instead of somewhere else”?
I don’t understand that use case. Is the refund script signed with anyone can pay and something else at the same time? Is it multisig? I don’t get it. By “placed after the signatures” I mean why they are in vchSig.back() of each checksig call. About the feature that #4555 removes, it is not currently used by the core nor the wallet, but anyway, if there was only one sighash per evalscript/input instead of one sighash for each signature that appears in the scriptSig, that additional check would be trivial to implement at a higher level. It could be a field on each input, for example:
0if (input.nSighash == SIGHASH_ANYONECANPAY)
Probably is too late to change that, but I would like to understand if there was a motivation I’m missing or it was just a design mistake.
@jtimon Yeah, it’s multisig.
The vchSig.back() bit is just standard conservatism to make sure a signature can’t be reused for something unintended. I remember finding a case where that prevented a security issue awhile back, although for the life of me I don’t remember the details exactly.
Making the sighash be a part of the CTxIn structure doesn’t really make sense in a system with scripts that’s meant to be expendable. Anyway CPU usage isn’t the bottleneck for any of this stuff.
@jtimon reread my answer above pointing out how Green address already makes use of the fact that there can be more than one SIGHASH flag in a deployed, production, application.
Premature optimisation is the root of all evil; the scalability problems Bitcoin has aren’t going to be solved with little single digit improvements everywhere at the cost of making the protocol less generic.
I’ve rewritten it to use statically allocated hashes rather than an std::map, so it doesn’t even have dynamic memory overhead.
I think the improvement is small, but maybe worthwhile. I’m not going to spend more time on it, however.
Initially I thought the optimization was going to be a side effect in some refactorings I’m working on. Thank you @sipa for showing me that this is the best I could do optimization-wise and realizing that wasn’t going to be the case. I was also interested in understanding why it was allowed to have different sighash in the same script evaluation, thank you @petertodd for pointing out the green address use case, once there’s one there’s probably more, but I really needed a first one.
This PR has been definitely useful for me, but since we know the optimization will be small and only for multisig, I’m not convinced that the additional software complexity is worth it.