Split operators in EvalScript's switch

jtimon commented at 3:02 am on October 28, 2014: contributor

After doing this, a clang pass is required. If by chance we have time to get this in before the global clang before 0.10, then we save a clang-only commit on script/interpreter.

theuni commented at 3:30 am on October 28, 2014: member

Yikes. Could you elaborate on the reasoning? In particular, I could understand moving the logic for each opcode to a function so that they could be independently verified or changed, but the groupings like OpNumUnary make this seem like a high-risk lateral move.

jtimon force-pushed on Oct 28, 2014

jtimon commented at 11:31 am on October 28, 2014: contributor

Yes, this change is about readability and maintainability (which includes being able to change a small part without affecting other parts). I used to use automatic style checking tools in java. Both Functions/methods longer than X lines and Y nested levels of flow control raised warnings saying that the function couldn’t possibly be readable and asking you to refactor it. After the change EvalScript() is 274 lines and has 4 nested levels of flow control at most, which I think it’s complex enough.

The groupings were already there, this is just the simplest (more reviewable) way to split the switch maintaining most of the code and just passsing opcode to those cases that were already grouped. In some cases like OpEqual(), I’ve replaced the opcode parameter for a simpler boolean param (fVerify to do the additional verification when the opcode is OP_EQUALVERIFY).

Maybe later OpNumUnary(), OpNumBinary(), OpCrypto(), etc can be split further without replicating code (I doubt it), but in fact splitting them in the same commit actually seems more risky and less reviewable to me.

EDIT: Anyway, the tests are failing in some platforms so I’ve must have done something wrong.

laanwj commented at 12:01 pm on October 28, 2014: member

Personally I’d rather not touch this code if not necessary.

jtimon force-pushed on Oct 28, 2014

jtimon commented at 1:38 pm on October 28, 2014: contributor

I missed a break after OP_EQUAL, fixed but now it needs rebase…

gavinandresen commented at 1:43 pm on October 28, 2014: contributor

NACK from me, too dangerous to change, benefits do not outweigh the risks.

jtimon force-pushed on Oct 28, 2014

jtimon commented at 5:17 pm on October 28, 2014: contributor

In my opinion the benefits in readability in a critical part like this are underestimated. Leaving the current code structure not only makes later changes harder to review, it also reduces the number of potential reviewers we could have for those changes and the performance of those who are already reviewing these changes.Given that those resources are currently very limited, I completely understand the bias towards minimizing review needs in critical parts of the code, this conservative attitude makes the most sense. I consider this apparently meaningless change a great investment in precisely the resources that are being economized though.

At the same time, I also believe the risks are overestimated. Through the use of version control tools like Git, I’ve discovered (mainly by contributing to complex projects including this one) that the amount of logical changes applied to a given codebase can be greatly minimized with some effort to simplify the review process. There are multiple possibilities only in this example: the operators could be tunred into functions first and change their interface (ie: replace an opcode param with a bool) later, operators could be turned into functions one by one, by groups, ordered by complexity in both directions… Many roads lead to Rome (not all of them if you’re trying to avoid crossing an seas and oceans). I’m very open to hear more ideas about the best way to reduce risks while also reducing our costs in terms of collective time. Rebased only splitting OpChecksig() and OpCheckMultisig() [the ones at the end to minimize the diff but, coincidentally, also the ones with higher levels of flow control nesting].

In any case, I also understand that pre0.10 is probably not the best time to have this discussion so I’m happy waiting for a better time and rebasing as many times as necessary. Obviously functional changes and releases should never be delayed to make refactorings easier.

Nonetheless, early feedback is always greatly appreciated, thank you @theuni @laanwj @gavinandresen

If keeping it open for later is a distraction, I can close it now and reopen it after0.10, no problem.

maaku commented at 5:41 pm on October 28, 2014: contributor

Minor style nit : should it be OpCheckSig and OpCheckMultiSig?

Otherwise, need a language lawyer to make sure that nothing changed with the introduction of a function call. Looks okay to me otherwise.

jgarzik commented at 7:15 pm on October 28, 2014: contributor

-1

Risky, and slower too.

jtimon commented at 10:40 pm on October 28, 2014: contributor

How slower? The functions can be inlined (I assumed the compiler would figure it out by itself having only one call per function though).

Split OpChecksig() and OpCheckMultisig() from EvalScript() 74ed668a5c

jtimon force-pushed on Oct 28, 2014

gmaxwell commented at 10:59 pm on October 28, 2014: contributor

I don’t like breaking the script execution engine into functions very much. The big switch statement is readable and generally easy to reason about. The rather long function prototypes are perhaps an indication that this isn’t a good abstraction boundary.

jtimon commented at 2:55 am on October 29, 2014: contributor

The big switch statement is readable and generally easy to reason about.

I obviously disagree with this.

The rather long function prototypes are perhaps an indication that this isn’t a good abstraction boundary.

The longer ones are precisely these two. Though, yes, the fact that these have such long list of attributes while most of the rest practically just use the stack may indicate something, but don’t necessarily talk bad about this division. Maybe in favor given that it makes this “parameters asymmetry” more visible.

maaku commented at 5:04 am on October 29, 2014: contributor

To be clear I saw this after it was reduced to only OpCheckSig and OpCheckMultiSig. I favor this approach for gigantic, monolithic operators like these two. Imagine how much worse it would be if/when we soft-fork functionality into CHECKSIG. I would not support splitting off execution for things like DUP et al.

laanwj commented at 7:52 am on October 29, 2014: member

@jtimon I generally agree with you that structuring code in a way to make reviewing easier is good. However this is code that is very unlikely to change in the first place. I’d only feel confident about changes like this if they don’t change the resulting executable at all, or there would be some other rigorous proof system.

laanwj added the label Improvement on Oct 29, 2014

jtimon commented at 9:28 pm on October 29, 2014: contributor

@laanwj Not changing the resulting at all should clear all risk and performance concerns, thank you for suggesting that. Is there a simple way to do that? I don’t know many details about deterministic building processes, but maybe there’s a one-line thing to compare the hashes of the resulting builds from 2 different git commits, that everybody but myself is aware of or something. I would greatly appreciate more guidance in this direction. That would be very useful for me in general, not just for this PR.

laanwj commented at 5:41 pm on November 12, 2014: member

@jtimon There’s no one-ling thing, but you could follow my process here #4180 (comment)

jtimon commented at 2:07 pm on November 15, 2014: contributor

Thank you @laanwj that seems very useful. I feel tempted to use that to write a little python script that checks that the resulting builds from 2 different commits are identical. And then use that script to check this PR. But that’s definitely after-0.10 (and with that tool we don’t care much about an additional clang since I could just indent properly from the beginning without worrying about making the diff harder to read) so I’m closing this for now.

jtimon closed this on Nov 15, 2014

MarcoFalke locked this on Sep 8, 2021

Split operators in EvalScript’s switch #5153