libbitcoinscript #4692

pull TheBlueMatt wants to merge 4 commits into bitcoin:master from TheBlueMatt:libscript changing 32 files +1402 −1224
  1. TheBlueMatt commented at 11:51 am on August 13, 2014: member

    This does the minimal effort to generate a libbitcoinscript.so which can VerifyScript, though nothing else is exposed.

    Build with ./configure –enable-shared=yes and you’ll find a libbitcoinscript.so installed (or in src/.libs).

    It weighs quite a bit, but it works, so I think its worth merging and then trimming down (and adding features) as we go (Ive already got one or two others ready, but first things first).

    Please do not ACK this unless you’ve verified there are no changes before/after the copy/paste.

  2. in src/script/standalone.cpp: in 92bacee490 outdated
    45+    return time(NULL);
    46+}
    47+
    48+
    49+
    50+extern "C" bool VerifyScript(const unsigned char *scriptSig,    const unsigned int scriptSigLen,
    


    sipa commented at 12:08 pm on August 13, 2014:
    It is necessary to pass scriptSig? It could be taken from txTo.vin[nIn].scriptSig?

    luke-jr commented at 12:18 pm on August 13, 2014:
    This is more flexible; maybe the scriptSig being passed is some new form of signed message, for example. But it would make sense to accept NULL to do the extraction from txTo…

    TheBlueMatt commented at 4:34 am on August 14, 2014:
    ACK, fixed.
  3. laanwj added the label TX scripts on Aug 13, 2014
  4. jgarzik commented at 12:49 pm on August 13, 2014: contributor
    It does not make much sense to put CScript class to scriptutils.h. CScript is not a utility.
  5. sipa commented at 12:52 pm on August 13, 2014: member
    @jgarzik CScript is defined in script/script.h, scriptutils.h only contains whatever isn’t necessary for validation.
  6. jgarzik commented at 12:59 pm on August 13, 2014: contributor
    @sipa Actually it was just the first commit. CScript is in scriptutils.h… until it is moved again in a further commit.
  7. theuni commented at 3:02 pm on August 13, 2014: member

    Great work! I really like this idea, it’s something I’ve been wanting to do for a long time.

    However, I would really prefer that this be broken up into 2 steps: splitting out libbitcoinscript as a convenience library, THEN exposing it as a shared lib once that’s stabilized. There are just too many issues to work wrt packaging/distribution/api/abi/etc to introduce a shared lib haphazardly. Once that’s exposed, it can’t be taken back.

  8. jgarzik commented at 3:22 pm on August 13, 2014: contributor

    Generally agreed. I’m curious about the C API, versus C++. Any comment on that design decision? Do we want to start providing a C API in additional to C++? If yes, to what extent? All core datastructures?

    Data point, may or may not be relevant: The few C++-native projects I know of that export a C API typically have a One Big C Wrapper approach, that sits on top of a first class C++ API. ie. you #include “bitcoin/c.h”

  9. theuni commented at 3:29 pm on August 13, 2014: member

    @jgarzik I think the c api approach is basically the only reasonable one until we have a set of cleaned up c++ structures that can be exported. We don’t have any kind of c++ api, only internal data structures.

    I think this is a great first step towards that though, and I would be very willing to work on it if others agree it’s the way forward.

  10. sipa commented at 3:32 pm on August 13, 2014: member
    I like this approach in general. I don’t think we’re ready to start exposing a shared library with a stable API for core data structures, but the sufficiently encapsulated approach of “Take a bunch of serialized structures, and compute something with it” should be very safe. Libsecp256k1 uses the same approach by the way - none of the internal data structures are accessible through the public API. As the exposed functions are so simple, there’s little reason to need a C++ api there.
  11. jtimon commented at 7:08 pm on August 13, 2014: contributor

    Ack on the general intent. I’m precisely working on separating things on script and I think I’m close to find something that’s ready for a PR, but so far I’ve been doing incremental little improvements first like #4617 #4680 #4681 #4555 #4694 and #4695 Right now it’s pretty dirty with ugly commits and half-backed ideas, but you can get a general sense of where I’m headed here: https://github.com/jtimon/bitcoin/tree/script2 I’ll clean it up soon, I just wasn’t ready for a PR yet.

    I’m making some things templated so that:

    1. The interpreter file is completely independent from core.h
    2. Maybe some people want to use the scripting language to sign other things beside bitcoin transactions? In particular I would like very few things in the same file as the interpreter (in my case checksig the sig cache, and verifyscript) to make it easier to read.

    So I’m not sure what I should do now…we should definitely coordinate. My approach was improving script first, separating it later, then move to its own module (this PR). Maybe its own module first then imrpoving things like you’re doing it’s better.

  12. TheBlueMatt commented at 2:52 am on August 14, 2014: member
    @jtimon Well it looks like you’re trying to make this library expose more and weight less, which is of course the goal, but I’m just trying to push this step-by-step, so I think they’re very much compatible.
  13. TheBlueMatt commented at 2:53 am on August 14, 2014: member
    @jgarzik re: C-vs-C++: It was originally C++, but since its using literally nothing C++ in the API, I figured its easier to expose it as a C API and that way its easy to include from C++ or C code, whereas using vectors would make it hard from C…
  14. laanwj commented at 2:56 am on August 14, 2014: member
    +1 for this C API that doesn’t expose any of the internal data structures and is purely data in/out. That’s much easier to wrap in Python (ctypes) and a bunch and other languages. Dumb interoperability is what we need here, not a fancy interface. (though that can, of course, be done later! But my point is that it shouldn’t hold up this very important work)
  15. TheBlueMatt commented at 2:57 am on August 14, 2014: member
    The goal is to continue to use this to expose only the minimum required APIs for someone to use much of Bitcoin Core’s validation engine to do the consensus-critical stuff and can still feel special for having re-implemented things.
  16. TheBlueMatt commented at 2:59 am on August 14, 2014: member
    Note that the library is very deliberately not built by default (yet), also script/libbitcoinscript.h should move to an include/ directory, I think, but, again, I’ll leave that for @theuni after this gets merged.
  17. theuni commented at 4:53 am on August 14, 2014: member

    @TheBlueMatt for this PR, would you mind removing the libtool lib so that the code changes can be reviewed independently of the build discussion? There are several build-side things that I’d like to address before actually introducing the lib, but that shouldn’t slow this down.

    I’ll verify the code movement is 1:1 with the originals tomorrow if no one beats me to it.

  18. TheBlueMatt commented at 4:56 am on August 14, 2014: member
    @theuni hmm? its all disabled-by-default, whats the harm? Also, having more people verify 1:1 is always welcome :).
  19. TheBlueMatt commented at 4:56 am on August 14, 2014: member
    (and not in configure –help yet, either, as its not supported)
  20. jtimon commented at 7:47 pm on August 14, 2014: contributor

    So I have manually reproduced all the code movements until I’ve got to a 0-diff branch here: https://github.com/jtimon/bitcoin/tree/libscript2

    In the last commit you can see only pure changes instead of code movements (plus some include fixes). Nothing bad detected in the process, but yeah, the more people that independently verify is 1:1 the better, I’ll leave that branch there in case is useful for somebody else verifying 1:1.

    ACK

  21. theuni commented at 10:06 pm on August 14, 2014: member

    I went through and manually c/p from the old files to the new. The diffs lined up perfectly other than a few expected formatting changes.

    I really don’t like the fact that CheckSig is defined differently based on which object you’re building (standalone vs unified). Among other things, it’s confusing to read and means that they can’t both be used at the same time (for ex, a test to compare their results).

    It also means that the convenience lib can’t be used by the share libs, so several objects have to be built twice to cope.

    Can this not be handled with a delegate function or extra param (cache flag or function pointer) to VerifyScript?

  22. jtimon commented at 10:33 am on August 15, 2014: contributor
    I would prefer to merge this as is and then change the header to add a cache bool param and remove the redundancy. I’m biased since this blocks further refactor work in script and I just prefer to have it merged soon than it being perfect. It’s good enough for first step I wouldn’t like to delay.
  23. sipa commented at 11:18 am on August 15, 2014: member

    I would prefer CheckSig to remain where it was, but take the CSignatureCache object as an optional parameter, which is queried and updated when provided. The usual calls inside script could use the default cache, the call in the C wrapper interface could pass NULL.

    This would also be compatible with potentially later extending the interface to have context objects, which could hold such a cache.

    Making things leaner, like by moving the cache functionality out can be done later.

    Pieter

    On Aug 15, 2014 12:33 PM, “Jorge Timón” notifications@github.com wrote:

    I would prefer to merge this as is and then change the header to add a cache bool param and remove the redundancy. I’m biased since this blocks further refactor work in script and I just prefer to have it merged soon than it being perfect. It’s good enough for first step I wouldn’t like to delay. On Aug 15, 2014 12:07 AM, “Cory Fields” notifications@github.com wrote:

    I went through and manually c/p from the old files to the new. The diffs lined up perfectly other than a few expected formatting changes.

    I really don’t like the fact that CheckSig is defined differently based on which object you’re building (standalone vs unified). Among other things, it’s confusing to read and means that they can’t both be used at the same time (for ex, a test to compare their results).

    It also means that the convenience lib can’t be used by the share libs, so several objects have to be built twice to cope.

    Can this not be handled with a delegate function or extra param (cache flag or function pointer) to VerifyScript?

    Reply to this email directly or view it on GitHub #4692 (comment).

    Reply to this email directly or view it on GitHub #4692 (comment).

  24. theuni commented at 4:09 pm on August 15, 2014: member

    I still contend that this should be split up so that the code movement and lib issues can be addressed separately. I really don’t see any disadvantage in that.

    That said, I’d go along with a merge if the issues above will be worked out afterwards.

  25. jtimon commented at 8:21 am on August 16, 2014: contributor
    yeah I would do the code movement from script to scriptutils and script/script first, that would be very simple to merge and the following standalone PR would also be more readable later, but please, let’s merge something soon.
  26. petertodd commented at 10:19 pm on August 16, 2014: contributor

    We should put the word “consensus” or at least “validation” into the name of this library somewhere. I’d like us to send a clear message that you can rely on it to accurately validate scripts, and that this library does so significantly more reliably than other alternatives.

    Also, ACK on concept of a C-level interface “black box”. I’d rather keep what it does and how it exposes functionality as stripped down as possible. For instance I’d add this to python-bitcoinlib by making VerifyScript call this library behind the scenes, retaining the Python code for non-consensus-critical work.

  27. sipa commented at 11:50 am on August 17, 2014: member

    We perhaps want to keep the externally accessible flags different from the ones used internally.

    Some reasons:

    • Caching currently doesn’t make sense as an exposed interface flag.
    • Some future flags (like BIP62 ones) may be too low level to expose.
    • I’d like to keep bitcoinscript.{h,cpp} purely a wrapper around internal stuff; the fact that script.h includes bitcoinscript.h is a bit ugly imho.
  28. sipa commented at 11:55 am on August 17, 2014: member
    @petertodd I like the fact that this library would be pretty much purely consensus-critical stuff, but it is not all consensus critical code - a fair deal of how UTXO handling/block connection/reorgs work belongs there too…
  29. TheBlueMatt commented at 11:29 pm on August 18, 2014: member

    Sorry about the delay here (my drive crashed and had to rebuild from backups).

    The reason for having a separate CheckSig instead of building the same for both was because I wanted to avoid building the CSignatureCache object in the text at all. I could #ifdef that part out, if y’all prefer, but I want to atleast get script things properly separated and then separate out the other crap later as possible. @sipa: In the future, I’d hope that bitcoind just uses libbitcoinscript as a static link, so it makes sense to include the exposed file. I agree that the cache flag (and others in the future) dont make any sense to expose (hell, cache isnt even built in…), so I moved that one to script.h. @petertodd I agree, do you have any ideas for names that arent ridiculously long?

  30. TheBlueMatt commented at 0:01 am on August 19, 2014: member

    Builds for me….

    On 08/18/14 23:32, BitcoinPullTester wrote:

    Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/p4692_875bdaab3e2f670ac44f587d496b637fb7df091b/ for binaries and test log.

    This could happen for one of several reasons:

    1. It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts (please tweak those patches in qa/pull-tester)
    2. It adds/modifies tests which test network rules (thanks for doing that), which conflicts with a patch applied at test time
    3. It does not build on either Linux i386 or Win32 (via MinGW cross compile)
    4. The test suite fails on either Linux i386 or Win32
    5. The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)

    If you believe this to be in error, please ping BlueMatt on freenode or TheBlueMatt here.

    This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.

    — Reply to this email directly or view it on GitHub #4692 (comment).

  31. sipa commented at 2:32 am on August 19, 2014: member

    For now, the layering is “there is a bunch of internal object code - including all of bitcoin core - with a tiny wrapper to expose a subset of the functionality”. It feels wrong the have that internal code depend on the wrapper.

    I agree it would be nice to have bitcoin core one day not include script verification code at all, and delegate that to a separate library only through such a simple interface.

    However, the way to do that would be by replacing all references in higher level code (main, in particular) to the lowlevel script with references to the wrapper api. Not by making the lowlevel script implementation depend on its own api…

  32. TheBlueMatt commented at 3:11 am on August 19, 2014: member
    sipa: Hmm, what about three files (ewww)? bitcoinscript.h which has just the publicly exposed parts but not internal headers, script.h which has just internal stuff, and some third file that has internal parts that are exposed to the public (with necessary includes). Specifically, I dont want to have things like the flags for verification duplicated.
  33. theuni commented at 3:15 am on August 19, 2014: member

    @TheBlueMatt pull-tester is failing because the external headers aren’t installed. We’ve never installed headers, so that’s not a one-line fix.

    I spent a good chunk of the day working on getting exports working properly. It was simple for Linux, but it’s going to be quite complicated for windows/osx. We have no infrastructure for building shared libs there, so it’s going to require a re-working of dependencies and build-scripts to get non-Linux platforms in place.

    OSX dylib builds, but it needed -lcrypto to link. key.cpp and script.h pull in some symbols. I’m not sure why Linux works, presumably the linker is smart enough to drop the unused paths. Windows needs a bunch of work.

    So, NACK on anything but code movement until the aspects of building/distributing shared libs have been properly evaluated. I’m certainly volunteering to help get those things in order, but I don’t like the idea of treating master as a feature branch.

  34. sipa commented at 3:16 am on August 19, 2014: member
    It’s two different APIs, with different goals and (partially) different features. I don’t care about duplicating a few flags.
  35. sipa commented at 4:14 am on August 19, 2014: member
    The public API flags could even be a simple enum (not bit flags), with values like verify_prebip16, verify_bip16, verify_strict, …
  36. jgarzik commented at 4:17 am on August 19, 2014: contributor

    +1 for keeping the bit flags, if you can

    Though in general agree w/ @sipa When defining APIs, sometimes you wind up with a bit of duplication. Not a big issue in the bigger picture.

  37. in src/core.h: in 875bdaab3e outdated
    213@@ -214,7 +214,9 @@ class CTransaction
    214 private:
    215     /** Memory only. */
    216     const uint256 hash;
    217-    void UpdateHash() const;
    218+    void UpdateHash() const {
    


    laanwj commented at 10:30 am on August 19, 2014:
    What is the rationale for moving this implementation to the header?

    TheBlueMatt commented at 11:09 pm on August 19, 2014:
    Its needed to not have to build all of core.cpp, which pulls in quite a bit more weight.

    sipa commented at 11:12 pm on August 19, 2014:
    That’s broken. If there is logic in core (whether it’s in h or cpp) that you don’t want, it should be split up as a whole.

    TheBlueMatt commented at 11:13 pm on August 19, 2014:
    Huh? it needs two functions, moving them to a separate file makes no sense, its two lines of code.

    sipa commented at 11:21 pm on August 19, 2014:

    I don’t mind moving these functions. But not including the .cpp source in the library while you are relying on its header file is fragile - just a refactor may break the library. IMHO, the .h and .cpp should be considered as a unit, and if it it turns out that we want the library to be leaner, then that means that the source code needs to be changed to support that.

    I’d prefer to not do any such things, and just get the library in a functioning state with minimal changes. Making it leaner can always be done later.


    TheBlueMatt commented at 11:26 pm on August 19, 2014:
    We already have several functions implemented in core.h, including core.cpp doesnt just pull in core.cpp, but quite a few others quite quickly. The library weighs a lot now, but including core.cpp would end up with half of bitcoind… re: modifying the header file and breaking build? That’s what we have pull-tester (or Travis) for, they tell you if you break something dumb.

    TheBlueMatt commented at 11:29 pm on August 19, 2014:
    The use of core.h without core.cpp simply signifies “I’m using the definitions and maybe a few of the serialization/deserialization routines, but I dont care about most of the object”

    TheBlueMatt commented at 11:30 pm on August 19, 2014:
    And, the first step in making the library lighter later, would be exactly this.

    laanwj commented at 7:12 am on August 20, 2014:

    I agree with @sipa here. Include files are the interface for the implementation file. Including one but not linking the other is confusing. Please avoid that. Moving functions from the .cpp to the .h should not result in build errors.

    I’m sure what you want can be solved some other way. I wouldn’t worry about size of the resulting library at all in this initial pull.


    jtimon commented at 10:33 am on August 20, 2014:
    If VerifyScript becomes a templated function as shown in https://github.com/jtimon/bitcoin/commit/886260173ee9367b5c00d445fa8f7b678921da4e (well, something simpler and less dirty, that’s just a thinking branch for now), the standalone can instantiate the functions with its own minimal CTransaction type without any need to link core.o or include core.h Maybe other C++ implementations like libcoin or libbitcoin decide to consume this templated version directly instead of the C API, also with their own Transaction classes.
  38. TheBlueMatt force-pushed on Aug 20, 2014
  39. TheBlueMatt force-pushed on Aug 20, 2014
  40. Build libbitcoinscript.a, start refactoring script files.
    The goal here is to move towards a C-callable libbitcoinscript.so
    that any reimplementation can use for its script exectuion instead
    of reimplementing it from scratch, which has been shown to be
    incredibly error-prone.
    
    This library will implement basic script parsing and execution, but
    it is not the goal to expose things like transaction signing or
    script analysis.
    
    The end-goal is to move the things which will be built as a part of
    this library to the script/ subfolder, with the exposed API in
    script/bitcoinscript.h. Things which are used only by Bitcoin Core
    will be in scriptutils.{h,cpp}.
    c1c67acc13
  41. Get a working libbitcoinscript.so build.
    This moves a bunch of functions around and, with minimal changes
    generates a linkable (though not useful) libbitcoinscript.so.
    1dc9666c39
  42. Make bitcoinscript.h expose useful things d5b91eb84e
  43. Make bitcoinscript.h C, not C++ 176ec02a6c
  44. TheBlueMatt force-pushed on Aug 20, 2014
  45. BitcoinPullTester commented at 3:07 am on August 20, 2014: none

    Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/p4692_176ec02a6c567d84363ea2394f0508a4f67a82fc/ for binaries and test log.

    This could happen for one of several reasons:

    1. It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts (please tweak those patches in qa/pull-tester)
    2. It adds/modifies tests which test network rules (thanks for doing that), which conflicts with a patch applied at test time
    3. It does not build on either Linux i386 or Win32 (via MinGW cross compile)
    4. The test suite fails on either Linux i386 or Win32
    5. The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)

    If you believe this to be in error, please ping BlueMatt on freenode or TheBlueMatt here.

    This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.

  46. TheBlueMatt commented at 3:07 am on August 20, 2014: member
    @theuni OK, I added the link to fix OSX. Right now this pull works on linux/osx (albeit with too many symbols exposed), and builds a static library on windows (no dll yet, needs the symbol exposure stuff worked out). So, either this pull needs to wait on the symbol exposure stuff and then just add the macros, or the symbol exposure pull needs to throw in the one macro, I dont see much of a difference there.
  47. theuni commented at 3:48 am on August 20, 2014: member

    There’s much more to it for building dll’s. For starters, we’ll need 2 sets of dependencies and a way to cope with them (dllimport vs dllexport). We’ll have to rewrite a good chunk of configure to quit assuming static linkage for mingw. After those are done, we can teach our headers how to cope with the 2 types of builds. Then we’ll need some tests to verify that the dll actually does something useful.

    I have all of the above hacked together enough to test. They’re non-trivial changes. In reality, I think time would be better spent trying to drop those dependencies for lib usage, since they’re quite minimal already.

  48. TheBlueMatt commented at 7:00 am on August 20, 2014: member
    Yea, you need to have a define for building (dllexport) vs installing (dllimport), but really these aren’t that hard. Anyway, its up to you, either put the the ability to build a dll on windows in a separate pull or this one can wait until the infrastructure for dllimport/dllexport/gcc symbol export is in place and I’ll get it in here.
  49. TheBlueMatt closed this on Aug 20, 2014

  50. petertodd commented at 0:34 am on August 23, 2014: contributor
    @TheBlueMatt re: naming, long term I think we need a “libbtcconsensus” that holds all the consensus-critical code, so why not just name the library that? Obviously it’ll be incomplete, but we can move functionality over in pieces.
  51. TheBlueMatt commented at 2:12 am on August 23, 2014: member
    @petertodd yea, but I think we’re so far away from that…I dont know if in the end we’ll get a libbtcconsensus which uses a libbtcscript or just force everyone onto a libbtcconsensus (I might be in favor of the second, but I dont know what people will want/demand).
  52. TheBlueMatt commented at 2:13 am on August 23, 2014: member
    In any case, if we do get a single library, it would be to force people to let libbtc* handle all block thinking, and then just deal with its results, and would probably want to avoid exposing script stuff directly.
  53. petertodd commented at 8:13 am on August 23, 2014: contributor

    @TheBlueMatt re: exposing script stuff directly, I’m not opposed to that even with a libbtcconsensus library, because often it can still be useful to be able to evaluate script execution independent of chain validity, for instance to determine if a transaction could be mined in the future.

    Also a libbtcconsensus library will likely have more than just the consensus code strictly needed by a full node; we probably want a reference SPV implementation in there as well.

  54. laanwj commented at 8:48 am on August 23, 2014: member
    @petertodd Disagree about including a SPV implementation - a consensus library should be just the consensus, and the absolute minimum amount of code to implement that, nothing more. It’s not meant as a reference anything, that’s what Bitcoin Core is already. Well it could be included in the repository if it is clearly marked as an example of using the library.
  55. TheBlueMatt commented at 9:03 am on August 23, 2014: member
    I agree with @laanwj, my goal here and in the future is to provide Bitcoin Core’s consensus in an easily-readable form to any clients who want it in a shared library. SPV…meh, its easy enough to implement SPV, the bugs creep in in the wallet, and I’m not sure we want to do that too.
  56. petertodd commented at 4:00 am on August 24, 2014: contributor
    @laanwj With headers-first my understanding is we will have essentially a SPV-consensus layer, which develops into full consensus as blocks are downloaded. Since a consensus library won’t include networking code in it, I see no harm in ensuring that the SPV-consensus part of that library can be used as-is; most likely the natural path of development will make it so anyway.
  57. jtimon commented at 9:20 am on August 28, 2014: contributor
    Can someone help verify that the changes in #4754 are moving code only? It will make this PR easier to read when reopened and the sooner it is merged, the sooner other changes in script will stop conflicting with these simple code movements.
  58. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 13:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me