A plan for abstracting out libconsensus #6714

issue jtimon openend this issue on September 23, 2015
  1. jtimon commented at 5:23 pm on September 23, 2015: contributor

    There have been multiple discussions about completing libconsensus in the mailing list but they are hard to find. Here’s a detailed plan

    Goals:

    1. Complete libconsensus’s C API with the following functions: VerifyScript, VerifyHeader, VerifyTx and VerifyBlock. Possibly also: CheckTransaction/Consensus::CheckTx, Consensus::CheckTxInputs, Consensus::CheckTxInputsScripts (doesn’t exist yet in master), CheckBlockHeader, ContextualCheckBlockHeader, CheckBlock and ContextualCheckBlock.

    2. Maintain libconsensus as a stateless library that doesn’t have its own storage.

    Regardless of the concrete final C API, here’s some things we do know now and if done in advance, would simplify the task of reviewing concrete C API’s proposals:

    • Separate consensus critical code into different files. Right now, if one wants to propose a concrete C API without moving any code, he needs to include main.cpp (and its abundant dependencies) in the libconsensus build. This goes against goal 2.
    • Remove depencies that use globals (like Params()/pCurrentParams, util::mapArgs or util::error/util::LogPrintStr/fPrintToConsole_and_friends).

    The most complicated part of proposing a concrete C API is to somehow interface with the equivalent of CBlockIndex and CCoinsViewCache (which depend on storage) in Bitcoin Core:

    • C-compatible API for CBlockIndex
    • C-compatible API for CCoinsViewCache
    • Discuss and accept a complete C API for libconsensus.

    If you think this plan is not detailed enough, please make concrete suggestions or questions: “this is not well documented enough” is not very useful as feedback.

    Proposed API (C++, not the final C API):

     0bool VerifyScript(const CScript& scriptSig, const CScript& scriptPubKey, unsigned int flags, const BaseSignatureChecker& checker, ScriptError* error = NULL);
     1
     2bool VerifyTx(const CTransaction& tx, CValidationState &state, const Consensus::Params&, int nBlockHeight, int64_t nBlockTime, const CCoinsViewCache& inputs, int nSpendHeight, bool cacheStore, unsigned int flags);
     3bool CheckTx(const CTransaction&, CValidationState&, const Consensus::Params&);
     4/**
     5 * Check whether all inputs of this transaction are valid (no double spends and amounts)
     6 * This does not modify the UTXO set. This does not check scripts and sigs.
     7 * Preconditions: tx.IsCoinBase() is false.
     8 */
     9bool CheckTxInputs(const CTransaction& tx, CValidationState& state, const CCoinsViewCache& inputs, int nSpendHeight);
    10/**
    11 * Preconditions: tx.IsCoinBase() is false.
    12 * Check whether all inputs of this transaction are valid (scripts and sigs)
    13 * This does not modify the UTXO set. This does not check double spends and amounts.
    14 * This is the more expensive consensus check for a transaction, do it last.
    15 */
    16bool CheckTxInputsScripts(const CTransaction& tx, CValidationState& state, const CCoinsViewCache& inputs, bool cacheStore, unsigned int flags);
    17
    18/** Block header validation functions */
    19
    20bool VerifyBlockHeader(const CBlockHeader&, CValidationState&, const Consensus::Params&, int64_t nTime, CBlockIndexBase*, PrevIndexGetter);
    21bool CheckBlockHeader(const CBlockHeader&, CValidationState&, const Consensus::Params&, int64_t nTime, bool fCheckPOW = true);
    22bool ContextualCheckBlockHeader(const CBlockHeader&, CValidationState&, const Consensus::Params&, const CBlockIndexBase*, PrevIndexGetter);
    23
    24/** Block validation functions */
    25
    26bool VerifyBlock(const CBlock&, CValidationState&, const Consensus::Params&, int64_t nTime, const CBlockIndexBase*, PrevIndexGetter);
    27bool CheckBlock(const CBlock& block, CValidationState& state, const Consensus::Params& params, int64_t nTime, bool fCheckPOW = true, bool fCheckMerkleRoot = true);
    28bool ContextualCheckBlock(const CBlock&, CValidationState&, const Consensus::Params&, const CBlockIndexBase*, PrevIndexGetter);
    
  2. jtimon commented at 8:46 pm on September 24, 2015: contributor

    An alternative plan that may be more interesting for users in the short term is to slowly complete libconsensus’ API. For example, the next steps could be:

    • Separate header consensus validation critical code
    • Remove depencies that use globals
    • C-compatible API for CBlockIndex
    • Expose CheckBlockHeader, ContextualCheckBlockHeader and VerifyHeader

    That would require less work to see some results for users rather than merely better software quality from having a properly encapsulated code base, and an increase in the number of potential contributors due to the decreased probability of any given change being considered risky or conflicting with other changes that are more of a more priority.

  3. CodeShark commented at 8:59 pm on September 24, 2015: contributor

    If we’re to maintain libconsensus as stateless, we’ll presumably have to pass it the UTXO somehow. Eventually, I was imagining this being exposed as an abstract DB model so wrappers can be written to support arbitrary storage engines.

    Also, it would be useful to provide a subscription mechanism for signals pertaining to consensus state changes (i.e. NEW_TIP, REORG).

  4. CodeShark commented at 9:01 pm on September 24, 2015: contributor
    I’d also like to add that I think rather than focusing too much on quick implementation we should really think more about high level design. Architecturally, having a solid, reusable libconsensus that is easy to review is priority #1, methinks.
  5. CodeShark commented at 10:11 pm on September 24, 2015: contributor
    I would like to correct myself - a libconsensus is not what’s priority #1. What’s priority #1 is removing dependencies on anything but highly stable code, cleaning up the directory structure, and having a good interface.
  6. jtimon commented at 10:29 pm on September 24, 2015: contributor

    As discussed recently, the value of the code encapsulation and the completion of libconsensus are independent, even if the former is required to implement the latter. For that reason and following @sipa ’s and your advice I will prepare an independent document plan for the encapsulation part “with words and pictures”.

    But the API is also important and even if not so doable in the short term, it’s never too late to start discussing it. So I will update the initial post with the “current proposed API”, even if it’s a C++ instead of a C API (functions will be the analogous of VerifyScript not of bitcoinconsensus_verify_script ) and there’s open questions like how to “break” CBlockIndex and CCoinsViewCache.

    Regarding your comment about libconsensus being stateless and the UTXO being passed somehow, that’s what I mean by “breaking CCoinsViewCache” which is I believe the hardest part of designing this API and will probably generate more controversy (since I’m sure there will be alternative proposals). My initial design draft is “one or more function pointers to retrieve data, basic C types or structs at most”, but there’s still several ways to do that. I believe the CBlockIndex case is much simpler and we should deal with that first (see #5995). VerifyTx requires breaking CCoinsViewCache, VerifyHeader requires breaking CBlockIndex and VerifyBlock requires both.

  7. dexX7 commented at 10:33 am on September 29, 2015: contributor

    Possibly also: CheckTransaction/Consensus::CheckTx, Consensus::CheckTxInputs, Consensus::CheckTxInputsScripts …

    I’m sure these are by-products, and may easily be exposed at some point, but I’d focus on the higher level functions, such as VerifyHeader, VerifyTx, … and expose the low level stuff only if demanded.

    I can see the issue regarding stateless and state-dependent checks, and I tend to believe it should be tackled seperately: ideally there were functions for stateless checks, and functions for stateful checks (or both), given that state probably adds a lot of complexity.

    I noted a few examples of use-cases that I can think of, which are loosely related to verification in general, see #4162 (comment).

    From a practical perspective, currently on my “wishlist” would be to have a better insight about (script) verification failures: exposing ScriptError in one way or another seems useful, for example as optional parameter, or return code. I’d differentiate between errors regarding the use of the verification function (such as ERR_TX_SIZE_MISMATCH, …) and actual verification errors (i.e. ScriptError).

  8. gavinandresen commented at 3:46 pm on September 30, 2015: contributor

    Here’s a use case I’m looking at right now, which I think might be useful for thinking about libconsensus:

    I want to be able to validate blocks in parallel, without holding the cs_main lock. Basically, I want multiple threads to be able to call TestBlockValidity() with different blocks (why? because I intend to implement ‘weak block’ announcements, where miners tell their peers about blocks they’re working on and the peers pre-validate them). (this could also speed up RPC getblocktemplate ‘proposal’, and might even be a big win for normal operation, if block validation doesn’t have to hold the cs_main lock for long periods of time)

    I THINK that means a design where the context from previous blocks in the chain is encapsulated in some kind of data structure and passed into TestBlockValidity. I’d like to write code that does something like:

    0ReceiveWeakBlock(CBlock& block)
    1{
    2    LOCK(cs_main);
    3    if block->prev isn't current tip: ignore
    4    BlockValidationContext = snapshot of information from best chain and snapshot of the UTXO state
    5        (snapshot of UTXO state probably implemented as an actual leveldb read-only snapshot).
    6    schedule a task to validate the block with the BlockValidationContext, but skipping the POW check...
    7    ... I'm imagining the task calls something like:
    8         ValidateBlock(const CBlock& block, const BlockValidationContext& context, VALIDATE_BLOCK_ALL & (~VALIDATE_BLOCK_POW))
    9}
    
  9. jtimon commented at 4:55 pm on October 1, 2015: contributor
    Updated OP with a draft of the currently proposed C++ API (not the final C API). I still owe the document with pictures that focuses more in the code movements/architecture.
  10. CodeShark commented at 1:14 am on October 4, 2015: contributor

    After thinking about these issues a bit more, it seems we’re conflating several goals in this, and while these goals are complementary in the long run, in the short term this kind of work requires an even greater level of architectural sophistication that strongly takes dependencies and all the development dynamics into account. We really need to take a step back from the code and think high level design.

    Goal 1: Splitting consensus code from nonconsensus code

    This is the literal stated goal of this thread. It is indeed a crucial long-term goal, I hardly think a justification is needed at this point. However, after a bit more discussion and thought it seems that perhaps creating a libconsensus is not the best place to start. I think we agree that most urgent is the encapsulation, not so much the packaging. We should probably forget about a C interface (or any other language bindings) for now.

    Goal 2: Better code organization

    I don’t think I need to convince anyone here that main.cpp is a horrendous mess. The fact that much of this code is consensus-critical makes things a heck of a lot worse…but even if this were a completely different type of app without such stringent security requirements, this would still be a serious problem in itself. It is seriously hurting recruitment into the project, it is creating enormous bottlenecks in testing and code review…and it is a vicious cycle. Even though most of the core developers are aware of these issues, we’ve been inadvertently promoting a culture where people try to change as few lines of code as possible and greatly limit their ambitions in the hopes of getting their stuff reviewed and merged quickly lest they suffer months of continuous rebasing…and this tends to lead to an even greater scattering of logic across the codebase in the long run, making refactors even harder…and perhaps even more importantly, making it harder to bring new people up to speed. We need to change this pattern.

    Goal 3: Layered architecture with less stable layers built atop more stable ones

    We’ve come a long ways since the very early days of this codebase in terms of reducing our complex web of dependencies - but we still have a good ways to go. It is self-evident: the less we need to be tinkering with lower layer stuff in ways that are negatively felt by those atop it and the more that people working atop it can tinker without the risk of breaking something below, the freer we all are to tinker generally.

    These three stated goals are inextricably intertwined. Perhaps we should seriously take a step back and contemplate everything…and come up with a plan that addresses all these concerns that can be broken down into small palatable steps before just jumping into doing the steps.

  11. laanwj added the label Refactoring on Oct 7, 2015
  12. laanwj added the label Brainstorming on Apr 28, 2016
  13. MarcoFalke commented at 0:12 am on May 7, 2020: member
    Many changes happened since this issue was opened. E.g. main.cpp is no longer and we have indexes that feed off of the validation interface instead of leeching directly from validation. There is still a lot more to do and we should use this issue as reference. Though, closing this for now. When someone has a plan to restructure consensus and wants to work on it, we can reopen this issue or open a new one, but I think no need to keep this sitting in the open issues until then.
  14. MarcoFalke closed this on May 7, 2020

  15. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me