De-globalizing ChainstateManager

dongcarl commented at 7:37 pm on September 30, 2020: member

Update

Draft overall changeset published in #20158, closing this in favor of discussing in #20158.

Prelude

From my reading of past conversations and from a few offline chats, it seems that modularizing our consensus engine is a worthwhile first step towards a more complete isolation of said engine from non-consensus code.

Modularizing our consensus engine means that:

We get clearer visibility into what currently lies in consensus codepaths and what depends on our consensus engine
We can coalesce duplicate consensus initialization codepaths, mitigating against bugs that arise out of test/non-test initialization inconsistencies

De-globalizing `g_chainman`

In order to modularize our consensus engine, we need to first de-globalize the global ChainstateManager – namely g_chainman – as it and its dependencies are what makes up the bulk of our consensus engine. A few direct references to g_chainman have already been removed in #19927, however, its indirect uses (mainly via ::Chain(state|)Active()) are numerous in our codebase and often used to ~~cheat~~ avoid obtaining a ChainstateManager reference.

In a series of PRs, I plan to remove the g_chainman global and ensure that codepaths which depend on our consensus engine have a non-global reference to the object they require (be it ChainstateManger, CChainState, BlockManager, or something more specific).

Order of commits

After having reorganized my commits a few times, I’ve realized that the ordering of commits can drastically influence both the review complexity of each diff and the number of commits needed in total to achieve the same thing. My current approach is basically to “continuously trim the leaves of the g_chainman call graph”.

Take, as an example, the following call graph where all of these functions either directly or indirectly reference g_chainman/::Chain(state|)Active():

ChainmanCall

I can start with Qux, and do the following:

Does it need all of ChainstateManger? Or does it just need a CChainState? Or just a BlockManager? Or just the ChainstateManager::ActiveTip()?
If it already has the reference that it needs (as a member of the class it belongs to, as a function parameter, etc.) -> Use it
If we’re dealing with a function which looks like it should be a member of a class that would have that reference -> Move the function to that class
Otherwise, we need to pass the reference to Qux.

When we have to invoke the last two options, we propagate the dependency on g_chainman/::Chain(state|)Active() up to Wibble and Wobble. In the case of (3), it’ll look something like ::ChainstateActive().Qux(); whereas in the case of (4), it’ll look something like Qux(::ChainstateActive()).

This mean that, for some functions (LookupBlockIndex being the main culprit), a resolution of their dependency on the g_chainman/::Chain(state|)Active() globals results in an increase in apparent references to g_chainman/::Chain(state|)Active(). However, it is important to note that these are not new dependencies by any means, those dependency were always there – they was just hidden by Qux’s use of the globals. Furthermore, all of these “new” apparent references to g_chainman/::Chain(state|)Active() will be dealt with as we prune the call graph upward.

Current status

Right now, I have a working branch https://github.com/bitcoin/bitcoin/compare/master...dongcarl:2020-09-libbitcoinruntime-v5, which resolves of all the g_chainman/::Chain(state|)Active() global calls in validation.{cpp,h}. I noticed that for resolving these calls in validation, there are 3 somewhat distinct bundles that have a natural ordering:

A bundle of functions that are related to ::LookupBlockIndex in the call graph (see: #20050)
A bundle of functions that are mempool-related
- This has a dependency on bundle (1) as MemPoolAccept::PreChecks calls ::GetSpendHeight, which is in bundle (1) as it calls LookupBlockIndex
A bundle of functions which are not related to either
- This has a dependency on bundle (2) as CChainState::{ActivateBestChainStep,InvalidateBlock} both call ::UpdateMempoolForReorg which is in bundle (2)

I believe this bundling naturally splits up the validation cleanup into 3 pull requests, each touching around 12 functions.

I’m still organizing the rest the non-validation commits and will be pushing that up when it’s ready.

Note to reviewers: At this stage I would very much like to get a sanity check on my approach to make sure that there isn’t a better way to organize my commits that I’ve missed.

dongcarl added the label Refactoring on Sep 30, 2020

fanquake added the label Brainstorming on Oct 1, 2020

fanquake added the label Needs Conceptual Review on Oct 1, 2020

MarcoFalke commented at 7:12 am on October 1, 2020: member

Furthermore, all of these “new” apparent references to g_chainman/::Chain(state|)Active() will be dealt with as we prune the call graph upward

I can understand that mechanically using the global might reduce review burden and is “trivially correct”, but if there is a class that simply depends on the chainstate or block tree (like BaseIndex or PeerManager or …) it might be best to pass it a reference to the chainstatemanager as early as possible. Otherwise your diffs will look like implicit global -> g_chainman -> m_chainman instead of just a direct replacement implicit global -> m_chainman.

dongcarl commented at 6:37 pm on October 1, 2020: member

@MarcoFalke I found it much easier for reasoning and for rebases to have each commit focus on one function and do mechanical changes for its callers because:

Say that the function in focus references BlockManager, we may think that the function’s caller only needs BlockManager right now, but it may turn out that the caller calls another function which references ::ChainstateActive() that we haven’t pruned yet and we won’t know what the caller actually needs until we’ve pruned everything under the caller.
Doing mechanical changes allows us to express more of the changeset as scripted-diffs. Which, for a large, mostly move-only project like this one, will make review and rebases a lot easier.
- A scripted-diff example of implicit global -> g_chainman: 0e101a0 (#20050)
- A scripted-diff example of g_chainman -> m_chainman: This is unpublished but I do something similar to the below for rest.cpp, rpc/, tests/, interfaces/node.cpp, interfaces/chain.cpp, etc.
```
0-BEGIN VERIFY SCRIPT-
1sed -i -e 's/g_chainman/EnsureChainman(context)/g' -- src/rest.cpp
2sed -i -e 's/::Chain\(state\|\)Active()/EnsureChainman(context).ActiveChain\1()/g' -- src/rest.cpp
3-END VERIFY SCRIPT-
```
Note that each “module” (rest/rpc/tests/etc) in our codebase may reference ChainstateManager/CChainState/etc slightly differently. So even though doing the caller changes mechanically may look ugly in intermediary commits, they:
1. Are easier to reason about w/re correctness
2. Delay having to think about the “module-specific” way to reference objects until we’ve pruned up to said module, whereby they will 100% be cleaned up and in a lot of cases we can apply a “module-specific” scripted-diff like above.

Having said all that, the above is just my personal take on how to bring some sanity to this unruly call graph (~40 commits just to clean up validation). I do understand that at the end of the day, what matters is what is clearer for reviewers, so if the benefits I described above do not sound worthwhile, I’d be happy to go with whatever reviewers think is best! :relaxed:

MarcoFalke commented at 7:47 am on October 2, 2020: member

we may think that the function’s caller only needs BlockManager

If this is a concern, the functions that implicitly rely on chainman (not just blockman) could be de-globalized first, and the blockmanager ones later. Though, if there already is a chainman reference in scope (like #20050 (review)) it should be safe to assume that the chainman is actually required, no?

allows us to express more of the changeset as scripted-diffs

Replacing (an implicit) g_chainman in rpc code with EnsureChainman(context) is correct, but seems overly verbose code and maybe also confusing. I’d find an RPC easier to read if the requirements were listed in the first lines of the method. Similar how we get the wallet reference early on in wallet rpc, we should get a chainman reference early in chain rpcs. The remainder of the changes can still be done with a scripted diff.

dongcarl commented at 10:13 pm on October 2, 2020: member

@MarcoFalke

if there already is a chainman reference in scope (like #20050 (review)) it should be safe to assume that the chainman is actually required, no?

Agreed!

Replacing (an implicit) g_chainman in rpc code with EnsureChainman(context) is correct, but seems overly verbose code and maybe also confusing. I’d find an RPC easier to read if the requirements were listed in the first lines of the method. Similar how we get the wallet reference early on in wallet rpc, we should get a chainman reference early in chain rpcs. The remainder of the changes can still be done with a scripted diff.

Agreed w/re the RPCs, will fix that up on my local branch!

I apologize in advance as I feel like I’ve not yet fully understood why you want the diffs to be organized the way you described. To me, ease-of-review and correctness is the most important thing, even if intermediary commits look a bit ugly, and especially when that ugliness will go away in followups. At the same time, it is super important to me that I’m not doing anything that makes review harder, so I very much value your honest feedback!

In order for me to better understand your thinking, perhaps we can go through a concrete example: the commits where I “prune a leaf” by moving LookupBlockIndex to BlockManager in #20050. Namely: ace764fdb0e749d47d8e6bb500f3bb96f45a9d28 and 0e101a01832f8f799326faba27ffc97d73f8b434. What I think you’re proposing is that instead of leaving the “use the right local reference” work until we prune up to the caller, we should either add on to the first commit ace764fdb0e749d47d8e6bb500f3bb96f45a9d28 or have an additional commit which uses local references to ChainstateManager as soon as possible like so:

For src/rpc calls to LookupBlockIndex
1. Insert a ChainstateManager chainman = EnsureChainman(request.context) at the beginning of each lambda I touch if not already there
2. Use the inserted chainman for LookupBlockIndex like: chainman.m_blockman.LookupBlockIndex(...)
For src/interfaces calls to LookupBlockIndex
1. Use m_node.chainman->m_blockman for LookupBlockIndex like: m_node.chainman->m_blockman.LookupBlockIndex(...)
For src/net_processing.cpp PeerManager methods that call LookupBlockIndex
1. Use m_chainman for for LookupBlockIndex like: m_chainman.m_blockman.LookupBlockIndex(...)
For src/test calls to LookupBlockIndex
1. Use m_node.chainman->m_blockman for LookupBlockIndex like: m_node.chainman->m_blockman.LookupBlockIndex(...)

I also want to know what you mean by:

if there is a class that simply depends on the chainstate or block tree (like BaseIndex or PeerManager or …) it might be best to pass it a reference to the chainstatemanager as early as possible

Right now, I add an m_chainstate member to BaseIndex once I’ve pruned everything underneath it. It seems that you would like for that member to be added as early as possible, and I’m guessing it’s so we can use it while pruning LookupBlockIndex like so:

For src/index calls to LookupBlockIndex
1. Use m_chainstate.m_blockman for LookupBlockIndex like: m_chainstate.m_blockman.LookupBlockIndex(...)

Please let me know if I’m understanding you correctly, and thank you for your patience with me :sweat_smile:

MarcoFalke commented at 6:44 am on October 3, 2020: member

Yes, conceptually that is what I wanted to say. Though, as the end result is the same, I am happy to review either path that goes there.

dongcarl commented at 6:29 pm on October 15, 2020: member

Draft overall changeset published in #20158, closing this in favor of discussing in #20158.

dongcarl closed this on Oct 15, 2020

DrahtBot locked this on Feb 15, 2022

De-globalizing ChainstateManager #20049

Update

Prelude

De-globalizing g_chainman

Order of commits

Current status

De-globalizing `g_chainman`