refactor, kernel: Decouple ArgsManager from blockstorage

TheCharlatan commented at 11:10 am on February 19, 2023: contributor

The libbitcoin_kernel library should not rely on the ArgsManager, but rather use option structs that can be passed to the various classes it uses. This PR removes reliance on the ArgsManager from the blockstorage.* files. Like similar prior work, it uses the options struct in the BlockManager that can be populated with ArgsManager values.

Related PR removing blockstorage globals: https://github.com/bitcoin/bitcoin/pull/25781

DrahtBot commented at 11:10 am on February 19, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	ryanofsky, mzumsande
Concept ACK	brunoerg
Approach ACK	hebasto
Stale ACK	dergoegge, MarcoFalke

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27596 (assumeutxo (2) by jamesob)
#27571 (ci: Run iwyu on all src files by MarcoFalke)
#26966 (index: blockfilter initial sync speedup, parallelize process by furszy)
#26326 (net: don’t lock cs_main while reading blocks in net processing by andrewtoth)
#26288 (Enable -Wstring-concatenation and -Wstring-conversion on clang builds by Empact)
#25193 (indexes: Read the locator’s top block during init, allow interaction with reindex-chainstate by mzumsande)
#24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

TheCharlatan force-pushed on Feb 19, 2023

fanquake commented at 2:10 pm on February 19, 2023: member

0validation.cpp:80:13: error: using decl 'ReadBlockFromDisk' is unused [misc-unused-using-decls,-warnings-as-errors]
1using node::ReadBlockFromDisk;
2            ^
3/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/validation.cpp:80:13: note: remove the using
4using node::ReadBlockFromDisk;

TheCharlatan force-pushed on Feb 19, 2023

TheCharlatan commented at 7:56 pm on February 19, 2023: contributor

0validation.cpp:80:13: error: using decl 'ReadBlockFromDisk' is unused [misc-unused-using-decls,-warnings-as-errors]
1using node::ReadBlockFromDisk;
2            ^
3/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/validation.cpp:80:13: note: remove the using
4using node::ReadBlockFromDisk;

Thank you, I’m improving my clang tidy workflow.

DrahtBot added the label Needs rebase on Feb 22, 2023

TheCharlatan force-pushed on Feb 22, 2023

DrahtBot removed the label Needs rebase on Feb 22, 2023

brunoerg commented at 1:11 pm on February 25, 2023: contributor

Concept ACK

in src/net_processing.cpp:3869 in 810f815cb6 outdated

3865@@ -3869,7 +3866,7 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3866 
3867             if (pindex->nHeight >= m_chainman.ActiveChain().Height() - MAX_BLOCKTXN_DEPTH) {
3868                 CBlock block;
3869-                bool ret = ReadBlockFromDisk(block, pindex, m_chainparams.GetConsensus());
3870+                bool ret = m_chainman.m_blockman.ReadBlockFromDisk(block, pindex, m_chainparams.GetConsensus());

brunoerg commented at 2:31 pm on February 27, 2023:

nit:

0const bool ret{m_chainman.m_blockman.ReadBlockFromDisk(block, pindex, m_chainparams.GetConsensus())};

TheCharlatan force-pushed on Feb 27, 2023

TheCharlatan commented at 8:50 pm on February 27, 2023: contributor

Updated e8b095f8ad65544f18ad69201d95e11ddf689f1f -> 899ba71e9d7caa370d8a0c4bed2e514f51024dc6 (removeBlockstorageArgs_0 -> removeBlockstorageArgs_1, compare) with suggested changes. Rebased 899ba71e9d7caa370d8a0c4bed2e514f51024dc6 -> b640db3c5fbe49af58dbc9087a96cbb493fa9574 to get recent PR’s with kernel build improvements.

TheCharlatan force-pushed on Feb 28, 2023

TheCharlatan commented at 6:22 am on February 28, 2023: contributor

Updated b640db3c5fbe49af58dbc9087a96cbb493fa9574 -> 3ded321b878e7ad2bdbd9d542ffb2e36428396ec (removeBlockstorageArgs_2 -> removeBlockstorageArgs_3, compare) with suggested typo fix.

DrahtBot added the label Needs rebase on Feb 28, 2023

in src/node/blockmanager_args.h:14 in dce7d5480d outdated

 9+#include <node/blockstorage.h>
10+
11+#include <optional>
12+
13+class ArgsManager;
14+struct bilingual_str;

mzumsande commented at 3:58 pm on February 28, 2023:

bilingual_str seems unused

mzumsande commented at 5:05 pm on February 28, 2023: contributor

Concept ACK

Seeing all that effort to pass around the hacky fastprune arg (which doesn’t make any sense outside of functional tests) seems a bit sad, so I wonder if it’s really inevitable.

Did you consider whether it’s possible to also move BlockFileSeq into blockmanager? I guess it doesn’t work just because of one remaining call site for node::ReadBlockFromDisk from zmqpublishnotifier?

dergoegge commented at 5:13 pm on February 28, 2023: member

Concept ACK

TheCharlatan force-pushed on Feb 28, 2023

TheCharlatan commented at 9:50 pm on February 28, 2023: contributor

Thanks for the reviews!

Seeing all that effort to pass around the hacky fastprune arg (which doesn’t make any sense outside of functional tests) seems a bit sad, so I wonder if it’s really inevitable. Did you consider whether it’s possible to also move BlockFileSeq into blockmanager? I guess it doesn’t work just because of one remaining call site for node::ReadBlockFromDisk from zmqpublishnotifier?

Yes, totally echo this sentiment. Indeed BlockFileSeq and its dependants are not class members because of the call site in zmqpublishnotifier. I refrained from further refactoring because of this. If somebody has an idea on how to handle the zmq case I’ll gladly rework this pull request.

Updated 3ded321b878e7ad2bdbd9d542ffb2e36428396ec -> 596404317bef425179d68009a76ac185db4da9a4 (removeBlockstorageArgs_3 -> removeBlockstorageArgs_4, compare) removing unused declaration. Rebased 596404317bef425179d68009a76ac185db4da9a4 -> 88e88a5 (removeBlockstorageArgs_4 -> removeBlockstorageArgs_5, compare) to resolve conflicts.

DrahtBot removed the label Needs rebase on Feb 28, 2023

TheCharlatan force-pushed on Mar 1, 2023

TheCharlatan commented at 6:51 am on March 1, 2023: contributor

Updated 88e88a5 -> f87c39399823e22c553b797cc66fa4063462a32b (removeBlockstorageArgs_5 -> removeBlockstorageArgs_6, compare) with linter fix.

TheCharlatan force-pushed on Mar 9, 2023

TheCharlatan commented at 5:15 pm on March 9, 2023: contributor

Updated f87c39399823e22c553b797cc66fa4063462a32b -> 6d9826f182d9122d4464d35d6682dc6fb4b1116e (removeBlockstorageArgs_6 -> removeBlockstorageArgs_7, compare) addressing #27125#pullrequestreview-1317942691 by instantiating a BlockManager in the zmqpublishnotifier. As commented previously #27125#pullrequestreview-1317942691 by @mzumsande other functions like BlockFileSeqare now moved into the BlockManager. This resolves the double declaration of functions that was previously required in the previous patchset.

fanquake referenced this in commit b175bdb9b2 on Mar 14, 2023

DrahtBot added the label Needs rebase on Mar 15, 2023

TheCharlatan force-pushed on Mar 16, 2023

TheCharlatan commented at 9:28 am on March 16, 2023: contributor

Rebased 6d9826f182d9122d4464d35d6682dc6fb4b1116e -> 593a9f36609a74f47175e681a3921f3975272766 (removeBlockstorageArgs_7 -> removeBlockstorageArgs_8, compare) on newest master to resolve conflicts.

in src/kernel/blockmanager_opts.h:1 in 39fa0de08a outdated

maflcko commented at 9:37 am on March 16, 2023:

39fa0de08adc2cd19081bb5679d26a6b8b6cb331: Commit message is wrong?

maflcko commented at 9:40 am on March 16, 2023:

Commit message is wrong too in e78003b42d6bbd166fd87aea357788c712506a6f

in src/node/blockmanager_args.cpp:14 in 39fa0de08a outdated

 6@@ -7,6 +7,8 @@
 7 #include <util/system.h>
 8 #include <validation.h>
 9 
10+#include <optional>

maflcko commented at 9:38 am on March 16, 2023:

unrelated: If you want, you can add this file to ci iwyu in ./ci/test/06_…

DrahtBot removed the label Needs rebase on Mar 16, 2023

TheCharlatan force-pushed on Mar 17, 2023

TheCharlatan commented at 3:53 pm on March 17, 2023: contributor

Updated 593a9f36609a74f47175e681a3921f3975272766 -> 54d8644dce180197fa657e9b68d6de63cf4c8072 (removeBlockstorageArgs_8 -> removeBlockstorageArgs_9, compare) to improve commit messages and add a commit adding the blockmanager_args.cpp file to iwyu as suggested by @MarcoFalke .

in src/zmq/zmqpublishnotifier.cpp:258 in 9ba7ab344f outdated

252@@ -248,9 +253,12 @@ bool CZMQPublishRawBlockNotifier::NotifyBlock(const CBlockIndex *pindex)
253     LogPrint(BCLog::ZMQ, "Publish rawblock %s to %s\n", pindex->GetBlockHash().GetHex(), this->address);
254 
255     const Consensus::Params& consensusParams = Params().GetConsensus();
256+    BlockManagerOpts blockman_opts{};
257+    Assert(!ApplyArgsManOptions(gArgs, blockman_opts)); // already checked on init
258+    BlockManager blockman{blockman_opts};

maflcko commented at 4:01 pm on March 17, 2023:

in 9ba7ab344fb39c1145283b02e6b4ff4753c6511b: Can you explain why it is safe to construct two blockman in the same process? This seems fragile, considering that the objects may hold state?

TheCharlatan commented at 5:03 pm on March 17, 2023:

I may have missed some nuance here. To me it seemed acceptable, because functions are protected by cs_main. For const functions likeReadBlockFromDisk this seems safe to me (though I’m still new to the locking model and might be missing something here). However doing block write and pruning operations with two blockmanagers is fragile, since their internal state might no longer reflect the state on disk.

I’m not sure how to proceed here. I could mark the BlockManager here as const, that would at least ensure that it’s state cannot change.

maflcko commented at 10:13 am on March 20, 2023:

cs_main

Hmm, that sounds like you are making it harder to remove cs_main.

I am not sure it the alternative are better, but my brain enjoys the thought of having only to think about a single blockman object at a time. Ideas:

Use findBlock from the node interface. The overhead should only be another map lookup by the block hash? You can ask Ryan to confirm.
Pass down a blockman ref into the zmq stuff (massive change)
Add a new global alias for zmq to point to the single existing blockman instance, ugly but less code and less ugly than constructing a new object?
Something else?

fanquake commented at 10:17 am on March 20, 2023:

I may have missed some nuance here. To me it seemed acceptable, because functions are protected by cs_main. For const functions likeReadBlockFromDisk this seems safe to me

Note that according to the PR description in #27006 there is currently at least one cs_main related bug in ReadBlockFromDisk.

TheCharlatan commented at 11:16 am on March 20, 2023:

Hmm, that sounds like you are making it harder to remove cs_main.

I don’t think this change complicates a refactor of the global locks. zmqpublishnotifier relies on cs_main being in global scope before and after this change. If the lock is moved into some narrower scope, this would have to be handled for the zmqpublishnotifier too - with our without this patch. If I read #27006 correctly, no change is required to make it work with the patch here.

my brain enjoys the thought of having only to think about a single blockman object at a time.

I feel this though, I’ll try out the first and second option you laid out.

TheCharlatan commented at 2:45 pm on March 20, 2023:

I feel this though, I’ll try out the first and second option you laid out.

Both are branching out a bit too much in my taste for the refactor we seek to achieve here. If the const BlockManager is a non-starter, I can always revert back to https://github.com/bitcoin/bitcoin/commit/f87c39399823e22c553b797cc66fa4063462a32b

maflcko commented at 3:52 pm on March 20, 2023:

I can always revert back to https://github.com/bitcoin/bitcoin/commit/f87c39399823e22c553b797cc66fa4063462a32b

Not sure. I will need to move those functions to blockman anyway for some other work, so dropping the changes will only kick the can down the road.

Both are branching out a bit too much in my taste for the refactor we seek to achieve here.

I can take a look when time allows. In the meantime other reviewers can chime in.

dergoegge commented at 1:43 pm on March 23, 2023:

fwiw, I would prefer either one of these two options over what is currently in this PR:

Use findBlock from the node interface. The overhead should only be another map lookup by the block hash? You can ask Ryan to confirm.

Pass down a blockman ref into the zmq stuff (massive change)

I implemented the second option here: https://github.com/dergoegge/bitcoin/commit/a17844fe3e3c97c0eb5906536b96d1ce634b790c feel free to pick that, it’s not to crazy of a change imo (haven’t tested that but it does compile).

TheCharlatan commented at 4:39 pm on March 23, 2023:

This patch is much simpler than what I came up with, thanks a lot!

in src/node/blockstorage.h:241 in fd015c73b0 outdated

239@@ -236,11 +240,11 @@ class BlockManager
240     bool ReadBlockFromDisk(CBlock& block, const FlatFilePos& pos, const Consensus::Params& consensusParams) const;
241     bool ReadBlockFromDisk(CBlock& block, const CBlockIndex* pindex, const Consensus::Params& consensusParams) const;

maflcko commented at 4:07 pm on March 17, 2023:

When touching this, it might be better to not pass the params, but access them from a member?

Also, I don’t like that several unrelated “refactoring” is bundled with potentially behavior changing stuff. For example, it would be good to do the function moving in a separate commit from the change that constructs two blockman in the same process.

TheCharlatan commented at 5:03 pm on March 17, 2023:

Thank you for the suggestion, will split this up.

ryanofsky commented at 1:38 pm on May 8, 2023:

In commit “refactor: Move functions to BlockManager methods” (1def580e80a1892e473324016db9a0f2ffec0c27)

re: #27125 (review)

Thank you for the suggestion, will split this up.

Seems like this could be marked resolved

TheCharlatan force-pushed on Mar 19, 2023

TheCharlatan commented at 1:03 pm on March 19, 2023: contributor

Updated 54d8644dce180197fa657e9b68d6de63cf4c8072 -> 9f5600742c65a2421c9fe5a2a2670e86a25ef696 (removeBlockstorageArgs_9 -> removeBlockstorageArgs_10, compare) to split a commit moving methods and instantiating a block manager in zmq, and replacing the fastprune arg. I also added a short inline comment explaining the zmq’s blockmanager const declaration and threading requirements.

TheCharlatan force-pushed on Mar 23, 2023

TheCharlatan commented at 4:38 pm on March 23, 2023: contributor

Thank you for the discussion and suggestions.

Updated 9f5600742c65a2421c9fe5a2a2670e86a25ef696 -> 97bf70119e5b8567bcdc553d59b18a09023fd05a (removeBlockstorageArgs_10 -> removeBlockstorageArgs_11, compare) to add @dergoegge’s patch as suggested in his comment. With this latest push, there is no additional BlockManager being instantiated anymore.

TheCharlatan commented at 8:51 pm on March 23, 2023: contributor

Not sure why the functional tests are failing. The test failing on the Win64 job is passing on the 32-bit job and vice-versa. Both are passing locally.

ryanofsky commented at 9:08 pm on March 23, 2023: contributor

Win64 wallet_create_tx.py failure is probably fixed by #27318

TheCharlatan force-pushed on Mar 23, 2023

fanquake referenced this in commit 369d4c03b7 on Apr 3, 2023

in src/init.cpp:1474 in 9a5c111d8c outdated

1469@@ -1478,6 +1470,14 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1470         node.chainman = std::make_unique<ChainstateManager>(chainman_opts, blockman_opts);
1471         ChainstateManager& chainman = *node.chainman;
1472 
1473+#if ENABLE_ZMQ
1474+        g_zmq_notification_interface = CZMQNotificationInterface::Create(chainman.m_blockman);

mzumsande commented at 7:36 pm on April 6, 2023:

This moves the creation of the ZMQ interface into a for loop, which I think can be executed twice under some special conditions (-reindex is activated by a GUI user). However, g_zmq_notification_interface is just a simple pointer, not a std::unique_ptr, so I think that there could be a memory leak with the first object never being deleted if we call CZMQNotificationInterface::Create a second time.

in src/zmq/zmqpublishnotifier.cpp:20 in 97bf70119e outdated

19@@ -19,6 +20,8 @@
20 #include <streams.h>
21 #include <sync.h>

maflcko commented at 3:58 pm on April 13, 2023:

unrelated: Could make sense to remove unused includes, according to iwyu

TheCharlatan force-pushed on Apr 18, 2023

TheCharlatan commented at 2:24 pm on April 18, 2023: contributor

Thank you for the review @mzumsande

Updated 97bf70119e5b8567bcdc553d59b18a09023fd05a -> 09f950e1f3608e3019e97236969b152e9252a220 (removeBlockstorageArgs_11 -> removeBlockstorageArgs_12, compare)

Addressed @mzumsande’s comment by making the g_zmq_notification_interface a std::unique_ptr and checking that it is not already the owner of a CZMQNotificationInterface object before assigning to it.
Pass a reference of the node context to the CZMQNotificationInterface to ensure that we access the currently initialized BlockManager within the zmq code. In the previous revision, the reference to the BlockManager might have been pointing to a destructed object.

TheCharlatan force-pushed on Apr 18, 2023

TheCharlatan commented at 3:12 pm on April 18, 2023: contributor

Thank you for the notice,

Rebased 09f950e1f3608e3019e97236969b152e9252a220 -> 908d95e46f15f6b4def9534bc09fcdb3c0a5ef54 (removeBlockstorageArgs_12 -> removeBlockstorageArgs_13, compare)

DrahtBot added the label Needs rebase on Apr 21, 2023

TheCharlatan force-pushed on Apr 21, 2023

TheCharlatan commented at 1:51 pm on April 21, 2023: contributor

Rebased 908d95e46f15f6b4def9534bc09fcdb3c0a5ef54 -> 1a7d73d30e312e862be4560601302e79725b1bc7 (removeBlockstorageArgs_13 -> removeBlockstorageArgs_14, compare)

DrahtBot removed the label Needs rebase on Apr 21, 2023

in src/node/blockstorage.cpp:583 in 1a7d73d30e outdated

577@@ -583,28 +578,28 @@ void UnlinkPrunedFiles(const std::set<int>& setFilesToPrune)
578     }
579 }
580 
581-static FlatFileSeq BlockFileSeq()
582+FlatFileSeq BlockManager::BlockFileSeq() const
583 {
584-    return FlatFileSeq(gArgs.GetBlocksDirPath(), "blk", gArgs.GetBoolArg("-fastprune", false) ? 0x4000 /* 16kb */ : BLOCKFILE_CHUNK_SIZE);
585+    return FlatFileSeq(gArgs.GetBlocksDirPath(), "blk", m_opts.fast_prune ? 0x4000 /* 16kb */ : BLOCKFILE_CHUNK_SIZE);

dergoegge commented at 2:59 pm on April 21, 2023:

0    return FlatFileSeq(m_opts.blocks_dir, "blk", m_opts.fast_prune ? 0x4000 /* 16kb */ : BLOCKFILE_CHUNK_SIZE);

The “util/system.h” include can almost be removed after this. ScheduleBatchPriority() is the only function that is still used.

TheCharlatan commented at 3:41 pm on April 21, 2023:

Good catch, thank you! Will also remove the <args.h> include.

in src/zmq/zmqpublishnotifier.cpp:252 in 1a7d73d30e outdated

248@@ -250,7 +249,7 @@ bool CZMQPublishRawBlockNotifier::NotifyBlock(const CBlockIndex *pindex)
249     const Consensus::Params& consensusParams = Params().GetConsensus();
250     CDataStream ss(SER_NETWORK, PROTOCOL_VERSION | RPCSerializationFlags());
251     CBlock block;
252-    if (!ReadBlockFromDisk(block, pindex, consensusParams)) {
253+    if (!m_node.GetBlockManager().ReadBlockFromDisk(block, pindex, consensusParams)) {

dergoegge commented at 3:05 pm on April 21, 2023:

nit: Seems unnecessary to pass the entire NodeContext when you only need the blockman.

Feel free to ignore, but would be interested in why you chose to do it this way.

TheCharlatan commented at 3:40 pm on April 21, 2023:

The blockman and chainstate instances may be deleted during the lifetime of the zmq instance similarly as described in mzsumsande’s comment when iterating the for loop. If we pass a reference to the NodeContext instead, we have better guarantees that its blockman reference points to an actual object.

dergoegge commented at 3:58 pm on April 21, 2023:

Is that actually possible? Seems like all the validation interfaces including the zmq one are unregistered before chainman is destroyed?

So it can’t be holding a reference to blockman that is actually destroyed, see:

https://github.com/bitcoin/bitcoin/blob/f3f5c9712670fc2c9fb3b57580146aa248a5d36f/src/init.cpp#L327-L343

Or are you just saying that having a reference to the node context has better lifetime guarantees because basically nothing outlives the node context?

TheCharlatan commented at 4:16 pm on April 21, 2023:

Is that actually possible?

Likely that I am missing something in the logic. The way I see it each iteration of the for loop creates and destroys the chainman by calling std::make_unique again, while the Shutdown code is not called during each iteration.

dergoegge commented at 3:09 pm on April 21, 2023: member

lgtm once the last gArgs access is removed.

TheCharlatan force-pushed on Apr 21, 2023

TheCharlatan commented at 3:49 pm on April 21, 2023: contributor

Thank you for the review @dergoegge,

Updated 1a7d73d30e312e862be4560601302e79725b1bc7 -> ed3159e0eb105d9f46659bd8f8b2db27d94841de (removeBlockstorageArgs_14 -> removeBlockstorageArgs_15, compare)

Addressed @dergoegge’s comment, removing a left over gArgs reference in blockstorage.cpp.

dergoegge commented at 4:07 pm on April 21, 2023: member

Code review ACK ed3159e0eb105d9f46659bd8f8b2db27d94841de

in src/zmq/zmqnotificationinterface.cpp:46 in ed3159e0eb outdated

42@@ -39,12 +43,14 @@ std::list<const CZMQAbstractNotifier*> CZMQNotificationInterface::GetActiveNotif
43     return result;
44 }
45 
46-CZMQNotificationInterface* CZMQNotificationInterface::Create()
47+std::unique_ptr<CZMQNotificationInterface> CZMQNotificationInterface::Create(const node::NodeContext& node)

ryanofsky commented at 3:03 pm on April 27, 2023:

Just talked about this offline, but if possible to make ZMQ just take a pointer to BlockManager instead of NodeContext, that would be good. This way ZMQ would only have access to the state it actually needs

DrahtBot added the label Needs rebase on May 2, 2023

TheCharlatan force-pushed on May 2, 2023

TheCharlatan commented at 1:02 pm on May 2, 2023: contributor

Thank you for the discussions,

Updated ed3159e0eb105d9f46659bd8f8b2db27d94841de -> 9c197181803e08e1cce4460190cc06fc2e093a80 (removeBlockstorageArgs_15 -> removeBlockstorageArgs_16, compare)

Reverted change made for passing a NodeContext to ZmqNotificationInterface back to passing a reference of a BlockManager
Introduced a change declaring the BlockManager outside of the ChainstateManager and making the NodeContext its new owner. References to this BlockManager are now passed to declare ChainstateManager and ZmqNotificationInterface. This could also allow the kernel more control over how blocks are stored.

Rebased 9c197181803e08e1cce4460190cc06fc2e093a80 -> f59f0c91acb7a35b767bb0dc75ed8b10add81d9f (removeBlockstorageArgs_16 -> removeBlockstorageArgs_17, compare)

DrahtBot removed the label Needs rebase on May 2, 2023

dergoegge commented at 3:52 pm on May 2, 2023: member

re-ACK f59f0c91acb7a35b767bb0dc75ed8b10add81d9f

fanquake requested review from ryanofsky on May 2, 2023

in src/node/blockmanager_args.cpp:14 in a48fd5f3c1 outdated

 9+#include <tinyformat.h>
10+#include <util/translation.h>
11 #include <validation.h>
12 
13+#include <cstdint>
14+#include <memory>

maflcko commented at 10:38 am on May 3, 2023:

unrelated nit in a48fd5f3c17a0a6030d0ba47fa42be41f7e4aad9: I wonder what memory is being used for, but iwyu seems happy, so this is fine.

TheCharlatan commented at 8:13 am on May 4, 2023:

Seems like it’s a bug: https://github.com/include-what-you-use/include-what-you-use/issues/648

Since it sounds like it’s going to be fixed in an upcoming version, I’ll remove the memory include.

maflcko commented at 9:38 am on May 4, 2023:

Looks like this fix is part of the branch we are using, so this seems a separate bug?

TheCharlatan commented at 10:15 am on May 4, 2023:

Huh, yeah, seems like it. But I guess the issue is similar to what is happening here?

maflcko commented at 10:26 am on May 4, 2023:

I guess it would help upstream if someone created an issue with a minimal reproducer and minimal steps to reproduce, starting from scratch.

in src/test/blockmanager_tests.cpp:27 in 7e0e84c1a3 outdated

22@@ -21,7 +23,9 @@ BOOST_FIXTURE_TEST_SUITE(blockmanager_tests, BasicTestingSetup)
23 BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos)
24 {
25     const auto params {CreateChainParams(ArgsManager{}, CBaseChainParams::MAIN)};
26-    BlockManager blockman{{}};
27+    BlockManager::Options blockman_opts{};
28+    ApplyArgsManOptions(gArgs, blockman_opts);

maflcko commented at 10:48 am on May 3, 2023:

Unrelated in 7e0e84c1a3587f8b2ffac00f647d1e2d17febb91: Is there a reason you are sometimes using gArgs and then m_args? It looks like m_args ignores extra_args and gArgs/m_node.args doesn’t?

So my preference would be to figure out whether to use m_node.args or m_args, and then use it consistently?

maflcko commented at 12:42 pm on May 3, 2023:

See also https://github.com/bitcoin/bitcoin/issues/25055

in src/init.cpp:1447 in 87999089e1 outdated

1439@@ -1439,9 +1440,12 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1440     };
1441     Assert(!ApplyArgsManOptions(args, chainman_opts)); // no error can happen, already checked in AppInitParameterInteraction
1442 
1443-    node::BlockManager::Options blockman_opts{};
1444+    BlockManager::Options blockman_opts{};
1445     Assert(!ApplyArgsManOptions(args, blockman_opts)); // no error can happen, already checked in AppInitParameterInteraction
1446 
1447+    assert(!node.blockman);
1448+    node.blockman = std::make_unique<BlockManager>(blockman_opts);

maflcko commented at 10:58 am on May 3, 2023:

87999089e1b7794cd595ca19893a383149546087: Can you explain why it is necessary and safe to move this out of the for loop?

TheCharlatan commented at 12:37 pm on May 3, 2023:

In the context of this commit it is necessary to move the assignment out of the for loop, so we don’t destruct the node.blockman on each iteration and end up with a dangling reference to the original object in zmq. What I did not think about though is that we are also no longer resetting the internal state of the BlockManagerwhen the ChainstateManager is destructed. So no, I’m no longer sure that this is safe.

maflcko commented at 12:41 pm on May 3, 2023:

I wonder if the for loop can be removed, and replaced by the shutdown workflow from #26674 (comment)? See also 5921b863e39e5c3997895ffee1c87159e37a5d6f

TheCharlatan commented at 12:58 pm on May 3, 2023:

Thank you for the context, I see that I am conceptually going backwards here.

TheCharlatan commented at 2:06 pm on May 3, 2023:

I wonder if the for loop can be removed, and replaced by the shutdown workflow from #26674 (comment)

That does sound more sane than continuously re-indexing forever, or waiting for the user to interrupt.

The change I pushed will be reverted. How about instead of moving the previously static functions of blockstorage.cpp into its BlockManager class, moving them to a stateless sub-class of the BlockManager? Then we can have the node context be the owner of this class, and pass it through as a const reference to the ChainstateManager and its BlockManager, as well as the zmq code, which can now use this class without taking ownership of the entire BlockManager?

maflcko commented at 2:23 pm on May 3, 2023:

I very much like them to be stateful, see also #27125 (review) . This is also needed for some other, unrelated, stuff that someone should be working on (cc @josibake )

TheCharlatan commented at 2:49 pm on May 3, 2023:

I was unclear with the word “stateless”. What I meant was, “does not retain memory of previous interactions”, e.g. we set all the options (including the consensus params) on initialization.

TheCharlatan commented at 4:04 pm on May 3, 2023:

For now, I will go back to this #27125 (review) , where you also suggested to use findBlock.

maflcko commented at 10:01 am on May 8, 2023:

Thanks, thread can be resolved

maflcko commented at 10:02 am on May 8, 2023:

or maybe the unrelated comment #27125 (review) can be transformed into an issue?

in src/test/validation_chainstatemanager_tests.cpp:388 in 7e0e84c1a3 outdated

383@@ -381,10 +384,12 @@ struct SnapshotTestSetup : TestChain100Setup {
384                 .datadir = m_args.GetDataDirNet(),
385                 .adjusted_time_callback = GetAdjustedTime,
386             };
387+            BlockManager::Options blockman_opts{};
388+            ApplyArgsManOptions(gArgs, blockman_opts);

maflcko commented at 11:03 am on May 3, 2023:

nit in 7e0e84c1a3587f8b2ffac00f647d1e2d17febb91: What is the point of adding 5 LOC that will be removed in the next commit anyway? If you want to keep that, it would be good to also remove the now unused include as well, which you forgot?

in src/zmq/zmqnotificationinterface.cpp:26 in 3b9d6b0640 outdated

20@@ -21,6 +21,10 @@
21 #include <utility>
22 #include <vector>
23 
24+namespace node {
25+class NodeContext;
26+}

maflcko commented at 11:13 am on May 3, 2023:

3b9d6b06407f79d87b3318ecaaaca696b0dd1e68: Why?

in src/index/base.cpp:208 in 512592cf41 outdated

204@@ -207,7 +205,7 @@ void BaseIndex::ThreadSync()
205 
206             CBlock block;
207             interfaces::BlockInfo block_info = kernel::MakeBlockInfo(pindex);
208-            if (!ReadBlockFromDisk(block, pindex, consensus_params)) {
209+            if (!m_chainstate->m_blockman.ReadBlockFromDisk(block, pindex, consensus_params)) {

maflcko commented at 11:19 am on May 3, 2023:

unrelated in 512592cf41ae8fb307bd6eb651e3319e5053851e: Might be good to store a reference to the chain params in m_opts in a previous commit and use it inside the function. Then, while touching all lines here, the now unused params argument could be dropped.

maflcko commented at 2:24 pm on May 3, 2023:

This cleanup applies to other functions as well, so I am happy to have this split up (for those other functions), to avoid this pull from growing too large?

maflcko commented at 10:41 am on May 4, 2023:

Apologies for sniping this in #27570. Feel free to NACK

maflcko commented at 8:44 am on May 8, 2023:

nit in 1def580e80a1892e473324016db9a0f2ffec0c27: (For this method and others like UndoReadFromDisk …) It is UB to pass a nullptr pindex, so it would be good, while touching the function signature in this pull request to switch the pointer to a reference.

maflcko commented at 10:01 am on May 8, 2023:

Thanks, thread can be resolved

in src/test/blockfilter_index_tests.cpp:159 in 512592cf41 outdated

155@@ -155,7 +156,7 @@ BOOST_FIXTURE_TEST_CASE(blockfilter_index_initial_sync, BuildChainTestingSetup)
156         for (block_index = m_node.chainman->ActiveChain().Genesis();
157              block_index != nullptr;
158              block_index = m_node.chainman->ActiveChain().Next(block_index)) {
159-            CheckFilterLookups(filter_index, block_index, last_header);
160+            CheckFilterLookups(filter_index, block_index, last_header, m_node.chainman->m_blockman);

maflcko commented at 11:26 am on May 3, 2023:

nit (same commit): You can bind blockman to a lambda named CheckFilterLookups that binds the arg to avoid touching those lines

TheCharlatan commented at 10:25 am on May 4, 2023:

I’m not sure if it is possible for lambda’s to shadow a function that they call. Can you show me what you mean?

maflcko commented at 10:01 am on May 8, 2023:

Not worth it on a second thought, thread can be resolved

in src/node/blockstorage.cpp:587 in f665024be7 outdated

583@@ -584,7 +584,7 @@ void BlockManager::UnlinkPrunedFiles(const std::set<int>& setFilesToPrune) const
584 
585 FlatFileSeq BlockManager::BlockFileSeq() const
586 {
587-    return FlatFileSeq(gArgs.GetBlocksDirPath(), "blk", gArgs.GetBoolArg("-fastprune", false) ? 0x4000 /* 16kb */ : BLOCKFILE_CHUNK_SIZE);
588+    return FlatFileSeq(gArgs.GetBlocksDirPath(), "blk", m_opts.fast_prune ? 0x4000 /* 16kb */ : BLOCKFILE_CHUNK_SIZE);

maflcko commented at 11:30 am on May 3, 2023:

f665024be7e5cc0a91ac7ff5779209313453a043: Wrong commit? How is this not missing from 7e0e84c1a3587f8b2ffac00f647d1e2d17febb91?

mzumsande commented at 11:51 am on May 3, 2023:

It wasn’t possible to do that in https://github.com/bitcoin/bitcoin/commit/7e0e84c1a3587f8b2ffac00f647d1e2d17febb91 because BlockFileSeq wasn’t part of BlockManager then. One could rearrange the commits though and have 7e0e84c1a3587f8b2ffac00f647d1e2d17febb91 after the commit that moves stuff into BlockManager.

maflcko commented at 10:01 am on May 8, 2023:

Thanks, thread can be resolved

in src/init.cpp:1454 in 6b5eb9376c outdated

1430@@ -1431,7 +1431,9 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1431     };
1432     Assert(!ApplyArgsManOptions(args, chainman_opts)); // no error can happen, already checked in AppInitParameterInteraction
1433 
1434-    BlockManager::Options blockman_opts{};
1435+    BlockManager::Options blockman_opts{
1436+        .blocks_dir = args.GetBlocksDirPath(),
1437+    };

maflcko commented at 11:35 am on May 3, 2023:

6b5eb9376c97ac7edd096754708aba6d454cef76:Shouldn’t this happen inside ApplyArgsManOptions? If not, shouldn’t blocks_dir be const to avoid an empty path?

TheCharlatan commented at 3:02 pm on May 3, 2023:

I copied this behavior from ChainstateManagerOpts. We expect the default constructed Options to contain defaults. If we don’t have defaults, we assign on object construction. So, yes, in both instances a const qualifier (and maybe in others using the same pattern) should be added.

maflcko commented at 2:59 pm on May 5, 2023:

unrelated: The following diff compiles for me:

 0diff --git a/src/kernel/chainstatemanager_opts.h b/src/kernel/chainstatemanager_opts.h
 1index 2395f60164..dd75985332 100644
 2--- a/src/kernel/chainstatemanager_opts.h
 3+++ b/src/kernel/chainstatemanager_opts.h
 4@@ -29,7 +29,7 @@ namespace kernel {
 5  */
 6 struct ChainstateManagerOpts {
 7     const CChainParams& chainparams;
 8-    fs::path datadir;
 9+    const fs::path datadir;
10     const std::function<NodeClock::time_point()> adjusted_time_callback{nullptr};
11     std::optional<bool> check_block_index{};
12     bool checkpoints_enabled{DEFAULT_CHECKPOINTS_ENABLED};

in src/node/blockstorage.h:208 in f59f0c91ac outdated

204@@ -207,6 +205,8 @@ class BlockManager
205 
206     [[nodiscard]] bool LoadingBlocks() const { return m_importing || fReindex; }
207 
208+    bool StopAfterBlockImport() const { return m_opts.stop_after_block_import; }

maflcko commented at 11:37 am on May 3, 2023:

f59f0c91acb7a35b767bb0dc75ed8b10add81d9f: nodiscard?

maflcko commented at 10:00 am on May 8, 2023:

Thanks, thread can be resolved

maflcko commented at 11:38 am on May 3, 2023: member

f59f0c91acb7a35b767bb0dc75ed8b10add81d9f 🙍

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: f59f0c91acb7a35b767bb0dc75ed8b10add81d9f 🙍
3QCpVzQy7eIdu/y9fKVeQ1TjvngAc8zmwi0x9FvX6ykCtM3q58Vmz49PAujSNjQz5zj6bjCMXWAXoHeOH/w6VAQ==

hebasto commented at 10:45 am on May 4, 2023: member

Concept ACK.

TheCharlatan force-pushed on May 4, 2023

TheCharlatan commented at 11:15 am on May 4, 2023: contributor

Thank you for the review and discussion @MarcoFalke!

Updated f59f0c91acb7a35b767bb0dc75ed8b10add81d9f -> db668b3644883c07064b46b4e2cfd269ac9dffbd (removeBlockstorageArgs_17 -> removeBlockstorageArgs_18, compare)

Reverted a previous change moving BlockManager initialization out of the ChainstateManager as discssed in #27125 (review)
Added a commit a614ac59cd753c7ef83899602e0176654c5fff87 introducing an interface function for accessing the BlockManager::ReadBlockFromDisk. This was introduced since existing functions were more complex and less efficient than just a direct call. If this interface should not grow, I can drop this commit again and use a different method, like the previously discussed FindBlock in the following commit.
Added a commit 6b9ba7b54097bb32d3c1596b7fdcec4428c27a3a capturing this introduced interface function in a lambda and using it in the zmq code. Capturing the node.chain should be safe for the purpose of zmq notifying about block tip changes, even with the for loop re-declaring the ChainstateManager.
Addressed @MarcoFalke’s comment, reordering the commits. Functions are now moved to BlockManager methods before removing ArgsManager usage.
Addressed @MarcoFalke’s comment, adding nodiscard qualifier.
Addressed @MarcoFalke comment, adding const qualifier to Options.
Addressed @MarcoFalke comment, using m_args for the test arguments.
Addressed @MarcoFalke comment, removing memory include.
No longer use nullptrfor the g_zmq_notification_interface in 8bcd56f2aee46ccc28354882009fc38925e629df

in src/interfaces/chain.h:176 in db668b3644 outdated

171@@ -171,6 +172,9 @@ class Chain
172         const uint256& ancestor_hash,
173         const FoundBlock& ancestor_out={}) = 0;
174 
175+    //! Get a CBlock by its index and return whether successful.
176+    virtual bool getBlockByIndex(CBlock& block, const CBlockIndex* pindex) = 0;

hebasto commented at 12:43 pm on May 4, 2023:

a614ac59cd753c7ef83899602e0176654c5fff87, nit, naming:

0    virtual bool getBlockByIndex(CBlock& block, const CBlockIndex* index) = 0;

TheCharlatan commented at 1:06 pm on May 4, 2023:

This is called pindex in the original: https://github.com/bitcoin/bitcoin/blob/master/src/node/blockstorage.h#L235 . Is that wrong?

hebasto commented at 1:17 pm on May 4, 2023:

No, it is not wrong. Just mentioned it as in new code we do not make a reference to a type of the variable in its name:

0git grep -e "CBlockIndex\* index"

Feel free to ignore :)

in src/zmq/zmqnotificationinterface.cpp:47 in db668b3644 outdated

44 {
45     std::map<std::string, CZMQNotifierFactory> factories;
46     factories["pubhashblock"] = CZMQAbstractNotifier::Create<CZMQPublishHashBlockNotifier>;
47     factories["pubhashtx"] = CZMQAbstractNotifier::Create<CZMQPublishHashTransactionNotifier>;
48-    factories["pubrawblock"] = CZMQAbstractNotifier::Create<CZMQPublishRawBlockNotifier>;
49+    factories["pubrawblock"] = [&get_block_by_index = get_block_by_index]() -> std::unique_ptr<CZMQAbstractNotifier> {

hebasto commented at 12:45 pm on May 4, 2023:

6b9ba7b54097bb32d3c1596b7fdcec4428c27a3a

Isn’t capturing [&get_block_by_index] enough?

hebasto commented at 12:45 pm on May 4, 2023: member

Approach ACK db668b3644883c07064b46b4e2cfd269ac9dffbd.

Changes in test/util/common.cpp do not seem required in the dbd739d54386dcd5dd1d945423bc82362263d4e8 “refactor, BlockManager: Replace fastprune from arg with options” commit. Maybe combine them into the following commit?

TheCharlatan force-pushed on May 4, 2023

TheCharlatan commented at 1:43 pm on May 4, 2023: contributor

Thank you for the review @hebasto

Updated db668b3644883c07064b46b4e2cfd269ac9dffbd -> 9033dc6827cc623f1f7176fde120229f36ff5e74 (removeBlockstorageArgs_18 -> removeBlockstorageArgs_19, compare)

Addressed @hebasto’s comment, removing unneeded change in a commit, and implementing in the commit it was actually required.
Addressed @hebasto’s comment, dropping the p naming prefix.
Addressed @hebasto’s comment, removing capture assignment that was accidentally carried over from a previous code iteration.

in src/interfaces/chain.h:176 in 4ee6cb2d3f outdated

171@@ -171,6 +172,9 @@ class Chain
172         const uint256& ancestor_hash,
173         const FoundBlock& ancestor_out={}) = 0;
174 
175+    //! Get a CBlock by its index and return whether successful.
176+    virtual bool getBlockByIndex(CBlock& block, const CBlockIndex* index) = 0;

maflcko commented at 2:25 pm on May 4, 2023:

Passing a pointer from one process to another one is UB. I wonder if there should be a compile failure if someone adds a pointer type to the multiprocess interface? cc ryan

ryanofsky commented at 1:19 pm on May 5, 2023:

re: #27125 (review)

In general it’s ok to pass pointers and references. These are both really useful for output parameters, so the IPC framework does support them. But the type needs to be serializable, and it isn’t really possible to write serialization code for the CBlockIndex type because of the pprev pointers it contains. It compile error in #10102 to try to pass types that aren’t serializable.

For this PR would suggest not adding a new chain method, and instead doing:

0@@ -1425,7 +1425,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1 #if ENABLE_ZMQ
2     g_zmq_notification_interface = CZMQNotificationInterface::Create(
3         [&chain = node.chain](CBlock& block, const CBlockIndex* index) {
4-            return chain->getBlockByIndex(block, index);
5+            return chain->findBlock(index->GetBlockHash(), interfaces::FoundBlock().data(block));
6         });
7 
8     if (g_zmq_notification_interface) {

It’d also be fine not to use the chain interface and just call ReadBlockFromDisk directly there

TheCharlatan commented at 1:25 pm on May 5, 2023:

Thanks for taking a look. It’s also not used by an actual client of the interface, so I think I’ll just drop the commit then. No need to needlessly pollute the interface.

maflcko commented at 10:00 am on May 8, 2023:

Thanks, thread can be resolved

TheCharlatan force-pushed on May 5, 2023

TheCharlatan commented at 2:08 pm on May 5, 2023: contributor

Thank you for the review @MarcoFalke and @ryanofsky

Updated 9033dc6827cc623f1f7176fde120229f36ff5e74 -> e056d6f758382d3418682095ab02f8d487aa641f (removeBlockstorageArgs_19 -> removeBlockstorageArgs_20, compare)

Dropped the commit changing the interface
Changed the lambda to capture a reference to the node’s chainman and call ReadBlockFromDisk from its m_blockman. I think this is equivalent to going through the chain interface’s findBlock, which also accesses m_node.chainman.

DrahtBot added the label CI failed on May 5, 2023

DrahtBot added the label Needs rebase on May 5, 2023

TheCharlatan force-pushed on May 5, 2023

TheCharlatan commented at 8:23 pm on May 5, 2023: contributor

Rebased e056d6f758382d3418682095ab02f8d487aa641f -> 3b34ac7465919b968795063995f6610a73aa2d29 (removeBlockstorageArgs_20 -> removeBlockstorageArgs_21, compare)

Fixed conflicts with #27570
Added a commit to remove the params argument from the functions moved into the BlockManager

DrahtBot removed the label Needs rebase on May 5, 2023

DrahtBot removed the label CI failed on May 5, 2023

in src/init.cpp:1426 in ea2a049b42 outdated

1421@@ -1422,7 +1422,11 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1422     }
1423 
1424 #if ENABLE_ZMQ
1425-    g_zmq_notification_interface = CZMQNotificationInterface::Create();
1426+    g_zmq_notification_interface = CZMQNotificationInterface::Create(
1427+        [&chainman = node.chainman](CBlock& block, const CBlockIndex* index) {

maflcko commented at 8:28 am on May 8, 2023:

nit in ea2a049b42f73499eaa1b69212af3514026440fd: Any reason to accept a nullptr here, which would be UB?

I think it would be better to accept a const CBlockIndex& block.

maflcko commented at 8:33 am on May 8, 2023:

Also, it would be good to include the diff in src/zmq/zmqpublishnotifier.cpp from the next commit. This makes review easier, because reviewers can see that the function-to-lambda is a simple 1-1 replacement that compiles and passes all tests on its own.

in src/node/transaction.cpp:143 in 0400fb7b55 outdated

142@@ -143,7 +143,7 @@ CTransactionRef GetTransaction(const CBlockIndex* const block_index, const CTxMe
143     }

maflcko commented at 9:05 am on May 8, 2023:

in 0400fb7b55415867d9ff0aa88898920c60b4b2df: The function parameter consensusParams is now unused and should be removed?

maflcko approved

maflcko commented at 9:59 am on May 8, 2023: member

nice ACK 3b34ac7465919b968795063995f6610a73aa2d29 🗣

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: nice ACK 3b34ac7465919b968795063995f6610a73aa2d29 🗣
3qW6Kq7M3ZBVaQG1TZ3KYvAfs/o/BxahlxqncOR+LkhBejB6WLxUPgOSJbRR/cxW8UxRJBMDqkIVde9krgNPqCw==

DrahtBot requested review from dergoegge on May 8, 2023

in src/index/base.cpp:208 in 0400fb7b55 outdated

202@@ -205,7 +203,7 @@ void BaseIndex::ThreadSync()
203 
204             CBlock block;
205             interfaces::BlockInfo block_info = kernel::MakeBlockInfo(pindex);
206-            if (!m_chainstate->m_blockman.ReadBlockFromDisk(block, pindex, consensus_params)) {

ryanofsky commented at 1:50 pm on May 8, 2023:

In commit “refactor: Use BlockManager options for params” (0400fb7b55415867d9ff0aa88898920c60b4b2df)

Seem like you could squash this commit into the last commit and decrease review burden because the changes are mechanical and basically every line that changed here changed last commit. Not important though.

ryanofsky approved

ryanofsky commented at 1:58 pm on May 8, 2023: contributor

Code review ACK 3b34ac7465919b968795063995f6610a73aa2d29. Left some comments but this looks good as is and would not change unless necessary

in src/init.cpp:1043 in cb27dfb5af outdated

1035@@ -1035,8 +1036,9 @@ bool AppInitParameterInteraction(const ArgsManager& args, bool use_syscall_sandb
1036         if (const auto error{ApplyArgsManOptions(args, chainman_opts_dummy)}) {
1037             return InitError(*error);
1038         }
1039-        node::BlockManager::Options blockman_opts_dummy{
1040+        BlockManager::Options blockman_opts_dummy{
1041             .chainparams = chainman_opts_dummy.chainparams,
1042+            .blocks_dir = args.GetBlocksDirPath(),

mzumsande commented at 9:24 pm on May 8, 2023:

This is a slightly different approach to -fastprune and others, which are set in ApplyArgsManOptions(). So I wonder what is the criterion for an arg to be set in ApplyArgsManOptions(), as opposed to being initialized directly? And, closely connected, isn’t the current approach of not calling ApplyArgsManOptions() in some spots (setup_common, some unit tests) a bit brittle? (considering that users could specify extra bitcoind args while running unit tests, which then would be ignored)

TheCharlatan commented at 7:46 pm on May 9, 2023:

This is a slightly different approach to -fastprune and others, which are set in ApplyArgsManOptions(). So I wonder what is the criterion for an arg to be set in ApplyArgsManOptions(), as opposed to being initialized directly?

Initializing in the list is done for option members that don’t have sane default or empty values. This also has precedent in the ChainstateManager::Options.

And, closely connected, isn’t the current approach of not calling ApplyArgsManOptions() in some spots (setup_common, some unit tests) a bit brittle?

Yes, the current situation does not seem to be ideal. Sometimes these extra_args that the user can supply are currently parsed into the Options in the particular tests, and sometimes, like for the ChainstateManager, they are ignored. This is down to the mixed usage of both the m_args, which creates a fresh, default ArgsManager for each test, and the m_node.args, which parses all the passed in arguments.

This was brought up previously in this PR in #27125 (review), where based on the related discussion in #25055 and #21244 (review) I decided to keep the current behavior. If I understand the current problem correctly, it is not ideal that the m_node.args points to the global mutable gArgs instead of re-creating the arguments cleanly on each test run, while m_args is recreated on each run, but does not read the extra arguments.

Based on this, I am not sure how to proceed with this on this PR, since I do think that the current approach is broken. I could add a commit making the argument handling consistent for all the Options, at least that would make it probably less broken than it is now. What do you think?

maflcko commented at 11:03 am on May 10, 2023:

Seems unrelated, so maybe do in a follow-up or not at all?

mzumsande commented at 3:54 pm on May 10, 2023:

Yes, no need to do change it in this PR.

DrahtBot added the label Needs rebase on May 9, 2023

TheCharlatan force-pushed on May 10, 2023

TheCharlatan commented at 7:00 am on May 10, 2023: contributor

Updated 3b34ac7465919b968795063995f6610a73aa2d29 -> 24f1ace25081040af80ba4cf1636592266d8dbb5 (removeBlockstorageArgs_21 -> removeBlockstorageArgs_22, compare)

Addressed @MarcoFalke’s comment and comment making CBlockIndex a reference in ReadBlockFromDisk and UndoReadFromDisk as well as in the new zmq lambda function.
Addressed @ryanofsky’s comment by squashing commits changing the function signature of the moved functions. Also did this for the CBlockIndex argument change.
Addressed @MarcoFalke’s comment removing unused consensus params in src/node/transaction.cpp
Addressed @MarcoFalke’s comment making sure the lambda is defined in one, atomic commit.

Rebased 24f1ace25081040af80ba4cf1636592266d8dbb5 -> 8f94f059b3af5ecaf175a95389ba5e73b724203b (removeBlockstorageArgs_22 -> removeBlockstorageArgs_23, compare)

Fixed conflicts with https://github.com/bitcoin/bitcoin/pull/27491

DrahtBot removed the label Needs rebase on May 10, 2023

maflcko commented at 7:30 am on May 10, 2023: member

Nice.

re-ACK 24f1ace25081040af80ba4cf1636592266d8dbb5 🌦

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: re-ACK 24f1ace25081040af80ba4cf1636592266d8dbb5   🌦
3IX4DDtG96O/vTN0xyOyRw6n6dFGy59g9TyHzovXrie+lvGr2wMJeeOvSbQPLCSl08HXWYCbLSC/LP3sqJzD+AA==

maflcko commented at 7:31 am on May 10, 2023: member

trivial rebase re-ACK 8f94f059b3af5ecaf175a95389ba5e73b724203b 🗝

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: trivial rebase re-ACK 8f94f059b3af5ecaf175a95389ba5e73b724203b 🗝
3IJvbWV8BH4qE1BxY+hGbsdP4EvakcWtP3Ryi2ezkt+UfLS/RJhQc4QJ6u9zxvmShvLdpsFomlooRZAMl7D3aCQ==

DrahtBot requested review from ryanofsky on May 10, 2023

fanquake referenced this in commit 883766fa45 on May 10, 2023

refactor: Declare g_zmq_notification_interface as unique_ptr

Ensures better memory safety for this global. This came up during
discussion of the following commit, but is not strictly required for its
implementation.

8ed4ff8e05

TheCharlatan force-pushed on May 10, 2023

TheCharlatan commented at 11:16 am on May 10, 2023: contributor

Rebased 8f94f059b3af5ecaf175a95389ba5e73b724203b -> 886a473fc48f2c7d67436b5d9ac5643cd007e27f (removeBlockstorageArgs_23 -> removeBlockstorageArgs_24, compare)

Fixed conflicts with https://github.com/bitcoin/bitcoin/pull/27573

maflcko commented at 1:32 pm on May 10, 2023: member

trivial re-ACK 886a473fc48f2c7d67436b5d9ac5643cd007e27f 🐗

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: trivial re-ACK 886a473fc48f2c7d67436b5d9ac5643cd007e27f 🐗
3MiFgn+VZb98m014LmOnzmOv20QUq1tQSKEYOvhN6Tv3gta/Ii6BQq864nyHczaEFMZIfPXqKHHnPaRQ3SyVrAg==

in src/zmq/zmqnotificationinterface.cpp:47 in 6247b3c5e3 outdated

44 {
45     std::map<std::string, CZMQNotifierFactory> factories;
46     factories["pubhashblock"] = CZMQAbstractNotifier::Create<CZMQPublishHashBlockNotifier>;
47     factories["pubhashtx"] = CZMQAbstractNotifier::Create<CZMQPublishHashTransactionNotifier>;
48-    factories["pubrawblock"] = CZMQAbstractNotifier::Create<CZMQPublishRawBlockNotifier>;
49+    factories["pubrawblock"] = [&get_block_by_index]() -> std::unique_ptr<CZMQAbstractNotifier> {

ryanofsky commented at 3:41 pm on May 10, 2023:

In commit “zmq: Pass lambda to zmq’s ZMQPublishRawBlockNotifier” (6247b3c5e3c9e70097243a85b3d883b279a473cc)

It looks like the lambda is taking a reference to the local get_block_by_index variable here, which doesn’t seem right, since the variable will go out of scope before the lambda is called. So I think & should be dropped here.

Not totally sure though, since I would expect there to be problems in CI if there was a bug.

EDIT: I think moving instead of copying would also work:

0    factories["pubrawblock"] = [gb = std::move(get_block_by_index)]() -> std::unique_ptr<CZMQAbstractNotifier> {
1       return std::make_unique<CZMQPublishRawBlockNotifier>(std::move(gb));
2    };

TheCharlatan commented at 4:15 pm on May 10, 2023:

CZMQPublishRawBlockNotifier takes get_block_by_index by value (meaning instead of borrowing, copies its value), so I think that is why this works out.

ryanofsky commented at 4:51 pm on May 10, 2023:

re: #27125 (review)

Yeah sorry, I saw the “factories” map and assumed the factories would be long-lived, not destroyed before the function returns. This code is pretty strange, because it creates a map and then never looks anything up in the map. Could be cleaned up later, though. Probably a more sane way to write it would be to drop the map, change the outer for loop into an add_notifiers lambda, and do

0add_notifiers("pubhashblock", CZMQAbstractNotifier::Create<CZMQPublishHashBlockNotifier>);
1add_notifiers("pubhashtx", CZMQAbstractNotifier::Create<CZMQPublishHashTransactionNotifier>);
2...

In any case, agree the current code is safe

DrahtBot requested review from ryanofsky on May 10, 2023

in src/zmq/zmqpublishnotifier.h:56 in 6247b3c5e3 outdated

47@@ -46,7 +48,12 @@ class CZMQPublishHashTransactionNotifier : public CZMQAbstractPublishNotifier
48 
49 class CZMQPublishRawBlockNotifier : public CZMQAbstractPublishNotifier
50 {
51+private:
52+    const std::function<bool(CBlock&, const CBlockIndex&)> m_get_block_by_index;
53+
54 public:
55+    CZMQPublishRawBlockNotifier(const std::function<bool(CBlock&, const CBlockIndex&)> get_block_by_index)
56+        : m_get_block_by_index{get_block_by_index} {}

ryanofsky commented at 3:45 pm on May 10, 2023:

In commit “zmq: Pass lambda to zmq’s ZMQPublishRawBlockNotifier” (6247b3c5e3c9e70097243a85b3d883b279a473cc)

Would probably be a little more efficient to make the get_block_by_index non-const and use m_get_block_by_index{std::move(get_block_by_index)} to move the function instead of copying it.

in ci/test/06_script_b.sh:55 in c065e093b1 outdated

51@@ -52,6 +52,7 @@ if [ "${RUN_TIDY}" = "true" ]; then
52            src/dbwrapper.cpp \
53            src/init \
54            src/kernel \
55+           src/node/blockmanager_args.cpp \

ryanofsky commented at 3:54 pm on May 10, 2023:

In commit “refactor/iwyu: Complete includes for blockmanger_args” (c065e093b1c786476d9bc9769e42c2a54ff68816)

“blockmanger_args” spelling in commit title

TheCharlatan force-pushed on May 10, 2023

TheCharlatan commented at 4:51 pm on May 10, 2023: contributor

Updated 886a473fc48f2c7d67436b5d9ac5643cd007e27f -> e3311d649d9785ca7bc3a29ad7470c7cd69951bb (removeBlockstorageArgs_24 -> removeBlockstorageArgs_25, compare)

Addressed @ryanofsky’s comment, moving the lambda instead of borrowing.

EDIT: Sorry, forgot the second part :P

Updated e3311d649d9785ca7bc3a29ad7470c7cd69951bb -> 87479b2c3a6e5044baf41343e5287ff4bef0547a (removeBlockstorageArgs_25 -> removeBlockstorageArgs_26, compare)

TheCharlatan force-pushed on May 10, 2023

ryanofsky approved

ryanofsky commented at 4:59 pm on May 10, 2023: contributor

Code review ACK 886a473fc48f2c7d67436b5d9ac5643cd007e27f

re: #27125 (comment)

* Addressed [@ryanofsky](/bitcoin-bitcoin/contributor/ryanofsky/)'s [comment](https://github.com/bitcoin/bitcoin/pull/27125#discussion_r1190097934), moving the lambda instead of borrowing.

Actually I think my suggestion won’t work. There can be multiple notifiers, so the function needs to be copied into all of them not moved into the first one. ACKed previous version of the PR 886a473fc48f2c7d67436b5d9ac5643cd007e27f though, so would suggest reverting.

zmq: Pass lambda to zmq's ZMQPublishRawBlockNotifier

The lambda captures a reference to the chainman unique_ptr to retrieve
block data. An assert is added on the chainman to ensure that the lambda
is not used while the chainman is uninitialized.

This is done in preparation for the following commits where blockstorage
functions are made BlockManager methods.

cfbb212493

refactor: Move functions to BlockManager methods

This is a commit in preparation for the next few commits. The functions
are moved to methods to avoid their re-declaration for the purpose of
passing in BlockManager options.

The functions that were now moved into the BlockManager should no longer
use the params as an argument, but instead use the member variable.

In the moved ReadBlockFromDisk and UndoReadFromDisk, change
the function signature to accept a reference to a CBlockIndex instead of
a raw pointer. The pointer is expected to be non-null, so reflect that
in the type.

To allow for the move of functions to BlockManager methods all call
sites require an instantiated BlockManager, or a callback to one.

f0bb1021f0

refactor/iwyu: Complete includes for blockmanager_args a498d699e3

refactor, BlockManager: Replace fastprune from arg with options

Remove access to the global gArgs for the fastprune argument and
replace it by adding a field to the existing BlockManager Options
struct.

When running `clang-tidy-diff` on this commit, there is a diagnostic
error: `unknown type name 'uint64_t' [clang-diagnostic-error] uint64_t
prune_target{0};`, which is fixed by including cstdint.

This should eventually allow users of the BlockManager to not rely on
the global gArgs and instead pass in their own options.

02a0899527

refactor, blockstorage: Replace blocksdir arg

Add a blocks_dir field to the BlockManager options. Move functions
relying on the global gArgs to get the blocks_dir into the BlockManager
class.

This should eventually allow users of the BlockManager to not rely on
the global Args and instead pass in their own options.

18e5ba7c80

refactor, blockstorage: Replace stopafterblockimport arg

Add a stop_after_block_import field to the BlockManager options. Use
this field instead of the global gArgs.

This should allow users of the BlockManager to not rely on the global
Args.

5ff63a09a9

fanquake assigned fanquake on May 10, 2023

fanquake commented at 5:16 pm on May 10, 2023: member

https://github.com/bitcoin/bitcoin/pull/27125/checks?check_run_id=13380672723

0clang-tidy-16 -p=/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu -quiet /tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/zmq/zmqnotificationinterface.cpp
1zmq/zmqnotificationinterface.cpp:48:62: error: std::move of the const variable 'gb' has no effect; remove std::move() or make the variable non-const [performance-move-const-arg,-warnings-as-errors]
2        return std::make_unique<CZMQPublishRawBlockNotifier>(std::move(gb));
3                                                             ^~~~~~~~~~  ~

fanquake unassigned fanquake on May 10, 2023

TheCharlatan force-pushed on May 10, 2023

DrahtBot added the label CI failed on May 10, 2023

TheCharlatan commented at 5:24 pm on May 10, 2023: contributor

Updated 87479b2c3a6e5044baf41343e5287ff4bef0547a -> 5ff63a09a9edd1204b2cc56cf6f48a44adab7bb3 (removeBlockstorageArgs_26 -> removeBlockstorageArgs_27, compare)

Reverted previous change as suggested by @ryanofsky in comment.
Addressed @ryanofsky’s comment, correcting the commit message.
Addressed @ryanofsky’s comment, moving the lambda in the constructor.

ryanofsky approved

ryanofsky commented at 5:35 pm on May 10, 2023: contributor

Code review ACK 5ff63a09a9edd1204b2cc56cf6f48a44adab7bb3. Since last ACK just added std::move and fixed commit title. Sorry for the noise!

DrahtBot requested review from maflcko on May 10, 2023

in src/node/transaction.h:57 in f0bb1021f0 outdated

53@@ -53,11 +54,10 @@ static const CFeeRate DEFAULT_MAX_RAW_TX_FEE_RATE{COIN / 10};
54  * @param[in]  block_index     The block to read from disk, or nullptr
55  * @param[in]  mempool         If provided, check mempool for tx
56  * @param[in]  hash            The txid
57- * @param[in]  consensusParams The params
58  * @param[out] hashBlock       The block hash, if the tx was found via -txindex or block_index

mzumsande commented at 8:12 pm on May 10, 2023:

nit (if you repush): blockman should be added to the doc, also it would be nice to keep the out param at the end according to dev notes.

mzumsande commented at 8:17 pm on May 10, 2023: contributor

Code Review ACK 5ff63a09a9edd1204b2cc56cf6f48a44adab7bb3

DrahtBot removed the label CI failed on May 10, 2023

fanquake merged this on May 11, 2023

fanquake closed this on May 11, 2023

sidhujag referenced this in commit 43ccd0919e on May 11, 2023

ryanofsky referenced this in commit 153a6882f4 on Jun 9, 2023

bitcoin locked this on May 10, 2024

refactor, kernel: Decouple ArgsManager from blockstorage #27125

Reviews

Conflicts