Clean up code dependencies

CodeShark commented at 7:18 am on January 7, 2013: contributor

I realize this introduction is a bit long, so if you don’t feel like reading skip to the bullet points below under “strategy”.

The codebase as it exists right now has a number of unnecessary dependencies which makes code modularization much more difficult. In particular, the satoshi client was built to handle all bitcoin-related tasks, but its value as a “reference implementation” lies primarily in doing verification and acting as a relay agent as these are the most essential tasks that participating nodes must perform to keeping the network in operation. Things like wallets, historical databases, mining, and notification agents could be written as entirely separate third-party applications without risk to the network’s fundamental integrity.

The basic architecture of a bitcoin node is as follows:

At the core there exist fundamental bitcoin message structures, along with the code necessary for serialization/deserialization. These structures belong in their own source files with minimal dependencies so they can be reused for applications that needn’t perform verification and relay - for instance, filtering and notification agents. Unfortunately, these core structures currently reside for the most part in main.h/main.cpp, one of the central problems this pull request attempts to fix.

On top of these core structures sits a network component that manages sockets, does peer discovery, and handles queueing and dispatching of messages. This component is clearly dependent on the core message structures but does not depend on the specific logic used to verify blocks and transactions nor to identify misbehaving peers nor sign transactions nor maintain a block chain database.

Then we have a scripting engine, signature verification component, and a signing component. Historical database applications do not need signature verification/signing functionality at all. Filtering messages and sending alerts generally does not even require a scripting engine and does fine with basic pattern matching.

The most critical high-level operations needed by a verification/relay node such as the satoshi client are transaction verification; block chain and memory pool management; and detection/management of misbehaving peers. These things are currently primarily implemented in main.h/main.cpp. These are indeed the main operations of the satoshi client - but the core low-level structures should not depend at all on this logic.

Then there’s the UI, but let’s leave that aside for a moment.

Finally there’s init.h/init.cpp, which sets up the particular environment in which the satoshi client runs.

This branch takes the following strategy:

Remove source file dependencies on main.h and init.h by only including necessary headers wherever possible.
If source files depend on definitions in main or init, either move the dependent portions into main/init or move the depended-upon portions into separate files.
If the dependent source files use global variables or functions that clearly belong in either main or init, copy the value over to a class member or a variable with file scope in the dependent source or expose a registration function to set a callback.
If moving a core class out of main is impossible because its methods depend on variables or functions defined in main, isolate the methods that depend on main and either move them to another class that does belong in main or convert them into regular functions in main.

It is important that all modifications made in this branch are easy to review and to test. This branch does not encourage rewriting things from scratch - only moving them and rearranging them in easily identifiable chunks. Furthermore, the focus of the branch is not so much on coding style and style consistency - but on isolation of functional units and elimination of unnecessary dependencies.

gavinandresen commented at 2:53 pm on January 7, 2013: contributor

Why namespace db_cpp ? Seems to me the database copy of pchMessageStart could be a static member of the DbEnv.

CodeShark commented at 3:19 pm on January 7, 2013: contributor

I’d rather avoid adding more dependencies in CAddrDB for something that isn’t really used elsewhere.

sipa commented at 3:34 pm on January 7, 2013: member

@gavinandresen That doesn’t make sense. This is for peers.dat, which doesn’t use BDB at all, and I suppose CDbEnv will be gone as soon as we kick out BDB-based wallets.

CodeShark commented at 4:38 pm on January 7, 2013: contributor

I’m still not clear on exactly where in the code it’s best to:

set the magic bytes for CAddrDB
Set the call handlers for net.cpp

I tried running and got this error: http://pastebin.com/tJU9Gsi2 But tried restarting and it initialized correctly, and now it seems to be running fine.

BitcoinPullTester commented at 5:23 pm on January 7, 2013: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/03014e2ff26d4ab6195f8b0bfc29c712dec509de for binaries and test log.

gavinandresen commented at 10:07 pm on January 7, 2013: contributor

RE: namespace db_cpp:

Fine, make it static in CAddrDB, I forgot that peers.dat is not a BDB file (and pass it into CAddrDB constructors? or is CAddrDB a singleton class? )

CodeShark commented at 4:05 am on January 8, 2013: contributor

@gavinandresen Noted and done.

CodeShark commented at 5:21 am on January 8, 2013: contributor

cp: writing out/bitcoind.exe': No space left on device cp: writingout/test_bitcoin.exe’: No space left on device

WTF?!?!?!

Shall we all chip in and get BlueMatt some more hard disk space?

CodeShark commented at 12:42 pm on January 8, 2013: contributor

I am going to keep pushing commits as an extra backup despite BitcoinPullTester being out of disk space.

Diapolo commented at 1:49 pm on January 8, 2013: none

I think you are doing good work, but I’m sure you will get faster ACKs or merges, if you try to keep pulls smaller. Perhaps @sipa or other core devs can comment.

gavinandresen commented at 2:02 pm on January 8, 2013: contributor

Our bottleneck is code review and testing.

If you want your pulls to get merged, then you need to help fix that bottleneck– either recruit reviewers/testers to review/test your own code, or volunteer to help review/test other people’s pulls.

Or only refactor code that has good unit tests, so we can be more confident that nothing breaks on any of the three platforms we support.

I am generally against refactoring just to improve the code, unless the refactoring comes with some significant benefit or unit tests.

gavinandresen commented at 2:06 pm on January 8, 2013: contributor

Oh, also: you might want to help test or update https://github.com/libcoin/libcoin which is a fully refactored version of the core code, that, last I checked, nobody used because nobody trusts it (because so many changes were made without thorough testing).

CodeShark commented at 4:30 pm on January 8, 2013: contributor

The changes I’m making will be well-documented and possible to rigorously review. And if more unit tests are needed, I’d be glad to help write some up.

I’m making incremental changes, hopefully each change easy to understand and track. I haven’t changed any actual logic in the code nor implementation details - just moved code around - and in large chunks, not small pieces.

As most of the significant changes thus far are things like moving methods from one class to another or turning them into regular functions, it should be possible to do a fairly comprehensive assessment by doing pattern matches on the codebase and making sure each usage was appropriately modified. i.e. tx.CheckTransaction() becomes CheckTransaction(tx); or tx.GetOutputFor(txin, inputs) becomes inputs.GetOutputFor(txin); etc…

CodeShark commented at 4:46 pm on January 8, 2013: contributor

Also, I’m willing to do comprehensive testing on earlier commits with the hope of getting them merged without having to merge everything at once.

in src/net.cpp: in 265b174829 outdated

72@@ -74,6 +73,28 @@ struct LocalServiceInfo {
73 
74 static CSemaphore *semOutbound = NULL;
75 
76+//
77+// Handlers that need to be registered
78+//
79+ProcessMessagesHandler fnProcessMessages = NULL;
80+SendMessagesHandler fnSendMessages = NULL;
81+StartShutdownHandler fnStartShutdown = NULL;

sipa commented at 7:27 pm on January 8, 2013:

Can these be static?

BitcoinPullTester commented at 8:48 pm on January 8, 2013: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/4af9a6c7e0d1246852195b47c200a974dff47ab1 for binaries and test log.

sipa commented at 9:05 pm on January 8, 2013: member

Just for the record: I’ve been discussing these changes extensively with @CodeShark the past few days, and I think they are very valuable. They should make the code easier to understand and reuse.

Getting 0.8 out now certainly has priority over refactorings, but as these are almost entirely just code-movement changes, merging them shouldn’t be too hard.

in src/main.cpp: in 4af9a6c7e0 outdated

1326 
1327     int64 nResult = 0;
1328-    for (unsigned int i = 0; i < vin.size(); i++)
1329-        nResult += GetOutputFor(vin[i], inputs).nValue;
1330+    for (unsigned int i = 0; i < tx.vin.size(); i++)
1331+        nResult += this->GetOutputFor(tx.vin[i]).nValue;

sipa commented at 9:15 pm on January 8, 2013:

Is this “this->” necessary?

CodeShark commented at 9:27 pm on January 8, 2013:

nah, it’s mostly a stylistic thing, I guess. I can get rid of it if you don’t like it.

BitcoinPullTester commented at 0:03 am on January 9, 2013: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/aea31289a59166195d2d7270eacb0905b14d476a for binaries and test log.

BitcoinPullTester commented at 6:33 am on January 9, 2013: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/366fb8aa64a581c22064dae33a3d08769375b9f3 for binaries and test log.

CodeShark commented at 5:11 pm on January 23, 2013: contributor

The core commits will be reorganized into four stages:

1. move core class methods that should remain class methods implemented in main.cpp out of main.cpp and into core.cpp. (move)
1. move core class method implementations in main.h that should not be part of core classes out of the class declarations in main.h. (move)
1. turn methods that should not be part of the core classes into regular functions in main.h and main.cpp, get rid of the method prototypes in the classes, and add function declarations in main.h where necessary. (pattern replacement, add function declarations to main.h)
1. move the core class declarations to core.h (move)

BitcoinPullTester commented at 4:24 am on January 24, 2013: none

Automatic sanity-testing: FAILED MERGE, see http://jenkins.bluematt.me/pull-tester/366fb8aa64a581c22064dae33a3d08769375b9f3 for test log.

This pull does not merge cleanly onto current master

luke-jr commented at 6:37 pm on January 30, 2013: member

Rebase needed

Diapolo commented at 6:39 pm on January 30, 2013: none

I’m sure @CodeShark intents to rework this pull into smaller more logical pieces after 0.8 is final.

in src/wallet.h: in 32126ade9e outdated

22 
23 class CAccountingEntry;
24 class CWalletTx;
25 class CReserveKey;
26 class COutput;
27+class CWalletDB;

sipa commented at 5:31 pm on June 4, 2013:

Is this needed?

in src/main.h: in 32126ade9e outdated

634-
635-    void UpdateTime(const CBlockIndex* pindexPrev);
636-};
637+void UpdateTime(CBlockHeader& block, const CBlockIndex* pindexPrev);
638 
639 class CBlock : public CBlockHeader

sipa commented at 5:33 pm on June 4, 2013:

Any reason why CBlock doesn’t move to core.h? Or just a “first the easy parts”?

CodeShark commented at 3:23 am on June 5, 2013:

I want to get what I had already done rebased correctly before moving CBlock - but yes, definitely that comes immediately after.

jgarzik commented at 5:40 pm on June 4, 2013: contributor

OK, very much like where this code is going. ACK 90% of it, modulo inline comments made earlier.

One specific criticism with this rebase: the rebase problems were all fixed in a final, appended commit. That does not work, because it breaks bisection properties. Each commit needs to produce a tree that is buildable and passes tests. Otherwise, “git bisect” will not work, and attempting to find a problematic commit in past history becomes more difficult.

I know that makes rebasing more difficult for changes like this, sorry :(

We cannot have: broken commit, broken commit, broken commit, commit that fixes everything.

sipa commented at 5:44 pm on June 4, 2013: member

I’m very much in favor of the code changes here (and they look move-only to, apart from the registration functions). It’s only a first step, but it’s a very needed one IMHO.

I agree with the comment about the last commit fixing everything, but apart from that, I’d like to see this merged soon. Since now seems the ideal time for this, I don’t mind holding up other pullreqs for a short time, so this can get reviewed and finalized.

jgarzik commented at 5:47 pm on June 4, 2013: contributor

Agreed, RE holding other pullreqs, to avoid the endless rebase pain on @CodeShark ’s part.

CodeShark commented at 3:24 am on June 5, 2013: contributor

I’ll try to get this done today.

CodeShark commented at 10:43 am on June 5, 2013: contributor

All the commits build with make -f makefile.unix now.

Still left to do:

Move CBlock to core
Remove main.h include in net.cpp.
Unit tests

CodeShark commented at 1:22 pm on June 5, 2013: contributor

The unit tests have run successfully on all the commits in linux. Need a few more eyes to review for correctness and some help with a few more tests on other platforms. As for the other two things I had listed as TODOs - moving CBlock to core and removing the main.h include from net - I’d rather merge what we have now and do these things later.

jgarzik commented at 1:33 pm on June 5, 2013: contributor

OK, looks pretty good.

ACK everything except the indirect function pointer stuff. Definite NAK on the function pointers. Let’s fix that up, and we can get this merged.

CodeShark commented at 1:53 pm on June 5, 2013: contributor

What do you suggest in place of the function pointers?

sipa commented at 1:58 pm on June 5, 2013: member

@jgarzik Not sure you see the reason for those indirect pointers. They are there to break the dependency of net on main, and seen as such it seems perhaps weird, if the only user of net is main for now.

However, over time, net should turn into a class “CNetworkNode” or something, which exposes a way to listen for events. I suppose a boost::signal could be used right now instead, which more clearly shows its intention.

And please, the overhead of a pointer indirection is few orders of magnitude lower than even just allocating the buffer in which a message being processed is read…

CodeShark commented at 2:00 pm on June 5, 2013: contributor

If there’s any reasonable critique of the function pointers it’s about style, not performance. A more general-purpose messaging system would be very nice - but in absence of a clear design, at least the function pointers avoid any dependencies on outside libraries.

gavinandresen commented at 2:09 pm on June 5, 2013: contributor

I’d also prefer boost::signals2 over registering function pointers. We’re already using them elsewhere in the code, and it is exactly the type of decoupling they are designed for.

laanwj commented at 2:19 pm on June 5, 2013: member

Agree with Gavin on boost::signal2, let’s avoid raw function pointers in c++

CodeShark commented at 2:21 pm on June 5, 2013: contributor

Very well, I agree it’s a cleaner solution.

CodeShark commented at 3:20 pm on June 5, 2013: contributor

I’ll add signals later today…

sipa commented at 6:13 pm on June 5, 2013: member

Just so this isn’t forgotten: @TheUni just noticed this doesn’t update makefiles (yet)

robbak commented at 1:00 am on June 6, 2013: contributor

ACK for FreeBSD - this branch builds and runs clean with the standard FreeBSD adjustments.

On 6 June 2013 04:13, Pieter Wuille notifications@github.com wrote:

Just so this isn’t forgotten: @TheUni https://github.com/TheUni just noticed this doesn’t update makefiles (yet)

— Reply to this email directly or view it on GitHubhttps://github.com/bitcoin/bitcoin/pull/2154#issuecomment-18996278 .

CodeShark commented at 2:26 am on June 6, 2013: contributor

@sipa I have deliberately avoided putting anything in core.cpp to avoid makefile issues for this merge. Eventually, it will probably make sense to move some of the code in core.h into core.cpp.

Get rid of db dependencies on main 336fe971e6

Moved PushGetBlocks to main.cpp to eliminate dependence of net.cpp on CBlockLocator. 8926263dde

Moved unrelated-to-network calls in StartNode and StopNode into init.cpp 4751d07efd

Removed net.cpp's dependency on init.h.

Added explicit include of main.h in init.cpp, changed include of init.h to include of main.h in net.cpp.

Added function registration for net.cpp in init.cpp's network initialization.

Removed protocol.cpp's dependency on main.h.

TODO: Remove main.h include in net.cpp.

663224c232

CodeShark commented at 4:36 am on June 6, 2013: contributor

When I had worked on this originally, net.cpp was calling StartShutdown directly. Happily, this has since been removed.

I had overlooked the fact that ProcessMessages was still being called directly from net.cpp. I’ll fix the appropriate commits once everyone agrees with the messaging approach.

sipa commented at 4:49 am on June 6, 2013: member

@CodeShark You’ll still at least need to add core.h to bitcoin-qt.pro, and while you’re on it, I don’t see any harm in adding core.cpp to the other makefiles too - that’ll make it easier to move stuff there in the future.

Created core.h/core.cpp, added to makefiles. Started moving core structures from main to core beginning with COutPoint. effc2770f5

Moved CInPoint to core. Removed GetMinFee from CTransaction and made it a regular function in main. 788536f175

Removed AcceptToMemoryPool method from CTransaction. This method belongs to the mempool instance.

Removed AreInputsStandard from CTransaction, made it a regular function in main.
Moved CTransaction::GetOutputFor to CCoinsViewCache.

Moved GetLegacySigOpCount and GetP2SHSigOpCount out of CTransaction into regular functions in main.

Moved GetValueIn and HaveInputs from CTransaction into CCoinsViewCache.

Moved AllowFree, ClientCheckInputs, CheckInputs, UpdateCoins, and CheckTransaction out of CTransaction and into main.

Moved IsStandard and IsFinal out of CTransaction and put them in main as IsStandardTx and IsFinalTx. Moved GetValueOut out of CTransaction into main. Moved CTxIn, CTxOut, and CTransaction into core.

Added minimum fee parameter to CTxOut::IsDust() temporarily until CTransaction is moved to core.h so that CTxOut needn't know about CTransaction.

05df3fc68d

Moved CCoins, CTxOutCompressor, CTxInUndo, and CTxUndo to core. 65e7bbef74

Removed script.cpp's dependence on main.h 48343a0a50

Moved UpdateTime out of CBlockHeader and moved CBlockHeader into core. aabdf9e899

Using boost::signals2 to message main from net.cpp. 501da2503a

CodeShark commented at 6:17 am on June 6, 2013: contributor

@sipa done @TheUni core.h/core.cpp will have to be considered in what you’re doing

Removed the main.h include from net.cpp. 6e68524e95

CodeShark closed this on Jun 6, 2013

CodeShark reopened this on Jun 6, 2013

CodeShark commented at 7:25 am on June 6, 2013: contributor

Sorry, hit the close button by accident.

CodeShark commented at 7:42 am on June 6, 2013: contributor

Alright - code freeze until merge.

in src/db.cpp: in 6e68524e95

487@@ -486,6 +488,7 @@ void CDBEnv::Flush(bool fShutdown)
488 // CAddrDB
489 //
490 
491+unsigned char CAddrDB::pchMessageStart[4] = { 0x00, 0x00, 0x00, 0x00 };

sipa commented at 9:11 am on June 7, 2013:

This duplication is quite ugly. Mike’s refactor that moves these chain-dependent constants to a separate class is nicer, but you may want to check that it doesn’t introduce new dependencies.

sipa commented at 9:38 am on June 7, 2013: member

ACK. Code changes look good to me, and I’ve tested syncing/mining/receiving/sending on testnet. I have a few inline comments left, but those can be dealt with in follow-up commits, so we don’t need to stall the merging process too long.

in src/core.cpp: in 6e68524e95

0@@ -0,0 +1,7 @@
1+// Copyright (c) 2009-2010 Satoshi Nakamoto
2+// Copyright (c) 2009-2013 The Bitcoin developers
3+// Distributed under the MIT/X11 software license, see the accompanying
4+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
5+
6+#include "core.h"

laanwj commented at 7:53 am on June 9, 2013:

Why have this implementation file that only includes a header?

sipa commented at 8:20 am on June 9, 2013:

More things are intended to be moved to core (in particular, CBlock), in a second step. As those will include implementation code too, already create a file for them, so e.g. #2748 picks it up already.

laanwj commented at 7:09 pm on June 9, 2013: member

I’ve tested this commit on testnet, no problems found.

jgarzik referenced this in commit f59530ce6e on Jun 10, 2013

jgarzik merged this on Jun 10, 2013

jgarzik closed this on Jun 10, 2013

sidhujag referenced this in commit 46114a8f67 on Jul 13, 2018

HashUnlimited referenced this in commit 15db1ed139 on Jul 27, 2018

owlhooter referenced this in commit c4698d5f3d on Oct 11, 2018

guruvan referenced this in commit 2ccfffc87d on Nov 8, 2018

DrahtBot locked this on Sep 8, 2021

Clean up code dependencies #2154