Separate Contextual checks and handling & switch on enum in net

JeremyRubin commented at 10:13 pm on April 3, 2017: contributor

edit 0: Updated to reflect updates mentioned in #10145 (comment)

This PR separates ProcessMessage into two functions, ContextualProcessMessage and ~~ProcessMessage~~ Process*Message. This helps with readability, verifiability, and maintainability of the code.

ContextualProcessMessage generates a list of context dependent “whitelists”, all of which must pass for the incoming message before a call to ProcessMessage may be made. If the whitelists fail, the code that follows should be identical to the previous behavior. The choice of whitelists over blacklists is because it is better to explicitly enable the behaviors desired, rather than to try to block the potential bad features (e.g., adding something unsafe and new won’t be permitted in unstudied contexts). This design should be extensible for adding new features (~~up to 64 netmsgs total~~ unlimited network messages) as well as new contexts (easy to add new whitelists). There should be very little overhead to check these whitelists as it is all ~~bitwise~~ bool array lookups.

~~ProcessMessage now uses an enum to switch to~~ Dispatch is now done using a std::map lookup to get the appropriate handler, and is semi “stateless” (the map is const). This makes it easier to verify the code and make dispatch more modular.

I haven’t benchmarked that the conversion from string->~~enum~~std::pair<handler_t, whitelist_index> has any performance implication, negative or positive. In theory this code could be faster given fewer branch mispredictions due to the ~~switch~~function pointer call. Another PR could improve the lookup algorithm ~~(trivially, inlining getAllNetMessageTypes might help the compiler a lot)~~, but unless it is exotic it should be compatible with this design by replacing the map with the desired scheme. ~~I didn’t think there was something obviously faster than the linear lookup, because n is small.~~ A std::map lookup should be fairly fast, but perhaps a custom map could be faster.

The correctness of this code is dependent on NetMsgTypeEnum::tag and allNetMessageTypes having the same index order. It would be nice to verify this property at compile time, which should be possible with some recursive constexpr static_assert magic. The default return of ProcessNewMessage is now also false, because the last return is unreachable.

See #9608 and https://github.com/theuni/bitcoin/commit/f1e4e281e3f1eb884f8010ac941c82752174bdbe for related work/alternatives.

gmaxwell commented at 10:23 pm on April 3, 2017: contributor

I don’t get the bitfield stuff here. Why? It adds a lot of code with indirect effects, and means that we cannot use a perfect hash to set the enum value (e.g. stuck with a map of strings at best, though this code doesn’t do that).

theuni commented at 10:39 pm on April 3, 2017: member

@gmaxwell I agree. I asked @JeremyRubin to check my logic in https://github.com/theuni/bitcoin/commit/f1e4e281e3f1eb884f8010ac941c82752174bdbe, and after reviewing, he wanted to take a stab at a more direct mapping for the initial filter.

This is an interesting approach, but I think this is much less clear than f1e4e281e3f1eb884f8010ac941c82752174bdbe, and it tangles the rules up with the enum values.

JeremyRubin commented at 10:41 pm on April 3, 2017: contributor

It’s actually trivial to make this work very well with a perfect hash and this implementation doesn’t add any network dependency on the order. If you have a perfect hash available I could demonstrate :)

The bitwise stuff is the set of current policies. It’s much more explicit about which operations are allowed in which contexts compared to a list of conditionals. There might be a better way to aggregate those rules together, but those are the rules. Blacklists would be shorter (I used one for the bloom stuff, now that I think of it), but it’s generally easier to audit what is permitted rather than what is not.

gmaxwell commented at 11:16 pm on April 3, 2017: contributor

If you have a perfect hash available

gperf works in a pinch

cat protocol.cpp | grep ‘^const char *’ | cut -d’"’ -f2 | sort | gperf -lCcE

. It’s much more explicit about which operations are allowed in which contexts compared to a list of conditionals.

/Generally/ moving function preconditions far away from their code results in defects. When its something that applies basically universally (like the check for the version handshake finishing), then it can make sense… but I think having if (importing) return; at the top of a message handling function is a lot more maintainable than what is effectively an additional state machine.

fanquake added the label P2P on Apr 3, 2017

JeremyRubin commented at 2:12 am on April 4, 2017: contributor

cat protocol.cpp | grep ‘^const char *’ | cut -d’"’ -f2 | sort | gperf -lCcE

As promised, perfect hashing patch in https://github.com/JeremyRubin/bitcoin/tree/perfect_hashing (a little bit less clean than it could be, because I didn’t want to modify protocol.h, but you get the idea).

It’s like 8 lines of code. At initialize you just fill up a translation table to map the perfect hashes. Note that I slightly tweaked the API of the gperf hash to return the hash key and max+1 on fail to make this design easier.

What’s nice about the translation table is it could be modified to store the function pointers to the handlers directly as well, skipping the jump table/switch.

/Generally/ moving function preconditions far away from their code results in defects. When its something that applies basically universally (like the check for the version handshake finishing), then it can make sense… but I think having if (importing) return; at the top of a message handling function is a lot more maintainable than what is effectively an additional state machine.

I’m mixed on this one. I agree that having preconditions closer to the code can be good; but it’s also good to have a non-exposed function which only takes sanitized inputs, and do the sanitizing elsewhere. I agree with the state machine comment, but that was the existing state of the code: there is currently an implicit state machine on what order messages were allowed to come in. This PR makes it more explicit; if you want to make it even more explicit (e.g., ProtocolStateMachine class I think that would be great :)). This at least makes it really easy to do whatever you want with the actual dispatch as it is state independent.

I don’t think it’s more maintainable to have repeated preconditions throughout the code, because then it is easier to forget to check a precondition in a handler and you repeat precondition checking code, leading to more opportunity for error. It’s a trade off.

It is possible to re-check these preconditions if critical, which would maybe be “the best/worst” of both worlds.

laanwj commented at 6:27 am on April 4, 2017: member

A serious drawback of perfect hashing is that it makes it hard to add message types, especially in PRs or third-party patches. It reduces flexibility.

Unless it can be clearly shown from performance results that matching a small string in e.g. a sorted table or run-time constructed hash is really a performance sink, and given the small number of small messages I would be really surprised (12 bytes isn’t even two 64-bit words!), I’d prefer if adding a message type was just adding a line.

ProtocolStateMachine

If you go this way, I don’t think there should be one protocol state machine. Different concerns (e.g. initial negotiation, handling pings, handling transactions, handling blocks, handling filters) could be separate state machines that handle groups of messages (this was @sipa’s idea). Creating a separate handler class for every message type would be typical OOP overkill, but for separate concerns I think it’d make sense and would help untangle the current labyrinth.

sipa commented at 7:07 am on April 4, 2017: member

I think using a perfect hash is unnecessary overkill here.

JeremyRubin force-pushed on Apr 4, 2017

JeremyRubin commented at 9:09 pm on April 4, 2017: contributor

Ultimately if we want to check preconditions, it can’t hurt to check some general preconditions in ContextualProcessMessage which apply to either the ProcessMessage state machine (e.g., that only one version may be received) or affect more than one handler. Individual handlers that rely on global preconditions should also re-check these global preconditions to harden them against a failure in ContextualProcessMessage.

I pushed up a better version.

More Modular:

there is no external dependency on the value of message types in the enum
Each handler is a separate function (this duplicates/replaces some of @jtimon’s work)

Cleaner:

The whitelists are now a std::array<bool, …> rather than (previously) a uint64_t, so no masking just indexing
Identifiers are renamed to be a bit more clear
whitelists are now separated into global and connection specific
diff is easier to read

More Extensible:

The whitelists are now a std::array<bool, …> rather than (previously) a uint64_t, so no refactoring needed to have more than 64 messages. Adding new whitelists is also easier/cleaner.
The use of the function pointers/whitelist arrays makes future work easier, including having per-node handler tables/whitelists (e.g., if a client says it is SPV disable certain operations).

old version exists at https://github.com/jeremyrubin/bitcoin/tree/netprocessing_enum_backup for reference.

jtimon commented at 4:44 pm on April 5, 2017: contributor

I only looked fast at the changes and read the rest of the comments, but this does way more than #9608 , even if #9608 would rebase on top of https://github.com/theuni/bitcoin/commit/f1e4e281e3f1eb884f8010ac941c82752174bdbe (which at a glance look like something good to do before #9608, but I haven’t reviewed it deeply because I’m not that familiar with the network code).

#9608 is very easy to review and trivial to rebase (or re-write) and personally I think it makes the network code easier to read for people like me (perhaps combined with https://github.com/theuni/bitcoin/commit/f1e4e281e3f1eb884f8010ac941c82752174bdbe even more, I haven’t tried combining them). In contrast, it seems like this PR would make harder for me to read at a glance (or maybe not, or maybe it is still worth it, I honestly don’t know). Although I’ve written #9608 (which as said is trivial to write IMO, just a little bit painful [could have been lesss painful if I still used eclipse http://help.eclipse.org/neon/index.jsp?topic=%2Forg.eclipse.jdt.doc.user%2FgettingStarted%2Fqs-ExtractMethod.htm ]), I don’t feel very capacitated to review this PR or even https://github.com/theuni/bitcoin/commit/f1e4e281e3f1eb884f8010ac941c82752174bdbe which looks much simpler.

Separate Contextual checks and handling in net_processing.cpp. 1b389b0e7e

JeremyRubin force-pushed on Sep 5, 2017

JeremyRubin commented at 11:34 pm on September 5, 2017: contributor

rebased to 1b389b0

in src/net_processing.cpp:1175 in 1b389b0e7e

1193-        } else {
1194-            pfrom->fDisconnect = true;
1195-            return false;
1196-        }
1197-    }
1198+static bool ContextualProcessMessage(CNode* pfrom, const std::string&

ryanofsky commented at 5:24 pm on September 6, 2017:

Maybe leave this as ProcessMessage instead of renaming to ContextualProcessMessage. “Contextual” doesn’t seem that elucidating, and the PR could be a little smaller without this change.

in src/net_processing.cpp:1179 in 1b389b0e7e

1197-    }
1198+static bool ContextualProcessMessage(CNode* pfrom, const std::string&
1199+        strCommand, CDataStream& vRecv, int64_t nTimeReceived, const
1200+        CChainParams& chainparams, CConnman& connman, const std::atomic<bool>&
1201+        interruptMsgProc);
1202+namespace ProcessMessageHandler {

ryanofsky commented at 5:28 pm on September 6, 2017:

Style guide probably changed since this PR was written, but currently says to use a snake_case name for a namespace, put the brace on the next line, and not indent the contents. Could maybe name this “handlers” instead of “handler” too since it is a collection of handlers.

in src/net_processing.cpp:1206 in 1b389b0e7e

1199@@ -1218,19 +1200,12 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
1200                 LogPrint(BCLog::NET, "Unparseable reject message received\n");
1201             }
1202         }
1203+        return true;
1204     }
1205 
1206-    else if (strCommand == NetMsgType::VERSION)
1207+static bool ProcessVersionMessage(CNode* pfrom, CDataStream& vRecv, int64_t nTimeReceived, const CChainParams& chainparams, CConnman& connman, const std::atomic<bool>& interruptMsgProc)

ryanofsky commented at 5:47 pm on September 6, 2017:

You might be able to cut down on the size of the PR by making these handlers classes instead of functions. E.g. you wouldn’t need to repeat these pfrom, vRecv, nTimeReceived declarations because they could be members initialized by an inherited constructor. strCommand variables could be static members, so PushMessage calls wouldn’t need to change. And you could replace handler_registry table with initialization from a flat list of handler classes.

I could imagine other reviewers liking functions more than classes though, and the differences are pretty superficial, so should just consider this a tentative suggestion.

in src/net_processing.cpp:2627 in 1b389b0e7e

2626+        GETBLOCKTXN,
2627+        BLOCKTXN,
2628+        // Unknown Index
2629+        MSG_TYPE_UNKNOWN,
2630+        // Size
2631+        INDEX_COUNT,

ryanofsky commented at 5:57 pm on September 6, 2017:

Maybe add a comment here that this must be last.

in src/net_processing.cpp:2673 in 1b389b0e7e

2672 
2673 
2674 
2675+    //! Message Processing Whitelists (default init 0)
2676+    //! Not const for initialization
2677+    static std::array<bool, INDEX_COUNT> KNOWN_MESSAGES {};

ryanofsky commented at 6:05 pm on September 6, 2017:

No good reason not to make these constant, I think. You can assign them values at initialization by returning from a function or lambda.

ryanofsky commented at 6:39 pm on September 6, 2017:

Would be good to have comments describing what each whitelist does, and maybe the types of messages make sense to have in the whitelist. For example, it’s not immediately obvious which messages should go in KNOWN_MESSAGES (all the message types, or only the messages types with handlers).

Alternately could add an overall comment here pointing to the ContextualProcessMessage function for specifics on what the whitelists do.

in src/net_processing.cpp:2681 in 1b389b0e7e

2680+    static std::array<bool, INDEX_COUNT> BEFORE_VERSION {};
2681+    static std::array<bool, INDEX_COUNT> AFTER_VERACK {};
2682+    static std::array<bool, INDEX_COUNT> NO_REQUIRE_BLOOM {};
2683+
2684+    bool init_whitelists() {
2685+        for (auto x : { VERSION, VERACK, ADDR, INV, GETDATA, MERKLEBLOCK,

ryanofsky commented at 6:36 pm on September 6, 2017:

Could you initialize this from the registry table instead of listing all the messages again?

ryanofsky commented at 7:17 pm on September 6, 2017: member

Light utACK 1b389b0e7e57534a2a4b18a2fead70c24dd69ced. Light because I didn’t spend as much time as I would like verifying changes in this PR don’t change current behavior. Will spend more time when there are concept acks for the overall approach.

I think this is a nice improvement. The logic and explanations in ContextProcessMessage make it easier to understand why various checks are being done. The centralized whitelists should make it easier to add new message types with correct checking in the future because you can see at a glance how all the existing message types behave. The separate message handling functions make code more navigable.

theuni commented at 10:25 pm on September 22, 2017: member

I’ve thought about this a good bit lately, sorry for the delayed response.

After rewriting this about a dozen different ways now, I’m now convinced that the approach in #9608 is the way to go.

A dispatcher with registered functions and a state checker (like you’ve done here, and like I attempted as well) seems like the obviously correct approach, but once implemented, it’s far less straightforward than the simplistic #9608.

We can avoid some duplication and keep some of the whitelist functionality here with a few helper functions like CanProcessWhileImporting(command).

A dumb if/then/else parser seems icky, but I think it’s the least likely to cause us issues in the future :(

JeremyRubin commented at 0:16 am on September 23, 2017: contributor

A dumb if/then/else parser seems icky, but I think it’s the least likely to cause us issues in the future :(

Can you clarify the kinds of future issues you’re worried about? I think something like #9608 is more likely to cause issues. I think verifying correctness of changes to the processing logic is more complicated under #9608. I’d love to understand what other future considerations you’re making though.

A dispatcher with registered functions and a state checker (like you’ve done here, and like I attempted as well) seems like the obviously correct approach, but once implemented, it’s far less straightforward than the simplistic #9608.

I think that there may be a similarity bias. #9608 is more similar to the existing code, which makes it more straightforward. But my guess would be that #10145 is much easier to understand for anyone new to the project/seeing the code for the first time.

Personally, I think this design has the following benefits over the ‘simpler’ design:

Easier to ‘prove correct’ design/write exhaustive tests for
Extensible to per connection handler tables
Extensible to middlewares (e.g., a statistics middleware or maybe an encryption one)
Easier to add new message types & less chance of hitting an unforseen edge case
Lower latency dispatch

theuni commented at 8:26 pm on September 25, 2017: member

Can you clarify the kinds of future issues you’re worried about? I think something like #9608 is more likely to cause issues. I think verifying correctness of changes to the processing logic is more complicated under #9608. I’d love to understand what other future considerations you’re making though.

Because of all of the pesky interactions with our own state, we can only go so far in laying out the rules up-front.

For example, someone could be forgiven for assuming (based on the whitelist here) that INVs are processed while importing/reindexing, though in reality, there’s only a tiny wallet interaction.

Similarly, IsInitialBlockDownload() isn’t handled by the dispatcher, leaving it up to individual messages to decide. And that’s as it should be, because each message will have their own criteria for that.

So if the rules can’t be completely enumerated, we’re only spreading them out and adding more places to check (or miss). The current behavior (and that of #9608) is to only check for proper handshake behavior outside of individual message parsing. That basically boils down to:

0if (!node->nVersion && strCommand != NetMsgType::VERSION) return;
1else if (!node->fSuccessfullyConnected && strCommand != NetMsgType::VERACK) return;
2Dispatch(strCommand);

I might be convinced that a slimmed-down version of this that only performed ^^ checks before dispatching would be reasonable.

MarcoFalke added the label Needs rebase on Jun 6, 2018

laanwj referenced this in commit 021dce935a on Aug 25, 2018

MarcoFalke commented at 4:31 pm on August 25, 2018: member

Closing for now. Let me know when you want to continue working on this

MarcoFalke closed this on Aug 25, 2018

JeremyRubin commented at 6:43 pm on September 16, 2018: contributor

@MarcoFalke my read was that there was not consensus that this approach was welcome.

If it’s worth refreshing, I’d be happy to.

laanwj removed the label Needs rebase on Oct 24, 2019

PastaPastaPasta referenced this in commit c261173669 on Jan 25, 2020

UdjinM6 referenced this in commit 31066138c0 on Jan 31, 2020

UdjinM6 referenced this in commit e5e3572e9d on Feb 3, 2020

FornaxA referenced this in commit fa74145cee on Jul 6, 2020

ckti referenced this in commit 77002d0909 on Mar 28, 2021

MarcoFalke locked this on Dec 16, 2021

Separate Contextual checks and handling & switch on enum in net_processing.cpp #10145