This refactors the code how incoming messages gets handled. It adds flexibility for adding new transport types (encryption handshake, new message format).
This is a subset of #14032 and a pre-requirement for BIP151.
This refactors the code how incoming messages gets handled. It adds flexibility for adding new transport types (encryption handshake, new message format).
This is a subset of #14032 and a pre-requirement for BIP151.
16 | +// base class for format agnostic network messages 17 | +class NetMessageBase 18 | +{ 19 | +public: 20 | + CDataStream vRecv; // received message data 21 | + int64_t nTime; // time (in microseconds) of message receipt.
According to https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md these should be prefixed m_.
I kept it because it is moved code and to simplify the review.
This is moved code right?
Any reason to not use unique_ptr instead? I don't see sharing of message ownership.
I think shared is fine here and does allow more flexible handling in future with little cost (ref counting)
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
utACK 59a7ce0f9de950b64c5e89dd29ebcd0b607b982e
Adding an extra memory allocation for every network message isn't free. I'm concerned that we may be falling into overusing sharedptr "just in case", which is bad because it leaks performance in a diffuse way that doesn't show up well in profiles. Please don't use sharedptr where unique will work. If something new comes in in the future that needs a sharedptr then we can switch then.
@gmaxwell unique_ptr and shared_ptr have the same number of allocations (if make_shared is used, at least), though shared_ptr uses more memory (both for the pointer itself and for the allocated memory), and has atomics to update the refcounts (which may reduce performance).
To be clear, I agree we should avoid shared_ptr unless there is a good reason.
754 | + if (vRecvMsg.empty() || vRecvMsg.back()->complete()) { 755 | + vRecvMsg.emplace_back(std::make_shared<CNetMessage>(Params().MessageStart(), SER_NETWORK, INIT_PROTO_VERSION)); 756 | + } 757 | 758 | - CNetMessage& msg = vRecvMsg.back(); 759 | + CNetMessageRef msg = vRecvMsg.back();
nit: In commit " Use shared pointers for CNetMessage handling " you could keep this a plain reference to keep the diff smaller, I think.
3014 | + pfrom->nProcessQueueSize -= msgs.front()->vRecv.size() + CMessageHeader::HEADER_SIZE; 3015 | pfrom->fPauseRecv = pfrom->nProcessQueueSize > connman->GetReceiveFloodSize(); 3016 | fMoreWork = !pfrom->vProcessMsg.empty(); 3017 | } 3018 | - CNetMessage& msg(msgs.front()); 3019 | + CNetMessageRef msg(msgs.front());
Same here
13 | +/** Maximum length of incoming protocol messages (no message over 4 MB is currently acceptable). */ 14 | +static const unsigned int MAX_PROTOCOL_MESSAGE_LENGTH = 4 * 1000 * 1000; 15 | + 16 | +// base class for format agnostic network messages 17 | +class NetMessageBase 18 | +{
Implement a virtual destructor to protect against undefined behavior in the event that an object of a derived NetMessageBase class with a destructor is destroyed through a pointer to this base class.
virtual ~NetMessageBase() {}
Done
29 | + virtual uint32_t getMessageSize() const = 0; //returns 0 when message has not yet been parsed 30 | + virtual uint32_t getMessageSizeWithHeader() const = 0; //return complete message size including header 31 | + 32 | + virtual std::string getCommandName() const = 0; //returns the command name. Returns an empty string when no command name is supported 33 | + 34 | + void SetVersion(int nVersionIn)
Make SetVersion a virtual member function to be sure that derived implementations are called when called via a pointer to this base class.
virtual void SetVersion(int nVersionIn)
38 | + 39 | + virtual int read(const char* pch, unsigned int nBytes) = 0; //parse bytes 40 | + 41 | + virtual bool verifyMessageStart() const = 0; 42 | + virtual bool verifyHeader() const = 0; 43 | + virtual bool verifyChecksum(std::string& error) const = 0;
The developer notes state that for new symbols it is preferred that: "Class names, function names and method names are UpperCamelCase (PascalCase)"
virtual bool VerifyMessageStart() const = 0;
virtual bool VerifyHeader() const = 0;
virtual bool VerifyChecksum(std::string& error) const = 0;
Can you check if all the new method names are UpperCamelCase?
18 | +{ 19 | +public: 20 | + CDataStream vRecv; // received message data 21 | + int64_t nTime; // time (in microseconds) of message receipt. 22 | + 23 | + NetMessageBase(const CMessageHeader::MessageStartChars& pchMessageStartIn, int nTypeIn, int nVersionIn) : vRecv(nTypeIn, nVersionIn)
Remove the const CMessageHeader::MessageStartChars& pchMessageStartIn argument from the constructor's argument list?
NetMessageBase(int nTypeIn, int nVersionIn) : vRecv(nTypeIn, nVersionIn)
104 | + int readHeader(const char* pch, unsigned int nBytes); 105 | + int readData(const char* pch, unsigned int nBytes); 106 | + 107 | + bool verifyMessageStart() const; 108 | + bool verifyHeader() const; 109 | + bool verifyChecksum(std::string& error) const;
Use override to ensure that these methods are overriding virtual methods from the base class.
bool VerifyMessageStart() const override;
bool VerifyHeader() const override;
bool VerifyChecksum(std::string& error) const override;
65 | + in_data = false; 66 | + nHdrPos = 0; 67 | + nDataPos = 0; 68 | + } 69 | + 70 | + bool complete() const
Specify this method as override
bool complete() const override
72 | + if (!in_data) 73 | + return false; 74 | + return (hdr.nMessageSize == nDataPos); 75 | + } 76 | + 77 | + uint32_t getMessageSize() const
Specify this method as override
uint32_t getMessageSize() const override
80 | + return 0; 81 | + } 82 | + return hdr.nMessageSize; 83 | + } 84 | + 85 | + uint32_t getMessageSizeWithHeader() const
Specify this method as override
uint32_t getMessageSizeWithHeader() const override
85 | + uint32_t getMessageSizeWithHeader() const 86 | + { 87 | + return hdr.nMessageSize + CMessageHeader::HEADER_SIZE; 88 | + } 89 | + 90 | + std::string getCommandName() const
Specify this method as override
std::string getCommandName() const override
92 | + return hdr.pchCommand; 93 | + } 94 | + 95 | + const uint256& GetMessageHash() const; 96 | + 97 | + void SetVersion(int nVersionIn)
Specify as override
void SetVersion(int nVersionIn) override
98 | + { 99 | + hdrbuf.SetVersion(nVersionIn); 100 | + NetMessageBase::SetVersion(nVersionIn); 101 | + } 102 | + 103 | + int read(const char* pch, unsigned int nBytes);
Specify as override
int read(const char* pch, unsigned int nBytes) override;
107 | + bool verifyMessageStart() const; 108 | + bool verifyHeader() const; 109 | + bool verifyChecksum(std::string& error) const; 110 | +}; 111 | + 112 | +typedef std::shared_ptr<NetMessageBase> NetMessageBaseRef;
Prefer using over typedef?
using NetMessageBaseRef = std::shared_ptr<NetMessageBase>;
108 | + bool verifyHeader() const; 109 | + bool verifyChecksum(std::string& error) const; 110 | +}; 111 | + 112 | +typedef std::shared_ptr<NetMessageBase> NetMessageBaseRef; 113 | +typedef std::shared_ptr<NetMessage> NetMessageRef;
NetMessageRef doesn't seem to be used. Can NetMessageRef be deleted?
@sipa, @gmaxwell:
The reason why I used shared_ptr over unique_ptr is the polymorphism I'd like to use for the net messages and I once read that shared_ptr are recommended for polymorphic inheritance.
However, I changed this PR now to use unique_ptr.
Thanks @251Labs for the review, implemented all your suggestions.
574 | + CDataStream vRecv; // received message data 575 | + int64_t nTime; // time (in microseconds) of message receipt. 576 | + 577 | + NetMessageBase(int nTypeIn, int nVersionIn) : vRecv(nTypeIn, nVersionIn) 578 | + { 579 | + nTime = 0;
could move to init list , nTime(0).
Fix commit message "Use shared pointers for CNetMessage handling".
Fixed commit message issue reported by @promag
632 | @@ -681,6 +633,7 @@ class CNode 633 | 634 | mapMsgCmdSize mapSendBytesPerMsgCmd; 635 | mapMsgCmdSize mapRecvBytesPerMsgCmd; 636 | + void RecordRecvBytesPerMsgCmd(const std::string& cmd, uint32_t bytes);
nit: could be private
Fixed.
82 | + return hdr.nMessageSize; 83 | + } 84 | + 85 | + uint32_t GetMessageSizeWithHeader() const override 86 | + { 87 | + return hdr.nMessageSize + CMessageHeader::HEADER_SIZE;
Seems these other hdr accesses be guarded against !in_data, as they're relatively uninitialized data in that case, e.g.nMessageSize could be -1.
I basically moved this code part... I think its fine.
73 | + return data_hash; 74 | +} 75 | + 76 | +bool NetMessage::VerifyMessageStart() const 77 | +{ 78 | + return memcmp(hdr.pchMessageStart, Params().MessageStart(), CMessageHeader::MESSAGE_START_SIZE) == 0 ? true : false;
src/net_message.cpp:78:108: warning: redundant boolean literal in ternary expression result [readability-simplify-boolean-expr]
Fixed.
56 | + if (vRecv.size() < nDataPos + nCopy) { 57 | + // Allocate up to 256 KiB ahead, but never more than the total message size. 58 | + vRecv.resize(std::min(hdr.nMessageSize, nDataPos + nCopy + 256 * 1024)); 59 | + } 60 | + 61 | + hasher.Write((const unsigned char*)pch, nCopy);
src/net_message.cpp:61:18: warning: C-style casts are discouraged; use reinterpret_cast [google-readability-casting]
src/net_message.cpp:61:18: warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]
It's moved code and the scope of this PR is not refactor.
579 | + nTime = 0; 580 | + } 581 | + virtual ~NetMessageBase() {} 582 | 583 | + virtual bool Complete() const = 0; 584 | + virtual uint32_t GetMessageSize() const = 0; //returns 0 when message has not yet been parsed
nit: missing space after // ?
1305 | @@ -1372,9 +1306,9 @@ void CConnman::ThreadSocketHandler() 1306 | size_t nSizeAdded = 0; 1307 | auto it(pnode->vRecvMsg.begin()); 1308 | for (; it != pnode->vRecvMsg.end(); ++it) { 1309 | - if (!it->complete()) 1310 | + if (!it->get()->Complete())
Could be (*it)->Complete() instead, but maybe what you've got is already clearer?
c2f9c04583a3ffcae4a54d85bb49def58e79cef2 looks reasonable to me
16 | +// base class for format agnostic network messages 17 | +class NetMessageBase 18 | +{ 19 | +public: 20 | + CDataStream vRecv; // received message data 21 | + int64_t nTime; // time (in microseconds) of message receipt.
nit, int64_t nTime{0}; and remove from the constructor.
Agree. Though I'd say scope of the PR is not to refactor code parts that where moved 1:1. Lets do further refactors in another PR.
SGTM.
utACK c2f9c04.
18 | +} 19 | + 20 | +int NetMessage::ReadHeader(const char* pch, unsigned int nBytes) 21 | +{ 22 | + // copy data to temporary parsing buffer 23 | + unsigned int nRemaining = 24 - nHdrPos;
I see the magic constant 24 here quite a bit. Would it be useful to extract this into an appropriately named constant? Perhaps HEADER_NUM_BYTES or equivalent?
Agree. Though I'd say scope of the PR is not to refactor code parts that where moved 1:1. Lets do further refactors in another PR.
utACK c2f9c04583a3ffcae4a54d85bb49def58e79cef2
ACK c2f9c04583a3ffcae4a54d85bb49def58e79cef2
42 | + virtual bool VerifyMessageStart() const = 0; 43 | + virtual bool VerifyHeader() const = 0; 44 | + virtual bool VerifyChecksum(std::string& error) const = 0; 45 | +}; 46 | + 47 | +//basic network message for the currently used unencryted p2p communication
Should be "unencrypted" :-)
Fixed.
Rebased.
utACK c2f9c04.
Is it really necessary to make CNetMessage a virtual class? This seems very much overkill. Only the network deserializers should change based on what protocol is used; we don't need to have that variability put a burden on every separate message.
Seems like you could just create a CNetMessageSerializer class (with different instances), to which some of the current CNetMessage logic is moved. This class is then responsible for turning the network byte stream into packets, and the other way around, but the CNetMessage objects remain independent of the protocol used.
@jonasschnelli Mind to follow up on @sipa's question? I have removed this from high priority for now, but happy to add back when you addressed the question.
Rebased
Approach NACK. This is overkill; the message parsing/serializing logic can just move out of CNetMessage to avoid turning it into a virtual class.
@sipa: Thanks for the review. Currently turning this into the approach you mentioned. I think it's doable though not super convinced right now due to the state necessary in parsing/serialization (hash, MAC). Until I'm finished I keep this rebased to not lose testability of other elements in the pipeline.
<!--cf906140f33d8803c4a75a2196329ecb-->Needs rebase
Superseded. Closing.