troygiorshev
commented at 2:10 am on July 14, 2020:
contributor
This PR introduces per-peer message capture into Bitcoin Core. 📓
Purpose
The purpose and scope of this feature is intentionally limited. It answers a question anyone new to Bitcoin’s P2P protocol has had: “Can I see what messages my node is sending and receiving?”.
Functionality
When a new debug-only command line argument capturemessages is set, any message that the node receives or sends is captured. The capture occurs in the MessageHandler thread. When receiving a message, it is captured as soon as the MessageHandler thread takes the message off of the vProcessMsg queue. When sending, the message is captured just before the message is pushed onto the vSendMsg queue.
The message capture is as minimal as possible to reduce the performance impact on the node. Messages are captured to a new message_capture folder in the datadir. Each node has their own subfolder named with their IP address and port. Inside, received and sent messages are captured into two binary files, msgs_recv.dat and msgs_sent.dat, like so:
Because the messages are raw binary dumps, included in this PR is a Python parsing tool to convert the binary files into human-readable JSON. This script has been placed on its own and out of the way in the new contrib/message-capture folder. Its usage is simple and easily discovered by the autogenerated -h option.
Future Maintenance
I sympathize greatly with anyone who says “the best code is no code”.
The future maintenance of this feature will be minimal. The logic to deserialize the payload of the p2p messages exists in our testing framework. As long as our testing framework works, so will this tool.
Additionally, I hope that the simplicity of this tool will mean that it gets used frequently, so that problems will be discovered and solved when they are small.
FAQ
“Why not just use Wireshark”
Yes, Wireshark has the ability to filter and decode Bitcoin messages. However, the purpose of the message capture added in this PR is to assist with debugging, primarily for new developers looking to improve their knowledge of the Bitcoin Protocol. This drives the design in a different direction than Wireshark, in two different ways. First, this tool must be convenient and simple to use. Using an external tool, like Wireshark, requires setup and interpretation of the results. To a new user who doesn’t necessarily know what to expect, this is unnecessary difficulty. This tool, on the other hand, “just works”. Turn on the command line flag, run your node, run the script, read the JSON. Second, because this tool is being used for debugging, we want it to be as close to the true behavior of the node as possible. A lot can happen in the SocketHandler thread that would be missed by Wireshark.
Additionally, if we are to use Wireshark, we are at the mercy of whoever it maintaining the protocol in Wireshark, both as to it being accurate and recent. As can be seen by the many previous attempts to include Bitcoin in Wireshark (google “bitcoin dissector”) this is easier said than done.
Lastly, I truly believe that this tool will be used significantly more by being included in the codebase. It’s just that much more discoverable.
fanquake added the label
P2P
on Jul 14, 2020
fanquake added the label
Utils/log/libs
on Jul 14, 2020
In case you wanted to do this because the import ( sys.path.append('test/functional')) is relative to the pwd, you could make it relative to this file, which would be more stable.
Also, a test wouldn’t hurt.
troygiorshev
commented at 6:33 pm on July 22, 2020:
done
(That sort of format string is something I’m looking forward to in 3.6 😄)
troygiorshev
commented at 6:33 pm on July 22, 2020:
A test is in the works
in
src/net_processing.cpp:3640
in
38c6f10d5coutdated
troygiorshev
commented at 11:28 am on July 14, 2020:
ah thanks fixed!
naumenkogs
commented at 7:48 am on July 14, 2020:
member
Concept ACK the high-level idea. I think this feature may be quite useful while debugging, and it requires very few lines of code.
I have two questions though:
How to better integrate it with the existing codebase? Should the format be similar to what we have now:
received: addr (30003 bytes) peer=0? Or maybe it doesn’t matter.
Is there any threat of this feature being used for DoS? Perhaps on a system where file create is expensive, an attacker could disconnect/reconnect (from a new net address?) to make us create a new file? Maybe the file should be created after 1 minute of connection lifetime? (Although the messages from the very beginning should not be lost).
DrahtBot
commented at 7:58 am on July 14, 2020:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#20404 (Remove names from translatable strings by hebasto)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
use fs::create_directories (which is currently just an alias for boost::filesystem`, but means that the std filesystem can be substituted in when we upgrade our c++ version.
Nit. If if log messages where enabled by-logmessages=<LogMessageDir>not empty
we would have a separate LogMessagesDir , instead of logging in DataDir by default. And DataDir should be always <> LogDir.imho. see #19419
troygiorshev
commented at 1:11 pm on July 17, 2020:
This is ok. As per the boost docs: “Creation failure because [base_path] resolves to an existing directory shall not be treated as an error.”
troygiorshev
commented at 2:06 pm on July 17, 2020:
I can’t imagine this is a performance hit, but if anyone is worried I guess this could be moved to the CNode’s constructor. I would want to first be really sure that a node’s address won’t ever change.
troygiorshev
commented at 0:51 am on July 20, 2020:
Nit. If if log messages where enabled by-logmessages=<LogMessageDir>not empty
we would have a separate LogMessagesDir , instead of logging in DataDir by default. And DataDir should be always <> LogDir.imho. see #19419
I’ll consider this, having the log directory configurable would be nice.
in
src/net_processing.cpp:3641
in
dc393b30e9outdated
Can this logic be moved to ReceiveMsgBytes so that all the message dumping is contained within the net layer?
troygiorshev
commented at 4:01 pm on July 14, 2020:
Yes! It’s unfortunate that we have some MessageHandler code in net.cpp. Right now all of the logging is done is done in MessageHandler, I think we should keep it that way. It would be nice to move PushMessage to net_processing some day.
Yes, you’re right. Even though PushMessage is in net.cpp, it’s always executed on the MessageHandler thread.
jnewbery
commented at 8:28 am on July 14, 2020:
member
Concept ACK. I’m looking forward to playing around with this.
My first thought is that this would be much more useful if the user was able to specify an individual peer to log messages for, either through a command line argument -logmessages=<ip_addr> or through an RPC to enable and disable logging. Perhaps that can be done in a follow-up PR?
troygiorshev
commented at 12:44 pm on July 22, 2020:
I’d prefer to keep this as it is. I understand (and agree with) ES.21 completely, but I’m trying to treat time as an argument, not a local, which is why it’s right at the top. I would love to pass it in instead but practically can’t (see here).
If anyone is strongly against this I’ll move it.
practicalswift
commented at 5:54 pm on July 14, 2020:
contributor
Concept ACK: Very nice feature! Looking forward to using it. Thanks for adding it.
Feature request: Would it be possible to add a flag to the Python script that writes the recorded messages to one file each in the binary format the ProcessMessage(…) fuzzers expect (src/test/fuzz/process_message.cpp and src/test/fuzz/process_messages.cpp)?
Ideally with a file naming convention matching that of what libFuzzer and many other fuzzers generate: the file name being the SHA-1 of the file content.
That way this feature could be used to record live network messages that could be used directly as fuzzing inputs. A great way to generate a fuzzing seed corpus. That would be super neat! :)
troygiorshev force-pushed
on Jul 20, 2020
troygiorshev
commented at 2:39 am on July 20, 2020:
contributor
git range-diff master cf5966a 9898fa1
Made suggested changes and fixed the Travis problem.
Q: Is it important that this time is equal to the receive time of the message?
If yes, you could pass in the time from the caller. (msg.m_time)
troygiorshev
commented at 12:39 pm on July 22, 2020:
The exact time doesn’t matter, this is mainly used for sorting the messages later. All that’s important is that messages that are sent later have a later time. To avoid races I only call LogMessage from the MessageHandler thread. When sending a message in PushMessage, we don’t have anything like msg.m_time until the message is into SocketHandler.
MarcoFalke
commented at 5:55 am on July 20, 2020:
member
Looks surprisingly simple, so Concept ACK
troygiorshev
commented at 12:48 pm on July 21, 2020:
contributor
How to better integrate it with the existing codebase? Should the format be similar to what we have now:
received: addr (30003 bytes) peer=0? Or maybe it doesn’t matter.
IMO that format is good for a debug log, where things need to be compact on a single line. For this I much prefer JSON.
Is there any threat of this feature being used for DoS? …
At the moment, with this locked behind a debug flag, I’m not worried about DoS vectors. For both debugging and learning someone should be at the keyboard and the node shouldn’t be working with anything real. Additionally, I imagine that if writing a very small amount to disk in this way (every time a message is sent or received) is a DoS vector then there are many more serious ones available.
That said, this is worth consideration, especially if we think this will one day be used for wider purposes.
troygiorshev force-pushed
on Jul 22, 2020
troygiorshev force-pushed
on Jul 22, 2020
troygiorshev
commented at 7:03 pm on July 22, 2020:
contributor
git range-diff master 9898fa1 cbb154e
Trivial Rebase
Added a commit at the beginning cleaning up PushMessage’s and ProcessMessages’s if statements
Please avoid change unrelated lines (unless there are good reasons doing so). It makes it harder to reason about the actual changes if one includes unrelated whitespace changes in the same commit :)
Applies also below and throughout this PR.
troygiorshev
commented at 2:05 am on July 23, 2020:
I made these changes in response to this comment. As far as I can tell they’re all in the first commit 84a79d61cef5c9386597869adccda4aa5992cc96 (which only contains the unrelated changes), and they don’t show up at all in git range-diff master 9898fa1 cbb154e. i.e. on my machine I haven’t mixed any real changes with whitespace changes.
Are you seeing something different? Is there a better way I could have done this?
This seems reasonable. Although the general advice is to not fix style on lines that you’re not otherwise touching, if it’s all contained within a single cleanup commit, then I think that’s also fine.
Might you add something to this affect to the commit title so people know that? I didn’t get that from reading “Clean PushMessage and ProcessMessages”. Maybe “Whitespace only changes in PushMessage and ProcessMessages”
in
src/net_processing.cpp:4031
in
cbb154e5d5outdated
3647@@ -3648,14 +3648,12 @@ bool PeerLogicValidation::ProcessMessages(CNode* pfrom, std::atomic<bool>& inter
3648 if (!pfrom->orphan_work_set.empty()) return true;
36493650 // Don't bother if send buffer is too full to respond anyway
3651- if (pfrom->fPauseSend)
3652- return false;
3653+ if (pfrom->fPauseSend) return false;
troygiorshev
commented at 6:47 pm on July 23, 2020:
contributor
git range-diff master cbb154e 863c6b5
Added Test
Fixed Travis warning
Made Time Mockable
troygiorshev marked this as ready for review
on Jul 23, 2020
theStack
commented at 7:45 pm on July 26, 2020:
contributor
Huge Concept ACK! Very nice idea with simple implementation, I’m also pretty sure this will be very helpful for debugging purposes. Will definitely code-review that within the next days.
Just one detail I find quite odd: why create two files instead of just one and not rather include the direction as part of the serialization for each message? Is it only about saving disk space (1 byte per message) or is there another reason I’m not aware of? I’ve never seen a logging/sniffing tool that divides up into multiple files before and tend to think it would only complicate re-assembling the whole send/receive order later.
troygiorshev
commented at 12:46 pm on July 27, 2020:
contributor
Good question; I considered a few ways of doing this. Just to be clear, I create two files per peer. Partially, yes, this saves 1 byte per message. I could have also just logged everything into one file, which meant I would have had to save the peer address alongside each message as well. Ultimately, I chose the current way with the intention of simplifying the usage of the parser. As it stands, the parser is a simple “just give me what you want me to parse” sort of program. This allows the user to take advantage of globbing when selecting which files to parse (e.g. **/*.dat or **/msgs_recv.dat). Additionally the parser, being written in python, is actually noticeably slow. Forcing it to process all of the messages and then filter would be unnecessarily inconvenient.
theStack
commented at 12:53 pm on July 28, 2020:
contributor
why create two files instead of just one
Good question; I considered a few ways of doing this. Just to be clear, I create two files per peer. Partially, yes, this saves 1 byte per message. I could have also just logged everything into one file, which meant I would have had to save the peer address alongside each message as well. Ultimately, I chose the current way with the intention of simplifying the usage of the parser. As it stands, the parser is a simple “just give me what you want me to parse” sort of program. This allows the user to take advantage of globbing when selecting which files to parse (e.g. **/*.dat or **/msgs_recv.dat). Additionally the parser, being written in python, is actually noticeably slow. Forcing it to process all of the messages and then filter would be unnecessarily inconvenient.
Generally I’d say it makes sense to divide up into files/folders by criteria that I would likely use as frequent search key for accessing the logging data. Would I often want to access logs for a certain peer? Absolutely. Would I want to see only incoming or outcoming messages for a certain peer? Very unlikely, as the resulting data is probably not very valueable without the whole request/response stream in both directions. On the other hand, your arguments for simplification, efficiency (if only looking for incoming or outcoming messages is really such a frequent operation) and flexibility are also very strong, hence this solution is also okay with me.
I reviewed the code in more detail and noticed that there may be a serialization endianness problem, see the comments soon following below.
This way of serialization doesn’t seem to be endian-agnostic, i.e. it depends on the architecture whether this is written out in big or little endian format. In the parsing script you assume both time and data size are stored in little endian, hence this needs to be enforced here. I think ser_writedata32 (see serialization.h) can be used for this.
It’s a pity though that the CI tests don’t fail, I think this is because we don’t have any testing instance where the architecture uses big endian byte order :/
troygiorshev
commented at 1:17 pm on July 30, 2020:
Thanks for catching this! When fixing it I considered if it was better to instead change the test to follow the system endianness (with python’s sys.byteorder). I think fixing a particular endianness for the .dat files, as temporary as they are, is better. Someone might generate them on a BE system and then parse them on a LE system one day…
in
src/net.cpp:2881
in
0c31d58f7doutdated
2876+ f.write(msg_type.data(), msg_type.length());
2877+ for (auto i = msg_type.length(); i < CMessageHeader::COMMAND_SIZE; ++i) {
2878+ f << '\0';
2879+ }
2880+ uint32_t size = data.size();
2881+ f.write((const char*)&size, sizeof(size));
nit: pep8 says at least 2 spaces before an inline comment :grimacing:
troygiorshev
commented at 8:11 pm on August 4, 2020:
oof thanks, TIL
(I’m surprised nothing caught this)
jonatack
commented at 10:22 am on August 19, 2020:
at HEAD 4fc2e3906, there are still a number of misformatted inline comments and a few other nits that you would see by running the new python files through pycodestyle and black (with black, I tend to ignore the single-to-double quote changes and some of the newlines for brackets, but otherwise it’s useful)
in
contrib/peer-logging/peer-logging-parser.py:60
in
6095431033outdated
"recv": true and "recv": false don’t seem very friendly here. Perhaps "direction": "received" and "direction": "sent" is better?
troygiorshev
commented at 8:21 pm on August 4, 2020:
I’ll meet you in the middle and do "direction": "recv" and "direction": "sent" :)
in
contrib/peer-logging/peer-logging-parser.py:57
in
6095431033outdated
52+ length = int.from_bytes(tmp_header.read(LENGTH_SIZE), "little") # type: int
53+ if command not in MESSAGEMAP:
54+ continue # For now just skip unrecognized messages
55+ msg = MESSAGEMAP[command]()
56+ msg.deserialize(f_in)
57+ msg_dict = to_jsonable(msg)
You should deserialize the body of the message to a sub-object. That means if any message types contain fields called ‘msgtype’, ’time’, ’length’ or ‘recv’, they won’t conflict with the metadata.
troygiorshev
commented at 8:26 pm on August 4, 2020:
done, and improved the names and order
jnewbery
commented at 5:21 pm on July 31, 2020:
member
I’ve been playing around with this and it’s great. Just a few small comments so far.
The parser is printing hashes as large floats. Is it possible to print the hex representation:
troygiorshev
commented at 2:21 pm on August 5, 2020:
contributor
git range-diff master 6095431 0547b23
Values coming from uint256 variables print as hex correctly
Restructured the output, renamed some fields
Some whitespace fixes
Small change to msg_generic (unused) in messages.py
in
test/functional/p2p_peer_logging.py:37
in
0547b23754outdated
32+ - Message Type
33+ - We can add new message types
34+
35+ We're ignoring these because they're simply too brittle to test here.
36+ """
37+ with open(dat_file, 'rb') as f_in:
MarcoFalke
commented at 2:31 pm on August 5, 2020:
Instead of reimplementing the parser here, couldn’t this just call contrib/peer-logging/peer-logging-parser.py {dat_file}?
+1 on code deduplication. The included timestamps check (all timestamps are past genesis block) can be done on the result rather than directly in the parser.
I think it’s much better not to assume that the contrib directory is available. Functional tests can be run when the build directory is out of tree and the source directory is not available. It makes sense to me to have a limited parser in the test case rather than importing from somewhere else.
I agree that the dependency on the contrib directory is a drawback, but then on the other hand we already have a test importing from contrib: feature_loadblock.py (uses contrib/linearize scripts). Is there really a need to run functional tests without the other sources available? They are all part of the same repository.
Hm, so the usage pattern in this case is “copy the test/functional folder somewhere” and executing from there, decoupled from the repository, should still work? Still not convinced why anyone should ever want/do that (maybe I’m missing something there).
Search for “out of tree builds” in this repo. You’ll see lots of issues/PRs for running tests out of tree.
I agree that the dependency on the contrib directory is a drawback, but then on the other hand we already have a test importing from contrib: feature_loadblock.py (uses contrib/linearize scripts).
Yes, and I think that’s a terrible pattern that we shouldn’t repeat. Importing code from contrib essentially makes those files part of the test framework.
To my understanding the primary point of out of tree builds is separation between the build artifacts (i.e. everything spit out by make, like binaries, manpages etc.) and the source folder. I guess our disagreement stems from the fact that in your view, contrib is part of the source folder, while in my view it isn’t – it only contains scripts and no compilation units, hence no ending up in an (out of tree) build folder. That could change in the future though, so I can see your point.
MarcoFalke
commented at 11:47 am on August 7, 2020:
The functional test scripts itself live in the source directory. The build directory has a symlink to it. As long as the script in contrib is also symlinked to the build dir, this should not affect out-of-tree builds at all. And all ci scripts use out-of-tree builds, so if this lead to issues, it would be noticed immediately.
troygiorshev
commented at 8:41 pm on August 7, 2020:
I’ve elaborated on why I did this below. If anyone is still against this choice, please let me know!
in
src/net_processing.cpp:3808
in
fac22eae7coutdated
I think the cast to size_t is not needed here? msg.m_recv is an instance of CDataStream, its size() method returns CDataStream::size_type which is a typedef to a std::vector::size_type, which again is size_t.
The “command” terminology should not be used anymore for network messages, we just call it “message type” now (see e.g. PRs #18533, #18610, #18937). Replacement suggestions: . s/COMMAND_SIZE/MSGTYPE_SIZE and s/command/msg_type (in the parsing code below).
troygiorshev
commented at 1:02 pm on August 6, 2020:
Thanks for catching this! This “command -> message type” switch is a monumental effort that I’m happy to be a part of.
I’m going to choose msgtype not msg_type for the same reasons you did in #18610 :)
in
contrib/peer-logging/peer-logging-parser.py:80
in
78959de4b4outdated
75+ tmp_header = BytesIO(tmp_header_raw)
76+ time = int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int
77+ command = tmp_header.read(COMMAND_SIZE).split(b'\x00', 1)[0] # type: bytes
78+ length = int.from_bytes(tmp_header.read(LENGTH_SIZE), "little") # type: int
79+ if command not in MESSAGEMAP:
80+ continue # For now just skip unrecognized messages
Could check here after the deserialization if the f_in was advanced by the expected length (what is currently done in the functional test), e.g. something like:
troygiorshev
commented at 1:23 pm on August 6, 2020:
I like this a lot, thanks
troygiorshev force-pushed
on Aug 6, 2020
troygiorshev
commented at 1:58 pm on August 6, 2020:
contributor
git range-diff master 0547b23 4205c56
peer logging -> message logging
command -> msgtype
Removed unneeded size_t cast
Parser now properly handles unrecognized messages
Parser now verified payload length
troygiorshev
commented at 2:00 pm on August 6, 2020:
contributor
Regarding mini_parser in the functional test: discussion here. The popular question is “Why reimplement the parser here, as opposed to just calling contrib/message-logging/message-logging-parser.py?”
The main reason, which I agree with, is brought up by @jnewbery.
I think it’s much better not to assume that the contrib directory is available. Functional tests can be run when the build directory is out of tree and the source directory is not available. It makes sense to me to have a limited parser in the test case rather than importing from somewhere else.
Additionally, I think that this separate mini_parser makes it clear, at a glance, what exactly is being tested. As noted in the module docstring of p2p_message_logging.py, the message logging parser isn’t tested in this test. This test is for LogMessage only. I would hate to run into a Hubble Space Telescope-like problem where LogMessage is wrong but the message logging parser is wrong in the same way, so the errors go by unnoticed.
in
src/net_processing.cpp:3808
in
6f0e88752doutdated
Sorry that I didn’t noticed on the earlier review today, but: if you rebase on master, this will unlock you the nice MakeUCharSpan (see commit e63dcc3a6752e7d406e7a650c2d6c2e95cd39aab, introduced with PR #19326 that was merged 3 days ago), not needing to cast anything anymore. :tada:
troygiorshev
commented at 8:40 pm on August 7, 2020:
Thanks, I’ll do this! Another upcoming improvement is #19660, but I’ll try and leave that for a follow-up as not to increase the scope too far here.
troygiorshev force-pushed
on Aug 7, 2020
troygiorshev
commented at 9:38 pm on August 7, 2020:
contributor
theStack
commented at 9:00 pm on August 11, 2020:
contributor
I tested the logging parser on my system (running Python 3.5.2) and got the following error:
0$ contrib/message-logging/message-logging-parser.py /tmp/bitcoin_func_test_9sw28m5f/node0/regtest/message_logging/127.0.0.1\:33370/*.dat
1/tmp/bitcoin_func_test_9sw28m5f/node0/regtest/message_logging/127.0.0.1:33370/msgs_recv.dat
2<class 'pathlib.PosixPath'>
3Traceback (most recent call last):
4 File "contrib/message-logging/message-logging-parser.py", line 129, in <module>
5 main()
6 File "contrib/message-logging/message-logging-parser.py", line 117, in main
7 process_file(log, messages, "recv" in log.stem)
8 File "contrib/message-logging/message-logging-parser.py", line 73, in process_file
9 with open(path, 'rb') as f_in:
10TypeError: invalid file: PosixPath('/tmp/bitcoin_func_test_9sw28m5f/node0/regtest/message_logging/127.0.0.1:33370/msgs_recv.dat')
Turns out that open() is capable to take os.PathLike objects only from Python 3.6 onwards, see https://stackoverflow.com/a/42694113
Since we are still on Python 3.5 (though there is already a PR for changing to 3.6, see #19504), this should be fixed by converting the path to a string by opening str(path) instead.
troygiorshev force-pushed
on Aug 13, 2020
troygiorshev
commented at 11:48 pm on August 13, 2020:
contributor
git range-diff master 555e48a 4fc2e39@theStack thanks for catching that! I had been running this all under the wrong environment the whole time. (I’ll plug conda here, never worry about your python environment again: just make sure to pick the right one)
TL;DR: if you want an absolute path to a file that may not exist, in Python 3.5, always use Path.cwd() / Path(arg). This is due to the “strictness” of resolve() discussed here.
jb55
commented at 8:10 pm on August 17, 2020:
contributor
I don’t find your “why not Wireshark?” very convincing… If anything, it made me less sure this should get merged. Wireshark is a fairly simple and standard tool. Glancing at the code, this PR is basically just a less-user-friendly reinvention thereof. If there are concerns with maintenance of the Wireshark dissector, as I understand it, we could fork it and maintain it out of Wireshark’s codebase.
I’m inclined to agree. Also with a more general system like tracepoints we could do low overhead p2p(& other subsystem) tracing/filtering/scripting without any custom code
in
test/functional/p2p_message_logging.py:60
in
4fc2e39063outdated
55+ self.extra_args = [["-logmessages"]]
56+ self.setup_clean_chain = True
57+
58+ def run_test(self):
59+ logdir = os.path.join(self.nodes[0].datadir, "regtest/message_logging")
60+ # Connect an disconnect a node so that the handshake occurs
adamjonas
commented at 7:41 pm on August 18, 2020:
0 # Connect and disconnect a node so that the handshake occurs
troygiorshev
commented at 2:33 pm on August 20, 2020:
Thanks. Fixed
in
contrib/message-logging/message-logging-parser.py:32
in
0d10af73dfoutdated
27+# As such, they are itemized here
28+# (These can be easily found by looking for calls to deser_uint256, deser_uint256_vector, and uint256_from_str in messages.py)
29+HASH_INTS = [
30+ "blockhash",
31+ "block_hash",
32+ "hash", # A few conflicts here
amitiuttarwar
commented at 11:18 pm on August 18, 2020:
conflicts with what?
troygiorshev
commented at 12:59 pm on August 20, 2020:
This part of the parser recognizes a uint256 by the name of the member. It doesn’t take into account the name of the class. In some classes (in messages.py), self.hash refers to one of these “int-encoded” uint256s, and in others it does not.
e.g.
COutPoint, self.hash is an “int-encoded” uint256
CTransaction, self.hash is a str
This is here to justify the isinstance(val, int) checks in to_jsonable but I think it’s causing more confusion than it’s helping. I’ll comment this somewhere else.
amitiuttarwar
commented at 7:11 pm on August 20, 2020:
gotcha, sounds good
in
test/functional/p2p_message_logging.py:62
in
4fc2e39063outdated
57+
58+ def run_test(self):
59+ logdir = os.path.join(self.nodes[0].datadir, "regtest/message_logging")
60+ # Connect an disconnect a node so that the handshake occurs
61+ self.nodes[0].add_p2p_connection(P2PDataStore())
62+ self.nodes[0].disconnect_p2ps()
amitiuttarwar
commented at 11:31 pm on August 18, 2020:
I’m not following why this disconnect_p2ps() is needed here. The comment above says “so that the handshake occurs”, but this doesn’t clarify anything for me. do you mean the version handshake? why do we need to disconnect for the version handshake to occur?
my understanding of this test is that it parses through the sent & received data files to check the structure of each recorded message is valid. I added some logging to see what message types are recorded & from what I can tell its the same whether or not we disconnect. am I missing something?
I have the same question - by default, add_p2p_connection waits for verack.
My initial guess was that disconnecting the peer closed the files or unlocked a mutex, but I don’t think this is the case 🤔
troygiorshev
commented at 1:11 pm on August 20, 2020:
You’re both right, it isn’t needed.
I thought it was just good form to disconnect the peers at the end of the test. Makes it easier for someone to expand the test later.
I’ll fix the comment
amitiuttarwar
commented at 11:40 pm on August 18, 2020:
contributor
approach ACK. this is super cool! thank you for building it!
I’ve read over the commits, but don’t feel familiar enough with many of the components & relevant concerns (yet) to leave a proper review ACK. left a couple questions though.
I tried this out locally & it works great. I think it would be helpful to add some light documentation within the repo with the basics of how-to-use that you’ve mentioned on this PR. By basics I mean things like run node with -logmessages, the file structure of the message_logging dir & passing through contrib/message-logging/message-logging-parser.py to interpret.
The one trick I found very useful is using message_logging/[ip address]/*.dat as the arg for the message parser, since it interpolates the sent & received messages from the peer. I gleaned this from your comment earlier because I agree with @theStack about a common desired use case as accessing all logs for a certain peer. Your reasoning behind having sent vs received as separate files makes perfect sense, and I’m happy that the message parser supports the use case nicely. My request here is something simple to make this interface more apparent.
thanks again for this PR! I’m excited to query my node more and level up my jq skills.
Yes, please disregard my comment, I misunderstood!
narula
commented at 0:00 am on August 19, 2020:
contributor
It pains me a bit that this circumvents the existing logging infrastructure. Why is it not the case this is all just logged to debug.log (like everything else) and the parser works on that, filtering peer messages?
As @jnewbery points out below, I totally missed that this was binary logging! I should have looked more carefully.
jnewbery
commented at 8:20 am on August 19, 2020:
member
It pains me a bit that this circumvents the existing logging infrastructure. Why is it not the case this is all just logged to debug.log (like everything else) and the parser works on that, filtering peer messages?
This PR dumps binary serialized messages (not text) to file, which is then post-processed by other software. The msgs_*.dat files can be enormous - hundreds of megabytes if blocks are being sent over that connection. Converting those messages to printable text and dropping them into the debug.log file would make them larger still, overwhelm the standard debug logging (eg a block message converted to hex would be millions of characters), and probably be a performance bottleneck (since all messages logged to debug.log would need to be deserialized and passed through tinyformat.h).
I think this also answers your other questions:
I don’t think “.dat” is the right extension. These are logs, not data, and should end in “.log”.
.log to me suggests printable text log files. These are binary files that need post-processing before they’re readable.
practicalswift
commented at 10:11 am on August 19, 2020:
contributor
Regarding the “why not use Wireshark?” discussion:
I don’t know about which use cases others are thinking about, but having this in master would help me a lot at least:
I will use this feature in ways not suitable for Wireshark: both for long-term permanent capture of messages for after-the-fact trouble-shooting/debugging, and also for capturing messages that can be fed as seeds into the net message fuzzers.
I’ve played with this PR a bit and it covers my needs: I certainly hope it will land in master :)
practicalswift
commented at 10:37 am on August 19, 2020:
Could this be moved to the start of the function right after the LogPrint(BCLog::NET, …) call to a.) avoid being run while we’re holding pnode->cs_vSend (which we don’t need for logging), and b.) make sure all sending %s (%d bytes) peer=%d``BCLog::NET debug log entries are guaranteed to have a corresponding entry in the message capture file (in the absence of fatal logging errors)?
troygiorshev
commented at 1:41 pm on August 20, 2020:
Yup, it can, and I agree with you that it should be. The locking in PushMessage is a bit of a mess, which I fix up a little in #19673.
in
src/net_processing.cpp:3842
in
4fc2e39063outdated
3838@@ -3841,6 +3839,10 @@ bool PeerLogicValidation::ProcessMessages(CNode* pfrom, std::atomic<bool>& inter
3839 }
3840 CNetMessage& msg(msgs.front());
38413842+ if (gArgs.IsArgSet("-logmessages")) {
jonatack
commented at 10:38 am on August 19, 2020:
here and in net.cpp::2834, this may be somewhat more robust:
01- if (gArgs.IsArgSet("-logmessages")) {
2+ if (gArgs.GetBoolArg("-logmessages", false)) {
practicalswift
commented at 10:40 am on August 19, 2020:
To make it easier to reason about this function (and its locking requirements) from looking at the function signature, what about replacing const CNode& node with const CAddress& address since that is what is actually used (node.addr)?
troygiorshev
commented at 1:46 pm on August 20, 2020:
I like this, thanks. One day LogMessage may use other attributes of a CNode, but it can be easily changed back when the time comes. Done.
in
src/init.cpp:535
in
4fc2e39063outdated
519@@ -520,6 +520,7 @@ void SetupServerArgs(NodeContext& node)
520 ArgsManager::ALLOW_ANY, OptionsCategory::DEBUG_TEST);
521 argsman.AddArg("-debugexclude=<category>", strprintf("Exclude debugging information for a category. Can be used in conjunction with -debug=1 to output debug logs for all categories except one or more specified categories."), ArgsManager::ALLOW_ANY, OptionsCategory::DEBUG_TEST);
522 argsman.AddArg("-logips", strprintf("Include IP addresses in debug output (default: %u)", DEFAULT_LOGIPS), ArgsManager::ALLOW_ANY, OptionsCategory::DEBUG_TEST);
523+ argsman.AddArg("-logmessages", "Log all p2p messages to file", ArgsManager::ALLOW_BOOL | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
jonatack
commented at 11:42 am on August 19, 2020:
I think the help should specify the message_logging directory location and also mention the location of and need to use the contrib/message-logging/message-logging-parser.py script.
Perhaps also add a markdown documentation file somewhere relevant that makes sense.
All of this, along with usage info, might also usefully be in the header docstring of the parser script and make sure it shows up nicely when git grepping.
There seems to be a formatting issue due to unescaped double quotes, however.
FelixWeis
commented at 5:52 pm on August 19, 2020:
or just piping trough jq, gives you colors too
troygiorshev
commented at 3:16 pm on October 7, 2020:
I think I’ll leave formatting to the user! I’ll take the jq suggestion and add that as a note in the module docstring.
michaelfolkson
commented at 2:41 pm on August 19, 2020:
contributor
I think this is a clear Concept ACK for new contributors.
I was unsure on the Concept ACK for longer term contributors. The fact that this is useful for some longer term contributors (e.g. @theStack, @practicalswift) convinces me it is just about a Concept ACK given the low cost (future maintenance, additional complexity).
If there are concerns with maintenance of the Wireshark dissector, as I understand it, we could fork it and maintain it out of Wireshark’s codebase.
This sounds like it would be more work and in the absence of someone saying they are keen to do this I think it is an overall Concept ACK for this PR from me.
(Excellent initial motivation section from @troygiorshev and fair challenge from @luke-jr)
sipa
commented at 5:18 pm on August 19, 2020:
member
I wonder why the name “message dumping” was changed to “message logging”. I find dumping a much more accurate term, as this isn’t just producing a log (which sounds more like metadata recording), but an exact dump of all messages.
Have you tried to fit this call in CConnman::SocketHandler like when we iter to increment nSizeAdded ? I think it would be cleaner to avoid leaking more network-level processing in a p2p protocol processing function.
troygiorshev
commented at 4:16 am on October 7, 2020:
For simplicity, and to avoid any race conditions, I’ve made sure to keep all calls to LogMessage in MessageHandler.
ariard
commented at 0:05 am on August 20, 2020:
member
Approach ACK.
Given the minimal new code it sounds like a win to relieve developers to install yet another debugging tool. Debugging tool which may not be accurate given the steady rate of new p2p messages (block filters, erlay, wtxid, …) or running cross-platform. If now one used it would be easy to remove it.
in
contrib/message-logging/message-logging-parser.py:110
in
0d10af73dfoutdated
77+ time = int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int
78+ msgtype = tmp_header.read(MSGTYPE_SIZE).split(b'\x00', 1)[0] # type: bytes
79+ length = int.from_bytes(tmp_header.read(LENGTH_SIZE), "little") # type: int
80+ if msgtype not in MESSAGEMAP:
81+ # For now just skip unrecognized messages
82+ f_in.read(length)
theStack
commented at 11:19 am on August 20, 2020:
0 if (optimisticSend) nBytesSent = SocketSendData(pnode);
or:
0 if (node->vSendMsg.empty()) nBytesSent = SocketSendData(pnode);
and remove the bool completely. The comment already clarifies what is happening.
troygiorshev
commented at 4:59 am on October 7, 2020:
I’m more than happy to do this!
fjahr
commented at 12:44 pm on September 1, 2020:
contributor
Concept ACK
Will do testing and deeper review when CI is green. Mostly ignored the python code for now. Travis failed with running out of disc space but it would surprise me if that was related, so I restarted it. I don’t think I can restart Cirrus CI.
laanwj
commented at 12:36 pm on September 6, 2020:
member
Concept ACK, I think the dumping changes to the C++ here could entirely be replaced with a use of #19866, no strong opinion on whether we should do both or not, but I think the python analysis tooling here is useful in any case.
DrahtBot added the label
Needs rebase
on Sep 21, 2020
troygiorshev force-pushed
on Sep 23, 2020
troygiorshev
commented at 2:11 am on September 23, 2020:
contributor
git range-diff master ff3a39c ba68f02
trivial rebase
DrahtBot removed the label
Needs rebase
on Sep 23, 2020
jnewbery
commented at 9:51 am on September 23, 2020:
member
@troygiorshev what’s the status here? You’ve rebased, but there are unaddressed review comments from a few contributors. CI is also failing.
in
contrib/message-logging/message-logging-parser.py:19
in
ba68f02310outdated
Looking forward to having this in master. This functionality is super useful for 1.) learning purposes, 2.) troubleshooting purposes, and 3.) fuzzing purposes (easy generation of seed corpora from actual message data).
I encourage others to review. To try this out:
0$ gh checkout pr 19509 && make distclean && ./autogen.sh && ./configure --with-incompatible-bdb && make
1$ src/bitcoind -capturemessages
2# now look in ~/.bitcoin/message_capture/
The availability of easy-to-use built-in tools such as this and bitcoin-cli -netinfo which allow for easy debugging/inspection of the P2P layer of Bitcoin Core will make the road to P2P understanding so much easier/quicker for future generations of Bitcoin Core contributions. Neat feature with great long-term benefits @troygiorshev! Thanks! :)
jnewbery
commented at 9:57 am on October 9, 2020:
member
I’m planning to rereview this once the CI failures are fixed and the documentation is done.
troygiorshev force-pushed
on Oct 14, 2020
troygiorshev force-pushed
on Oct 14, 2020
troygiorshev
commented at 6:17 am on October 14, 2020:
contributor
git range-diff master d2d3d12 d0ffb34
Fixed remaining mininode to p2p issue
Added documentation!
troygiorshev
commented at 6:17 am on October 14, 2020:
contributor
git range-diff master d0ffb34 9a9314a
Rebased to master to fix Travis
All pending TODOs for this PR are complete. Thanks to everyone who has reviewed so far, should be an easy reACK from many of you!
promag
commented at 7:07 am on October 14, 2020:
member
Concept ACK.
in
src/net.cpp:2943
in
c2f62aefaboutdated
2945+ if (nMessageSize) pnode->vSendMsg.push_back(std::move(msg.data));
29462947 // If write queue empty, attempt "optimistic write"
2948- if (optimisticSend == true)
2949- nBytesSent = SocketSendData(pnode);
2950+ if (pnode->vSendMsg.empty()) nBytesSent = SocketSendData(pnode);
MarcoFalke
commented at 7:51 am on October 14, 2020:
Any reason for this behavior change?
MarcoFalke
commented at 7:52 am on October 14, 2020:
I know this is a debugging feature, but file open/close for each message is kind of bad right, also because close flushes?
troygiorshev
commented at 2:44 am on October 16, 2020:
I’m not sure that I see the disadvantage. The alternative required using a greater number of file descriptors, which I thought was worse.
practicalswift
commented at 1:14 pm on October 16, 2020:
If measurements show that this ever becomes a problem in practice it can be tackled in a follow-up.
TBH I’m much more worried about the possible file descriptor DoS vector that we would risk open up if the file descriptor were kept open.
Security first! :)
promag
commented at 1:44 pm on October 14, 2020:
member
We could (also) expose these via ZMQ, just saying.
troygiorshev force-pushed
on Oct 16, 2020
troygiorshev
commented at 2:41 am on October 16, 2020:
contributor
git range-diff master 9a9314a 2418ec6
Reverted the change to optimisticSend
Travis should™ pass now. I can’t get the fuzz tests to build, but I get the same linker error on master as I do here, so I assume it’s unrelated.
troygiorshev
commented at 9:46 am on October 16, 2020:
contributor
Cirrus failing to fetch sqlite-autoconf-3320100.tar.gz, 404. Unrelated.
practicalswift
commented at 1:48 pm on October 16, 2020:
contributor
ACK2418ec658ccd2e8e033bced0f5b7c183946940ac
theStack
commented at 7:59 pm on November 15, 2020:
contributor
The changes themselves LGTM. However, while testing the feature on mainnet, the parsing tool contrib/message-capture/message-capture-parser.py encountered a problem decoding the following version message:
This output can be converted back to the binary file via xxd -r hexdump.txt. If anyone has a better idea how to share binary data for reviewing here (probably base64 or alike?), I’d be glad to hear it.
0$ contrib/message-capture/message-capture-parser.py ~/.bitcoin/message_capture/78.244.132.5_52410/* 1Processing /home/honeybadger/.bitcoin/message_capture/78.244.132.5_52410/msgs_recv.dat
2Traceback (most recent call last):
3File"contrib/message-capture/message-capture-parser.py", line 178, in<module> 4 main()
5File"contrib/message-capture/message-capture-parser.py", line 160, in main
6 process_file(str(capture), messages, "recv"in capture.stem, progress_bar)
7File"contrib/message-capture/message-capture-parser.py", line 117, in process_file
8 assert_equal(length, payload_length)
9File"contrib/message-capture/../../test/functional/test_framework/util.py", line 50, in assert_equal
10 raise AssertionError("not(%s)"%" == ".join(str(arg) for arg in (thing1, thing2) + args))
11AssertionError: not(107==106)
It seems like the deserialization routine for msg_version is consuming one byte less than expected. Guess that has to do with the optional fRelay field at the end. See also BIP37 and BIP60 – the latter is a draft, but it seems to summarize perfectly the problem we have here.
in
contrib/message-capture/message-capture-docs.md:18
in
2418ec658coutdated
13+ * Inside each peer's folder there are two `.dat` files: one is for received messages (`msgs_recv.dat`) and the other is for sent messages (`msgs_sent.dat`).
14+* Run `contrib/message-capture/message-capture-parser.py` with the proper arguments.
15+ * See the `-h` option for help.
16+ * To see all messages, both sent and received, for all peers use:
17+ ```
18+ python contrib/message-capture/message-capture-parser.py -o out.json \
theStack
commented at 8:10 pm on November 15, 2020:
nit: I’d just drop the python command part and call the script directly (also because on some systems, python symlinks to python2, not python3! See also https://www.python.org/dev/peps/pep-0394/), thanks to the she-bang on top of the file the right python interpreter will be called.
troygiorshev
commented at 10:12 pm on November 17, 2020:
Thank you again for the link. I’m so used to having python environment problems that I’ve always tried to be as explicit as possible when running python scripts. Since switching to conda I haven’t had any problems, and I’m sure other people have things figured out as well (Either they only have one python install or they use an environment manager too). I’ll make this change.
in
contrib/message-capture/message-capture-docs.md:25
in
2418ec658coutdated
20+ ```
21+ * Note: The messages in the given `.dat` files will be interleaved in chronological order. So, giving both received and sent `.dat` files (as above with `*.dat`) will result in all messages being interleaved in chronological order.
22+ * If an output file is not provided (i.e. the `-o` option is not used), then the output prints to `stdout`.
23+* View the resulting output.
24+ * The output file is `JSON` formatted.
25+ * Suggestion: use `jq` to view the output, with `cat out.json | jq`
theStack
commented at 8:13 pm on November 15, 2020:
troygiorshev
commented at 9:58 pm on November 17, 2020:
Thanks for the link :)
cat foo.txt | less is so common I once saw the suggestion to set a shell alias:
alias L='| less'
so that you can run cat foo.txt L!
troygiorshev force-pushed
on Nov 18, 2020
troygiorshev
commented at 0:42 am on November 18, 2020:
contributor
git range-diff master 2418ec6 4690881
Fixed version fRelay bug
The parser now more closely matches how parsing works in the test framework and in bitcoind
Documentation now follows PEP 0394
Removed cat overuse in documentation
First of all, many thanks @theStack for both finding the bug and identifying exactly what causes it. Your guess was spot on and the links to the BIPs were very useful! It ended up identifying a small bug with how I implemented my message parser.
However, I’m not sure what to think about this message. Looking at it closely, it has the following:
Version = 60000
fRelay = 0 (1 byte)
Declared payload size = 107
Actual payload size = 107
The fRelay field is only supposed to exist as of version 70001. So this message is malformed. But, in the protocol’s implementation in bitcoind, we don’t check this.
This is a discrepancy between our testing framework and bitcoind. If fRelay was 1 for a message with version <70001, bitcoind would take it as a 1, whereas our testing framework would take it was a 0. I don’t think this will ever happen in the real world, but it’s worth noting. I’ve opened a PR to discuss this as #20411.
For now I take this as fixed.
troygiorshev
commented at 1:03 am on November 18, 2020:
contributor
Also, I thought sharing the binary file as a hexdump worked great! Thanks for adding the tip to use xxd’s -r flag. I personally use HxD in Windows to view and edit binary files. It’s a GUI so everything’s messy - someone who uses xxd with vim might have a good reason to have binary files shared one way or another.
in
contrib/message-capture/message-capture-parser.py:20
in
4690881fc5outdated
MarcoFalke
commented at 6:24 am on November 18, 2020:
contrib/message-capture/message-capture-parser.py:20:1: F401 ’test_framework.util.assert_equal’ imported but unused
troygiorshev
commented at 3:36 pm on January 20, 2021:
Fixed
MarcoFalke
commented at 6:29 am on November 18, 2020:
member
another example: In commit fa74d3d720f I removed unused deserialize code, because supported versions of Bitcoin Core would never serialize such a version message. (We only use the test framework to test supported versions of Core). So I am wondering if the message capture may ever write a message that can’t be parsed by our test code. And if there is a way to protect against this.
practicalswift
commented at 10:02 am on November 18, 2020:
contributor
Is there any chance this PR could get a release milestone?
FWIW I love this feature and would love to see it in master soon :) Thanks @troygiorshev!
DrahtBot added the label
Needs rebase
on Nov 19, 2020
troygiorshev
commented at 7:01 am on November 25, 2020:
contributor
So I am wondering if the message capture may ever write a message that can’t be parsed by our test code. And if there is a way to protect against this.
Ideally, in my mind, our test code should function as closely as possible to Core. So, if a message can be parsed by Core it should be able to be parsed by our test code. You’re right that this will get tricky for undefined behavior or broken messages like this.
For now I’ll add a MIN_PEER_PROTO_VERSION check as suggested in 14025.
Edit: We still have to have a way to deal with messages that fail to be deserialized by the test code. Doesn’t appear to be a problem, see below.
MarcoFalke
commented at 7:06 am on November 25, 2020:
member
the message capture code will write the message before parsing. So except for the message header (message type) nothing is parsed, it could be any raw byte string. There is no way the test framework could make sense of that if not by accident.
troygiorshev
commented at 7:20 am on November 25, 2020:
contributor
Oh, yeah you’re completely right, thanks.
Luckily (due to #19107) only messages with a valid header are pushed onto the vRecv queue, given to ProcessMessages, and parsed by the message capture parser. So we can trust that the header is workable. This will allow us to simply skip messages files that contain messages that fail deserilization, and we should be able to continue parsing all of the other ones.
I’ll think of a nice way to signal this to the user.
Edit: I think we’ll have to skip the entire file that contains a bad message, as a bad stated size will result in the next message being deserialized badly almost all of the time. I’ll experiment and see how best to fix this.
Clean PushMessage and ProcessMessages
This brings PushMessage and ProcessMessages further in line with the
style guide by fixing their if statements.
LogMessage is later called, inside an if statement, inside both of these
methods.
dbf779d5de
laanwj referenced this in commit
43f3ada27b
on Jan 18, 2021
troygiorshev force-pushed
on Jan 19, 2021
troygiorshev
commented at 8:49 pm on January 19, 2021:
contributor
git range-diff master 4690881 abbb8ac
Parser now no longer crashes when it reaches an unrecognized or broken message. Instead it simply skips that message and the rest of the file the message came from.
Removed an unused import from the previous force push
troygiorshev force-pushed
on Jan 19, 2021
troygiorshev
commented at 9:03 pm on January 19, 2021:
contributor
git range-diff master abbb8ac d2e0f73
Rebased
in
src/init.cpp:524
in
d2e0f731c5outdated
520@@ -521,6 +521,7 @@ void SetupServerArgs(NodeContext& node)
521 argsman.AddArg("-limitdescendantcount=<n>", strprintf("Do not accept transactions if any ancestor would have <n> or more in-mempool descendants (default: %u)", DEFAULT_DESCENDANT_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
522 argsman.AddArg("-limitdescendantsize=<n>", strprintf("Do not accept transactions if any ancestor would have more than <n> kilobytes of in-mempool descendants (default: %u).", DEFAULT_DESCENDANT_SIZE_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
523 argsman.AddArg("-addrmantest", "Allows to test address relay on localhost", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
524+ argsman.AddArg("-capturemessages", "Capture all p2p messages to file", ArgsManager::ALLOW_BOOL | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
practicalswift
commented at 9:26 pm on January 19, 2021:
troygiorshev
commented at 9:44 pm on January 19, 2021:
Sounds good, and Wikipedia agrees. Fixed
in
contrib/message-capture/message-capture-docs.md:9
in
d2e0f731c5outdated
0@@ -0,0 +1,25 @@
1+# Per-Peer Message Capture
2+
3+## Purpose
4+
5+This feature allows for message capture on a per-peer basis. It answers the simple question: "Can I see what messages my node is sending and receiving?"
6+
7+## Usage and Functionality
8+
9+* Run `bitcoind` with the `-capturemessage` option.
practicalswift
commented at 9:34 pm on January 19, 2021:
This should be -capturemessages (with ending “s”) :)
troygiorshev
commented at 9:44 pm on January 19, 2021:
Thank you!
practicalswift
commented at 9:41 pm on January 19, 2021:
contributor
No need to change this unless you touch the branch for another reason.
in
contrib/message-capture/message-capture-parser.py:173
in
af8bcd1a6eoutdated
168+ progress_bar = ProgressBar(total_size)
169+ for capture in capturepaths:
170+ process_file(str(capture), messages, "recv" in capture.stem, progress_bar)
171+ else:
172+ for capture in capturepaths:
173+ process_file(str(capture), messages, "recv" in capture.stem, None)
jnewbery
commented at 11:19 am on January 20, 2021:
No need to duplicate the process_file call:
0 else:
1 progress_bar = None
23 for capture in capturepaths:
4 process_file(str(capture), messages, "recv" in capture.stem, progress_bar)
in
contrib/message-capture/message-capture-parser.py:157
in
af8bcd1a6eoutdated
152+ help="binary message capture files to parse.")
153+ parser.add_argument(
154+ "-o", "--output",
155+ help="output file. If unset print to stdout")
156+ parser.add_argument(
157+ "--no-progress-bar",
jnewbery
commented at 11:20 am on January 20, 2021:
Can you add -n as a short argument here?
in
contrib/message-capture/message-capture-parser.py:95
in
af8bcd1a6eoutdated
practicalswift
commented at 3:56 pm on January 20, 2021:
Since this is new code: consider running black all new .py files added in this PR :)
troygiorshev
commented at 10:16 pm on January 20, 2021:
Thanks, this is much more readable now. I’ve broken it up slightly differently than you.
in
test/functional/p2p_message_capture.py:44
in
af8bcd1a6eoutdated
39+ tmp_header_raw = f_in.read(TIME_SIZE + LENGTH_SIZE + MSGTYPE_SIZE)
40+ if not tmp_header_raw:
41+ break
42+ tmp_header = BytesIO(tmp_header_raw)
43+ time = int.from_bytes(tmp_header.read(TIME_SIZE), "little") # type: int
44+ assert(time >= 1231006505000000) # genesis block timestamp
jnewbery
commented at 11:40 am on January 20, 2021:
I’m not sure if we want this. You could imagine running this test inside some environment where the system time is not set correctly. That’d potentially cause this test to fail even though there’s no problem with message capture.
troygiorshev
commented at 10:19 pm on January 20, 2021:
I’m fine to remove this, especially given your other suggestions that add other pieces to this test.
in
test/functional/p2p_message_capture.py:49
in
af8bcd1a6eoutdated
jnewbery
commented at 11:41 am on January 20, 2021:
Perhaps also assert that there are no non-null characters after the first non-null character? i.e. the message type is a string of printable characters followed by just null characters.
in
test/functional/p2p_message_capture.py:37
in
af8bcd1a6eoutdated
32+ - Message Type
33+ - We can add new message types
34+
35+ We're ignoring these because they're simply too brittle to test here.
36+ """
37+ with open(dat_file, 'rb') as f_in:
jnewbery
commented at 11:49 am on January 20, 2021:
We should also test that these files are non-empty. We may not know the exact sequence of messages, but there should be some messages during initial handshake, and we should capture them.
jnewbery
commented at 12:03 pm on January 20, 2021:
We don’t have guidance on line length in the style guide, but anything over 100 columns is a bit long for my taste. Also consider adding a doxygen comment:
troygiorshev
commented at 7:05 pm on January 21, 2021:
Added the doxygen comment, but I think I’ll leave the long line as it is. Looking through the rest of net.h, I see many declarations longer than 100 characters, and I don’t see any that have been broken up the way you’re suggesting.
I agree with you that it’s not great, but I’m going to appeal to consistency in this case.
jnewbery
commented at 10:06 am on January 22, 2021:
jnewbery
commented at 12:04 pm on January 20, 2021:
maybe s/time/now/
jnewbery
commented at 12:07 pm on January 20, 2021:
member
utACKaf8bcd1a6ecdfa15d7fb0f98256d47dec9b40f61
This looks great, Troy. I think it can be merged, but I’ve given a few style suggestions inline. Feel free to take or leave.
sidhujag referenced this in commit
d35db2baaa
on Jan 20, 2021
troygiorshev force-pushed
on Jan 21, 2021
troygiorshev force-pushed
on Jan 21, 2021
troygiorshev
commented at 7:39 pm on January 21, 2021:
contributor
git range-diff master af8bcd1 a17b250
Implemented various style improvement suggested by jnewbery above
in
contrib/message-capture/message-capture-docs.md:25
in
a17b2509e0outdated
20+ ```
21+ * Note: The messages in the given `.dat` files will be interleaved in chronological order. So, giving both received and sent `.dat` files (as above with `*.dat`) will result in all messages being interleaved in chronological order.
22+ * If an output file is not provided (i.e. the `-o` option is not used), then the output prints to `stdout`.
23+* View the resulting output.
24+ * The output file is `JSON` formatted.
25+ * Suggestion: use `jq` to view the output, with `jq out.json`
theStack
commented at 9:27 pm on January 21, 2021:
nit: sorry to be annoying with this command again, but I just found out that it’s mandatory to add a filter as second parameter (e.g. the dot . for all), otherwise the command fails.
0$ jq out.json
1jq: error: out/0 is not defined at <top-level>, line 1:
2out.json
3jq: 1 compile error
4 5$ jq . out.json
6[
7 {
8 "msgtype": "version",
9 "direction": "sent",
10..........
troygiorshev
commented at 8:45 pm on January 22, 2021:
Ah you caught me, I still do cat out.json | jq :)
Thanks for checking this, I’ll fix it in the docs.
in
contrib/message-capture/message-capture-parser.py:106
in
d3353b507foutdated
theStack
commented at 9:29 pm on January 21, 2021:
yocto-nit: git shows some trailing whitespace in this line
theStack approved
theStack
commented at 9:51 pm on January 21, 2021:
contributor
Tested ACKa17b2509e0d7e7e05fd5600d5387608bb706ac0e
Reviewed the code, captured messages on mainnet for a few minutes and parsed the output with the included parser, everything LGTM, except minor things (see below). Looking forward to see this in master (hopefully) soon! :rocket:
A follow-up that takes care of some PEP8 code style issues (running pylint or black helps here) may makes sense though :) Didn’t check Python type annotations, maybe another reviewer experienced with that could run via mypy. (Potential issues could also solved in a follow-up though.)
in
src/init.cpp:524
in
e2b6430dcboutdated
520@@ -521,6 +521,7 @@ void SetupServerArgs(NodeContext& node)
521 argsman.AddArg("-limitdescendantcount=<n>", strprintf("Do not accept transactions if any ancestor would have <n> or more in-mempool descendants (default: %u)", DEFAULT_DESCENDANT_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
522 argsman.AddArg("-limitdescendantsize=<n>", strprintf("Do not accept transactions if any ancestor would have more than <n> kilobytes of in-mempool descendants (default: %u).", DEFAULT_DESCENDANT_SIZE_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
523 argsman.AddArg("-addrmantest", "Allows to test address relay on localhost", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
524+ argsman.AddArg("-capturemessages", "Capture all P2P messages to file", ArgsManager::ALLOW_BOOL | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
Why make this DEBUG_ONLY? It’s certainly not a feature for all users, but it’s also not just a development debugging tool.
jnewbery
commented at 10:24 am on January 22, 2021:
Probably my suggestion as well. Making it DEBUG_ONLY allows this to be an experimental feature that we can alter/remove in future releases without disrupting users’ workflow. We can always remove DEBUG_ONLY later if we’re happy to make it more visible.
troygiorshev
commented at 9:15 pm on January 22, 2021:
s/file/disk/
troygiorshev
commented at 9:20 pm on January 22, 2021:
I think I’ll keep DEBUG_ONLY unless having it is harmful for some reason. I agree that it’s very easy to remove later, and very difficult to add back.
in
src/net_processing.cpp:4045
in
e2b6430dcboutdated
It’s a bit strange that this happens in net_processing while the capture of outgoing messages is in net.
Any reason why this can’t be in net as well? I’d expect CNode::ReceiveMsgBytes to work well.
MarcoFalke
commented at 10:06 am on January 22, 2021:
The difference would be that disconnected nodes may still have their messages captured and that the capture file will record a time closer to the receive time, as opposed to the process time of the message? No opinion on what to do, just asking about the difference.
jnewbery
commented at 10:22 am on January 22, 2021:
Hmm, I don’t think this functionality really belongs in the net_processing layer/thread in the first place. It’s net that handles the connection and parsing/serializing of messages from/to a stream of bytes. So if that’s the reason for having this weird split across modules, I’m not sure it’s desirable.
Anyway, no big deal - it just strikes me as ugly, but if it works, it works.
jnewbery
commented at 9:50 pm on January 22, 2021:
I see this message dumping as a net_processing layer thing. It should be agnostic to the connection layer (eg if we eventually use BIP 151 or similar scheme, then the messages delivered to and sent from the net_processing layer shouldn’t change). If you need the actual bytes sent on the wire, then tcpdump or similar might be more appropriate.
troygiorshev
commented at 9:52 pm on January 22, 2021:
As it stands, message capture always happens in the MessageHandler thread. Unfortunately, functions in this thread span both net and net_processing.
It’s unnecessary to have this script load the entire dump of all files into memory and then sort them, as we know that every individual file is internally sorted.
So an approach is possible where you have a loop in which you read one message for each file, and then loop {pick the one with the smallest timestamp; print that one; read one message from that file; repeat}. This would be faster, lower memory (especially when analyzing lots of dumps), and support streaming.
jnewbery
commented at 10:26 am on January 22, 2021:
Yes, definitely a good suggestion. I agree this can wait for a future PR.
See the use of heapq.merge() in test/functional/combine_logs.py for one way to do this (although I never enhanced that to support streaming since there wasn’t an immediate need).
troygiorshev
commented at 10:19 pm on January 22, 2021:
I agree on all accounts: this is a naive solution but I’ll leave an improvement for a future PR.
sipa
commented at 0:18 am on January 22, 2021:
member
Concept ACK. I really like the approach of having separate files per peer/connection. Slightly less sure about separating send/receive, but seems ok.
I wonder if we shouldn’t go further actually. Having one file (or two) = one session would be pretty nice for certain analyses. That could be accomplished by adding e.g. (our own node startup time, peer id) or (connection start time) in the filenames (or directory names).
I’m slightly concerned about the overhead of opening/closing files for every message sent/received. Has anyone benchmarked this (or have ideas for a useful benchmark, even)? I understand the rationale of not capturing occupying a large part of the file descriptor space for this, but it’d be good to know it doesn’t really hurt.
Apologies if these questions have been answered already.
in
contrib/message-capture/message-capture-parser.py:132
in
a17b2509e0outdated
127+ msg.deserialize(msg_ser)
128+ except KeyboardInterrupt:
129+ raise
130+ except Exception:
131+ # Message was bad, skip the rest of the file
132+ print(f"Found broken message, skipping remainder of file: {path}", file=sys.stderr)
Not sure this is what you want to do when an unparsable message is encountered. As each record has an overall length field, there should be no problem with skipping it.
MarcoFalke
commented at 10:21 am on January 22, 2021:
Tend to agree. The test framework can only deserialize a subset of messages than what Bitcoin Core can deserialize. This is expected to happen in practice, so should fail gracefully in some way. Probably like an unknown message type.
troygiorshev
commented at 10:38 pm on January 22, 2021:
You’re right. Because of how the capture is done in net.cpp, the length field is guaranteed to be correct. (We’ll never have a misalignment) I’ll change this to just skip the single message, and I’ll insert an informative placeholder into the JSON.
MarcoFalke
commented at 10:09 am on January 22, 2021:
completely unrelated, but this seems fragile. Would be good to remove those ints and only use uint256`s
troygiorshev
commented at 7:27 pm on January 23, 2021:
I agree that this is fragile, but I’m not sure exactly what you mean about removing them. We don’t really have a uint256 object in python. Adding one would unfortunately mean rewriting a LOT of our testing framework…
jnewbery
commented at 10:22 am on January 22, 2021:
member
Slightly less sure about separating send/receive, but seems ok.
This was my suggestion. Maybe a hangover from my telecoms days when it was common to dump different streams for send/recv. The benefit here is that we save one bit (really one byte since there’s nowhere to stuff it in the other metadata fields) per message because we don’t need a flag to indicate send/recv. As well as that, there are probably at least some tools/analyses that are only interested in traffic in one direction. Splitting the files means that those tools would not have to read/filter/seek over a bunch of messages that they’re not interested in.
I wonder if we shouldn’t go further actually. Having one file (or two) = one session would be pretty nice for certain analyses. That could be accomplished by adding e.g. (our own node startup time, peer id) or (connection start time) in the filenames (or directory names).
ACK! I think a directory structure of <IP_port>/<peerid>/[send|recv].dev would work very nicely.
I’m slightly concerned about the overhead of opening/closing files for every message sent/received. Has anyone benchmarked this (or have ideas for a useful benchmark, even)? I understand the rationale of not occupying a large part of the file descriptor space for this, but it’d be good to know it doesn’t really hurt.
This is something that @troygiorshev and I discussed at length, but I can’t find that discussion in this PR, so it was probably before it was opened. Troy, do you have any notes from then? I know you looked into both alternatives.
in
contrib/message-capture/message-capture-parser.py:137
in
a17b2509e0outdated
132+ print(f"Found broken message, skipping remainder of file: {path}", file=sys.stderr)
133+ break
134+
135+ # Convert message to dictionary and add it to the messages dict
136+ msg_dict = {}
137+ msg_dict["msgtype"] = getattr(msg, "msgtype", None).decode()
MarcoFalke
commented at 10:30 am on January 22, 2021:
nit
0 msg_dict["msgtype"] = msg.msgtype.decode()
or
0 msg_dict["msgtype"] = msg_type.decode()
in
contrib/message-capture/message-capture-parser.py:154
in
a17b2509e0outdated
MarcoFalke
commented at 10:43 am on January 22, 2021:
That seems fragile. What if the payload happens to be printable?
troygiorshev
commented at 7:32 pm on January 23, 2021:
Ah, remainder is actually just the remainder of the 12 msgtype bytes after the first null. In most cases it’s just a few null bytes. I added this in response to a suggestion to check that after the first null byte in the msgtype, all further bytes are null.
Effectively these few lines should do the same job as CMessageHeader::IsCommandValid.
MarcoFalke
commented at 12:12 pm on February 2, 2021:
should probably just check that all are zero
MarcoFalke approved
MarcoFalke
commented at 10:46 am on January 22, 2021:
member
MarcoFalke
commented at 10:53 am on January 22, 2021:
member
ACK! I think a directory structure of <IP_port>//[send|recv].dev would work very nicely.
I think that could be confusing, because the same IP_port can have different peerid during the same connection (reconnect), but also can have the same peerid across restarts. So this make it harder to get the msgs of just one session.
MarcoFalke
commented at 10:58 am on January 22, 2021:
member
There has been some feedback, but at the very least you need to fix the linter for this to be merged.
jnewbery
commented at 11:02 am on January 22, 2021:
member
I think that could be confusing, because the same IP_port can have different peerid during the same connection (reconnect), but also can have the same peerid across restarts. So this make it harder to get the msgs of just one session.
Ah, good point. I’d missed sipa’s point about using a (our own node startup time, peer id) pair to handle restarts.
practicalswift
commented at 4:00 pm on January 22, 2021:
contributor
Concept ACK. I really like the approach of having separate files per peer/connection. Slightly less sure about separating send/receive, but seems ok.
Agreed: I also have a slight preference towards combining send and recv in the same file for these type of captures.
For “request-response” analysis it is really nice to be able to have the messages in correct order without having to correlate timestamps from two different files to produce an “ordered” file covering the entire flow.
Personally I think this ease-of-use benefit in the exceeds the cost (one extra byte to indicate direction, and having to skip one direction in the case of one direction analysis).
I wonder if we shouldn’t go further actually. Having one file (or two) = one session would be pretty nice for certain analyses. That could be accomplished by adding e.g. (our own node startup time, peer id) or (connection start time) in the filenames (or directory names).
Strong Concept ACK on this idea as well.
In the use cases I can think of the relevant unit of analysis is more likely to be “one specific session (with one specific IP)” than “all session with one specific IP”.
And the latter use cases can easily be covered by the user by simply concatenating multiple files of the “one specific session” type :)
sipa
commented at 7:09 pm on January 22, 2021:
member
The benefit here is that we save one bit (really one byte since there’s nowhere to stuff it in the other metadata fields) per message because we don’t need a flag to indicate send/recv.
The direction could be stored in the top bit of the size field. The max protocol message size is 4000000 bytes, so even 3 bytes would be enough if we really wanted to. A variable-length encoding of command name (1 length byte + that many bytes for the command name itself e.g.) would also save several bytes on average per message.
I don’t think dump size is that much of a priority that it warrants lots of extra effort, but I also don’t think the size argument on itself is a good reason for splitting the files based on direction.
As well as that, there are probably at least some tools/analyses that are only interested in traffic in one direction. Splitting the files means that those tools would not have to read/filter/seek over a bunch of messages that they’re not interested in.
Maybe, but I expect those to also be interested in just one or a few commands, and they’ll still need to read/filter/seek over the ones they don’t care about. On the other hand, anything analyzing query/response like things will need to look in both files too.
Especially if we’d go in a direction of separating separate sessions to the same ip/port, I think there is an elegance to having exactly 1 file = 1 session.
Another idea: would it make sense to store the version information of the dumping node (CLIENT_NAME, CLIENT_VERSION, perhaps also connection type)? Either in a header at the beginning of the file, or in the filename. That would make it easy to do analyses to compare behavior of different client versions.
What about a directory structure of <CLIENT_NAME>_<CLIENT_VERSION>/<ip>_<port>/<timestamp>_<peerid>.dat?
jnewbery
commented at 7:28 pm on January 22, 2021:
member
Especially if we’d go in a direction of separating separate sessions to the same ip/port, I think there is an elegance to having exactly 1 file = 1 session.
I agree that this seems very nice. I don’t have any objection to changing the output to be this way.
What about a directory structure of <CLIENT_NAME>_<CLIENT_VERSION>/<ip>_<port>/<timestamp>_<peerid>.dat?
I’m less convinced of this. For one, those details can easily be fished out of the received version message in the file. I also like the directory structure of one directory per peer in .bitcoin/message_capture (and one file per session in each of those directories). But again, if other people prefer that I won’t strongly oppose it.
sipa
commented at 7:33 pm on January 22, 2021:
member
I’m less convinced of this.
Yes, just a suggestion to hear what people think.
For one, those details can easily be fished out of the received version message in the file.
Note that I’m talking about the version of the dumping node, not of the peer we’re connecting to (that may also be interesting of course, but is a bit harder as it means you only know the filename after receiving VERSION).
I also like the directory structure of one directory per peer in .bitcoin/message_capture
If the intent is really “one directory per peer”, the port perhaps shouldn’t be included (or at least not for incoming connections), as the outgoing TCP port is newly chosen for every connection.
troygiorshev
commented at 9:50 pm on January 22, 2021:
contributor
why not capture messages after CNode::ReceiveMsgBytes?
This has been brought up a couple times now, and it’s a great question, so it’s worth a full response (that won’t be buried in resolved conversations).
My priority in this PR is that the capture of messages always occurs in order from an application layer perspective. Because of this, I have to capture messages in the MessageHandler thread.
If I were to instead capture incoming messages in the SocketHandler thread, say, right after ReceiveMsgBytes, the following could possibly occur:
receive and capture message X
receive and capture message Y
send and capture message A
whereas on the processing side (since X and Y are in in the vRecvMsg queue before being processed), the node actually did the following:
process message X
respond with message A
process message Y
So the order would be broken. Remember, this PR isn’t just a fun replacement for wireshark. The point is that someone improving/debugging/learning bitcoin’s p2p protocol can see what’s actually going on from the node’s perspective, without having to worry about how the timing of the queues is complicating things.
sipa
commented at 10:16 pm on January 22, 2021:
member
@troygiorshev That’s a great point. I was thinking that the difference between net-level and net_processing-level ordering would only be different interleaving across peers, but you’re right to point out it even obscures the processing within one peer. I agree it’s valuable to dump in processing order.
MarcoFalke
commented at 6:17 am on January 23, 2021:
nit: Maybe add a small comment here that this function captures the message at process time, not socket receive/send time. Nothing fancy, just a oneline comment to sum up #19509 (comment)
Not only to document the expected behavior, but also to avoid people to “optimize” away this call to GetTime with the rationale that it doesn’t matter and the message receive time can be used. See also #19509 (review)
troygiorshev
commented at 7:50 pm on January 23, 2021:
contributor
is really nice to be able to have the messages in correct order without having to correlate timestamps from two different files to produce an “ordered” file covering the entire flow.
This is already done for you by the parser, just give it both files and the messages will be interleaved automatically in the output file. If I’m understanding you correctly, the following should do what you’re looking for:
I think there is an elegance to having exactly 1 file = 1 session.
I agree completely, I’ll figure this out!
Add CaptureMessage
This commit adds the CaptureMessage function. This will later be called
when any message is sent or received. The capture directory is fixed,
in a new folder "message_capture" in the datadir. Peers will then have
their own subfolders, named with their IP address and port, replacing
colons with underscores to keep compatibility with Windows. Inside,
received and sent messages will be captured into two binary files,
msgs_recv.dat and msgs_sent.dat.
e.g.
message_capture/203.0.113.7_56072/msgs_recv.dat
message_capture/203.0.113.7_56072/msgs_sent.dat
The format has been designed as to result in a minimal performance
impact. A parsing script is added in a later commit.
f2a77ff97b
Call CaptureMessage at appropriate locations
These calls are toggled by a debug-only "capturemessages" flag. Default
disabled.
4d1a582549
Add capture parser
This commit adds contrib/message-capture/message-capture-parser.py, a python
script to be used alongside -capturemessages to parse the captured
messages.
It is complete with arguments and will parse any file given, sorting the
messages in the files when creating the output. If an output file is
specified with -o or --output, it will dump the messages in json format
to that file, otherwise it will print to stdout.
The small change to the unused msg_generic is to bring it in line with
the other message classes, purely to avoid a bug in the future.
e4f378a505
Add Message Capture Test
Add a functional test for CaptureMessage. This connects and then
disconnects a peer so that the handshake can be used to check if capture
is being done correctly.
Included in a docstring in the test is the following:
From the data file we'll only check the structure.
We won't care about things like:
- Deserializing the payload of the message
- This is managed by the deserialize methods in
test_framework.messages
- The order of the messages
- There's no reason why we can't, say, change the order of the
messages in the handshake
- Message Type
- We can add new message types
We're ignoring these because they're simply too brittle to test here.
381f77be85
Add documentation to contrib folder
This commit adds brief documentation for this feature. Included in the
justification is the purpose of this feature as well as usage and
functionality tips.
bff7c66e67
troygiorshev force-pushed
on Jan 23, 2021
troygiorshev
commented at 9:17 pm on January 23, 2021:
contributor
git range-diff master a17b250 bff7c66
Message capture now more gracefully handles unrecognized and corrupted messages
Corrected typo in docs
Added comment to clarify how timestamps are being used here
A couple small improvements to the test
Thanks so much for the review everyone!
Up soon: 1 file = 1 session
MarcoFalke
commented at 11:14 am on January 24, 2021:
member
re-ACKbff7c66e67aa2f18ef70139338643656a54444fe only some minor changes: 👚
doc fixups
append messages in the python parser even when the msg type is unknown or something can not be decoded
Signature:
0-----BEGIN PGP SIGNED MESSAGE-----
1Hash: SHA512
2 3re-ACKbff7c66e67aa2f18ef70139338643656a54444fe only some minor changes: 👚
4 5* doc fixups
6* append messages in the python parser even when the msg type is unknown or something can not be decoded
7-----BEGIN PGP SIGNATURE-----
8 9iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
10pUhNwgwAsIB9TNhIdWgBLKYoNkJpnwXmAPbL/OAR3t+csB3wjowlJQMz9UijiMPN
11H/EzFMGiWMQoDgfxBMewFNXtBetGBzdQHHO9kGk7S0uulqj4S6Tg1WAWrfSOPhbI
12cSDuE8XSsPXq1qEpVS5NiVjsB5IJiVyER7LqSEnMp/iemdpgCIWfvxiJzaJNXbkj
13ijCLQ3+2D7TX/e+N8iyHyIBnC/NbTGp8fzim7AoRf6O42NFWy7CVmM2aSu5qpW5B
147nxTuWff66xBAnGwCEOLgfbm3G96ZgBqXGrBXCfL5GNIVTqNTDblX08cfoyMLqdv
15InGH+Qdz1ObG7tqY3TRq88U8Jgl+qm1N4Yi63lg2LISAL7r/qwAeQcyk/JiEsN8g
168cCVCLkTd3BPKsHb4O+ItsfLCg4xkqSPogLML+FjNj1ppDKxo/rPPrdV7Ss04lbo
175f4UbCSI9WMLMS+v2ZYSASW/4CXVXKF/bgAjzxv3VYK90+ITKnIoQcBjOZv5myGc
18Zkcqkx7K
19=FpGK
20-----END PGP SIGNATURE-----
michaelfolkson
commented at 2:01 pm on January 24, 2021:
contributor
Still need to do a lot of playing around with this. Interested if you have given any thought @troygiorshev on what questions this tool can easily answer (without additional digging) and what is possible for future improvements/future PRs. Don’t want to derail review of this PR though so set up a StackExchange question.
theStack approved
theStack
commented at 6:08 pm on January 27, 2021:
contributor
re-ACKbff7c66e67aa2f18ef70139338643656a54444fe
Verified that the error handling code that was introduced since my last ACK works, by corrupting a message-type in the middle of a random capture file (with a hex editor) and diffing the json outputs of the original and corrupted captures:
leonardojobim
commented at 7:21 am on February 10, 2021:
none
I ran the message-capture-parser.py and received the following error:
0Error:
1Traceback (most recent call last):
2 File "./message-capture-parser.py", line 214, in <module>
3 main()
4 File "./message-capture-parser.py", line 199, in main
5 process_file(str(capture), messages, "recv" in capture.stem, progress_bar)
6 File "./message-capture-parser.py", line 159, in process_file
7 msg_dict["body"] = to_jsonable(msg)
8 File "./message-capture-parser.py", line 82, in to_jsonable
9 elif slot in HASH_INT_VECTORS and isinstance(val[0], int):
10IndexError: list index out of range
So I changed the line 82
elif slot in HASH_INT_VECTORS and isinstance(val[0], int):
to
elif isinstance(val, list) and len(val) > 0 and slot in HASH_INT_VECTORS and isinstance(val[0], int):
The script worked after this change.
The reason of the error is the val variable had the value [] and when accessing val[0] the script crashed.
The error happened with following values for the msgtype and for the type of the obj variable (in to_jsonable() function):
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-04-06 03:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me