This attempts to fix the problem with revealing too much information about IP addresses, but still provides a way to track the progress of different nodes (via the NodeId). Enabling IP addresses is now optional via the -logips command line option.
Show nodeid instead of addresses (for anonymity) unless otherwise requested #3764
pull rebroad wants to merge 1 commits into bitcoin:master from rebroad:LogIPoptional changing 6 files +35 −25-
rebroad commented at 9:58 PM on February 27, 2014: contributor
-
laanwj commented at 8:48 AM on February 28, 2014: member
I like the idea of logging node IDs instead of logging IP addresses.
In principle, all
-logipsthen has to do is log a node ID to IP:port mapping once on connection and it's possible to figure out the IPs if necessary (Though it may be more useful in practice to actually log IPs in every message anyway). -
sipa commented at 1:04 PM on February 28, 2014: member
Concept ACK; I haven't reviewed the code
-
rebroad commented at 6:29 PM on February 28, 2014: contributor
@laanwj yes, could make -logips display IPs instead of NodeIDs, if this is what most people want. I went with just correlating them once (upon first connection). so that it keeps the debug.log more readable and smaller in size, but the info is still there if it needs to be referenced.
-
luke-jr commented at 6:30 PM on February 28, 2014: member
node=(IP-or-id) seems ideal to me
-
rebroad commented at 11:10 PM on February 28, 2014: contributor
@luke-jr has suggested that the IP address could be added to each and every line of debug.log instead of the node number, whereas I prefer that debug.log correlates the node number with the IP address only once, so reduce debug.log output and to make it more readable.
Can I seek consensus on this please? I'm happy to adapt this pull to the majority preference.
(Luke's suggestion might possibly involve creating a new variable within the CNode struct to contain the string to display so that there didn't need to be two LogPrintfs for every time one or the other is displayed).
-
laanwj commented at 7:36 AM on March 1, 2014: member
I like the approach of logging the IP:port info for a node only once, on connection. After that, log only the Node ID for messages concerning that node.
This minimizes the code difference between the
-logipsand-nologipscase (ie: one message) and avoids ugly if() statements duplicating the debug message apart from the IP information... -
in src/net.cpp:None in 25917368e7 outdated
965 | pnode->AddRef(); 966 | { 967 | LOCK(cs_vNodes); 968 | vNodes.push_back(pnode); 969 | } 970 | + if (fLogIPs)
laanwj commented at 7:38 AM on March 1, 2014:If you want -logips to affect every one of these messages I'd suggest another implementation:
- Define a function IdenfityNode or such that returns a string with either the node ID or IP, based on fLogIPs
- Use this function in all the concerned debug messages
This avoids these kinds of duplicated messages everywhere.
laanwj commented at 9:57 AM on March 2, 2014:You will have to implement it yourself. It doesn't exist yet.
sipa commented at 5:26 PM on May 18, 2014:std::string IdentityNode(const CNode *pnode) { if (fLogIPs) return pnode->addr.ToString(); else return strprintf("%d", pnode->GetId()); }Use a function like that, please?
gmaxwell commented at 8:40 PM on March 1, 2014: contributorI like the general ID of always logging a node ID with every message we emit which can be resolved against getpeerinfo— enabling after the fact diagnosis of a source while the client is still up. It needs to be a nonce though, not a sequential ID or we can correlate them based on connect time. The the logips logging can just make the connect message emit the nodeids.
There is some privacy leak, that you can see that the same node triggered two messages, but I don't think that one is concerning enough to overcome the loss of useful debugging messages.
gmaxwell commented at 1:05 AM on March 2, 2014: contributorI'm not sure what you don't understand. Right now the connect messages show IPs and so you get some good idea of what order nodes are, so the IDs would be linking. Thats bad. Additionally, esp if we later add crypto, I worry about metadata only attackers who can observe what order people connected in. The linkage between IDs and IPs should not be externally observable.
gmaxwell commented at 10:17 AM on March 2, 2014: contributorCan you please reread my answer, including the second half of it?
ghost commented at 6:47 PM on March 2, 2014: noneWouldn't these changes like this make the current ongoing mtgox investigations harder why is this even being prioritized. On Mar 2, 2014 1:44 PM, "R E Broadley" notifications@github.com wrote:
@gmaxwell https://github.com/gmaxwell I already read your answer 3 times, and on the 4th reading my interpretation of it is no different.
Reply to this email directly or view it on GitHubhttps://github.com/bitcoin/bitcoin/pull/3764#issuecomment-36462239 .
gmaxwell commented at 6:25 AM on March 3, 2014: contributor@rebroad "Additionally, esp if we later add crypto, I worry about metadata only attackers who can observe what order people connected in. The linkage between IDs and IPs should not be externally observable." someone watching who connects in what order would know the IDs even if they didn't know the content of the messages. getting the logs would tell them the content.
luke-jr commented at 6:37 AM on March 3, 2014: memberLogging IPs is a security regression IMO.
laanwj commented at 6:43 AM on March 3, 2014: memberThinking about it, we already concluded in #3741:
- The point was to not log IPs (esp. those associated with transactions) by default
- We are logging almost no IPs by default already: only on connect and some kinds of misbehavior:
- Nearly all of the messages affected in this pull are only logged when specific debug classes (mempool, net) are active
So I'm not entirely sure what is the point. As for the code this introduces a lot of clutter (duplicated debug messages) making it harder to maintain. It also adds yet another option to the long list.
rebroad commented at 6:25 PM on March 3, 2014: contributor@gmaxwell I'm not understanding what you're not understanding. There would be no linkage as you mention unless this optional command line argument is given. Logging which node is doing what is a useful debugging too, especially when working on code such as parallel block download or anything which needs to correlate what was sent to nodes and what was received from nodes. It needs to be a debug option therefore. The alternative I can suggest is that instead of the "logips" option as I've done here, I change it to "lognodeids" option, and we leave the display of IP addresses in as it currently is, but make logging node ids dependant on this command line option given. Either way, the "problem" as you put it would still exist, which is the correlation between node ids and IP addresses, but since these options would only be enabled by developers, and are necessary for certain development work, I don't really see there being an issue. @laanwj Hopefully my explanation above also answers your query.
sipa commented at 7:08 PM on March 3, 2014: member@rebroad: @gmaxwell is saying that if someone is able to monitor your network activity (just metadata, nor the actual traffic) + has access to these logs (even without
-logips) this would allow reconstructing which nodes were talking to eachother.I personally think this is a very limited problem, and the solution suggested (using random instead of sequentially assigned identifiers) doesn't help much against it (
-logtimestampsis default now, so you would still be able to correlate with the log).I think this can even improve usefulness of logs (both with and without -logips, though I'd prefer to only log IPs even with
-logipsat initial connection), as sequentially assigned ids are actually more readable than IPs when looking through logs to diagnose sync issues.in src/main.cpp:None in 268625ae21 outdated
2384 | @@ -2385,6 +2385,7 @@ void PushGetBlocks(CNode* pnode, CBlockIndex* pindexBegin, uint256 hashEnd) 2385 | pnode->pindexLastGetBlocksBegin = pindexBegin; 2386 | pnode->hashLastGetBlocksEnd = hashEnd; 2387 | 2388 | + LogPrint("net", "sending getblocks. node=%d\n", pnode->id);
sipa commented at 12:14 AM on March 4, 2014:Nit: I'd call it "peer" rather than "node". I know the source code is already confused in that regard, but it's not really an identifier of a particular node we're showing - just which # peer it is for us.
in src/net.cpp:None in 268625ae21 outdated
549 | @@ -550,7 +550,10 @@ void CNode::PushVersion() 550 | CAddress addrYou = (addr.IsRoutable() && !IsProxy(addr) ? addr : CAddress(CService("0.0.0.0",0))); 551 | CAddress addrMe = GetLocalAddress(&addr); 552 | RAND_bytes((unsigned char*)&nLocalHostNonce, sizeof(nLocalHostNonce)); 553 | - LogPrint("net", "send version message: version %d, blocks=%d, us=%s, them=%s, peer=%s\n", PROTOCOL_VERSION, nBestHeight, addrMe.ToString(), addrYou.ToString(), addr.ToString()); 554 | + if (fLogIPs)
sipa commented at 12:21 AM on March 4, 2014:Alternative: only have one place where the actual IP is logged.
My suggestion: in net.h, at the end of the CNode::CNode() constructor:
if (hSocket != INVALID_SOCKET) LogPrint("net", "Added connection to %s peer=%d", addrNameIn, id);
in src/net.cpp:None in af1f442793 outdated
551 | @@ -552,7 +552,10 @@ void CNode::PushVersion() 552 | CAddress addrYou = (addr.IsRoutable() && !IsProxy(addr) ? addr : CAddress(CService("0.0.0.0",0))); 553 | CAddress addrMe = GetLocalAddress(&addr); 554 | RAND_bytes((unsigned char*)&nLocalHostNonce, sizeof(nLocalHostNonce)); 555 | - LogPrint("net", "send version message: version %d, blocks=%d, us=%s, them=%s, peer=%s\n", PROTOCOL_VERSION, nBestHeight, addrMe.ToString(), addrYou.ToString(), addr.ToString()); 556 | + if (fLogIPs) 557 | + LogPrint("net", "send version message: version %d, blocks=%d, us=%s, them=%s, peer=%s\n", PROTOCOL_VERSION, nBestHeight, addrMe.ToString(), addrYou.ToString(), addr.ToString()); 558 | + else 559 | + LogPrint("net", "send version message: version %d, blocks=%d, us=%s\n", PROTOCOL_VERSION, nBestHeight, addrMe.ToString());
sipa commented at 2:19 PM on April 6, 2014:Can you print the peer id instead of the ip here, in this case?
sipa commented at 2:19 PM on April 6, 2014: memberACK, apart from minor nit above.
rebroad commented at 1:52 AM on May 9, 2014: contributorWhat is holding this up please?
luke-jr commented at 2:43 AM on May 16, 2014: memberI know.
in src/net.cpp:None in fd546b01fe outdated
482 | @@ -483,7 +483,7 @@ CNode* ConnectNode(CAddress addrConnect, const char *pszDest) 483 | { 484 | addrman.Attempt(addrConnect); 485 | 486 | - LogPrint("net", "connected %s\n", pszDest ? pszDest : addrConnect.ToString()); 487 | + if (fLogIPs) LogPrint("net", "connected %s\n", pszDest ? pszDest : addrConnect.ToString());
sipa commented at 5:22 PM on May 18, 2014:List the nodeid here as well? So you can correlate a messages back to what IP sent them if necessary (only with -logips) anyway.
rebroad commented at 11:34 AM on May 21, 2014:done
in src/net.h:None in fd546b01fe outdated
314 | @@ -315,7 +315,7 @@ class CNode 315 | } 316 | 317 | // Be shy and don't send version until we hear 318 | - if (hSocket != INVALID_SOCKET && !fInbound) 319 | + if ((hSocket != INVALID_SOCKET) && (!fInbound))
sipa commented at 5:28 PM on May 18, 2014:Why?
rebroad commented at 11:34 AM on May 21, 2014:mistake. undone.
in src/net.h:None in fd546b01fe outdated
429 | @@ -430,7 +430,7 @@ class CNode 430 | nRequestTime = it->second; 431 | else 432 | nRequestTime = 0; 433 | - LogPrint("net", "askfor %s %d (%s)\n", inv.ToString().c_str(), nRequestTime, DateTimeStrFormat("%H:%M:%S", nRequestTime/1000000).c_str()); 434 | + LogPrint("net", "askfor %s %d (%s) peer=%d\n", inv.ToString().c_str(), nRequestTime, DateTimeStrFormat("%H:%M:%S", nRequestTime/1000000).c_str(), id);
sipa commented at 5:28 PM on May 18, 2014:Optional: the .c_str() could be dropped here.
rebroad commented at 11:38 AM on May 21, 2014:done
sipa commented at 5:34 PM on May 18, 2014: memberI don't really care about adding an optional and by-default-off method for listing IPs. For debugging it is useful, and nobody will turn it on for any reason but that.
The arguments against even providing the ability to link log messages from the same node are interesting. I'm not sure it's such a big deal. If it is, maybe we can list the peer=id message piece only when -logips is provided as well (and thus even stricten the default behaviour, which currently does list some IPs already)?
rebroad commented at 2:49 AM on June 28, 2014: contributorrebaed, and nits addressed.
in src/init.cpp:None in c033cd7386 outdated
287 | @@ -288,6 +288,7 @@ std::string HelpMessage(HelpMessageMode mode) 288 | strUsage += " -genproclimit=<n> " + _("Set the processor limit for when generation is on (-1 = unlimited, default: -1)") + "\n"; 289 | strUsage += " -help-debug " + _("Show all debugging options (usage: --help -help-debug)") + "\n"; 290 | strUsage += " -logtimestamps " + _("Prepend debug output with timestamp (default: 1)") + "\n"; 291 | + strUsage += " -logips " + _("Include IP addresses in debug output (default: 0)") + "\n";
Diapolo commented at 11:34 AM on July 1, 2014:Nit²: This needs to be placed above
-logtimestamps.Show nodeid instead of addresses (for anonymity) unless otherwise requested. 2e36866fecBitcoinPullTester commented at 4:47 AM on July 4, 2014: noneAutomatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p3764_2e36866fecb7420cd73047a7aa762a6e5e225695/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
laanwj merged this on Jul 4, 2014laanwj closed this on Jul 4, 2014laanwj referenced this in commit 00fa78c35b on Jul 4, 2014rebroad deleted the branch on Jul 5, 2014gmaxwell commented at 1:54 PM on July 15, 2014: contributorWas was this merged without adding the peer index to getpeerinfo? The peer numbers given in the logs are not useful.
DrahtBot locked this on Sep 8, 2021
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 15:15 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me