log: improve addrman logging #22839

pull mzumsande wants to merge 1 commits into bitcoin:master from mzumsande:202108_log_addrman changing 2 files +19 −15
  1. mzumsande commented at 10:09 pm on August 30, 2021: member

    The addrman helper functions GetNewBucket() and GetTriedBucket()

    1. log into the wrong category (BCLog::NET instead of BCLog::ADDRMAN)
    2. log too unspecifically - especially GetTriedBucket() gets called from many different places (e.g. Check_(), Serialize()), it seems sufficient to me logging these when moving an address from new to tried. Running a node with -checkaddrman=1and net logging currently results in a lot of repetitive log entries.

    This PR moves these log entries to Add_() and Good_() and also adds logging for Select_() (allowing statistics about New/Tried success probabilities), GetAddr_(), ClearNew() and MakeTried().

  2. DrahtBot added the label P2P on Aug 30, 2021
  3. DrahtBot added the label Utils/log/libs on Aug 30, 2021
  4. fanquake requested review from amitiuttarwar on Aug 31, 2021
  5. fanquake requested review from jnewbery on Aug 31, 2021
  6. theStack commented at 9:23 am on August 31, 2021: member
    Concept ACK
  7. jnewbery commented at 9:54 am on August 31, 2021: member
    Code review ACK 632e648fd8a4d7b71be45139a51c75ffff305a1b
  8. theStack approved
  9. theStack commented at 8:36 pm on September 2, 2021: member
    Code-review ACK 632e648fd8a4d7b71be45139a51c75ffff305a1b
  10. practicalswift commented at 1:48 pm on September 4, 2021: contributor
    Concept ACK
  11. DrahtBot commented at 8:53 pm on September 7, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    No conflicts as of last run.

  12. in src/addrman.cpp:574 in 632e648fd8 outdated
    570@@ -573,6 +571,7 @@ void CAddrMan::GetAddr_(std::vector<CAddress>& vAddr, size_t max_addresses, size
    571 
    572         vAddr.push_back(ai);
    573     }
    574+    LogPrint(BCLog::ADDRMAN, "Created getaddr response of size %d\n", vAddr.size());
    


    naumenkogs commented at 9:33 am on September 17, 2021:

    nits:

    1. Say “of size XYZ entries” (implying not bytes).
    2. Maybe something more specific like “Selected XYZ entries to be sent in response to GETADDR”

    mzumsande commented at 10:36 pm on October 2, 2021:
    Thanks - I forgot to answer this - will address it when I push next (i.e. after #22950 is merged and I’ll need to rebase anyway).

    naumenkogs commented at 7:16 am on October 4, 2021:
    Why so pessimistic? :) This already has 3 ACKs, so might bring this up at the meeting? Unless you explicitly want to avoid forcing a rebase on #22950

    jnewbery commented at 9:11 am on October 4, 2021:
    I think that since #22950 is so much larger and requires more time-consuming review, it makes sense to try to merge that first and rebase other PRs on top of it (rather than merge small PRs and force 22950 to be rebased/re-reviewed).

    mzumsande commented at 9:53 am on October 4, 2021:
    Yes, I agree with that. I’d much rather rebase this than go through all the (mostly mechanical) changes of 22950 another time.

    mzumsande commented at 5:20 pm on October 5, 2021:
    I took your second suggestion, thanks!
  13. naumenkogs commented at 9:34 am on September 17, 2021: member
    ACK 632e648fd8a4d7b71be45139a51c75ffff305a1b
  14. DrahtBot added the label Needs rebase on Oct 5, 2021
  15. mzumsande force-pushed on Oct 5, 2021
  16. mzumsande commented at 5:20 pm on October 5, 2021: member
    Rebased and addressed feedback by @naumenkogs .
  17. DrahtBot removed the label Needs rebase on Oct 5, 2021
  18. theStack approved
  19. theStack commented at 9:05 pm on October 5, 2021: member
    re-ACK 2ac363a1f731b06d3f3c8b2fd7a02d0f846bb39b
  20. in src/addrman.cpp:787 in 2ac363a1f7 outdated
    775@@ -780,7 +776,7 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
    776 
    777         addresses.push_back(ai);
    778     }
    779-
    780+    LogPrint(BCLog::ADDRMAN, "Selected %d addrman entries to be sent in response to getaddr\n", addresses.size());
    


    vasild commented at 8:38 am on October 6, 2021:
    nit: a mixture of tenses is used: moving, adding, select, selected.

    mzumsande commented at 11:34 pm on October 7, 2021:
    I changed it to past in all added logs, but didn’t change all existing ones: Sometimes the log describes something that is about to be done after the log.
  21. in src/addrman.cpp:586 in 2ac363a1f7 outdated
    582@@ -589,8 +583,7 @@ void AddrManImpl::Good_(const CService& addr, bool test_before_evict, int64_t nT
    583             m_tried_collisions.insert(nId);
    584         }
    585     } else {
    586-        LogPrint(BCLog::ADDRMAN, "Moving %s to tried\n", addr.ToString());
    587-
    588+        LogPrint(BCLog::ADDRMAN, "Moving IP %s mapped to AS%i to tried bucket %i\n", addr.ToString(), addr.GetMappedAS(m_asmap), tried_bucket);
    


    vasild commented at 8:41 am on October 6, 2021:

    nit: might as well drop “IP”. “Moving 10.20.30.40” is also fine IMO.

    “Moving IP 2g5qfdkn2vvcbqhzcyvyiitg4ceukybxklraxjnu7atlhd22gdwywaid.onion” is funny because that is not an IP (address).


    mzumsande commented at 11:34 pm on October 7, 2021:
    Done as suggested.
  22. in src/addrman.cpp:658 in 2ac363a1f7 outdated
    654@@ -662,6 +655,7 @@ bool AddrManImpl::Add_(const CAddress& addr, const CNetAddr& source, int64_t nTi
    655             ClearNew(nUBucket, nUBucketPos);
    656             pinfo->nRefCount++;
    657             vvNew[nUBucket][nUBucketPos] = nId;
    658+            LogPrint(BCLog::ADDRMAN, "Adding IP %s mapped to AS%i to new bucket %i\n", addr.ToString(), addr.GetMappedAS(m_asmap), nUBucket);
    


    vasild commented at 8:52 am on October 6, 2021:

    Since we log the “add” operation, would it make sense to also log the “delete”?

    • Delete(): removes an entry from the new table
    • MakeTried(): may move an entry from tried to new.

    mzumsande commented at 11:47 pm on October 7, 2021:

    I added a log to MakeTried() as suggested.

    Delete()`: removes an entry from the new table

    It only removes from mapAddr and mapInfo: If Add() attempts to add an unknown addr, which collides with an existing, non-terrible one, Create() and Delete() are called even though nothing is ever actually added into any new bucket - I don’t think Delete() should log in this case. That’s why I added a log entry to ClearNew() instead of Delete():

  23. in src/addrman.cpp:583 in 2ac363a1f7 outdated
    582@@ -589,8 +583,7 @@ void AddrManImpl::Good_(const CService& addr, bool test_before_evict, int64_t nT
    583             m_tried_collisions.insert(nId);
    


    vasild commented at 9:09 am on October 6, 2021:

    Above, on line 581 there is a confusing log which says “moving”, but it is actually not moving anything - it only adds the ID to a list of collided IDs. The promoted entry remains in the new table (Good() fails to promote it to tried) and the existent entry remains in tried.

     0     if (test_before_evict && (vvTried[tried_bucket][tried_bucket_pos] != -1)) {
     1         // Output the entry we'd be colliding with, for debugging purposes
     2         auto colliding_entry = mapInfo.find(vvTried[tried_bucket][tried_bucket_pos]);
     3-        LogPrint(BCLog::ADDRMAN, "Collision inserting element into tried table (%s), moving %s to m_tried_collisions=%d\n", colliding_entry != mapInfo.end() ? colliding_entry->second.ToString() : "", addr.ToString(), m_tried_collisions.size());
     4+        LogPrint(
     5+            BCLog::ADDRMAN,
     6+            "Cannot move %s from new to tried due to a collision%s. Collisions=%u.",
     7+            addr.ToString(),
     8+            colliding_entry != mapInfo.end() ? " with " + colliding_entry->second.ToString() : "",
     9+            m_tried_collisions.size());
    10         if (m_tried_collisions.size() < ADDRMAN_SET_TRIED_COLLISION_SIZE) {
    11             m_tried_collisions.insert(nId);
    12         }
    13     } else {
    

    mzumsande commented at 11:50 pm on October 7, 2021:
    I changed this, and also moved the log call after the Insert into m_tried_collisions so that the logged m_tried_collisions.size() make more sense. “Cannot move” sounds a bit too final, because it might be moved a bit later when evaluating m_tried_collisions so I chose a slightly different wording.
  24. vasild approved
  25. vasild commented at 9:12 am on October 6, 2021: member

    ACK 2ac363a1f731b06d3f3c8b2fd7a02d0f846bb39b

    Some non-blocker comments below.

  26. jnewbery commented at 9:49 am on October 6, 2021: member
    utACK 2ac363a1f731b06d3f3c8b2fd7a02d0f846bb39b
  27. naumenkogs commented at 10:23 am on October 6, 2021: member
    ACK 2ac363a1f731b06d3f3c8b2fd7a02d0f846bb39b, but it’d be great to apply what vasild suggested.
  28. fanquake commented at 5:42 am on October 7, 2021: member

    Given this change is quite small, and re-ACKing is easy, I’d suggest addressing the outstanding review comments before we merge this. Rather than having a followup to change the same lines again.

    Also @naumenkogs I edited your ACK to remove the @ mention, otherwise it would have ended up in the merge message.

  29. mzumsande commented at 7:28 am on October 7, 2021: member
    Makes sense of course, I’ll push an update this evening.
  30. mzumsande force-pushed on Oct 7, 2021
  31. mzumsande force-pushed on Oct 7, 2021
  32. mzumsande commented at 11:59 pm on October 7, 2021: member
    I addressed @vasild’s suggestions, thanks for the review! Note that now this PR introduces more additional BCLog::ADDRMAN log calls. Although I originally had the intention to only log from the functions belonging to the public interface, I think that the entries in ClearNew and MakeTried can be helpful for debugging.
  33. in src/addrman.cpp:587 in 2ceab68079 outdated
    585             m_tried_collisions.insert(nId);
    586         }
    587+        // Output the entry we'd be colliding with, for debugging purposes
    588+        auto colliding_entry = mapInfo.find(vvTried[tried_bucket][tried_bucket_pos]);
    589+        LogPrint(BCLog::ADDRMAN, "Collision with %s while attempting to move %s to tried table. Collisions=%d\n",
    590+            colliding_entry != mapInfo.end() ? colliding_entry->second.ToString() : "",
    


    vasild commented at 7:07 am on October 8, 2021:

    nit: If the entry is not found, then the log would look like:

    0Collision with  while attempting...
    

    I think the entry will always be found, unless some addrman corruption has happened. The previous log would also look strange in that case.

    0        LogPrint(BCLog::ADDRMAN, "Collision%s while attempting to move %s to tried table. Collisions=%d\n",
    1            colliding_entry != mapInfo.end() ? " with " + colliding_entry->second.ToString() : "",
    

    Would produce either Collision while... or Collision with 1.2.3.4 while....


    naumenkogs commented at 8:03 am on October 8, 2021:
    This case was unclear for me either. Perhaps assert != end is too dangerous, but at least logging that something is wrong explicitly would be great?

    vasild commented at 8:15 am on October 8, 2021:
    Yeah, I also think an assert is too dangerous (and out of the scope of this PR). The methods Good_(), Attempt()_, Connected()_ and SetServices_() just cancel the current operation without disturbing the entire process (like assert() would do) if the unexpected happens (search for // if not found, bail out).

    MarcoFalke commented at 8:21 am on October 8, 2021:
    Assume could be used

    vasild commented at 9:07 am on October 8, 2021:

    Like this?

    0Assume(colliding_entry != mapInfo.end());
    1LogPrint(..., colliding_entry->second.ToString(), ...);
    

    in the adverse scenario, in production, the first line would do nothing and the second line would crash.


    jnewbery commented at 9:14 am on October 8, 2021:
    This seems like a reasonable improvement, but the behavior here is unchanged in this PR. Previously the log was also doing a test on colliding_entry != mapInfo.end(). This PR is currently focused entirely on logging improvements, so I think changing this behavior should be left for a different PR.

    mzumsande commented at 8:42 pm on October 11, 2021:
    Ok, I’ll leave this as is in this PR.
  34. vasild approved
  35. vasild commented at 7:11 am on October 8, 2021: member
    ACK 2ceab68079601072a8ab6f084e4b3123025d0463
  36. in src/addrman.cpp:478 in 2ceab68079 outdated
    474@@ -481,6 +475,7 @@ void AddrManImpl::ClearNew(int nUBucket, int nUBucketPos)
    475         assert(infoDelete.nRefCount > 0);
    476         infoDelete.nRefCount--;
    477         vvNew[nUBucket][nUBucketPos] = -1;
    478+        LogPrint(BCLog::ADDRMAN, "Removed %s from new bucket %i\n", infoDelete.ToString(), nUBucket);
    


    naumenkogs commented at 8:08 am on October 8, 2021:
    nit: If you decided to log a bucket, I assume the reason for that is to debug addrman logs in case something goes wrong. Then, logging a position would also be useful :)

    vasild commented at 8:20 am on October 8, 2021:

    Now, we are doing feature creep to this PR, so feel free to ignore. But, but, but… what about an uniform new[i][j] and tried[i][j] all over the place? E.g.

    0Moving 1.2.3.4 from tried[5][20] to new[4][50]
    

    mzumsande commented at 1:47 pm on October 9, 2021:
    That does sound useful! Might just take a couple more days until I get to it.

    mzumsande commented at 8:43 pm on October 11, 2021:
    I changed this as suggested by @vasild with the latest push.
  37. in src/addrman.cpp:530 in 2ceab68079 outdated
    526@@ -532,6 +527,7 @@ void AddrManImpl::MakeTried(AddrInfo& info, int nId)
    527         infoOld.nRefCount = 1;
    528         vvNew[nUBucket][nUBucketPos] = nIdEvict;
    529         nNew++;
    530+        LogPrint(BCLog::ADDRMAN, "Moved %s from tried to new bucket %i to make space\n", infoOld.ToString(), nUBucket);
    


    naumenkogs commented at 8:11 am on October 8, 2021:

    mzumsande commented at 8:44 pm on October 11, 2021:
    Done.
  38. naumenkogs commented at 8:15 am on October 8, 2021: member
    ACK 2ceab68079601072a8ab6f084e4b3123025d0463
  39. jnewbery commented at 9:14 am on October 8, 2021: member
    ACK 2ceab68079601072a8ab6f084e4b3123025d0463
  40. mzumsande force-pushed on Oct 11, 2021
  41. mzumsande commented at 8:47 pm on October 11, 2021: member
    Latest push: I added the bucket position everywhere where only the bucket was logged before and changed to a consistent notation new[i][j], tried[i][j].
  42. in src/addrman.cpp:725 in 12350e1e98 outdated
    721@@ -720,6 +722,7 @@ std::pair<CAddress, int64_t> AddrManImpl::Select_(bool newOnly) const
    722             assert(it_found != mapInfo.end());
    723             const AddrInfo& info{it_found->second};
    724             if (insecure_rand.randbits(30) < fChanceFactor * info.GetChance() * (1 << 30)) {
    725+                LogPrint(BCLog::ADDRMAN, "Selected %s from new to connect to\n", info.ToString());
    


    vasild commented at 10:18 am on October 13, 2021:
    This is the “tried” case but would log “new”. Below, in the “new” case we would log “tried”.

    mzumsande commented at 10:39 pm on October 13, 2021:
    Oops, thanks for catching that! Fixed now.
  43. mzumsande force-pushed on Oct 13, 2021
  44. vasild approved
  45. vasild commented at 8:36 am on October 14, 2021: member

    ACK e1d1788027e1339b0013fc8787e0251862d85117

    Thanks!

  46. in src/addrman.cpp:725 in e1d1788027 outdated
    721@@ -720,6 +722,7 @@ std::pair<CAddress, int64_t> AddrManImpl::Select_(bool newOnly) const
    722             assert(it_found != mapInfo.end());
    723             const AddrInfo& info{it_found->second};
    724             if (insecure_rand.randbits(30) < fChanceFactor * info.GetChance() * (1 << 30)) {
    725+                LogPrint(BCLog::ADDRMAN, "Selected %s from tried to connect to\n", info.ToString());
    


    jnewbery commented at 10:11 am on October 15, 2021:

    I don’t really like having internal addrman logs including assumptions about how the external interface function is being used. Here, I think it’s sufficient to log “Selected %s from tried”. The calling code is responsible for logging about what it does with the returned data.

    You could imagine in the future the Select() function being used for a purpose other than to get an address to connect to immediately, in which case this log would be misleading/incorrect.

    Same comment below for the “to be sent in response to getaddr” log.


    mzumsande commented at 10:28 pm on October 18, 2021:
    Changed. The reason I had those is to distinguish between GetAddr and Select, which both select addresses from addrman, but for a different purpose, and with a different methodology closely tied to the respective purpose. I removed the description of the purpose but kept the “getaddr” keyword to make that difference clear.
  47. jnewbery commented at 10:12 am on October 15, 2021: member

    ACK e1d1788027e1339b0013fc8787e0251862d85117

    One non-essential suggestion inline.

  48. log: improve addrman logging b65a25a846
  49. mzumsande force-pushed on Oct 18, 2021
  50. vasild approved
  51. vasild commented at 3:29 pm on October 19, 2021: member
    ACK b65a25a84666d41a0af4ad98ffadfa4ac802d1bb
  52. fanquake requested review from naumenkogs on Oct 20, 2021
  53. jnewbery commented at 8:38 am on October 20, 2021: member
    ACK b65a25a84666d41a0af4ad98ffadfa4ac802d1bb
  54. fanquake merged this on Oct 21, 2021
  55. fanquake closed this on Oct 21, 2021

  56. sidhujag referenced this in commit 2f6538e571 on Oct 21, 2021
  57. Fabcien referenced this in commit 1136c0b575 on Oct 21, 2022
  58. DrahtBot locked this on Oct 30, 2022
  59. mzumsande deleted the branch on Nov 3, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-06 19:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me