I2P network optimizations #26837

pull vasild wants to merge 3 commits into bitcoin:master from vasild:i2p_session_pool changing 3 files +54 −9
  1. vasild commented at 5:43 pm on January 6, 2023: contributor
    • Reuse an I2P transient session instead of discarding it if we failed to connect to the desired peer. This means we never used the generated address (destination), whose creation is not cheap. This does not mean that we will use the same address for more than one peer.
    • Lower the number of tunnels for transient sessions.
    • Explicitly specify the number of tunnels for persistent sessions instead of relying on the defaults which differ between I2P routers. This way we get consistent behavior with all routers.

    Alleviates: #26754

    (I have not tested this with i2pd, yet)

  2. DrahtBot commented at 5:43 pm on January 6, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK jonatack, mzumsande
    Concept ACK 1440000bytes, zzzi2p, sipa
    Approach ACK w0xlt

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27114 (p2p: Allow whitelisting outgoing connections by brunoerg)
    • #26441 (rpc, p2p: add addpermissionflags RPC and allow whitelisting outbound by brunoerg)
    • #25515 (net, test: Virtualise CConnman and add initial PeerManager unit tests by dergoegge)
    • #24748 (test/BIP324: functional tests for v2 P2P encryption by stratospher)
    • #24545 (BIP324: Enable v2 P2P encrypted transport by dhruv)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. vasild commented at 5:53 pm on January 6, 2023: contributor

    If it is desired to rate limit the creation of new sessions, we should (for i2pacceptincoming=0 aka transient sessions):

    • keep track of the rate at which we create new sessions
    • when closing a connection, if the rate is too high, instead of discarding the session put it back to a pool of reused sessions
    • when creating new session, if the rate is too high and if the reused sessions pool is non-empty, then pick one from there

    Reading #26754 I get the impression that this is not necessary. Further, it would be a compromise with “one address per connection” and would be somewhat more complicated (to implement and maintain).

  4. in src/net.h:1142 in 57fd27fd36 outdated
    1137+     * of creating new ones in order to reduce the load on the I2P network.
    1138+     * Creating a session in I2P is not cheap, thus if this is not empty, then
    1139+     * pick an entry from it instead of creating a new session. If connecting to
    1140+     * a host fails, then the created session is put to this pool for reuse.
    1141+     */
    1142+    std::queue<std::unique_ptr<i2p::sam::Session>> m_i2p_sessions GUARDED_BY(m_i2p_sessions_mutex);
    


    vasild commented at 6:33 pm on January 6, 2023:
    This would benefit from #25390 by just adding Synced<...> m_i2p_sessions without a separate mutex.
  5. jonatack commented at 8:19 pm on January 6, 2023: contributor
    Thanks @vasild, will have a look and test with i2pd.
  6. ghost commented at 2:58 am on January 7, 2023: none
    Concept ACK
  7. vasild commented at 2:18 pm on January 7, 2023: contributor

    With i2pacceptincoming=0 onlynet=i2p and this PR (57fd27fd3621e782652c280d99263589fb3697a1):

    • In the first 5 minutes 10 sessions were created (= one session creation per 30 seconds). During that time 19 connection failures occurred, whose sessions were reused for subsequent attempts due to this PR. Without it, it would be 29 (10+19) instead of 10.
    • In the next 18 hours 264 sessions were created (= one session creation per 4 minutes). During that time 234 connection failures occurred.

    Without onlynet=i2p the numbers would be much lower since other networks would also be tried. @zzzi2p, does that look reasonable?

  8. in src/net.cpp:516 in 57fd27fd36 outdated
    509+                }
    510                 connected = i2p_transient_session->Connect(addrConnect, conn, proxyConnectionFailed);
    511+                if (!connected) {
    512+                    LOCK(m_i2p_sessions_mutex);
    513+                    m_i2p_sessions.emplace(i2p_transient_session.release());
    514+                }
    


    vasild commented at 2:30 pm on January 7, 2023:

    I am wondering if m_i2p_sessions can grow unchecked to the point of using too much memory and whether it should be capped.

    If this is run by a single thread then the size of m_i2p_sessions can be either 0 or 1 because in order to create a new session, the list must be empty and then if the connection fails it will be added to the list. If run by N threads, then the size can be at most N, if all of the threads race into it.

    I think that is ok with the current pattern of add 1, take 1. Just a note to the future - if the code is changed to add more entries to the list and it takes.


    jonatack commented at 6:08 pm on January 9, 2023:

    Maybe only emplace if under a max size.

    0-                if (!connected) {
    1+                if (!connected && m_i2p_sessions.size() < 10) {
    2                     LOCK(m_i2p_sessions_mutex);
    3                     m_i2p_sessions.emplace(i2p_transient_session.release());
    

    vasild commented at 12:38 pm on January 11, 2023:
    Fair enough, not too difficult to add a cap. Done.
  9. zzzi2p commented at 6:10 pm on January 7, 2023: none

    ACK, thank you very much.

    This helps my thesis that it was mainly a startup problem that would drive the router into congestion. The key is you’ve gone from 290 tunnels at startup down to 22, which is a massive massive improvement.

    A couple notes:

    • You should monitor/test the reliability of connections over transient tunnels with *.quantity=1. If it doesn’t meet your requirements, retest with *.quantity=2 to see if that improves things enough to be worth the additional resource usage.
    • You should monitor/test the performance of non-transient tunnels with *.quantity=3. If it doesn’t meet your requirements, try 4 or 5 to see if that improves things enough to be worth the additional resource usage.
    • Agreed you want to avoid a pool with lots of unused sessions for a long time, either by design or some check. Also maybe close all idle sessions once you get to 10 good ones? Up to you, you’re on the right track here.
    • Try to get an ack from i2pd’s @orignal
  10. orignal commented at 8:38 pm on January 7, 2023: none
    Looks good
  11. w0xlt commented at 1:51 pm on January 9, 2023: contributor
    Approach ACK
  12. jonatack commented at 5:11 pm on January 9, 2023: contributor

    Testing ATM with i2pd 2.45.0 now 2.45.0_1.

    Without onlynet=i2p the numbers would be much lower since other networks would also be tried.

    We discourage using only I2P enough in doc/i2p.md and this article (and Tor is so popular with node operators) that I wonder if anyone is running bitcoind only over I2P. Two that did reported it took 1-2 months for IBD.

    When running with all of the networks (clearnet/tor/i2p/cjdns), I rarely see more than 1 outbound I2P connection.

    For instance, spun up a node running -i2pacceptincoming=0 and only one single transient destination was created in the first hour.


    Update: now running a node with i2p listening off and onlynet=tor/i2p/cjdns without clearnet and seeing ~3 transient destinations created/hour with grep "Creating transient SAM session" ~/.bitcoin/debug.log.

  13. in src/net.h:959 in 57fd27fd36 outdated
    955@@ -955,7 +956,7 @@ class CConnman
    956     bool AlreadyConnectedToAddress(const CAddress& addr);
    957 
    958     bool AttemptToEvictConnection();
    959-    CNode* ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type);
    960+    CNode* ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type) EXCLUSIVE_LOCKS_REQUIRED(!m_i2p_sessions_mutex);
    


    jonatack commented at 5:14 pm on January 9, 2023:

    In commit 74fa583, would adding this lock annotation only to the ConnectNode declaration suffice?

    (The corresponding lock assertion AssertLockNotHeld(m_i2p_sessions_mutex); should be added to the function definition.)


    vasild commented at 12:55 pm on January 11, 2023:

    would adding this lock annotation only to the ConnectNode declaration suffice?

    No, all callers of EXCLUSIVE_LOCKS_REQUIRED(!m) must also be tagged with the same annotation. Otherwise:

    0net.cpp:1995:20: error: calling function 'ConnectNode' requires negative capability '!m_unused_i2p_sessions_mutex' [-Werror,-Wthread-safety-analysis]
    1    CNode* pnode = ConnectNode(addrConnect, pszDest, fCountFailure, conn_type);
    2                   ^
    

    Added AssertLockNotHeld() calls to methods marked with EXCLUSIVE_LOCKS_REQUIRED(!m_unused_i2p_sessions_mutex)

  14. in src/net.h:1131 in 57fd27fd36 outdated
    1126@@ -1126,6 +1127,20 @@ class CConnman
    1127      */
    1128     std::vector<CService> m_onion_binds;
    1129 
    1130+    /**
    1131+     * Mutex protecting m_i2p_sam_sessions.
    


    jonatack commented at 5:59 pm on January 9, 2023:
    0     * Mutex protecting m_i2p_sessions.
    

    The role of this mutex and of the queue it guards might be more clearly named something like m_unused_i2p_sessions and m_unused_i2p_sessions_mutex.


    vasild commented at 12:55 pm on January 11, 2023:
    Renamed.
  15. sipa commented at 9:54 pm on January 9, 2023: member
    Concept ACK
  16. mzumsande commented at 3:40 pm on January 10, 2023: contributor

    Concept ACK Not an i2p expert, so these questions are more to improve my own understanding:

    Why exactly is session creation not cheap? I guess it’s cheap locally, but the expensive part is that the new destination needs to be propagated through the network, others will probe for it etc.?

    Does the reduction to inbound.quantity=1 outbound.quantity=1 of this PR affect our ability to help the i2p network by routing other non-bitcoin traffic? Or would that got through seperate tunnels?

  17. orignal commented at 6:06 pm on January 10, 2023: none

    Why exactly is session creation not cheap?

    Every session creates few tunnels. Tunnels creation is not cheap, because asymmetric crypto and KDF for each hop.

  18. jonatack commented at 6:12 pm on January 10, 2023: contributor

    Why exactly is session creation not cheap?

    Per the OP in #26754: …some limit on the number of addresses (aka destinations or tunnels) built by the bitcoin client is necessary. I2P Destinations are designed to be long-lived, it’s not like in Tor where you can create tons of circuits. Waiting for tunnels to be built for each connection also adds a huge amount of delay to connection setup time. A high number of addresses will also result in excessive CPU and bandwidth usage by your router to build the tunnels for each. Additionally, other routers will reject your excessive tunnel building, which will increase your resource usage even more as your router retries. And, of course, the overhead applies to every connection attempt, not just every successful connection.

    Locally, I’m seeing between 45-100 seconds for a successful transient I2P connection to be established, based on the “Creating transient SAM session xyz” and “Transient SAM session xyz created” logging in src/i2p.cpp.

  19. zzzi2p commented at 0:42 am on January 11, 2023: none

    @mzumsande re: why not cheap, in addition to what orignal said:

    • a tunnel is a bandwidth commitment by 3 peers, who have finite capacity, for 10 minutes
    • the asymmetric crypto is both locally and at each peer
    • the tunnel build request and response themselves require bandwidth
    • if any peer rejects or drops the request, esp. in times of congestion, the whole build process begins again
    • each peer doesn’t “know” if the other peers agreed, and if it agrees must allocate capacity for the tunnel regardless

    you can see how careful congestion strategies in the router are required to prevent congestion collapse, where failed tunnel builds lead to more failed builds

    “local” or “client” tunnels created by SAM are different from “transit” or “participating” tunnels created by others, and for the most part, one won’t affect the other, until we approach bandwidth or CPU limits.

  20. vasild commented at 11:05 am on January 11, 2023: contributor

    You should monitor/test the reliability of connections over transient tunnels with *.quantity=1

    I ran two nodes for about 20 hours each - one with *.quantity=1 and one with *.quantity=3. I recorded the time when each I2P session was created and its duration (time between Creating transient SAM session X and Destroying SAM session X). Running with i2pacceptincoming=0 onlynet=i2p. Here are the results:

    1tunnel_timeline 3tunnel_timeline

    It looks ok - the points in the “1 tunnel” case are not visibly lower than in the “3 tunnel” case. I guess if connections are dropped frequently and unexpectedly due to low tunnel count, then the durations would be lower.

  21. i2p: reuse created I2P sessions if not used
    In the case of `i2pacceptincoming=0` we use transient addresses
    (destinations) for ourselves for each outbound connection. It may
    happen that we
    * create the session (and thus our address/destination too)
    * fail to connect to the particular peer (e.g. if they are offline)
    * dispose the unused session.
    
    This puts unnecessary load on the I2P network because session creation
    is not cheap. Is exaggerated if `onlynet=i2p` is used in which case we
    will be trying to connect to I2P peers more often.
    
    To help with this, save the created but unused sessions and pick them
    later instead of creating new ones.
    
    Alleviates: https://github.com/bitcoin/bitcoin/issues/26754
    b906b64eb7
  22. i2p: lower the number of tunnels for transient sessions
    This will lower the load on the I2P network. Since we use one transient
    session for connecting to just one peer, a higher number of tunnels is
    unnecessary.
    
    This was suggested in:
    https://github.com/bitcoin/bitcoin/issues/26754#issuecomment-1365449401
    https://github.com/bitcoin/bitcoin/issues/26754#issuecomment-1367356129
    
    The options are documented in:
    https://geti2p.net/en/docs/protocol/i2cp#options
    
    A tunnel is unidirectional, so even if we make a single outbound
    connection we still need an inbound tunnel to receive the messages sent
    to us over that connection.
    
    Alleviates: https://github.com/bitcoin/bitcoin/issues/26754
    801b405f85
  23. i2p: use consistent number of tunnels with i2pd and Java I2P
    The default number of tunnels in the Java implementation is 2 and in the
    C++ i2pd it is 5. Pick a mid-number (3) and explicitly set it in order
    to get a consistent behavior with both routers. Do this for persistent
    sessions which are created once at startup and can be used to open up
    to ~10 outbound connections and can accept up to ~125 incoming
    connections. Transient sessions already set number of tunnels to 1.
    
    Suggested in:
    https://github.com/bitcoin/bitcoin/issues/26754#issuecomment-1367356129
    https://geti2p.net/en/docs/api/samv3
    
    Alleviates: https://github.com/bitcoin/bitcoin/issues/26754
    3c1de032de
  24. vasild force-pushed on Jan 11, 2023
  25. vasild commented at 1:02 pm on January 11, 2023: contributor
    57fd27fd36...3c1de032de: address suggestions
  26. jonatack commented at 9:16 pm on January 11, 2023: contributor

    ACK 3c1de032de01e551992975eb374465300a655f44

    I ran two nodes for about 20 hours each - one with *.quantity=1 and one with *.quantity=3. It looks ok - the points in the “1 tunnel” case are not visibly lower than in the “3 tunnel” case.

    Nice! Were these two nodes running at the same time (to maybe reflect the same network conditions)?

  27. jonatack commented at 9:31 pm on January 11, 2023: contributor
    Testing with i2pd 2.45.0_1 (edit: 2.45.1 was just released; I upgraded, and I think confirm that I’m still seeing the i2pd issue). Will try to measure bandwidth before/after this change as described in #26838 (comment).
  28. vasild commented at 3:57 pm on January 12, 2023: contributor

    Were these two nodes running at the same time

    Yes, and also on the same (java) router.

     0#!/usr/local/bin/bash
     1
     2# Dump shell script to parse `debug.log` and produce an output like "session_start_seconds_since_startup session_duration" per line. Use like: ./parse.sh debug.log > durations3
     3
     4log=${1}
     5
     6b=""
     7
     8for sess in $(sed -En 's/^.*Creating transient SAM session ([^ ]+) .*$/\1/p' < $log) ; do
     9
    10    beg=$(sed -En "s/^([^ ]+).*Creating transient SAM session ${sess}.*$/\1/p" < $log)
    11    end=$(sed -En "s/^([^ ]+).*Destroying SAM session ${sess}.*$/\1/p" < $log)
    12    if [ -z "$end" ] ; then
    13        continue
    14    fi
    15    # convert a timestamp like 2023-01-09T16:28:40Z to number of seconds since epoch, this only works on freebsd, linux's data(1) does not accept -f IIRC
    16    beg_s=$(date -j -f "%Y-%m-%dT%TZ" $beg +%s)
    17    if [ -z "$b" ] ; then
    18        b=$beg_s
    19    fi
    20    end_s=$(date -j -f "%Y-%m-%dT%TZ" $end +%s)
    21    echo "$((beg_s - b)) $((end_s - beg_s))"
    22done
    
    0# gnuplot script to visualize the result from the above parsing script. Use like: gnuplot this_file.gnuplot
    1set terminal svg size 600,400 dynamic mouse standalone background "#ffffff"
    2set output "3tunnel_timeline.svg"
    3set title "Session duration - 3 tunnels"
    4set xlabel "Session creation time [minutes since startup]"
    5set ylabel "Session duration [minutes]"
    6set logscale y
    7set yrange [0:2000]
    8plot "durations3" using ($1/60):($2/60) title "" with points pointtype 7 pointsize 1 linecolor rgb "#ff0000"
    
  29. fanquake added the label Needs backport (24.x) on Jan 13, 2023
  30. jonatack commented at 7:02 pm on January 13, 2023: contributor

    Thanks @vasild.

    I’d be curious if anyone testing this branch sees this added logging return a value > 0 (it’s only been 0 for me so far with -i2pacceptincoming=0, -onlynet=i2p and several addnoded peers to see 10-14 i2p peers, thus m_unused_i2p_sessions is either empty or contains one session).

     0+++ b/src/net.cpp
     1@@ -500,11 +500,13 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
     2                     LOCK(m_unused_i2p_sessions_mutex);
     3                     if (m_unused_i2p_sessions.empty()) {
     4+                        LogPrintf("m_unused_i2p_sessions size: 0\n");
     5                         i2p_transient_session =
     6                             std::make_unique<i2p::sam::Session>(proxy.proxy, &interruptNet);
     7                     } else {
     8                         i2p_transient_session.swap(m_unused_i2p_sessions.front());
     9                         m_unused_i2p_sessions.pop();
    10+                        LogPrintf("m_unused_i2p_sessions size after swap+pop: %i\n", m_unused_i2p_sessions.size());
    
  31. zzzi2p commented at 10:37 pm on January 13, 2023: none

    More fixes on the i2p/i2pd side: Neither project was stopping tunnel building if the client timed out and closed the control socket waiting for SESSION STATUS. Fixed in both projects.

    i2pd was waiting a minimum of 20 sec before responding with SESSION_STATUS. Reduced to 3 sec.

    Java i2p changes: https://github.com/i2p/i2p.i2p/commit/841e2771803ea999f881af4d959b7cc4848dd8d7 i2pd changes: https://github.com/PurpleI2P/i2pd/commit/7146a4dbaee7df1a165a18869593ce29089dde2b

  32. vasild commented at 1:40 pm on January 14, 2023: contributor

    I’d be curious if anyone testing this branch sees this added logging return a value > 0

    This comment #26837 (review):

    If run by N threads, then the size can be at most N, if all of the threads race into it.

    Let me elaborate. This is in ConnectNode() which is only called from OpenNetworkConnection() which can be called from more than one thread. If this code is run from just one thread then the size of the list can be either 0 or 1 and your logs will always print 0. If run by N threads it may happen that all of them execute this at the same time - check that the list is empty and create a new session, and then if all those N connections fail, then each thread will append its session to the list, making it with size N. It is ok, we run just a couple of threads (not 100s or 1000s, not even 10s).

    You could get a printout >0 - maybe if you addnode ... onetry at the same time as a normal connection is attempted from the b-opencon thread. I guess it would be very rare and difficult to reproduce.

  33. jonatack commented at 6:59 pm on January 19, 2023: contributor

    Thanks @vasild, that confirms what I was thinking.

    Java i2p changes: i2p/i2p.i2p@841e277 i2pd changes: PurpleI2P/i2pd@7146a4d

    Happy to test i2pd as soon as the new release is out. I’m still seeing the issue I reported with i2pd version 2.45.1 (0.9.57).

  34. fanquake commented at 10:40 am on February 16, 2023: member
    What are the next steps here? Does this just need more sanity checking from anyone who actually runs with i2p? I assume whatever upstream releases we may/may not have been waiting for have happened by now?
  35. vasild commented at 10:59 am on February 16, 2023: contributor
    Like usual, after this is deemed to have had enough review and testing it can be merged. It is not tied to or dependent on I2P releases.
  36. jonatack commented at 5:14 pm on February 16, 2023: contributor

    FWIW, I2P adoption by Bitcoin nodes seems to have been increasing significantly over the past 3 months. From 200-350 in November to over 1100 active peers now, possibly spurred in part by Umbrel and Raspiblitz adding I2P support in December.

    Anyone else like to have a look at testing/reviewing this PR?

  37. mzumsande commented at 8:17 pm on February 21, 2023: contributor

    Light ACK 3c1de032de01e551992975eb374465300a655f44

    I ran a -onlynet=i2p node with this branch for a while, with both options for -i2pacceptincoming and didn’t encounter any problems - also reviewed the code and checked that the changes make sense in light of the available i2p documentation and its recommendations.

    However, I’m not an i2p expert so I probably wouldn’t have spotted any more sophisticated problems if there were any.

    In light of #26754 (comment) it seems that backport may not be that urgent anymore.

  38. fanquake commented at 5:58 pm on February 22, 2023: member

    In light of #26754 (comment) it seems that backport may not be that urgent anymore.

    Yea, that does seem the case (no urgency), however I think pulling these into 24.x is still worthwhile. I’ll add the changes here to #26878 shortly.

  39. fanquake merged this on Feb 22, 2023
  40. fanquake closed this on Feb 22, 2023

  41. jonatack commented at 6:19 pm on February 22, 2023: contributor

    I’m still seeing the issue I reported with i2pd version 2.45.1 (0.9.57).

    No longer seeing the previously daily issue since a few days with i2pd version 2.46.0 and 2.46.1.

  42. vasild deleted the branch on Feb 23, 2023
  43. sidhujag referenced this in commit e35130eb81 on Feb 25, 2023
  44. fanquake removed the label Needs backport (24.x) on Feb 27, 2023
  45. fanquake commented at 2:10 pm on February 27, 2023: member
    Added to #26878 for backporting to 24.x.
  46. fanquake referenced this in commit 5027e93b6a on Feb 27, 2023
  47. fanquake referenced this in commit 29cdf42226 on Feb 27, 2023
  48. fanquake referenced this in commit ab3bd457cd on Feb 27, 2023
  49. glozow referenced this in commit c8c85ca16e on Feb 27, 2023
  50. bitcoin locked this on Feb 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me