[systemd] Use of discouraged network target

bubelov commented at 4:16 pm on November 13, 2022: contributor

Here are the current network hooks used in Bitcoin Core systemd integration template:

https://github.com/bitcoin/bitcoin/blob/7ef730ca84bd71a06f986ae7070e7b2ac8e47582/contrib/init/bitcoind.service#L16-L18

Semantically, it makes sense, so I never questioned those lines when deploying new nodes, but here are some quotes from the official systemd website:

https://systemd.io/NETWORK_ONLINE/

$network / network-online.target is a mechanism that is required only to deal with software that assumes continuous network is available (i.e. of the simple not-well-written kind)

Services using the network should hence simply place an After=network.target stanza in their unit files, without Wants=network.target or Requires=network.target

After=network.target can be sure that it is stopped before the network is shut down when the system is going down. This allows services to cleanly terminate connections before going down, instead of losing ongoing connections leaving the other side in an undefined state.

If you are a developer, instead of wondering what to do about network.target, please just fix your program to be friendly to dynamically changing network configuration.

Watch rtnetlink and react to network configuration changes as they happen. This is usually the nicest solution, but not always the easiest.

It looks like the use of network-online.target is discouraged and network.target is a preferred option for any program which knows how to adapt to changing network conditions. Are there any good reasons not to follow the official recommendation?

bubelov added the label Bug on Nov 13, 2022

willcl-ark commented at 8:17 am on November 14, 2022: member

After=network.target can be sure that it is stopped before the network is shut down when the system is going down. This allows services to cleanly terminate connections before going down, instead of losing ongoing connections leaving the other side in an undefined state.

Presumably After=network-online.target service is afforded the same graceful shutdown time before the network is brought down?

Anyway, speeding up system boot time certainly seems worthwhile, and Bitcoin Core can handle being offline just fine, so i think I’d agree that this change was a net positive. My own systemd file for Bitcoind has always just contained a simple After=network.target and it’s worked for a few years without issue.

bubelov commented at 11:30 am on November 14, 2022: contributor

Presumably After=network-online.target service is afforded the same graceful shutdown time before the network is brought down?

Yea, my understanding is that it reverses the order of those targets during shutdown :

0network-pre.target UP -> network.target UP -> network-online.target UP

0network-online.target DOWN -> network.target DOWN -> network-pre.target DOWN

My own systemd file for Bitcoind has always just contained a simple After=network.target and it’s worked for a few years without issue.

That’s good to know, changing things is always risky. I’m doing some testing too.

Sjors commented at 11:05 am on November 15, 2022: member

See @luke-jr’s comment on the PR.

But also, we read anchors.dat very early in the process in order to reestablish two earlier connections. If we fail to connect because we’re still offline we immediately give up on them. That (slightly) increases the risk of an eclipse attack.

https://github.com/bitcoin/bitcoin/blob/48174c0f287b19931ca110670610bd03a03eb914/src/net.cpp#L1766-L1773

This could perhaps be made more robust, if we had a way of detecting online-ness, but arguably that’s better left to systemd to figure out.

bubelov commented at 12:01 pm on November 15, 2022: contributor

@Sjors it seems like systemd doesn’t really know what it means to be online, too, and it delegates it to the network manager, giving us no hard guarantees. Their official recommendation is not to trust them blindly and to use rtnetlink instead, if possible.

A robust system boots up independently of external services. More specifically, if a network DHCP server does not react, this should not slow down boot on most setups, but only for those where network connectivity is strictly needed (for example, because the host actually boots from the network).

willcl-ark commented at 1:43 pm on November 15, 2022: member

But also, we read anchors.dat very early in the process in order to reestablish two earlier connections. If we fail to connect because we’re still offline we immediately give up on them. That (slightly) increases the risk of an eclipse attack.

This is something I didn’t consider and a good argument for leaving the service file as it currently is for now IMO.

It also leads me to wonder whether it would make sense to delay making our “best” (anchor) connections until after we have connected at least one other node, as a proxy for online-ness, specifically so that we can avoid them being missed like this…

prusnak commented at 9:56 pm on November 24, 2022: contributor

My own systemd file for Bitcoind has always just contained a simple After=network.target and it’s worked for a few years without issue.

And yet nix-bitcoin changed network.target to network-online.target recently in https://github.com/fort-nix/nix-bitcoin/pull/567, which fixed the following error:

0bitcoind: libevent: getaddrinfo: address family for nodename not supported
1bitcoind: Binding RPC on address 127.0.0.1 port 8332 failed.
2bitcoind: Unable to bind any endpoint for

when dhcpcd and bitcoind were started in parallel.

prusnak commented at 9:58 pm on November 24, 2022: contributor

The change from network.target to network-online.target (in this repo) was introduced in https://github.com/bitcoin/bitcoin/commit/d9392b724cae53b7a16fa5f84ebe152eea496502 by @hebasto

bubelov commented at 4:47 am on November 25, 2022: contributor

Binding RPC on address 127.0.0.1 port 8332 failed

That’s interesting, and it seems to contradict this quote from the official systemd doc:

It is strongly recommended not to make use of this target too liberally: for example network server software should generally not pull this in (since server software generally is happy to accept local connections even before any routable network interface is up). Its primary purpose is network client software that cannot operate without network.

Binding to 127.0.0.1 sounds like a normal thing for “server software” to do, why would it need to be online (whatever it means) to bind to a local port?

willcl-ark commented at 10:45 am on November 25, 2022: member

But also, we read anchors.dat very early in the process in order to reestablish two earlier connections. If we fail to connect because we’re still offline we immediately give up on them. That (slightly) increases the risk of an eclipse attack.

This seems worth preserving using network-online.target for. However perhaps we should think about changing the anchor connection logic so that it does not give up on them until we have at least one remote connection, to avoid this more robustly.

pinheadmz commented at 1:54 pm on April 27, 2023: member

This issue is unlikely to be fixed in Bitcoin Core. We’ll close for now, but feel free to open another issue or pull request with a fix.

pinheadmz closed this on Apr 27, 2023

bitcoin locked this on Apr 26, 2024

[systemd] Use of discouraged network target #26493