- when there is no network connection or a failing / non-existent proxy passed to the client the debug.log was spamed with socket 10022 errors
- this is fixed by adding a check for hSocketMax != 0
- merged a WSAGetLastError() into the related printf()
fix "socket select error 10022" spam #1772
pull Diapolo wants to merge 1 commits into bitcoin:master from Diapolo:fix_socket_10022_spam changing 1 files +2 −3-
Diapolo commented at 12:15 PM on September 1, 2012: none
-
8207857f40
fix "socket select error 10022" spam
- when there is no network connection or a failing / non-existent proxy passed to the client the debug.log was spamed with socket 10022 errors - this is fixed by adding a check for hSocketMax != 0 - merged a WSAGetLastError() into the related printf()
-
laanwj commented at 6:51 AM on September 2, 2012: member
Maybe we should not call select at all when there is no work to do? (ie, hSocketMax == 0). Or am I overlooking something?
Edit: I think I am. Select without fds should effectively be a sleep() of the specified duration. So why is it returning errors at all?
-
Diapolo commented at 9:50 AM on September 2, 2012: none
select returns 10022, when the 3 fds parameters are 0, which is the case when we have no net or a faulty proxy was supplied.
-
laanwj commented at 10:24 AM on September 2, 2012: member
But does it still pause for the "timeout" duration in that case? Or is it a busy loop, erroring out immediately and causing 100% CPU usage?
-
Diapolo commented at 11:04 AM on September 2, 2012: none
int nSelect = select(hSocketMax + 1, &fdsetRecv, &fdsetSend, &fdsetError, &timeout);The timeout parameter defines the maximum time for
select()to wait, but when we have a failed net all fdsetX variables have their .fd_count set to 0 and so I assume select fails imediately.But we have a timeout some lines below
Sleep(timeout.tv_usec/1000);, which prevents a 100% CPU usage IMO.Edit: And I can confirm, that Qt nearly idles, when I have set a non working proxy.
-
laanwj commented at 11:30 AM on September 2, 2012: member
Ok, the timeout is already OK. Hadn't seen that.
-
in src/net.cpp:None in 8207857f40
785 | - if (hSocketMax != INVALID_SOCKET) 786 | + if ((hSocketMax != INVALID_SOCKET) && (hSocketMax != (SOCKET)0)) 787 | { 788 | - printf("socket select error %d\n", nErr); 789 | + printf("socket select error %d\n", WSAGetLastError()); 790 | for (unsigned int i = 0; i <= hSocketMax; i++)
laanwj commented at 5:26 PM on September 4, 2012:BTW this <= here is interesting.
hSocketMax is the maximum socket number. Which means that, if it is 0 (which it is initialized with), we still pass 1 into select (which has a parameter "maximum fd+1", not "maximum fd").
So what happens if there is nothing to do? hSocketMax will be 0, however 1 will be passed into select, even though no fds have been set with FD_SET.
Hmm the code is fishy. Maybe we should track the maximum of (hSocket+1) instead of hSocket, so we can distinguish the cases only fd 0 set and no fds set. This also means that we don't have to increase the number before passing it into select.
Diapolo commented at 7:44 PM on September 4, 2012:At least on Windows, the first parameter of select is ignored
nfds [in] Ignored. The nfds parameter is included only for compatibility with Berkeley sockets., which should be no cause for an error then.So hSocketMax is 0, when hListenSocket is 0 or all pnode->hSocket are 0. This results in all fdset vars to contain .fd_count == 0, which results in the select() 10022 error code (
The time-out value is not valid, or all three descriptor parameters were null.).The longer I think about it, the more I feel like NOT calling select() when all fdset vars are empty, which of course is perhaps more expensive in that part of the code, then simply ignoring 10022 - via a check for hSocketMax == 0, which is true, when there are simply none ;).
Edit: That is such a part in the code, where I'm really missing more documentation :-D!
laanwj commented at 7:51 PM on September 4, 2012:The point is that we currently cannot distinguish between no fds and a single fd, 0. Which is wrong. If you make hSocketMax to be max(hSocket+1) this is easy to detect, as the value will be 0 for no fds and 1 for the single fd 0 case. And as a plus, we then pass the correct nfds on UNIX in both cases.
(also, if you do that, you could decide to skip the select when hSocketMax==0. You cannot do so right now, as it has two meanings and thus is ambigious)
Diapolo commented at 8:14 PM on September 4, 2012:I'm thinking for minutes now, but have no clue, where you would like to place max(hSocket + 1) or do you mean max(hSocketMax, hSocket + 1)? And even then it makes no sense yet...
laanwj commented at 8:21 PM on September 4, 2012:Yes,
hSocketMax = max(hSocketMax, hListenSocket + 1); ... hSocketMax = max(hSocketMax, pnode->hSocket + 1);Then change the select line to (ie, remove the + 1)
int nSelect = select(hSocketMax, &fdsetRecv, &fdsetSend, &fdsetError, &timeout);And change this loop to < i.s.o <=:
for (unsigned int i = 0; i < hSocketMax; i++) FD_SET(i, &fdsetRecv);Only after that, you can check for hSocketMax==0 and be sure that it means "no sockets".
Diapolo commented at 8:50 PM on September 4, 2012:I found out that setting an invalid -proxy leads to not even entering the BOOST_FOREACH loops, as vhListenSocket and vNodes are empty. That means hSocketMax simply keeps the init value of 0, which leads to 10022 spam.
I also did some research for the first select()-parameter and found this:
<pre>The first parameter to select() is the maximum file descriptor that is set in the structs PLUS ONE. That is, if you have 20 file descriptors in the sets, and the maximum value a file descriptor has is 123, then the value passed as the first parameter must be 124.</pre>
This makes me think the + 1 before hSocketMax in select() is indeed correct and valid.
laanwj commented at 8:55 PM on September 4, 2012:Yes, that's what I am trying to say all the time. The +1 is valid, however it is better to do it at different places, so you can distinguish between the two conditions that I mentioned. Currently you can not reliably detect the condition that causes the error.
Diapolo commented at 9:23 PM on September 4, 2012:When I now read the first comment all that makes sense ^^ you are such a patient person :-P.
With your new code we would pass hSocketMax == 0 to select(), when -proxy is invalid (no BOOST_FOREACH pass), I'm not sure if this is valid. But you are right, when hSocketMax stays 0, we have no fds set, but I'm not sure if it can ever become 1. It would be 1, if vhListenSocket or vNodes is not empty and the contained sockets are == 0.
vhListenSocket is empty, when we are not listening and vNodes if we are not connected to any peers IMO.
jgarzik commented at 3:04 AM on September 5, 2012: contributorhSocketMax+1 is not valid, when zero FDs are in the fd set. hSocketMax+1 is valid when fds are set.
We should not be passing "nfds == 1" to select, when that is clearly not the case. That is the bug that must be fixed here.
We always pass a timeout, therefore select always has "work" even if zero fds are indicated.
BitcoinPullTester commented at 12:01 PM on September 5, 2012: noneAutomatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8207857f401bc1a48f863be646c5a508a7cdfe9c for binaries and test log.
Diapolo closed this on Sep 5, 2012owlhooter referenced this in commit 362becbcce on Oct 10, 2018DrahtBot locked this on Sep 8, 2021
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-20 00:16 UTC
More mirrored repositories can be found on mirror.b10c.me