When using bitcoin-cli gettxoutsetinfo, I see in the defailed logs that the RPC is being performed: it flushes transactions to the coindb (it only takes about 40ms todo that):
This is performed in an "httpworker.*" thread (there are 4 such threads by default for handling ANY incoming RPC request. Visibly that request has only been able to flush all supposed transactions, but it expects to find more and does not find them. It seems to wait forever for those missing transactions,.
This occurs when there are new incoming transactions in a forked block. Even after waiting more then 10 minutes (so that other blocks are detected and downloaded), it still waits. Most probably those transactions from the forked blocks will never be validated and are now in a dead branch of the chain (not the best chaintip).
So bitcoin-cli gettxoutsetinfo never gets any reply. I can press CTRL+C to terminate the bitcoin-cli client. But in the bitcoind daemon, the "httpworker.*" thread is still running in a tight loop that will never terminate. So we've got one of these 4 httpworkers stuck.
In htop, I see that "httpworker.1" is busy and uses all its alloted CPU time. There are still 3 other httpworkers ready to accept new RPC requests.
I can still perform some other RPC requests with bitcoin-cli (e.g. bitcoin-cli -getinfo, or bitcoin-cli -netinfo 4, or bitcoin-cli getchaintips... but if I try to use bitcoin-cli gettxoutsetinfo again, they also flush the transactions in the coindb, inside one of the available httpworkers, but here also it never termintes: now we've got 2 httpworkers busy. Here again press CTRL+C in the client.
In "htop" you can see these "httpworker.1" and "httpworker.2" constantly busy.
Now If I try running bitcoin-cli gettxoutsetinfo a third time, and a 4th time, they will also flush to disk (this does not take long, about 40ms each time). "htop" reveals now that ALL "httpworker.*" are busy.
At that time it becomes impossible to use any RPC request (not even just bitcoin-cli -getinfo or bitcoin-cli getchaintips): that RPC will not even be initiated: the HTTP listener sees it, it quues the request, the messge handler will pick it, but will not be able to submit it to an httpworker thread as they are all busy): these RPC requsets will stack and will never be serviced.
If I run bitcoin-cli stop, it will also not reply; we cannot stop the bitcoin daemon safely.
During that time, all other services are running normally: new incoming blocks are detected, downloaded and sync'ed, validated, indexed in the chainstate or in other indexes (including the coindb).
I can wait for one hour, or more, these httpworkers are still running and using all their CPU time without ever terminating. I dno't know what they are waiting for to terminate: if there are still non-validated transactions, or transactions that were already detected as being invalid as they are no longer in the "best chain". It should terminte earlier. But apparently it waits until a given number of transactions have all been processed and flushed (they are not flushed if they are invalid).
If I press CTRL+C or a SIGINT signal to the daemon, it will attemt to terminate threads. But it will wait indefinitely for the termination of "busy" httpworkers: what are they doing?
This occurs if there was a recent incoming block creating a fork will not validated (and that will never be validated: the fork has already been rejected by consensus).
How can we avoid "gettxoutsetinfo" getting stucked for extremely long, or possibly for ever in httpworkers, making the RPC service completely inaccessible, just because there was a chain fork? (this does not block other services, but not being able to service RPC requests means that invalidated forks in the chain are creating a DoS situation in deamons, making its RPC service unusable. So we can no longer monitor what is happening in the chain, and cannot safely stop the deamon to restart it.
Example of log:
2022-08-23T23:16:35.194268Z [http] [httpserver.cpp:240] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:54349
2022-08-23T23:16:35.194652Z [httpworker.0] [rpc/request.cpp:176] [parse] [rpc] ThreadRPCServer method=gettxoutsetinfo user=__cookie__ peeraddr=127.0.0.1:54349
2022-08-23T23:16:35.232055Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 started
2022-08-23T23:16:35.265743Z [httpworker.0] [txdb.cpp:169] [BatchWrite] [coindb] Writing final batch of 0.51 MiB
2022-08-23T23:16:35.296377Z [httpworker.0] [txdb.cpp:171] [BatchWrite] [coindb] Committed 11263 changed transaction outputs (out of 16793) to coin database...
2022-08-23T23:16:35.296884Z [httpworker.0] [validationinterface.cpp:246] [ChainStateFlushed] [validation] Enqueuing ChainStateFlushed: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:16:35.297289Z [scheduler] [validationinterface.cpp:246] [operator()] [validation] ChainStateFlushed: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:16:35.297633Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 completed (64964μs)
2022-08-23T23:16:35.297959Z [scheduler] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, index/base.cpp:315 started
2022-08-23T23:16:35.298152Z [httpworker.0] [logging/timer.h:57] [Log] [lock] Enter: lock contention ::cs_main, kernel/coinstats.cpp:156 started
2022-08-23T23:16:35.299703Z [scheduler] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, index/base.cpp:315 completed (1564μs)
2022-08-23T23:16:35.299901Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 started
2022-08-23T23:16:35.300063Z [httpworker.0] [logging/timer.h:57] [Log] [lock] Enter: lock contention ::cs_main, kernel/coinstats.cpp:156 completed (1605μs)
And then after more than one hour, this [httpworker.0] does nothing I can see in the log
Notice "Committed 11263 changed transaction outputs (out of 16793) to coin database..." Vsibily there are more than 5000 transactions missing. Their validation will never come.
I can search the log for the chain "000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3": it never appears after that.
If I search the log instead before this attempted commit, I see`that this block came as:
2022-08-23T23:02:04.886231Z [msghand] [net_processing.cpp:2790] [ProcessMessage] [net] received: cmpctblock (8273 bytes) peer=23
2022-08-23T23:02:04.891223Z [msghand] [blockencodings.cpp:165] [InitData] [cmpctblock] Initialized PartiallyDownloadedBlock for block 000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 using a cmpctblock of size 8273
2022-08-23T23:02:04.891774Z [msghand] [net_processing.cpp:2790] [ProcessMessage] [net] received: blocktxn (33 bytes) peer=23
2022-08-23T23:02:04.919826Z [msghand] [blockencodings.cpp:210] [FillBlock] [cmpctblock] Successfully reconstructed block 000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 with 1 txn prefilled, 1329 txn from mempool (incl at least 1 from extra pool) and 0 txn requested
2022-08-23T23:02:04.941281Z [msghand] [validationinterface.cpp:257] [NewPoWValidBlock] [validation] NewPoWValidBlock: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:02:04.950268Z [msghand] [flatfile.cpp:69] [Allocate] [validation] Pre-allocating up to position 0x3000000 in blk03167.dat
2022-08-23T23:02:05.137427Z [msghand] [validationinterface.cpp:251] [BlockChecked] [validation] BlockChecked: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 state=Valid
2022-08-23T23:02:05.221094Z [msghand] [validationinterface.cpp:218] [TransactionRemovedFromMempool] [validation] Enqueuing TransactionRemovedFromMempool: txid=9471839224754a14a3455846f2f98541ad655c9792257b9dfe7f19702305626f wtxid=2312cbe8d16869684071bd2c3ad12f8b5353ffe3c1e0d5b7b4cc18656d609c6d
2022-08-23T23:02:05.221599Z [scheduler] [validationinterface.cpp:218] [operator()] [validation] TransactionRemovedFromMempool: txid=9471839224754a14a3455846f2f98541ad655c9792257b9dfe7f19702305626f wtxid=2312cbe8d16869684071bd2c3ad12f8b5353ffe3c1e0d5b7b4cc18656d609c6d
2022-08-23T23:02:05.310848Z [msghand] [validation.cpp:2511] [UpdateTipLog] UpdateTip: new best=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 height=750818 version=0x20800004 log2_work=93.692777 tx=759000744 date='2022-08-23T23:01:31Z' progress=1.000000 cache=2.7MiB(11393txo)
2022-08-23T23:02:05.311293Z [msghand] [txmempool.cpp:736] [check] [mempool] Checking mempool with 91 transactions and 148 inputs
2022-08-23T23:02:05.317205Z [msghand] [validationinterface.cpp:227] [BlockConnected] [validation] Enqueuing BlockConnected: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 block height=750818
2022-08-23T23:02:05.317607Z [msghand] [validationinterface.cpp:199] [UpdatedBlockTip] [validation] Enqueuing UpdatedBlockTip: new block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 fork block hash=00000000000000000004136e2034fb9e7d4ebe758567c6981840eec05b54f914 (in IBD=false)
2022-08-23T23:02:05.318393Z [scheduler] [validationinterface.cpp:227] [operator()] [validation] BlockConnected: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 block height=750818
2022-08-23T23:02:05.319058Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention g_cs_orphans, net_processing.cpp:4395 started
2022-08-23T23:02:05.319917Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention g_cs_orphans, net_processing.cpp:4395 completed (566μs)
That block temporarily made a switch for the best chaintip, but that chaintip is now orphan. And apparently it pollutes now the coindb and/or the mempool, and is never cleaned. It has been replaced by a better "confirmed" block containing 1330 transactions, much less than what was expected apparently by the "gettxoutsetinfo" RPC which apparently still expects to detect transactions that are orphaned.