RPC: gettxoutsetinfo correctly flushes transactions to the coindb, but then does not return any RPC reply, and keeps running #25912

verdy-p commented at 11:11 PM on August 23, 2022: none

When using bitcoin-cli gettxoutsetinfo, I see in the defailed logs that the RPC is being performed: it flushes transactions to the coindb (it only takes about 40ms todo that):

This is performed in an "httpworker.*" thread (there are 4 such threads by default for handling ANY incoming RPC request. Visibly that request has only been able to flush all supposed transactions, but it expects to find more and does not find them. It seems to wait forever for those missing transactions,.

This occurs when there are new incoming transactions in a forked block. Even after waiting more then 10 minutes (so that other blocks are detected and downloaded), it still waits. Most probably those transactions from the forked blocks will never be validated and are now in a dead branch of the chain (not the best chaintip).

So bitcoin-cli gettxoutsetinfo never gets any reply. I can press CTRL+C to terminate the bitcoin-cli client. But in the bitcoind daemon, the "httpworker.*" thread is still running in a tight loop that will never terminate. So we've got one of these 4 httpworkers stuck.

In htop, I see that "httpworker.1" is busy and uses all its alloted CPU time. There are still 3 other httpworkers ready to accept new RPC requests.

I can still perform some other RPC requests with bitcoin-cli (e.g. bitcoin-cli -getinfo, or bitcoin-cli -netinfo 4, or bitcoin-cli getchaintips... but if I try to use bitcoin-cli gettxoutsetinfo again, they also flush the transactions in the coindb, inside one of the available httpworkers, but here also it never termintes: now we've got 2 httpworkers busy. Here again press CTRL+C in the client.

In "htop" you can see these "httpworker.1" and "httpworker.2" constantly busy.

Now If I try running bitcoin-cli gettxoutsetinfo a third time, and a 4th time, they will also flush to disk (this does not take long, about 40ms each time). "htop" reveals now that ALL "httpworker.*" are busy.

At that time it becomes impossible to use any RPC request (not even just bitcoin-cli -getinfo or bitcoin-cli getchaintips): that RPC will not even be initiated: the HTTP listener sees it, it quues the request, the messge handler will pick it, but will not be able to submit it to an httpworker thread as they are all busy): these RPC requsets will stack and will never be serviced.

If I run bitcoin-cli stop, it will also not reply; we cannot stop the bitcoin daemon safely.

During that time, all other services are running normally: new incoming blocks are detected, downloaded and sync'ed, validated, indexed in the chainstate or in other indexes (including the coindb).

I can wait for one hour, or more, these httpworkers are still running and using all their CPU time without ever terminating. I dno't know what they are waiting for to terminate: if there are still non-validated transactions, or transactions that were already detected as being invalid as they are no longer in the "best chain". It should terminte earlier. But apparently it waits until a given number of transactions have all been processed and flushed (they are not flushed if they are invalid).

If I press CTRL+C or a SIGINT signal to the daemon, it will attemt to terminate threads. But it will wait indefinitely for the termination of "busy" httpworkers: what are they doing?

This occurs if there was a recent incoming block creating a fork will not validated (and that will never be validated: the fork has already been rejected by consensus).

How can we avoid "gettxoutsetinfo" getting stucked for extremely long, or possibly for ever in httpworkers, making the RPC service completely inaccessible, just because there was a chain fork? (this does not block other services, but not being able to service RPC requests means that invalidated forks in the chain are creating a DoS situation in deamons, making its RPC service unusable. So we can no longer monitor what is happening in the chain, and cannot safely stop the deamon to restart it.

Example of log:

2022-08-23T23:16:35.194268Z [http] [httpserver.cpp:240] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:54349
2022-08-23T23:16:35.194652Z [httpworker.0] [rpc/request.cpp:176] [parse] [rpc] ThreadRPCServer method=gettxoutsetinfo user=__cookie__ peeraddr=127.0.0.1:54349
2022-08-23T23:16:35.232055Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 started
2022-08-23T23:16:35.265743Z [httpworker.0] [txdb.cpp:169] [BatchWrite] [coindb] Writing final batch of 0.51 MiB
2022-08-23T23:16:35.296377Z [httpworker.0] [txdb.cpp:171] [BatchWrite] [coindb] Committed 11263 changed transaction outputs (out of 16793) to coin database...
2022-08-23T23:16:35.296884Z [httpworker.0] [validationinterface.cpp:246] [ChainStateFlushed] [validation] Enqueuing ChainStateFlushed: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:16:35.297289Z [scheduler] [validationinterface.cpp:246] [operator()] [validation] ChainStateFlushed: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:16:35.297633Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 completed (64964μs)
2022-08-23T23:16:35.297959Z [scheduler] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, index/base.cpp:315 started
2022-08-23T23:16:35.298152Z [httpworker.0] [logging/timer.h:57] [Log] [lock] Enter: lock contention ::cs_main, kernel/coinstats.cpp:156 started
2022-08-23T23:16:35.299703Z [scheduler] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, index/base.cpp:315 completed (1564μs)
2022-08-23T23:16:35.299901Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention cs_main, net_processing.cpp:4395 started
2022-08-23T23:16:35.300063Z [httpworker.0] [logging/timer.h:57] [Log] [lock] Enter: lock contention ::cs_main, kernel/coinstats.cpp:156 completed (1605μs)

And then after more than one hour, this [httpworker.0] does nothing I can see in the log

Notice "Committed 11263 changed transaction outputs (out of 16793) to coin database..." Vsibily there are more than 5000 transactions missing. Their validation will never come.

I can search the log for the chain "000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3": it never appears after that.

If I search the log instead before this attempted commit, I see`that this block came as:

2022-08-23T23:02:04.886231Z [msghand] [net_processing.cpp:2790] [ProcessMessage] [net] received: cmpctblock (8273 bytes) peer=23
2022-08-23T23:02:04.891223Z [msghand] [blockencodings.cpp:165] [InitData] [cmpctblock] Initialized PartiallyDownloadedBlock for block 000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 using a cmpctblock of size 8273
2022-08-23T23:02:04.891774Z [msghand] [net_processing.cpp:2790] [ProcessMessage] [net] received: blocktxn (33 bytes) peer=23
2022-08-23T23:02:04.919826Z [msghand] [blockencodings.cpp:210] [FillBlock] [cmpctblock] Successfully reconstructed block 000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 with 1 txn prefilled, 1329 txn from mempool (incl at least 1 from extra pool) and 0 txn requested
2022-08-23T23:02:04.941281Z [msghand] [validationinterface.cpp:257] [NewPoWValidBlock] [validation] NewPoWValidBlock: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3
2022-08-23T23:02:04.950268Z [msghand] [flatfile.cpp:69] [Allocate] [validation] Pre-allocating up to position 0x3000000 in blk03167.dat
2022-08-23T23:02:05.137427Z [msghand] [validationinterface.cpp:251] [BlockChecked] [validation] BlockChecked: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 state=Valid
2022-08-23T23:02:05.221094Z [msghand] [validationinterface.cpp:218] [TransactionRemovedFromMempool] [validation] Enqueuing TransactionRemovedFromMempool: txid=9471839224754a14a3455846f2f98541ad655c9792257b9dfe7f19702305626f wtxid=2312cbe8d16869684071bd2c3ad12f8b5353ffe3c1e0d5b7b4cc18656d609c6d
2022-08-23T23:02:05.221599Z [scheduler] [validationinterface.cpp:218] [operator()] [validation] TransactionRemovedFromMempool: txid=9471839224754a14a3455846f2f98541ad655c9792257b9dfe7f19702305626f wtxid=2312cbe8d16869684071bd2c3ad12f8b5353ffe3c1e0d5b7b4cc18656d609c6d
2022-08-23T23:02:05.310848Z [msghand] [validation.cpp:2511] [UpdateTipLog] UpdateTip: new best=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 height=750818 version=0x20800004 log2_work=93.692777 tx=759000744 date='2022-08-23T23:01:31Z' progress=1.000000 cache=2.7MiB(11393txo)
2022-08-23T23:02:05.311293Z [msghand] [txmempool.cpp:736] [check] [mempool] Checking mempool with 91 transactions and 148 inputs
2022-08-23T23:02:05.317205Z [msghand] [validationinterface.cpp:227] [BlockConnected] [validation] Enqueuing BlockConnected: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 block height=750818
2022-08-23T23:02:05.317607Z [msghand] [validationinterface.cpp:199] [UpdatedBlockTip] [validation] Enqueuing UpdatedBlockTip: new block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 fork block hash=00000000000000000004136e2034fb9e7d4ebe758567c6981840eec05b54f914 (in IBD=false)
2022-08-23T23:02:05.318393Z [scheduler] [validationinterface.cpp:227] [operator()] [validation] BlockConnected: block hash=000000000000000000095c6b72c9effaedbfe84f154e2ca16b764c6180e08bf3 block height=750818
2022-08-23T23:02:05.319058Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention g_cs_orphans, net_processing.cpp:4395 started
2022-08-23T23:02:05.319917Z [msghand] [logging/timer.h:57] [Log] [lock] Enter: lock contention g_cs_orphans, net_processing.cpp:4395 completed (566μs)

That block temporarily made a switch for the best chaintip, but that chaintip is now orphan. And apparently it pollutes now the coindb and/or the mempool, and is never cleaned. It has been replaced by a better "confirmed" block containing 1330 transactions, much less than what was expected apparently by the "gettxoutsetinfo" RPC which apparently still expects to detect transactions that are orphaned.

verdy-p added the label Bug on Aug 23, 2022

ZaidanK commented at 3:31 AM on August 24, 2022: none

Hi, the command did take a while to return. Perhaps about 30 seconds. This is what I got,

{ "height": 750848, "bestblock": "000000000000000000042bfa933012137be4426a545056c51d75f8358306f099", "txouts": 83853482, "bogosize": 6255092170, "hash_serialized_2": "e8915ee90c5152753b10ab7e1b48d33f21d999a89668a3f5b948478b153ccff6", "total_amount": 19130092.17246361, "transactions": 49927698, "disk_size": 5119722459 }

What version are you running? I'm on Bitcoin Core RPC client version v23.99.0-b1a2021f7809-dirty

verdy-p commented at 4:26 AM on August 24, 2022: none

I'm using the same version as you. It was compiled from the current develomment branch in this Git repository. Compiled with debug (-g) and with -DDEBUG_LOCKCONTENTION (as instructed by other developers because I submitted several other issues) here.

The node is fully synchronized, running as a full node.

Running on standard Debian Bullseye, in a VM with 8 CPU cores (Intel Core i9), 32GB RAM, and a dedicated physical storage volume for Bitcoin data, over a fast and reliable RAID5 array (on 6 physical disks), with 1TB allocated only for this volume (there's a separate volume for the rest of the system or /home, all Bitcion data goes into its own mounted partition, optimized for perfomance with 64KB clusters).

I can reproduce it on my local PC, or when using a VM hosted Azure or AWS. So I know that it is not a hardware issue. I have the same issue when running on a cheaper personal notebook (with a SSD), where Debian is installed with a basic desktop environment.

It has both connectivity over IPv4 and IPv6 (but I don't use Tor) and accepts incoming connections from peers on the Internet.

[That VM only runs the kernel (init), and a single user logged in running two bash sessions, one for running bitcoin dameon (via gdb --args bitcoind -printtoconsole), the other one for looking at what is happening (with "htop" to see active processes and threads, or to run 'bitcoin-cli' commands. [I use the "-printtoconsole" option just to have a direct view of logs, without having to use some "tail -f debug.log" in another console; but the main reason is that there are some other outputs on stderr which are not written in "debug.log", coming from some dependant libraries, noably from MiniUPNPC which is used in its own thread, so I can see these messages synchronized with other logs. I've not disabled UPNP with the "pnp=0" option; it has no effect on my network, as UPNP is not enabled on the router which is just configured with a single port forwarding for accepting incoming connections on TCP port 8333 with IPv4 or IPv6; I've disabled Tor, so "listenonion=0" is set in bitcoin.conf, as well there's no option to enable I2P; the RPC service is configured to allow onlyconnections using cookie authentication, and it only accepts local connections, with the "whitelist=127.0.0.1/32" option, and only from the same local Linux user, where this "rpc.cookie" file is accessible and not readable by any other Linux user (except "root") than the one used to run bitcoind or gdb.]

I run bitcoind through gdb, only to be able to pause it an inspect what is happening and get stack traces if needed). But the same occurs when letting it run outside gdb (then I can use the released version 23.0, or a compiled version 23.99 without debug (-g). Note that I cannot "attach" gdb to an already running bitcoind, sur to security restrictions ("ptrace" is not permitted across sessions with different controling terminals or different parent processes, this is enforced by the kernel with its standard security settings, so gdb must be attached as the parent process and must run bitcoind itself; I will never run any Internet-connected program accepting incoming connections over any Linux where such restrictions on ptrace are not enforced by the kernel, at least with the "yama" security model or with the "SELinux" model; a strenghened kernel is mandatory for running any VM in AWS and Azure when not running any Linux VM on costly dedicated hardware, but on shared hosts where you cannot run custom kernels, but only supported kernels: I use the standard (securely signed) Debian Bullseye kernel proposed and supported by these cloud-hosting providers.

I also know now (see other bug reports) that one "bitcoin.conf" option (checkblockindex=1) has an issue and is not thread-safe, so I've disabled it (that bug was signaled by another user in 2019). That option is disabled by default in the release, but while searching for other bugs, I had activated it, but this causes the RPC service to hang as well.

I know that bitcoin-cli gettxoutsetinfo can take a while, but according to what is logged, the deamon just takes about 40 ms to complete the flush, but then hangs but then enters in a tight loop that never ends after more than 1 hour, and never replies to the bitcoin-cli client.

maflcko commented at 5:44 AM on August 24, 2022: member

Can you attach gdb when this deadlock happens and let us know where exactly the http thread is stuck?

ZaidanK commented at 10:57 PM on August 24, 2022: none

I'm setting up an environment for testing this out right now. @verdy-p, Would you please make your bitcoin.conf available. Do you have the same problems on testnet as well?

You configure command looks like configure -g -DDEBUG_LOCKCONTENTION correct?

edit: Apologies, I got the following to work configure CXXFLAGS='-O0 -g -DDEBUG_CONTENTION'

verdy-p commented at 11:10 AM on August 25, 2022: none

I don't use testnet, jsut the regular chain, because I don't need to send any transaction on it or create blocks, and I'm not mining The regular chani contains all interesting features, is highly active, has many peers (using various software implementations). And bugs are occuring with this variety.

Note: I created a dedicated storage volume only for Bitcoin data (blocks, chainstate, indexes, pluis de debug.log file), separated from the system files, home directory and /usr/local/ directories where the compiled software will be installed and running. This volume also contains some regular copies of the "chainstate" and "indexes" (4 indexes are enabled), just to avoid having to reindex everything after a crash, or to perform a new test, so I dimensionned to to 800 GiB in my Debian VM (currently about 450 GiB for blocks for a full node, add the chainstate, indexes, and their backups, it remains about 150GB free on that volume).

The interesting thing is that you need to have more indexes enabled (txindex, coinstat, blockfilter). It seems that this bug mostly affects the coinstat index when the main chainstate is fully synced and no longer in IBD). But it you start bitcoind from scratch (not any block or index) with the 4 indexed enabled, you'll see it hand extremely rapidly (in less than 10 minutes), if you run with at least 8 CPU cores in your VM (so that threads may effectively run concurrently peer connections).

You can even hang it very rapidly by just using "bitcoin-cli -getinfo" repeatedly to run RPC requests (which will consume an http thread to consult the progress of the chainstate): it should not crash/hang the RPC service, but it does.

Note that initially I had also enabled "checkmempool=1" and "checkchainstate=1" (but now I know that the 2nd one has a problem and generates a deadlock, so I've disabled it, and there's still a problem).

On Windows, you don't need to configure anything: the default settings will make both the Qt version or the daemon version (running on a CMD console) hang the same way (as long as you've 8 CPU cores enabled); I could also get this hang with a lower number of CPU (you can use the Task manager to reduce the "CPU affinity" to reduce the number of cores you want for bitcoind; if running Windows in a VM, just configure the number of cores in your favorite hypervisor, it could be Hyper-V, Proxmox, or the internal hypervisors manage by AWS or Azure on their cloud; I did not test it with basic Qemu over Debian, or with Oracle Virtual Box; Proxmox however can only run on bare metal and won't let you run GUI apps easily without complex configuration, but it is fine as is for testing the deamon version, running in a "Windows Core" server installation). I thnik that one Windows such deadlocks are more likely, because Windows uses much lighter threads in its scheduler, whereas Linux uses "lighweight processes" (LWP) to support the POSIX thread library, with some additional containment, and transitions between threads in Linux have a higher cost than on Windows, and mutexes on Windows are more "reactive" than on Linux.

Also I noted that bitcoind does not manage any priority between its threads, so threads are woken up in random order (in Windows like on Linux). And there's not even any code in bitcoind to insert randomized delays after releasing a mutex: the message manager is using mutexes in a too much agressive, it is awoken much too often, more than necessary, and it it acquires a lock on the main mutex, it may hold it most of the time, or will "steal" it easily from other threads, that will have to pause and will have difficulties to be woken up: these waiting threads may still have another lock on another mutex, creating the deadlock if other threads need it but then need to acquire the global mutex.

Note also that I run with "upnp=1" enabled, but as long as UPnP is not enabled on the network side, it should be safe beacuse it should not acquire the lock on the list of peers (also used by the address manager to perform new outgoing connections, or when accepting list of peer addresses from any peer). It is possible that the deadlock occurs when the address manager acquires the lock on the address list, while it is still holding the global mutex on the chainstate in the same thread, this could be caused by the address manager being needed but the validator currently processing blocks, but needing another outgoing connection, if existing peers are not replying to a request, e.g. for new incoming blocks).

So studying the lock contention (notably when there are different mutexes acquired in different orders) may reveal the deadlock problem.

Another thing I do not understand is why there is a so small static number (4) of "httpworker" threads for RPC. RPC is supposed to be used locally at low frequency (not from remotes) and could be created dynamically (and then deleted, we can just keep one running permanently in a thread pool, and just fix a large but sufficient maximum, even up to 256, may be more, like on usual webservers, even if it exceeds the number of available CPU cores). Being able to tune the number of worker threads for each kind of thread would also allow interesting similations for testing bitcoind (notably for lock contention) and for adapting bitcoind on more environments, with better scalability (e.g. not running on desktop or notebook PCs, but on large servers with many cores and lot of memory, or running on small appliances and mobiles probably but not as a "full node" as this requires too much local storage).

The RPC listener should also be running much earlier in the init sequence (for example we should be able to use "bitcoin-cli logging '[]' '[]'" or the RPC request to suspend network activity, or to "stop" the daemon immediately (creating an "httpworker" if needed, even if the 4 existing ones are busy) even if some RPC requests (about the state of the chainstate, or the wallet as long as it is not loaded) are still not fully avaible and should return their existing initialization status by returning immediately a suitable error.

Then adding an RPC to control thread pool sizes, or CPU affinity, could also allow performing lot of simulations for "fuzz" tests (some tests will require a host with at least 8 cores to be able to run the simulation).

Finally I suggest that the Bitcoin documentation allows any RPC request to send "HTTP 100 Continue" status, which should be required if the execution of the RPC runs for more than 5 seconds: RPC clients should receive these notifications, containing also some informative status (as a JSON object) of what is happening and how these requests are progressing, for example the length of their work queue such as the range (minimum/maximum) height to handle, the number of transactions to do, and some counters, and a the same time should see if they were instructed to bail out, or if the RPC client has already been disconnected or has been interrupted and will no longer receive any reply. It's not reasonnable for any RPC thread to take several minutes (e.g. "gettxoutsetinfo"), or may be hours to complete, without knowing anything about what it is doing (we shuold not need to run the daemon inside a debugger and send a SIGINT signal to suspend the whole process, just to inspect the thread stacks with debugger commands, or by using some other "strace" tool on Linux, which does not always work in all environment where it is restricted by basic kernel security).

sipa commented at 12:18 PM on August 25, 2022: member

I assume what's happening here is that bitcoind takes longer to compute the gettxoutsetinfo result than bitcoin-cli is willing to wait. You may want to increase the rpcclienttimeout option (the default is 15 minutes).

verdy-p commented at 12:23 PM on August 25, 2022: none

I will also enable the "-DDEBUG_LOCKORDER" option (see "src/sync.h", "src/sync.cpp", also "test/sync_tests.cpp", "test/reverselock_tests.cpp") when compiling with "-g" and see what is happening when running in "gdb". The option "-DDEBUG_LOCKCONTENTION" does not seem sufficient.

ZaidanK commented at 4:35 PM on August 25, 2022: none

Hi Verdy,

So some of the options you have set are

chain=main checkmempool=1 checkchainstate=1 txindex=1 coinstat=1 blockfilter=1

is that correct?

Making your bitcoin.conf available would be greatly appreciated.

To understand this problem better, I'm assuming gettxoutsetinfo is near-consistent across all nodes? With gettxoutsetinfo returning data on the txoutset and mempool (as an implicit txoutset) Could anyone clarify?

maflcko commented at 7:10 AM on August 26, 2022: member

I'd say this is just a duplicate of #25897

verdy-p commented at 7:17 AM on August 26, 2022: none

Most of them are defaults, just commented for completeness.

I play with debug options, also now with compile options (-DDEBUG_LOCKCONTENTION, and recently -DDEBUG_LOCKORDER) (both used in Debian Bullseye or Windows 11)

Initially on my first installation on Windows, I did not configure anything and used the default with the released versions v22 and v23 (initially with Qt, but as it did not work, I used bitcoind from the commandline).

That is because I did not find any way to let it run that I started to look at these options and look at what were there default settings.

I had also tried to set the "alert_*" options, trying to spawn an external command to trace something else in an aditional log file), but this was not the cause (those alerts were never launched, but I still had the same issues with bitcoin core hanging), so I have disabled them again. That's when I decided to run it on Linux, and also tried to run it on another machine (fiest on a NUC running ProxMox to excute a Linux Debian VM, then on AWS, and Azure, where I still had the same hanging issues).

# Options de l'IU:
#
lang=fr_FR              # Définir la langue, par exemple « fr_CA » (par défaut: la langue du système)
resetguisettings=0      # Réinitialiser tous les paramètres changés dans l'IU
min=0                   # Démarrer minimisé
splash=1                # Afficher l'écran d'accueil au démarrage (par défaut: 1)

# Options de sélection de chaine:
#
testnet=0               # Utiliser la chaine test

# Options générales:
#
#rootcertificates=-system-      # Définir les certificats SSL racine pour les requêtes de paiement (par défaut: -system-)
#choosedatadir=0                # Choisir un répertoire de données au démarrage (par défaut: 0)
datadir=/mnt/g/bitcoin          # Spécifier le répertoire de données
conf=bitcoin.conf               # Spécifier le fichier de configuration (par défaut: bitcoin.conf)
dbcache=450                     # Définir la taille du cache de la base de données en MiB (4 à 16384, par défaut: 300)
maxmempool=300                  # Garder la réserve de mémoire transactionnelle sous n MiB (par défaut: 300)
maxsigcachesize=64              # MiB
dblogsize=4                     # Flush wallet database activity from memory to disk log every n megabytes (default: 100)
checkblocks=6                   # Nombre de blocs à vérifier au démarrage (par défaut: 6, 0=tous)
checklevel=4                    # Degré de profondeur de la vérification des blocs -checkblocks (0-4, par défaut: 3)
checkblockindex=0               # Flag
reindex=0                       # Reconstruire l'état de la chaîne et l'index des blocs à partir des fichiers blk*.dat sur le disque
reindex-chainstate=0            # Reconstruire l'état de la chaîne à partir des blocs indexés actuellement
#prune=550                      # Réduire les exigences de stockage en élaguant (supprimant) les anciens blocs. Ce mode est incompatible avec -txindex et -rescan. Avertissement: ramener ce paramètre à sa va
leur antérieure exige un nouveau téléchargement de la chaîne de blocs en entier (par défaut: 0=désactiver l'élagage des blocs, 550=taille cible en Mio à utiliser pour les fichiers de blocs).
txindex=0                       # Maintenir un index complet des transactions, utilisé par l'appel RPC getrawtransaction (obtenir la transaction brute) (par défaut: 0)
blockfilterindex=1              # flag
coinstatsindex=1                # flag
peerblockfilters=1              # flag
peerbloomfilters=1              # Prendre en charge le filtrage des blocs et des transactions avec les filtres bloom (par défaut: 1)
checkmempool=1                  # Flag
mempoolexpiry=72                # Ne pas conserver de transactions dans la réserve de mémoire plus de n heures (par défaut: 72)
persistmempool=1                # Flag
maxorphantx=100                 # Garder au plus n transactions non connectables en mémoire (par défaut: 100)
par=0                           # Définir le nombre de fils de vérification des scripts (-4 à 16, 0=autant que de cœurs, -4 à -1=laisser ce nombre de cœurs inutilisés, par défaut: 0)
# Exécuter la commande lorsque le meilleur bloc change (%s dans cmd est remplacé par le hachage du bloc)
#blocknotify="G:\Bitcoin\alert.cmd" Notif: %s
# Exécuter une commande lorsqu'une alerte pertinente est reçue, ou si nous voyons une bifurcation vraiment étendue (%s dans la commande est remplacé par le message)
#alertnotify="G:\Bitcoin\alert.cmd" Alert: %s

# Options du serveur RPC:
#
rpcthreads=4                    # Définir le nombre de fils pour les appels RPC (par défaut: 4)
server=1                        # Accepter les commandes JSON-RPC et en ligne de commande
rest=0                          # Accepter les demandes HTTP REST publiques (par défaut: 0)
rpcserialversion=1              # Sets the serialization of raw transaction or block hex returned in non-verbose mode, non-segwit(0) or segwit(1) (par défaut: 1)
rpccookiefile=rpc.cookie        # Emplacement du fichier témoin auth (par défaut: data dir)
rpcbind=127.0.0.1               # Se lier à l'adresse donnée pour écouter des connexions JSON-RPC. Utiliser la notation [host]:port pour l'IPv6. Cette option peut être spécifiée plusieurs fois (par défaut:
se lier à toutes les interfaces)
rpcport=8332                    # Écouter les connexions JSON-RPC sur port (par défaut: 8332 ou tesnet: 18332)
rpcallowip=127.0.0.1            # Permettre les connexions JSON-RPC de sources spécifiques. Valide pour ip qui sont une IP simple (p. ex. 1.2.3.4), un réseau/masque réseau (p. ex. 1.2.3.4/255.255.255.0) ou
un réseau/CIDR (p. ex. 1.2.3.4/24). Cette option peut être spécifiée plusieurs fois
#rpcuser=pve                    # Nom d'utilisateur pour les connexions JSON-RPC
# Mot de passe pour les connexions JSON-RPC
#rpcpassword=btc
# Nom d'utilisateur et mot de passe haché pour les connexions JSON-RPC. Le champ userpw vient au format: USERNAME:SALT$HASH. Un script python canonique est inclus dans share/rpcuser. Cette option peut être
spécifiée plusieurs fois.
#rpcauth=pve:SALT$btc

# Options de création de blocs:
#
#blockmaxweight=3000000         # Définir le poids maximal de bloc BIP141 (par défaut: 3000000)
#blockmaxsize=750000            # Définir la taille minimale de bloc en octets (par défaut: 750000)
#blockprioritysize=0            # Définir la taille maximale en octets des transactions à priorité élevée et frais modiques (par défaut: 0)
#blockreconstructionextratxn=100        #

# Options de relais du nœud:
#
#bytespersigop=20       # Octets équivalents par sigop dans les transactions pour relayer et miner (par défaut: 20)
#datacarrier=1          # Relayer et miner les transactions du porteur de données (par défaut: 1)
#datacarriersize=83     # Quantité maximale de données dans les transactions du porteur de données que nous relayons et minons (par défaut: 83)
#permitbaremultisig=1   # Relayer les multisignatures non-P2SH (par défaut: 1)
whitelistrelay=1        # Accepter les transactions relayées reçues de pairs de la liste blanche même si le nœud ne relaie pas les transactions (par défaut: 1)
#whitelistforcerelay=1  # Force relay of transactions from whitelisted peers even if they violate local relay policy (par défaut: 1)

# Options de connexion:
#
discover=1                      # Découvrir ses propres adresses (par défaut: 1 en écoute et sans externalip ou -proxy)
upnp=1                          # Utiliser UPnP pour mapper le port d'écoute (par défaut: 0)
natpmp=0                        # Utiliser NAT-PMP pour mapper le port d'écoute (par défaut: 0)
#externalip=ip                  # Spécifier votre propre adresse publique
#bind=ip:port                   # Se lier à l'adresse donnée et toujours l'écouter. Utiliser la notation [host]:port pour l'IPv6
port=8333                       # Écouter les connexions sur port (par défaut: 8333 ou tesnet: 18333)
#proxy=ip:port                  # Se connecter par un mandataire SOCKS5
#proxyrandomize=1               # Aléer les authentifiants pour chaque connexion mandataire. Cela active l'isolement de flux de Tor (par défaut: 1)
listen=1                        # Accepter les connexions entrantes (par défaut: 1 si aucun -proxy ou -connect)
checkaddrman=0                  # Flag
dnsseed=1                       # Demander les adresses des pairs par recherche DNS si l'on manque d'adresses (par défaut: 1 sauf si -connect)
forcednsseed=0                  # Toujours demander les adresses des pairs par recherche DNS (par défaut: 0)
dns=0                           # Autoriser les recherches DNS pour -addnode, -seednode et -connect (par défaut: 1)
#onlynet=IPV4                   # Seulement se connecter aux nœuds du réseau net (IPv4, IPv6 ou oignon)
#seednode=1.2.3.4               # Se connecter à un nœud pour obtenir des adresses de pairs puis se déconnecter
#connect=104.248.143.83:8333    # Ne se connecter qu'au(x) nœud(s) spécifié(s)
#addnode=1.2.3.4                # Ajouter un nœud auquel se connecter et tenter de garder la connexion ouverte
#asmap=ip_asn.map
listenonion=0                   # Créer automatiquement un service caché Tor (par défaut: 1)
#onion=ip:port                  # Utiliser un serveur mandataire SOCKS5 séparé pour atteindre les pairs par les services cachés de Tor (par défaut: -proxy)
#torcontrol=127.0.0.1:9051      # Port de contrôle Tor à utiliser si l'écoute onion est activée (par défaut:127.0.0.1:9051)
#torpassword=                   # Mot de passe du port de contrôle Tor (par défaut: vide)
maxconnections=125              # Garder au plus n connexions avec les pairs (par défaut: 125)
#maxuploadtarget=               # Tente de garder le trafic sortant sous la cible donnée (en Mio par 24 h), 0=sans limite (par défaut: 0)
#maxreceivebuffer=5000          # Tampon maximal de réception par connexion, n*1000 octets (par défaut: 5000)
#maxsendbuffer=1000             # Tampon maximal d'envoi par connexion, n*1000 octets (par défaut: 1000)
#maxtimeadjustment=4200         # Réglage moyen maximal autorisé de décalage de l'heure d'un pair. La perspective locale du temps peut être influencée par les pairs, en avance ou en retard, de cette valeur.
 (Par défaut: 4200 secondes)
#timeout=5000                   # Spécifier le délai d'expiration de la connexion en millisecondes (minimum: 1, par défaut: 5000)
#whitebind=addr                 # Se lier à l'adresse donnée et aux pairs s'y connectant. Utiliser la notation [host]:port pour l'IPv6
whitelist=127.0.0.1/32          # Whitelist peers connecting from the given IP address (e.g. 1.2.3.4) or CIDR notated network (e.g. 1.2.3.0/24). Can be specified multiple times. Les pairs de la liste blanch
e ne peuvent pas être bannis DoS et leurs transactions sont toujours relayées, même si elles sont déjà dans le mempool, utile p. ex. pour une passerelle
whitelist=[::1]/128             # Whitelist peers connecting from the given IP address (e.g. 1.2.3.4) or CIDR notated network (e.g. 1.2.3.0/24). Can be specified multiple times. Les pairs de la liste blanch
e ne peuvent pas être bannis DoS et leurs transactions sont toujours relayées, même si elles sont déjà dans le mempool, utile p. ex. pour une passerelle
bantime=3600                    # Délai en secondes de refus de reconnexion pour les pairs présentant un mauvais comportement (par défaut: 86400 pour 24 heures)

# Options du porte-monnaie:
#
#disablewallet=1                # Ne pas charger le porte-monnaie et désactiver les appels RPC
#wallet=G:\Bitcoin\Perso        # Spécifiez le répertoire des fichiers (wallet.dat et wallet.log) de porte-monnaie (dans le répertoire de données) (par défaut: aucun)
flushwallet=1                   # flag
#zapwallettxes=0                # Supprimer toutes les transactions du porte-monnaie et ne récupérer que ces parties de la chaîne de blocs avec -rescan au démarrage (1=conserver les métadonnées de transmiss
ion, p. ex. les informations du propriétaire du compte et de demande de paiement, 2=abandonner les métadonnées de transmission)
#walletbroadcast=1              # Obliger le porte-monnaie à diffuser les transactions (par défaut: 1)
#spendzeroconfchange=1          # Dépenser la monnaie non confirmée lors de l'envoi de transactions (par défaut: 1)
#keypool=100                    # Définir la taille de la réserve de clés à n (par défaut: 100)
#fallbackfee=0.0002             # Un taux de frais (en BTC/Ko) qui sera utilisé si l'estimation de frais ne possède pas suffisamment de données (par défaut: 0.0002)
#mintxfee=0.00001               # Les frais (en BTC/Ko) inférieurs à ce seuil sont considérés comme étant nuls pour la création de transactions (par défaut: 0.00001)
#paytxfee=0.00001               # Les frais (en BTC/ko) à ajouter aux transactions que vous envoyez (par défaut: 0.00001)
#txconfirmtarget=2              # Si paytxfee n'est pas défini, inclure suffisamment de frais afin que les transactions commencent la confirmation en moyenne avant n blocs (par défaut: 2)
#walletnotify=cmd               # Exécuter la commande lorsqu'une transaction de porte-monnaie change (%s dans la commande est remplacée par TxID)

# Options de débogage/de test:
#
#uacomment=cmt                  # Ajouter un commentaire à la chaîne d'agent utilisateur
logips=1                        # Inclure les adresses IP à la sortie de débogage (par défaut: 0)
logtimestamps=1                 # Ajouter l'horodatage au début de la sortie de débogage (par défaut: 1)
logtimemicros=1                 # Add microsecond precision to debug timestamps (default: 0)
logsourcelocations=1            # Prepend debug output with name of the originating source location (source file, line number and function name) (default: 0)
logthreadnames=1                # Prepend debug output with name of the originating thread (only available on platforms supporting thread_local) (default: 1)
#minrelaytxfee=0.00001          # Les frais (en BTC/Ko) inférieurs à ce seuil sont considérés comme étant nuls pour le relais, le minage et la création de transactions (par défaut: 0.00001)
#maxtxfee=0.10                  # Frais totaux maximaux (en BTC) à utiliser en une seule transaction de porte-monnaie ou transaction brute ; les définir trop bas pourrait interrompre les grosses transaction
s (par défaut: 0.10)
printtoconsole=0                # Envoyer les infos de débogage/trace à la console au lieu du fichier debug.log
shrinkdebugfile=0               # Réduire le fichier debug.log lors du démarrage du client (par défaut: 1 sans -debug)

#debug=0
#debug=category
# Extraire les informations de débogage (par défaut: 0, fournir category est facultatif).
# Si category n'est pas indiqué ou si category=1, extraire toutes les données de débogage.
# category peut être:
#   rand, util, libevent, ipc, rpc, qt, bench,
#   addrman, net, i2p, http, proxy, tor,
#   leveldb, blockstorage, prune, reindex, mempool, mempoolrej,
#   validation, cmpctblock, coindb, selectcoins, zmq, walletdb.
#
#debug=rand
#debug=util
#debug=libevent
debug=ipc
debug=rpc
##debug=qt
#debug=bench
debug=lock
#
#debug=addrman
debug=net
##debug=i2p
debug=http
##debug=proxy
##debug=tor
#
#debug=leveldb
#debug=blockstorage
#debug=prune
#debug=reindex
debug=mempool
debug=mempoolrej
#
debug=validation
#debug=estimatefee
debug=cmpctblock
debug=coindb
#debug=selectcoins
#debug=zmq
#debug=walletdb

verdy-p commented at 7:27 AM on August 26, 2022: none

Note that this RPC hanging issue (which appears if you run "gettxsetoutinfo" too often, which then hangs for hours, locking one of the 4 "httpworkers" each time, then block completely and forbids using even "bitcoin-cli stop" is a separate issue than #25897.

So there are multiple issues, not caused by the same thing, or not with the same effect. In #25897, you don't need to use any bitcoin-cli RPC request to get the hang very rapidly (and extremely rapidly with ALL default settings in both Windows and Linux), after jsut a few minutes, and it's simply impossible to complete the IBD without having to "-reindex" or "-reindex-chainstate" each time: indexes are not correctly flushed, get corrupted, chaintips are "disconnected", and to make any progress you need to have a complete backup of the "chainstate", created when you could shutdown bitcoin successfully.

To accelerate things, I also used a dedicated storage volume (about 800 MiB is sufficient for a full node, and all the blocks, indexes, and backups of indexes) for all bitcoin data, formatted with LARGE cluster sizes (64 KiB initially, then 256 KiB, to make sure this was not caused by filesystem overhead, and to significantly reduce the fragmentation: after all almost all files in bitcoin data are large, over 1MB). And I have eliminated all causes of possible I/O hardware issues by using a RAID5 volume over 6 physical disks (and looking at OS reports about them, and also used filesystem checking, and nothing was detected in system logs). Of course, all this should not be needed on a regular default installation, but for daily use, I would still recommend using a dedicated volume (not necessarily RAID5 on many disks), and you should not need extra space for backups of indexes (even if most of the space is for the ~450GiB of blocks, and about 10 GiB for the chainstate index, plus some 20% free margin to avoid most filesystem overheads), and also recommand large cluster sizes (as this makes the PC much more responsive, even when the volume is created on an SSD, and reduces significantly the fragmentation and disk activity if you use an hard disk, or an external USB device).

On the opposite I keep the OS and all other "home" files use the default cluster sizes on its volume (typically 4 KiB, unless you use very large disks): various people use larger cluster sizes with dedicated volumes for the storage of their videos or some of their games to get more performance for their large files (it does not necessarily waste space), or for the storage of large databases, always to minimize the filesystem overhead.

On Linux, I just chose the default "ext4" filesystem (it could have been just "ext2" without journaling but less safe, or "zfs", or other). On Windows I chose the default "NTFS" filesystem (it could have been "ReFS" with additional checks and costs, it could eventually be "FAT32" with much less stability after crashes, or "exFAT" but the 800 GiB volume is not so large that you really need it and all files are largely below 4GiB each so "exFAT" is not needed at all; anyway "NTFS" and "ext4" are fine for all disk sizes and all file sizes, and their journaling makes our life safer after almost all crashes; I did NOT use "BTRFS" on Linux which is known to have severe issues after crashes, being almost impossible to repair, forcing to reformat everything and loosing all existing data; I have no opinion about "ZFS" that some Linux distribs propose, but I don't understand why some Linux distribs are still supporting "BTRFS" and sometimes propose it by default).

On Linux, I also did not use "LVM" on the guest OS or on a native OS (I use "LVM" only on a Linux-based host OS for installing an hypervisor needed to run a VM; as well on Windows, I use "StorageSpaces" for the storage of VMs used by an Windows-based hypervisor: these are quite common things done by administrators of hypervisors, because this offers flexibility for managing/creating/deleting/moving/resizing the storage needed for multiple VMs). Some Linux distribs uses "LVM" by default (even in their desktop environment): Bitcoin should work independantly of such partitioning and filesystem schemes.

amovfx commented at 3:42 PM on August 28, 2022: none

Thanks for the thorough write up. Will check this out asap. Might be beyond my ability.

amovfx commented at 1:51 AM on September 9, 2022: none

I havn't forgotten about this. @verdy-p have you had any progress?

verdy-p commented at 12:50 PM on September 12, 2022: none

Now with a completely downloaeed database (fully indexed with all 4 indexes), there's no longer an issue. But all my attempts to start from zero data are still failing rapidly. I've not been able to isolate the deadlock (that still occurs even if I do not add any setting other the default or additional indexes, early during the IBD phase, on both Windows and Linux, with the version compiled as is from existing Git sources). But anyway NONE of the existing released versions (signed binaries from the wedssite) are working and show the same deadlock (or fatal data corruptions in the chainstate) occuring really fast (this is jsut slightly less frequent if we compiler it ourself from Git sources: here at least we avoid the corruption most of the time and can preserve the chainstate by just restarting with check level=4, without having to reindex everything; but wiht the release, even that solution does not work and it's impossible to complete the IBD safely, we constantly need to restart from scratch, if we've not kept regular snapshots by stopping the daemon or graphic version every about 5-10 minutes, because they always crash after about 15-20 minutes, sometimes sooner). And this is systelaic on any systems with 3 cores (or more) or 6 CPU threads (or more). If you run the daemon or graphic version on a single vcore (using process manager's CPU affinity in Windows, or the equivalent in Linux, or by running Linux in a VM with a single vcore), it does not crash but it is extremely slow.

maflcko commented at 12:57 PM on September 12, 2022: member

It could also make sense to try to reproduce this in asan to check for some cases of undefined behavior.

Windows: https://docs.microsoft.com/en-us/cpp/sanitizers/asan Linux: https://clang.llvm.org/docs/AddressSanitizer.html

verdy-p commented at 1:04 PM on September 12, 2022: none

My opinion is that there's some "Stack use after return" remaining, and in multithreaded environement with many concurrent connections opening and closing there's likely an issue affecting one of the worker threads causnig data corruption in indexes (plus the deadlock already detected in an unsafe data checker inside Bitcoin code). I wonder if this will be solved ibn the "multiprocess" version that you're starting to implement (separate processes for separate services, communicating via IPC calls with the documented RPC protocol interface)

maflcko commented at 1:07 PM on September 12, 2022: member

My opinion is that there's some "Stack use after return"

it would be good to have the runtime traceback of valgrind or asan, otherwise there is nothing we can do

verdy-p commented at 1:14 PM on September 12, 2022: none

Don't know how to setup this (problem finding the configuration that works with existing config scripts)

maflcko commented at 1:27 PM on September 12, 2022: member

On Linux you simply pass --with-sanitizers=address to ./configure (you might also have to set CC=clang CXX=clang++). On Windows, I don't know.

maflcko commented at 8:27 PM on February 23, 2023: member

Is this still an issue with a recent version of Bitcoin Core? If yes, what are the steps to reproduce?

willcl-ark commented at 3:18 PM on April 10, 2024: member

The problem does not seem to be easily reproducible.

Please open a new issue (or leave a comment in here if you want this re-opened) if you experience the problem again.

willcl-ark closed this on Apr 10, 2024

bitcoin locked this on Apr 10, 2025