Potential deadlock on {cs_main, pnode->cs_vSend} #4493

issue ashleyholman opened this issue on July 9, 2014
  1. ashleyholman commented at 11:02 AM on July 9, 2014: contributor

    2014-07-09 10:10:33 POTENTIAL DEADLOCK DETECTED 2014-07-09 10:10:33 Previous lock order was: 2014-07-09 10:10:33 pnode->cs_vRecvMsg net.cpp:1541 2014-07-09 10:10:33 (1) cs_main main.cpp:3887 2014-07-09 10:10:33 (2) cs_vSend net.h:480 2014-07-09 10:10:33 Current lock order is: 2014-07-09 10:10:33 (2) pnode->cs_vSend net.cpp:1560 2014-07-09 10:10:33 (1) cs_main main.cpp:4387 2014-07-09 10:10:33 (1) cs_main main.cpp:1320

    The problem seems to be:

    • Many paths of execution will grab cs_main and then cs_vSend, eg. ProcessMessage("tx") locks cs_main and then can call pfrom->PushMessage if it rejects the tx. PushMessage() obtains pnode->cs_vSend. Another example: CheckBlock() will lock cs_main and then can call PushGetBlocks() which does a call to PushMessage() which locks pnode->cs_vSend.
    • As far as I can tell, there is just a single execution path that obtains these locks in the opposite order: ThreadMessageHandler() locks pnode->cs_vSend and then calls SendMessages() which locks cs_main in order to call IsInitialBlockDownload().

    The least painful fix seems to be to change the SendMessages() code to not need cs_main. Otherwise, ThreadMessageHandler() would need to lock cs_main prior to its SendMessages calls, which might be expensive use of the lock. Any ideas on making the IsInitialBlockDownload() check available to SendMessages() without it having to lock cs_main every time? Maybe caching it is possible?

  2. ashleyholman renamed this:
    Potential deadlock on {cs_main, cs_vSend}
    Potential deadlock on {cs_main, pnode->cs_vSend}
    on Jul 9, 2014
  3. ashleyholman commented at 11:18 AM on July 9, 2014: contributor

    Oh, I've just realised that SendMessages() does a TRY_LOCK() on cs_main first, and if it can't get the lock it returns early before calling IsInitialBlockDownload(). So does that mean its not a deadlock risk in actuality? Although the LOCK() calls are in the opposite order, there may be no chance of deadlock?

  4. laanwj commented at 1:12 PM on July 9, 2014: member

    Well as long as there is one place where both locks are acquired (in any order) that's not a TRY_LOCK, there could be a deadlock there. If they are all TRY_LOCKS then indeed a deadlock cannot happen, but maybe livelock or thread starvation could (if both threads refuse to give up the other lock after getting the other one fails).

    As for IsInitialBlockDownload: one idea would be to change it to return a global flag that is cleared only after the initial block download (+ check at initialization time). This would be more efficient than executing the checks every time as well.

  5. laanwj added the label Bug on Jul 31, 2014
  6. sipa commented at 5:56 PM on January 23, 2017: member

    I'm going to close this, assuming that it was fixed by #7846 or other locking changes in recent time. Feel free to reopen if other problems are found.

  7. sipa closed this on Jan 23, 2017

  8. mcelrath commented at 6:17 PM on January 30, 2017: none

    Seen today on 0.13.2:

    2017-01-30 18:03:10 receive version message: /Satoshi:0.13.1/: version 70014, blocks=1087141, us=38.121.165.30:32483, peer=8
    2017-01-30 18:03:10 AdvertiseLocal: advertising address 38.121.165.30:18333
    2017-01-30 18:03:10 connect() to [2001:0:9d38:90d7:2cfe:1245:ac28:731a]:18333 failed: Network is unreachable (101)
    2017-01-30 18:03:37 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:03:37 Previous lock order was:
    2017-01-30 18:03:37  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:5911
    2017-01-30 18:03:37  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:03:37 Current lock order is:
    2017-01-30 18:03:37  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:1745
    2017-01-30 18:03:37 UpdateTip: new best=000000000000093a78ec61c44009ff74a44de94fd6e6b0fff2422804d180c082 height=1087142 version=0x20000000 log2_work=68.931904 tx=13040484 date='2017-01-30 18:03:19' progress=1.000000 cache=2.1MiB(9691tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:03:51 UpdateTip: new best=00000000000005f96e01b15ac4cfb95859e3eecf7f4885274e1f24b3d7e3fde9 height=1087143 version=0x20000000 log2_work=68.931923 tx=13040485 date='2017-01-30 18:03:37' progress=1.000000 cache=2.1MiB(9692tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:04:11 UpdateTip: new best=00000000000008bf5f8409d19de20f248f5b61e391071afd69bc9119efa3c0a4 height=1087144 version=0x20000000 log2_work=68.931941 tx=13040487 date='2017-01-30 18:03:51' progress=1.000000 cache=2.1MiB(9694tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:04:35 connect() to [2001:0:9d38:6ab8:248d:2e3a:983b:b04a]:18333 failed: Network is unreachable (101)
    2017-01-30 18:05:58 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:06:00 Previous lock order was:
    2017-01-30 18:06:05  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:06:07  (1) cs_main  main.cpp:5911
    2017-01-30 18:06:13  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:06:16 Current lock order is:
    2017-01-30 18:06:28  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:06:31  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:06:32  (1) cs_main  main.cpp:1745
    2017-01-30 18:08:58 UpdateTip: new best=000000000000095216f28d2cd715abee070cd83079411a93463ec61fa2a782eb height=1087145 version=0x20000000 log2_work=68.931959 tx=13040491 date='2017-01-30 18:05:29' progress=1.000000 cache=2.1MiB(9701tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:08:58 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:08:58 Previous lock order was:
    2017-01-30 18:08:58  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:5911
    2017-01-30 18:08:58  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:08:58 Current lock order is:
    2017-01-30 18:08:58  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:1745
    2017-01-30 18:08:58 UpdateTip: new best=000000000000024c5841a34b630828a27114c895ac7de8f1cb3376ec7d7a35a1 height=1087146 version=0x20000000 log2_work=68.931977 tx=13040493 date='2017-01-30 18:08:04' progress=1.000000 cache=2.1MiB(9703tx) warning='1 of last 100 blocks have unexpected version'
    

    Bitcoin did seem to sync successfully on testnet, and I'm not entirely sure if these message indicated a real problem?

  9. TheBlueMatt commented at 6:43 PM on January 30, 2017: member

    Yes, this is a spurious warning on many recent versions of Bitcoin Core, however, it is believed to have been "fixed" in master/will be fixed in 0.14.

  10. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 21:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me