Make mininode_lock non-reentrant #19178
pull jnewbery wants to merge 4 commits into bitcoin:master from jnewbery:2020-05-mininode-lock-reentrancy changing 2 files +5 −8-
jnewbery commented at 3:03 pm on June 5, 2020: memberThere’s no need for mininode_lock to be reentrant. Use a simpler non-recursive lock.
-
[tests] Only acquire lock once in p2p_compactblocks.py edae6075aa
-
[tests] Don't acquire mininode_lock twice in wait_for_broadcast() 9d80762fa0
-
[tests] Don't call super twice in P2PTxInvStore.on_inv() c67c1f2c03
-
[tests] Make mininode_lock non-reentrant
There's no need for mininode_lock to be reentrant. Use a simpler non-recursive lock.
-
DrahtBot added the label Tests on Jun 5, 2020
-
in test/functional/test_framework/mininode.py:495 in 62068381a3
491@@ -492,7 +492,7 @@ def test_function(): 492 # P2PConnection acquires this lock whenever delivering a message to a P2PInterface. 493 # This lock should be acquired in the thread running the test logic to synchronize 494 # access to any data shared with the P2PInterface or P2PConnection. 495-mininode_lock = threading.RLock() 496+mininode_lock = threading.Lock()
glozow commented at 6:48 pm on June 11, 2020:Seems likeLock
will try to acquire indefinitely by default (I’m looking at these docs). I tried out some re-acquiring behavior with this code and it hangs forever as expected… Just in case someone like me does some dumb deadlocky stuff, would it be a good idea to makemininode_lock
use something likelock.acquire(timeout=60)
and/orif lock.locked(): raise_assertion_error
?
glozow commented at 7:31 pm on June 11, 2020:Actually I have a question because I’m not 100% sure how the test_runner works - does it allot a certain amount of time for each test? It looks like it would run infinitely? How would it affect Travis if 1 test is in deadlock?
MarcoFalke commented at 8:27 pm on June 11, 2020:Travis is running our ci system, which calls the test_runner helper:
0 DOCKER_EXEC LD_LIBRARY_PATH=$DEPENDS_DIR/$HOST/lib ${TEST_RUNNER_ENV} test/functional/test_runner.py --ci $MAKEJOBS --tmpdirprefix "${BASE_SCRATCH_DIR}/test_runner/" --ansi --combinedlogslen=4000 ${TEST_RUNNER_EXTRA} --quiet --failfast
If a test takes a long time, it will show up in the log as
Remaining jobs: [foo_bar.py]
.If a test never finished, the virtual machine that the ci system runs on will eventually reboot and mark the run as failed.
jnewbery commented at 10:04 pm on June 11, 2020:would it be a good idea to make mininode_lock use something likelock.acquire(timeout=60) and/or if lock.locked(): raise_assertion_error
We use the
with <lock>:
syntax everywhere, which doesn’t allow ablocking
ortimeout
argument. I don’t think there’s a way to set a default time on a lock.I think if you did accidentally introduce a deadlock, you’d figure it out pretty easily. If your script was hanging, I think the best way would be to implement some way to attach pdb to the running process (eg https://stackoverflow.com/a/25329467/933705 or https://stackoverflow.com/a/147114/933705) and then inspect the stack trace.
(in fact, I think this would be generally useful, so perhaps someone should implement it :slightly_smiling_face: )
glozow commented at 11:57 pm on June 11, 2020:Ok this makes sense, thanks! I was worried I could accidentally stall Travis hahaha. Having some debugging help sounds useful, let me go look for a someone…in test/functional/test_framework/mininode.py:661 in 62068381a3
657@@ -658,8 +658,6 @@ def on_inv(self, message): 658 # save txid 659 self.tx_invs_received[i.hash] += 1 660 661- super().on_inv(message)
glozow commented at 6:50 pm on June 11, 2020:Good catch, I think this was from Amiti + me both adding it in separate PRs.glozow commented at 7:34 pm on June 11, 2020: memberConcept ACK, I just learned a lot about locks haha. I have a comment about timeouts because opening up the potential for deadlock scares me 🥶jnewbery commented at 10:06 pm on June 11, 2020: memberI wonder if we should actually change the Lock to be a Conditional Variable: https://docs.python.org/3/library/threading.html#condition-objects. A lot of what we do with the lock is polling for changes on objects, so having a notify() function that wakes the blocked thread could be much nicer.MarcoFalke commented at 10:31 pm on June 11, 2020: memberConcept ACK. Seems like a good first step for future cleanups. (e.g. making the lock a private member and then replacing the polling loops with cvs)in test/functional/test_framework/mininode.py:662 in 62068381a3
657@@ -658,8 +658,6 @@ def on_inv(self, message): 658 # save txid 659 self.tx_invs_received[i.hash] += 1 660 661- super().on_inv(message) 662- 663 def get_invs(self): 664 with mininode_lock:
gillichu commented at 6:01 pm on June 15, 2020:I’m a bit confused on whyget_invs
needs to grab the lock, buton_inv
doesn’t? I understand that in the case ofwait_for_broadcast
,self.wait_until
already acquires the lock, but I just wanted to confirm thaton_inv
already expects to be holding the lock on call.
jnewbery commented at 6:46 pm on June 15, 2020:Yes. All of theon_<msgtype>()
already hold the lock, since they’re called by theon_message()
method (specifically, thegetattr(self, 'on_' + msgtype)(message)
line)gillichu commented at 6:07 pm on June 15, 2020: noneConcept ACK, I learned more about how locks are built into the Bitcoin testing framework too. I’m not familiar with whether polling is a resource consuming activity/busy waiting in the current implementation, but in general use cases I know CV’s are a good replacement.MarcoFalke commented at 11:37 pm on June 15, 2020: memberShower thought: The lock could even be removed completely without a cv replacement if all code that previously required the lock was executed in the network thread. Not sure if this makes sense at all, so consider it a “shower thought”.jonatack commented at 9:29 am on June 16, 2020: memberACK 62068381a3b9c0MarcoFalke commented at 9:45 am on June 16, 2020: memberACK 62068381a3b9c065d81300be79abba7aecfdb41b 😃
Signature:
0-----BEGIN PGP SIGNED MESSAGE----- 1Hash: SHA512 2 3ACK 62068381a3b9c065d81300be79abba7aecfdb41b 😃 4-----BEGIN PGP SIGNATURE----- 5 6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p 7pUg1zgv/Yt3dwdeZi7IRInaHKprMrQbGGBB/MvgXliI7NbdZZmVGTcONWCPMXWDA 8NRPTCD2oP4scNp2uRwFYr3WGXo8XrHQImByajBJWrgMxwiPkzHGYN0VHh9WUJiQt 94XQyQhtr6nNg3T9dPvTyM0aUblItMkh17dcnQoynH7FGSh7njbhD9m+Nr0yJFMaX 10pMeiIyMTbOIao2VNX1htUF5XN0twFjHNvFKHlHTyJxSyK/85vd2UPYo1eCzgBQwa 11Pq9Ax5/MUtHnD5lucuthH4IHC7HfGxoJ7FpQGTZWPTDH7LA9HPn5z1D+JUXf7b3t 12zOCvVNm1DSvBG47EcPEHDIBLvhT4AhNJOd+APWN4NTV7y2AiGZwufipvWbPuBjpS 13J2YTw57Y7CIbos3O8fGDuA0bNAR1mYNwBrOvyqXtxQaS6KV0WKecCtRbFrxoAxu5 14565+kQoGp4sjp3KAEalyPVu+td9orvKCejNRI5MloMcG55EKQ+3zLtQgGPoVBVx/ 155vm8xtfy 16=YJKb 17-----END PGP SIGNATURE-----
Timestamp of file with hash
236b7f082f677ae84f62c6b2ec461c0d23883d147cfecbaf72127c0e22631b33 -
MarcoFalke merged this on Jun 16, 2020MarcoFalke closed this on Jun 16, 2020
jnewbery deleted the branch on Jun 16, 2020sidhujag referenced this in commit 9be866b47d on Jul 7, 2020deadalnix referenced this in commit 2b1e2420f9 on Jan 22, 2021Fabcien referenced this in commit e7039654cd on Apr 23, 2021DrahtBot locked this on Feb 15, 2022
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-23 21:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me