I'm hoping to get some feedback on this. It was mentioned by @gmaxwell that there is a 100ms sleep inside ThreadMessageHandler() which can delay the receiving and sending of network messages. By replacing this with a semaphore, it can hopefully improve block propagation times.
I haven't experienced any connections stalling with this patch, so it seems to work, and the ThreadMessageHandler() loop executed about 10 times per second on average when I tested with 8 peers.
There is still a lot of overhead since each execution of the loop does a scan of all nodes to check for new messages to send/receive, which I believe will happen every time there is a post() done on the semaphore. But, I haven't noticed any additional CPU load with this patch.
I am yet to test how this effects block propagation. Does anyone have a testing environment where you can roll out patches and test their effect on block latency?