ci: no-longer exclude feature_block in TSAN job #20543

pull fanquake wants to merge 1 commits into bitcoin:master from fanquake:dont_exclude_feature_block_cirrus changing 2 files +2 −3
  1. fanquake commented at 4:15 AM on December 2, 2020: member

    The TSAN job is now running on Cirrus. Increase the allocated memory to the maximum allowed.

  2. fanquake added the label Tests on Dec 2, 2020
  3. fanquake commented at 4:58 AM on December 2, 2020: member

    feature_block has failed:

    2020-12-02T04:37:36.022000Z TestFramework (INFO): Accept a block with invalid opcodes in dead execution paths
    2020-12-02T04:37:36.134000Z TestFramework (INFO): Test re-orging blocks with OP_RETURN in them
    2020-12-02T04:37:36.912000Z TestFramework (INFO): Test a re-org of one week's worth of blocks (1088 blocks)
    2020-12-02T04:46:04.530000Z TestFramework (ERROR): Assertion failed
    Traceback (most recent call last):
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 126, in main
        self.run_test()
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/feature_block.py", line 1278, in run_test
        self.send_blocks([block], True, timeout=2440)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/feature_block.py", line 1410, in send_blocks
        self.helper_peer.send_blocks_and_test(blocks, self.nodes[0], success=success, reject_reason=reject_reason, force_send=force_send, timeout=timeout, expect_disconnect=reconnect)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 631, in send_blocks_and_test
        self.sync_with_ping(timeout=timeout)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 507, in sync_with_ping
        self.wait_until(test_function, timeout=timeout)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 412, in wait_until
        wait_until_helper(test_function, timeout=timeout, lock=p2p_lock, timeout_factor=self.timeout_factor)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 247, in wait_until_helper
        if predicate():
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 409, in test_function
        assert self.is_connected
    AssertionError
    2020-12-02T04:46:06.683000Z TestFramework (INFO): Stopping nodes
    stderr:
    Traceback (most recent call last):
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/authproxy.py", line 107, in _request
        self.__conn.request(method, path, postdata, headers)
      File "/usr/lib/python3.8/http/client.py", line 1255, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1049, in _send_output
        self.send(chunk)
      File "/usr/lib/python3.8/http/client.py", line 971, in send
        self.sock.sendall(data)
    BrokenPipeError: [Errno 32] Broken pipe
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/feature_block.py", line 1417, in <module>
        FullBlockTest().main()
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 149, in main
        exit_code = self.shutdown()
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 278, in shutdown
        self.stop_nodes()
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 526, in stop_nodes
        node.stop_node(wait=wait)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 319, in stop_node
        self.stop(wait=wait)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/coverage.py", line 47, in __call__
        return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/authproxy.py", line 144, in __call__
        response, status = self._request('POST', self.__url.path, postdata.encode('utf-8'))
      File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/authproxy.py", line 113, in _request
        self.__conn.request(method, path, postdata, headers)
      File "/usr/lib/python3.8/http/client.py", line 1255, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
        self.send(msg)
      File "/usr/lib/python3.8/http/client.py", line 950, in send
        self.connect()
      File "/usr/lib/python3.8/http/client.py", line 921, in connect
        self.sock = self._create_connection(
      File "/usr/lib/python3.8/socket.py", line 808, in create_connection
        raise err
      File "/usr/lib/python3.8/socket.py", line 796, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
     node0 2020-12-02T04:44:06.247207Z [msghand] - Disconnect block: 2249.91ms 
     node0 2020-12-02T04:44:06.863029Z [msghand] UpdateTip: new best=107d0d7d666cc30a086b3346a220e0f71fe736988247e8c7227b7025c92e7993 height=378 version=0x00000004 log2_work=9.566054 tx=11656 date='2020-12-02T04:42:46Z' progress=1.000000 cache=1.6MiB(12001txo) 
     node0 2020-12-02T04:44:06.863218Z [msghand] Enqueuing BlockDisconnected: block hash=621dba2ca71816fc9d2e28994582593ea03aa4184638ee3962a7ed632e8b2ba8 block height=379 
     node0 2020-12-02T04:44:18.793214Z [msghand] - Disconnect block: 7207.89ms 
     node0 2020-12-02T04:44:53.834226Z [msghand] UpdateTip: new best=73b411695b1980e6ea88ae9fd02cddd51a71be59e988eee78243ffd0eaf60ccf height=377 version=0x00000004 log2_work=9.562242 tx=11653 date='2020-12-02T04:42:45Z' progress=1.000000 cache=1.6MiB(12001txo) 
     node0 2020-12-02T04:45:01.905174Z [msghand] Enqueuing BlockDisconnected: block hash=107d0d7d666cc30a086b3346a220e0f71fe736988247e8c7227b7025c92e7993 block height=378 
     test  2020-12-02T04:46:03.936000Z TestFramework.p2p (DEBUG): Closed connection to: 127.0.0.1:16244 
     test  2020-12-02T04:46:04.530000Z TestFramework (ERROR): Assertion failed 
                                       Traceback (most recent call last):
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 126, in main
                                           self.run_test()
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/feature_block.py", line 1278, in run_test
                                           self.send_blocks([block], True, timeout=2440)
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/feature_block.py", line 1410, in send_blocks
                                           self.helper_peer.send_blocks_and_test(blocks, self.nodes[0], success=success, reject_reason=reject_reason, force_send=force_send, timeout=timeout, expect_disconnect=reconnect)
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 631, in send_blocks_and_test
                                           self.sync_with_ping(timeout=timeout)
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 507, in sync_with_ping
                                           self.wait_until(test_function, timeout=timeout)
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 412, in wait_until
                                           wait_until_helper(test_function, timeout=timeout, lock=p2p_lock, timeout_factor=self.timeout_factor)
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 247, in wait_until_helper
                                           if predicate():
                                         File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 409, in test_function
                                           assert self.is_connected
                                       AssertionError
     test  2020-12-02T04:46:06.528000Z TestFramework (DEBUG): Closing down network thread 
     test  2020-12-02T04:46:06.683000Z TestFramework (INFO): Stopping nodes 
     test  2020-12-02T04:46:06.683000Z TestFramework.node0 (DEBUG): Stopping node 
    

    I've added a commit to increase the memory of the container to the allowed maximum. Either that fixes it, or we update the comment to make it generic, as Travis is going away, and this is currently being run on Cirrus.

  4. fanquake commented at 5:51 AM on December 2, 2020: member

    The TSAN job is now passing. @MarcoFalke can you advise if it's ok for us to just bump the memory to 24GB? I think the only potential downside is that the TSAN job may not get scheduled immediately depending on the load on Cirrus's community containers.

  5. MarcoFalke commented at 7:22 AM on December 2, 2020: member

    Concept ACK the changes are fine, but unrelated we should look into why bitcoind suddenly consumes more than 16GB of memory with tsan enabled.

  6. MarcoFalke commented at 7:22 AM on December 2, 2020: member
  7. hebasto commented at 7:47 AM on December 2, 2020: member

    Concept ACK.

    I think the only potential downside is that the TSAN job may not get scheduled immediately depending on the load on Cirrus's community containers.

    I saw a similar behavior when CPU number was bumped to its maximum.

  8. MarcoFalke commented at 8:05 AM on December 2, 2020: member

    If there are scheduling issues with one of the tasks, we could use compute credits for it

  9. fanquake force-pushed on Dec 2, 2020
  10. ci: no-longer exclude feature_block in TSAN job
    The TSAN job is now running on Cirrus.
    Increase the allocated memory to the maximum allowed.
    2b356117e9
  11. fanquake force-pushed on Dec 2, 2020
  12. practicalswift commented at 9:46 AM on December 2, 2020: contributor

    Strong concept ACK

    More TSAN is better than less TSAN.

    And generally: if adding more testing hardware means more safety checking that is typically a very good deal :)


    Aside:

    The same economical argument can be applied when doing capacity planning for fuzzing hardware farms: as a general rule be very aggressive when allocating hardware resources to your long-term fuzzing jobs. It is typically a relatively cheap way to find bugs (of course: assuming good fuzzing coverage, etc.).

    Some empirical results from Marcel Böhme (@mboehme) and Brandon Falk (@gamozolabs)'s excellent paper "Fuzzing: On the Exponential Cost of Vulnerability Discovery":

    We present counterintuitive results for the scalability of fuzzing. Given the same non-deterministic fuzzer, finding the same bugs linearly faster requires linearly more machines. For instance, with twice the machines, we can find all known bugs in half the time. Yet, finding linearly more bugs in the same time requires exponentially more machines. For instance, for every new bug we want to find in 24 hours, we might need twice more machines. Similarly for coverage. With exponentially more machines, we can cover the same code exponentially faster, but uncovered code only linearly faster. In other words, re-discovering the same vulnerabilities is cheap but finding new vulnerabilities is expensive. This holds even under the simplifying assumption of no parallelization overhead. We derive these observations from over four CPU years worth of fuzzing campaigns involving almost three hundred open source programs, two state-of-the-art greybox fuzzers, four measures of code coverage, and two measures of vulnerability discovery. We provide a probabilistic analysis and conduct simulation experiments to explain this phenomenon.

  13. jonasschnelli commented at 9:49 AM on December 2, 2020: contributor

    utACK 2b356117e94f9ef27b67a8e98663f5d676f58c11 - checked the CI run and confirmed that the feature_block runs: https://cirrus-ci.com/task/6008403543719936?command=ci#L3249

  14. MarcoFalke commented at 9:54 AM on December 2, 2020: member

    review ACK 2b356117e94f9ef27b67a8e98663f5d676f58c11

  15. MarcoFalke merged this on Dec 2, 2020
  16. MarcoFalke closed this on Dec 2, 2020

  17. sidhujag referenced this in commit 4c2e9849c1 on Dec 2, 2020
  18. dongcarl commented at 6:01 PM on December 19, 2020: member

    I'm getting a failure that is identical to the one fanquake got: https://cirrus-ci.com/task/5238486280175616?command=ci#L3462

  19. MarcoFalke commented at 6:17 PM on December 19, 2020: member

    @dongcarl The issue should fix itself after the next push.

  20. DrahtBot locked this on Feb 15, 2022
  21. fanquake deleted the branch on Nov 9, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-26 06:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me