test/refactor: use test deque to avoid quadratic iteration #33313

pull l0rinc wants to merge 1 commits into bitcoin:master from l0rinc:l0rinc/linearize_test_runner changing 1 files +6 −7
  1. l0rinc commented at 11:45 pm on September 4, 2025: contributor

    Extracted from #33141 (review).


    In Python, list pop(0) is linear, so consuming all items in the test results in quadratic iteration.

    Switching to collections.deque with popleft() expresses FIFO intent and avoids the O(n^2) path. Behavior is unchanged - for a few hundred items the perf impact is likely negligible.

  2. DrahtBot commented at 11:45 pm on September 4, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33313.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK theStack, enirox001, maflcko, w0xlt
    Stale ACK kevkevinpal

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33142 (test: Run bench sanity checks in parallel with functional tests by maflcko)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in test/functional/test_runner.py:455 in 1500bb359e outdated
    448@@ -449,8 +449,7 @@ def main():
    449         print("Re-compile with the -DBUILD_DAEMON=ON build option")
    450         sys.exit(1)
    451 
    452-    # Build list of tests
    


    kevkevinpal commented at 11:04 am on September 5, 2025:
    Might make sense to replace this comment to explain why we’re using a deque instead of list since it might not be obvious to a reader as to why

    l0rinc commented at 6:42 pm on September 7, 2025:
    Done
  4. in test/functional/test_runner.py:471 in 1500bb359e outdated
    467@@ -469,15 +468,15 @@ def main():
    468             script = script + ".py" if ".py" not in script else script
    469             matching_scripts = [s for s in ALL_SCRIPTS if s.startswith(script)]
    470             if matching_scripts:
    471-                test_list.extend(matching_scripts)
    472+                test_queue += matching_scripts
    


    kevkevinpal commented at 11:15 am on September 5, 2025:

    deque also has an extend method, it might be preferable to use that instead since it is more readable

    https://docs.python.org/3/library/collections.html#collections.deque.extend


    l0rinc commented at 2:39 pm on September 5, 2025:
    Yeah, but two lines below we’re already using +=
  5. kevkevinpal commented at 11:19 am on September 5, 2025: contributor

    ACK 1500bb3

    Looks good to me, we’re replacing list with a deque and replacing pop(0) with popleft()

  6. in test/functional/test_runner.py:741 in 1500bb359e outdated
    737@@ -739,10 +738,10 @@ def get_next(self):
    738                               log_stdout,
    739                               log_stderr))
    740         if not self.jobs:
    741-            raise IndexError('pop from empty list')
    742+            raise IndexError('pop from empty queue')
    


    maflcko commented at 10:23 am on September 7, 2025:

    i don’t think this is right, as explained previously. This block of code is about self.jobs, which is a list, even after the changes here.

    It is just a sanity check, so a shorter way to write it would be assert(self.jobs) # Can not and must not be empty here.


    l0rinc commented at 6:42 pm on September 7, 2025:
    I remember you said that, but we’re not popping from the jobs - but the assert is better anyway, thanks for the hint, added you as coauthor.
  7. in test/functional/test_runner.py:744 in 1500bb359e outdated
    737@@ -739,10 +738,10 @@ def get_next(self):
    738                               log_stdout,
    739                               log_stderr))
    740         if not self.jobs:
    741-            raise IndexError('pop from empty list')
    742+            raise IndexError('pop from empty queue')
    743 
    744         # Print remaining running jobs when all jobs have been started.
    745-        if not self.test_list:
    746+        if not self.test_queue:
    


    maflcko commented at 10:28 am on September 7, 2025:

    I am not sure if this needs a rename.

    • Generally, we avoid putting the type name in the variable name
    • This just says that there is a list of tests (what exact type the list has should not matter for the code)
    • It creates a conflict with the other pull request

    So it would be better to leave this as it was previously, or at least not include the exact type in the name.


    l0rinc commented at 6:42 pm on September 7, 2025:
    I agree the type name is weird, but it’s wrong now that it’s not a lits (I understand you mean it’s a conceptual list, not a type, but I also meant a conceptual queue, not a list) - but we can adjust that later, I understand the conflict argument. Reverted the name
  8. maflcko commented at 10:29 am on September 7, 2025: member
    I don’t think this will ever be a measurable difference, but it seems fine to change. (Left some comments)
  9. l0rinc force-pushed on Sep 7, 2025
  10. w0xlt commented at 6:16 pm on September 9, 2025: contributor

    ACK https://github.com/bitcoin/bitcoin/pull/33313/commits/8c1f10233dc9a843147772c40973031f5bfdbb7c

    While the impact of this PR on a few hundred items is negligible, it’s still a good practice and is likely to scale more effectively as test coverage increases.

    For reference, the script below demonstrates the performance difference with 1,000,000 items on my machine:

    0List pop(0): 62.790 seconds
    1Deque popleft(): 0.022 seconds
    
     0import time
     1from collections import deque
     2
     3N = 1_000_000  # number of elements
     4
     5def benchmark_list():
     6    lst = list(range(N))
     7    start = time.perf_counter()
     8    while lst:
     9        lst.pop(0)  # O(n) per operation
    10    end = time.perf_counter()
    11    return end - start
    12
    13def benchmark_deque():
    14    dq = deque(range(N))
    15    start = time.perf_counter()
    16    while dq:
    17        dq.popleft()  # O(1) per operation
    18    end = time.perf_counter()
    19    return end - start
    20
    21if __name__ == "__main__":
    22    print(f"List pop(0): {benchmark_list():.3f} seconds")
    23    print(f"Deque popleft(): {benchmark_deque():.3f} seconds")
    
  11. DrahtBot requested review from kevkevinpal on Sep 9, 2025
  12. theStack approved
  13. theStack commented at 4:47 pm on September 10, 2025: contributor

    ACK 8c1f10233dc9a843147772c40973031f5bfdbb7c

    While I agree with others that very likely there won’t be a notable difference in run-time any time soon (if ever), it seems conceptually the right approach.

  14. enirox001 commented at 4:47 pm on September 11, 2025: contributor
    ACK https://github.com/bitcoin/bitcoin/commit/8c1f10233dc9a843147772c40973031f5bfdbb7c. This is a solid improvement. While the benefits may not be immediately obvious, it’s a valuable addition. I ran the functional tests and everything passed successfully.
  15. test/refactor: use test deque to avoid quadratic iteration
    Extracted from https://github.com/bitcoin/bitcoin/pull/33141#discussion_r2323012972.
    In Python, list `pop(0)` is linear, so consuming all items is quadratic.
    Switched to `collections.deque` with `popleft()` to express FIFO intent and avoid the O(n^2) path.
    Behavior is unchanged; for a few hundred items the perf impact is likely negligible.
    
    Ref: https://docs.python.org/3/tutorial/datastructures.html#using-lists-as-queues
    > While appends and pops from the end of list are fast, doing inserts or pops
    > from the beginning of a list is slow (because all of the other elements have
    > to be shifted by one).
    
    Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
    75e6984ec8
  16. l0rinc force-pushed on Sep 11, 2025
  17. l0rinc commented at 8:10 pm on September 11, 2025: contributor
    Thanks for the reviews so far, I have rebased it after #33141, should be trivial to re-review
  18. theStack approved
  19. theStack commented at 4:28 pm on September 12, 2025: contributor
    re-ACK 75e6984ec8c6fa196ad78c11f454da506d7c8ff1
  20. DrahtBot requested review from enirox001 on Sep 12, 2025
  21. DrahtBot requested review from w0xlt on Sep 12, 2025
  22. enirox001 commented at 2:03 pm on September 15, 2025: contributor
    reACK 75e6984
  23. maflcko commented at 7:28 am on September 26, 2025: member

    lgtm ACK 75e6984ec8c6fa196ad78c11f454da506d7c8ff1

    This shouldn’t affect performance, but it seems fine as a style-cleanup/refactor.

  24. glozow merged this on Sep 26, 2025
  25. glozow closed this on Sep 26, 2025


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-10-10 18:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me