ci: remove commit count limit from `test-each-commit` and fail fast

l0rinc commented at 9:38 pm on January 22, 2026: contributor

Problem

test-each-commit currently tests only a limited number of ancestor commits in a PR, so failures introduced deeper in the commit stack might be missed.

Fix

Remove the max-count limit so test-each-commit runs the full build + unit + functional test flow on every non-head PR commit, while keeping the PR tip excluded because it is already covered by the normal CI jobs. It will also stop after the first failure to surface the root cause sooner and keep logs readable when testing ancestor commits.

Examples

Example failure 10 commits deep: https://github.com/l0rinc/bitcoin/actions/runs/21390976651/job/61577575033?pr=105 in https://github.com/l0rinc/bitcoin/pull/105
Example pass with >7 dummy commits: https://github.com/l0rinc/bitcoin/actions/runs/21392557521/job/61595159841?pr=106 in https://github.com/l0rinc/bitcoin/pull/106

Note: this PR has gone through a few iterations, the latest one just extends the existing job.

DrahtBot added the label Tests on Jan 22, 2026

DrahtBot commented at 9:38 pm on January 22, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	maflcko, hebasto, willcl-ark

If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

l0rinc force-pushed on Jan 22, 2026

DrahtBot added the label CI failed on Jan 22, 2026

DrahtBot commented at 10:14 pm on January 22, 2026: contributor

🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/actions/runs/21265870531/job/61204792016 LLM reason (✨ experimental): Lint failure: a file is not executable (ci-compile-each-commit-exec.py has 644 permission).

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

l0rinc marked this as ready for review on Jan 22, 2026

DrahtBot removed the label CI failed on Jan 23, 2026

maflcko commented at 8:48 am on January 23, 2026: member

ci-test-each-commit-exec.py

unrelated note: I think this script is missing the functional test runner --failfast option. A failure will lead to a massive log output with all failures collected, which is hard to read and debug. It would be better to just print the first failure.

More generally, I wonder if it wouldn’t be easier to just run it on all commits. I guess the timeout could be raised to the max (6 hours), possibly using a cirrus runner.

If a pull request doesn’t finish after 6 hours on a 16 core machine, then it is probably too large and should be split up?

If it isn’t possible to split up, one or two reviewers can run the script locally?

l0rinc marked this as a draft on Jan 23, 2026

l0rinc force-pushed on Jan 23, 2026

DrahtBot added the label CI failed on Jan 23, 2026

l0rinc renamed this:
~~ci: compile remaining untested ancestor commits~~
ci: run unit + functional tests on all ancestor commits
on Jan 23, 2026

l0rinc commented at 5:13 pm on January 23, 2026: contributor

Thanks @maflcko, split the previous test-each-commit into two, updated PR title&description (kept old branch name though).

l0rinc marked this as ready for review on Jan 23, 2026

DrahtBot removed the label CI failed on Jan 23, 2026

maflcko commented at 8:17 am on January 26, 2026: member

Ah, with “it” I meant “pull request”, not “the ci task here”.

If a pull request doesn’t finish after 6 hours on a 16 core machine, then it is probably too large and should be split up?

Was there ever a pull request in history that wouldn’t compile and test all commits in 6 hours on a modern 16 core machine?

l0rinc commented at 8:20 am on January 26, 2026: contributor

Was there ever a pull request in history that wouldn’t compile and test all commits in 6 hours on a modern 16 core machine?

let’s find out

l0rinc commented at 8:21 am on January 26, 2026: contributor

the leveldb and crc32 updates don’t always compile, but I couldn’t find any other

edit: e.g. first commit of #33641 fails with util/env_posix.cc:857:24: error: no template named 'is_standard_layout_v' in namespace 'std'; did you mean 'is_standard_layout'? And 9a5d29711afcdc4609da4786673758e641958bb4 also fails with old cmake requirement.

maflcko commented at 8:32 am on January 26, 2026: member

the leveldb and crc32 updates don’t always compile, but I couldn’t find any other

edit: e.g. first commit of #33641 fails with util/env_posix.cc:857:24: error: no template named 'is_standard_layout_v' in namespace 'std'; did you mean 'is_standard_layout'? And 9a5d297 also fails with old cmake requirement.

That pull request has a single commit, so it seems impossible for it to time out. Also, if it timed-out the current CI task would time out as well.

If you want to actually test this, my recommendation would be to only look at pulls with more than 6 commits.

l0rinc commented at 8:54 am on January 26, 2026: contributor

you mean that f21162d8193319d3cdd43cecd66ee5389632533b was already skipped by CI?

maflcko commented at 9:08 am on January 26, 2026: member

you mean that f21162d was already skipped by CI?

The commit is both a child of a merge commit and also a subtree commit of a different repo. You can also check the behavior here https://github.com/bitcoin/bitcoin/actions/runs/21197620806/job/60976836032?pr=34363, which is another subtree bump.

l0rinc force-pushed on Jan 26, 2026

l0rinc renamed this:
~~ci: run unit + functional tests on all ancestor commits~~
ci: remove commit count limit from `test-each-commit`
on Jan 26, 2026

l0rinc commented at 4:05 pm on January 26, 2026: contributor

Thanks for the feedback, the latest version of the PR simply removes the 6 commit limit from test-each-commit job.

maflcko commented at 8:44 am on January 27, 2026: member

This is missing a test (sample CI output).

Also, when increasing the commit count to check, this should run on a larger runner, I’d say. Maybe ghcr.io/cirruslabs/ubuntu-runner-amd64:24.04-md with double the CPU threads?

l0rinc force-pushed on Jan 27, 2026

l0rinc commented at 9:11 am on January 27, 2026: contributor

This is missing a test (sample CI output).

Added back the reproducer examples to the PR description.

Maybe ghcr.io/cirruslabs/ubuntu-runner-amd64:24.04-md

Thanks, added this one with standard ubuntu fallback, similarly to the lint job.

double the CPU threads

It seems to me we’re already doing that: https://github.com/bitcoin/bitcoin/blob/fa6db67369fbf9b9f0ec839b898edd7ba8bfe31a/.github/ci-test-each-commit-exec.py#L59

Also applied a DrahtBot LLM suggestion.

l0rinc force-pushed on Jan 27, 2026

DrahtBot added the label CI failed on Jan 27, 2026

DrahtBot removed the label CI failed on Jan 27, 2026

maflcko commented at 4:44 pm on January 28, 2026: member

lgtm ACK 22a6d8924455f64a3addf361afffc021e1e4d6e8

in .github/workflows/ci.yml:94 in 22a6d89244 outdated

90@@ -98,7 +91,7 @@ jobs:
91           if test -n "$MERGE_BASE"; then
92             EXCLUDE_MERGE_BASE_ANCESTORS=^${MERGE_BASE}^@
93           fi
94-          echo "TEST_BASE=$(git rev-list -n$((${{ env.MAX_COUNT }} + 1)) --reverse HEAD $EXCLUDE_MERGE_BASE_ANCESTORS | head -1)" >> "$GITHUB_ENV"
95+          echo "TEST_BASE=$(git rev-list -n${{ github.event.pull_request.commits }} --reverse HEAD $EXCLUDE_MERGE_BASE_ANCESTORS | head -1)" >> "$GITHUB_ENV"

hodlinator commented at 9:11 am on February 3, 2026:

It seems like now that we are using the exact number of commits, we can drop EXCLUDE_MERGE_BASE_ANCESTORS along with the calculation of it and MERGE_BASE:

0          echo "TEST_BASE=$(git rev-list -n${{ github.event.pull_request.commits }} --reverse HEAD | head -1)" >> "$GITHUB_ENV"

in fact, we can probably simplify further to:

0          echo "TEST_BASE=$(git rev-parse HEAD~${{ github.event.pull_request.commits - 1 }})" >> "$GITHUB_ENV"

maflcko commented at 9:29 am on February 3, 2026:

no? subtree merges will still have to be skipped

hodlinator commented at 9:33 am on February 3, 2026:

Unless my LLM is hallucinating, maybe even this would be possible:

0          echo "TEST_BASE=${{ github.event.pull_request.commits[0].id }}" >> "$GITHUB_ENV"

l0rinc commented at 9:36 am on February 3, 2026:

I didn’t dare refactoring this more than needed - is it a nit or a blocker?

hodlinator commented at 9:36 am on February 3, 2026:

no? subtree merges will still have to be skipped

Why?

maflcko commented at 9:40 am on February 3, 2026:

no? subtree merges will still have to be skipped

Why?

The CI script is written for the Bitcoin Core repo and codebase, if it is run on a completely different repo and codebase, it will fail.

hodlinator commented at 9:53 am on February 3, 2026:

I didn’t dare refactoring this more than needed - is it a nit or a blocker?

Thought it might be valuable to clean some of this away. But seems like it’s still needed for subtree merges. Would be nice to clearly document that.

Honestly I don’t feel competent to review this change, even though it’s small and the test runs look valid (I checked them for off-by-ones). I would want to lean on my own understanding of the PR content and not on @maflcko’s. Hopefully someone with more familiarity can help out, I’ll focus on other issues for now.

maflcko commented at 10:05 am on February 3, 2026:

It should be trivial to prove me wrong by pushing a pull request with your changes + a subtree merge on top. If it passes, I am happy to ack.

It should also be trivial to test locally:

 0git checkout 26fbe10873e727c5f345a6130e819772a321d924 # subtree merge commit
 1
 2git rebase --exec 'echo hi' 34a5ecadd7203121b03ac692d22a752d2a364111 
 3Auto-merging .github/workflows/ci.yml
 4CONFLICT (content): Merge conflict in .github/workflows/ci.yml
 5Auto-merging CMakeLists.txt
 6CONFLICT (content): Merge conflict in CMakeLists.txt
 7error: could not apply 2fccbea3c8... Squashed 'src/secp256k1/' changes from d543c0d917..14e56970cb
 8hint: Resolve all conflicts manually, mark them as resolved with
 9hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
10hint: You can instead skip this commit: run "git rebase --skip".
11hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
12hint: Disable this message with "git config set advice.mergeConflict false"
13Could not apply 2fccbea3c8... # Squashed 'src/secp256k1/' changes from d543c0d917..14e56970cb

maflcko commented at 10:10 am on February 3, 2026:

I would want to lean on my own understanding of the PR content and not on @maflcko’s.

I support this, so if there are any doc changes, I am happy to ack that as well.

hodlinator commented at 10:22 am on February 3, 2026:

It should be trivial

It’s not trivial to me, I’ve never merged a subtree, I don’t see why git would produce conflicts when rebasing such commits.

My intuition is that one would want to run our tests on the commit that merges in a new version of secp256k1. I get that one doesn’t want to traverse down the history of commits in secp256k1 and test those though.

l0rinc commented at 10:28 am on February 3, 2026:

I’ve never merged a subtree

It’s messy, I also have shortcomings in this area - it’s why I wanted the change to focus on the part I do understand. I also want to clean this up if it’s easy to do, but this may not be that easy - do you mind if we do these non-trivial cleanups in a separate PR?

maflcko commented at 10:46 am on February 3, 2026:

It should be trivial

It’s not trivial to me, I’ve never merged a subtree, I don’t see why git would produce conflicts when rebasing such commits.

I think it is generally true for any merge commit, not just a subtree merge commit.

If a dev is merging other open pull requests into a draft pull request to build something on top, we similarly only want to build the fresh commits on top of the (octo) merge base.

My intuition is that one would want to run our tests on the commit that merges in a new version of secp256k1. I get that one doesn’t want to traverse down the history of commits in secp256k1 and test those though.

Right, I guess it would be nice if this was possible. Though, subtree bumps are usually stand-alone pull requests, and the risk should be minimal that they introduce bisect errors. Same for pull requests with “normal” merge commits: Merge commits are not allowed in final pull requests, so running the CI on them in drafts doesn’t serve a purpose.

maflcko commented at 10:53 am on February 3, 2026:

For reference, if you want to create a subtree merge commit, you can run e.g. git subtree pull --prefix src/leveldb https://github.com/bitcoin-core/leveldb-subtree bitcoin-fork --squash to get fabced56f650eb0bc326f427dba8f8ec96c7b517

maflcko commented at 10:59 am on February 3, 2026:

For reference, if we wanted to run on merge commits as well, it could be rewritten to something like:

 0# 1. Initial Check: If HEAD is a merge commit, return immediately.
 1#    (other CI tasks check the top).
 2if is_merge_commit(HEAD):
 3    return
 4
 5# 2. Always skip the top commit (other CI tasks check the top)
 6git checkout HEAD~
 7
 8while true:
 9    ci_exec()
10   
11    # If the commit we just executed was the first one in the PR, all is done.
12    if is_pr_first_commit():
13        break
14
15    # If the commit we just executed was a merge commit, all is done, because merge commits are not allowed in pull requests and if this was a subtree merge commit, it was just checked. (A pull request can only contain one (subtree) merge commit)
16    if is_merge_commit(HEAD):
17        break
18
19    # Continue
20    git checkout HEAD~
21end

However, I am not sure if this is clearer.

hebasto commented at 3:10 pm on February 3, 2026:

My intuition is that one would want to run our tests on the commit that merges in a new version of secp256k1. I get that one doesn’t want to traverse down the history of commits in secp256k1 and test those though.

I’m not opposed this idea, but it would be more productive to keep this PR scope limited to its initial intent.

hodlinator commented at 9:18 am on February 3, 2026: contributor

Reviewed 22a6d8924455f64a3addf361afffc021e1e4d6e8

While I feel comfortable renaming things in CI (#33909), I’m not knowledgeable about runners. So can’t leave more than a light A-C-K. But would like to see the current version simplified if we are indeed going ahead with it (see inline comment).

hebasto commented at 2:41 pm on February 3, 2026: member

Concept ACK.

in .github/workflows/ci.yml:90 in 22a6d89244

hebasto commented at 4:01 pm on February 3, 2026:

I don’t think this comment is relevant anymore.

l0rinc commented at 4:06 pm on February 3, 2026:

is it related to this change or should we do it in a follow-up?

hebasto commented at 4:12 pm on February 3, 2026:

nm, I realized that FETCH_DEPTH calculation is not modified in this change.

hebasto approved

hebasto commented at 4:04 pm on February 3, 2026: member

ACK 22a6d8924455f64a3addf361afffc021e1e4d6e8, I have reviewed the code and it looks OK.

I agree on that:

unrelated note: I think this script is missing the functional test runner --failfast option. A failure will lead to a massive log output with all failures collected, which is hard to read and debug. It would be better to just print the first failure.

maflcko commented at 4:14 pm on February 3, 2026: member

I agree on that:

unrelated note: I think this script is missing the functional test runner --failfast option. A failure will lead to a massive log output with all failures collected, which is hard to read and debug. It would be better to just print the first failure.

It could make sense to add a commit for that here. Otherwise, a new pull will have to be opened for this.

l0rinc force-pushed on Feb 3, 2026

l0rinc commented at 4:49 pm on February 3, 2026: contributor

Pushed https://github.com/bitcoin/bitcoin/blob/0ffb20dee17858e0f13b23fce3dd0ec06cbeb9a2/test/functional/test_runner.py#L420 in separate commit since it’s a behavioral change, explained the reasoning in the commit message, added @maflcko as coauthor. Also rebased since to make sure the changes from #34461 are included here.

l0rinc renamed this:
~~ci: remove commit count limit from `test-each-commit`~~
ci: remove commit count limit from `test-each-commit` and fail fast
on Feb 3, 2026

hebasto approved

hebasto commented at 5:22 pm on February 3, 2026: member

ACK d25dfa700307356df7cce75b7b90a2af259bc3ab.

DrahtBot requested review from maflcko on Feb 3, 2026

maflcko commented at 5:47 pm on February 3, 2026: member

review ACK d25dfa700307356df7cce75b7b90a2af259bc3ab 🍘

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: review ACK d25dfa700307356df7cce75b7b90a2af259bc3ab 🍘
3VKjOQYMGIONIbPMiVgB/zKykqkh9fm+LNoWztyrRoJHorqFj1yShI0ZR0lNoq4O9FEqVFjk6oMiogHcpZrFVBA==

maflcko commented at 5:29 pm on February 4, 2026: member

Maybe to add documentation about why the merge base is searched, not only how, the following diff could be included?

 0diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
 1index 1f9af64129..c6ec9f4942 100644
 2--- a/.github/workflows/ci.yml
 3+++ b/.github/workflows/ci.yml
 4@@ -75,6 +75,16 @@ jobs:
 5           # Checkout HEAD~ because it would be wasteful to rerun tests on the PR
 6           # head commit that are already run by other jobs.
 7           git checkout HEAD~
 8+          # Moreover, pull requests that contain a merge commit are generally
 9+          # draft pull requests that merge in other pull requests, so only
10+          # check the relevant commits after the last merge commit.  A merge
11+          # commit could also be a subtree merge commit, which may be
12+          # worthwhile to check. However, it is rare that the subtree merge
13+          # commit is not the top commit (which would be skipped anyway by this
14+          # task, because it is run by all other tasks). Also, `git rebase
15+          # --exec` does not work on merge commits, so if this was important to
16+          # check, the logic would have to be rewritten.
17+          #
18           # Figure out test base commit by listing ancestors of HEAD, excluding
19           # ancestors of the most recent merge commit, ordering them from oldest to
20           # newest, and taking the first one.

ci: remove commit count limit from `test-each-commit`

Extend `test-each-commit` to run on every non-head pull request commit.
The PR tip is excluded because it is already covered by other CI jobs.
Runner was changed to a performant cirrus runner.

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>

04c4d71008

ci: fail fast in test-each-commit script

Pass `--failfast` to the functional test runner in `.github/ci-test-each-commit-exec.py`.
Stop after the first failure to surface the root cause sooner and keep logs readable when testing ancestor commits.

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>

eb510f8678

l0rinc force-pushed on Feb 4, 2026

l0rinc commented at 7:13 pm on February 4, 2026: contributor

the following diff could be included?

Thanks, did a fixup for the first commit, reordered the whole comment block a bit, see: d25dfa700307356df7cce75b7b90a2af259bc3ab..eb510f8678ba2e5f2c2fad2a8b086ce93293de1a#diff-b803fcb7f1 Rebased again to make sure we’re running against the latest CI changes.

maflcko commented at 8:28 pm on February 4, 2026: member

I think it is easier to review if the block was not re-ordered and re-written at the same time. Anyway, fwiw:

lgtm ACK eb510f8678ba2e5f2c2fad2a8b086ce93293de1a 🕓

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: lgtm ACK eb510f8678ba2e5f2c2fad2a8b086ce93293de1a 🕓
3i5s181Yz65F3szc005svNTze+18aPXVjci5iO+IntDvB0+1TS2a2huHA9vvch2IYlg5O14gReUjiibrMd3hLDg==

DrahtBot requested review from hebasto on Feb 4, 2026

hebasto approved

hebasto commented at 2:01 pm on February 5, 2026: member

re-ACK eb510f8678ba2e5f2c2fad2a8b086ce93293de1a.

hebasto commented at 2:01 pm on February 5, 2026: member

cc @willcl-ark

in .github/workflows/ci.yml:61 in 04c4d71008

55@@ -56,12 +56,11 @@ jobs:
56           fi
57 
58   test-each-commit:
59-    name: 'test max 6 ancestor commits'
60-    runs-on: ubuntu-24.04
61+    name: 'test ancestor commits'
62+    needs: runners
63+    runs-on: ${{ needs.runners.outputs.provider == 'cirrus' && 'ghcr.io/cirruslabs/ubuntu-runner-amd64:24.04-md' || 'ubuntu-24.04' }}

willcl-ark commented at 8:07 pm on February 16, 2026:

Perhaps if we are going to use cirrus runners for this job we should (here, or in a follow-up) take advantage of the cirrus cache for ccache on the first build with:

0      - name: Configure environment
1        uses: ./.github/actions/configure-environment
2
3      - name: Restore caches
4        id: restore-cache
5        uses: ./.github/actions/restore-caches

and the finally at the end of the action

0      - name: Save caches
1        uses: ./.github/actions/save-caches

Would probably save ~5 mins on the first compilation.

maflcko commented at 7:17 am on February 17, 2026:

I don’t think we want pull request caches saved at all, so I don’t think this will work

willcl-ark commented at 8:53 am on February 17, 2026:

You are correct

in .github/workflows/ci.yml:90 in 04c4d71008

92+          # subtree merge commit is not the top commit (which
93+          # would be skipped anyway by this task, because it is
94+          # run by all other tasks). Also, `git rebase --exec`
95+          # does not work on merge commits, so if this was
96+          # important to check, the logic would have to be
97+          # rewritten.

willcl-ark commented at 8:09 pm on February 16, 2026:

Can’t help but feel that this is a little wasteful to have 30 lines of comments, of width 64 chars. If i was touching I think I’d join every other line to go with 120 chars or so… :) But this is only a style nit, feel free to ignore.

willcl-ark approved

willcl-ark commented at 8:47 pm on February 16, 2026: member

ACK eb510f8678ba2e5f2c2fad2a8b086ce93293de1a

The code, CI run, and example run(s) look correct to me.

Left two nits, which could be done here or in a followup.

fanquake merged this on Feb 17, 2026

fanquake closed this on Feb 17, 2026

ci: remove commit count limit from test-each-commit and fail fast #34383

Problem

Fix

Examples

Reviews

ci: remove commit count limit from `test-each-commit` and fail fast #34383