I’m sure anyone who runs the functional tests regularly will experience occasional failures. Often you run those failed tests again and they will pass. Occasionally a functional test will repeatedly fail which is more concerning. It is difficult to assess whether others are experiencing the same flakiness and/or repeated failures and whether it is worth spending time trying to understand and fix the issue.
In #25030 @MarcoFalke raised that there are some functional tests that have been flaky for a long time now.
This issue is a first attempt to track failures and repeated flakiness contributors experience with particular functional tests. I’m not sure how to organize this. Whether there should be a table of the functional tests that can be regularly updated and edited or whether to just have individual contributors adding comments below on which functional tests they are experiencing failures/flakiness with.
For now please comment below if you experience failures/flakiness with a particular functional test and certainly if you spend any time trying to understand why it failed feel free to add thoughts on what you think is causing the problem.
Ideally this issue would help identify which functional tests to prioritize to fix but we’ll see if this is useful or not. If it isn’t useful feel free to close.