Add script_assets_test.json #27

pull sipa wants to merge 1 commits into bitcoin-core:master from sipa:master changing 1 files +1813 −0
  1. sipa commented at 7:00 PM on September 23, 2020: contributor

    This adds a blob for a large script unit test to qa-assets, as discussed in the IRC meeting from last week.

    It's used by a test want to add in the Taproot PR (see https://github.com/bitcoin/bitcoin/pull/19997/commits/3ef08f40cee0a773bd72c5169329004c02ec37bf for details), but is too large to just be included in the repository.

    While the current test data is specific to testing Taproot, the test isn't, and could easily be extended to include tests for all kinds of script logic.

    It would simplify things if it were merged here early, but if we don't want to do that before it's clear the Taproot PR will be merged, I'm also happy to keep it in my repo for now.

  2. Add script_assets_test.json 72735239fd
  3. maflcko approved
  4. maflcko commented at 11:08 AM on September 28, 2020: contributor

    Concept ACK

    ACK, but some (unrelated) questions, as I can't seem to find the IRC backlog.

    • Will the other json unit test data be moved here as well? What is the threshold?
    • Does make check pass when the json file is missing? I.e. is it optional?
    • Is there a script available to generate the json? (Similar to the fuzz script)
  5. sipa commented at 9:09 PM on October 1, 2020: contributor

    The IRC log on the topic:

    12:28:14< wumpus> #topic Size limit for data-driven unit tests (sipa)
    12:28:26< sipa> hi!
    12:29:26< sipa> in [#19953](/bitcoin-core-qa-assets/19953/) i've recently added a unit test with randomly-generated transaction validation success/failure cases, minimized using the fuzzing framework (it's not an actual fuzzer, all input is generated by a python script, but just minimized using the fuzz build)
    12:29:27< luke-jr> o hai thar sipa?
    12:29:29< gribble> https://github.com/bitcoin/bitcoin/issues/19953 | Implement BIP 340-342 validation (Schnorr/taproot/tapscript) by sipa · Pull Request [#19953](/bitcoin-core-qa-assets/19953/) · bitcoin/bitcoin · GitHub
    12:29:53< sipa> which i think is an interesting approach as it permits getting the kind of coverage you get by running the python functional test for days... weeks...
    12:30:09< sipa> it's 250 kB now, which seems in line with some of the other tests we have
    12:30:26< sipa> but i could extend this to test more things, and in particular more validation flags
    12:30:33< luke-jr> does that create a dep on fuzzing stuff for normal tests? :x
    12:30:40< sipa> luke-jr: nope, just a json file
    12:30:51< sipa> with the output of the whole fuzzing procedure
    12:31:00< luke-jr> aha
    12:31:21< sipa> anyway, trying to extend this, using the same approach, leaves me with things in the 1-2 MB range
    12:31:25< wumpus> I think we should be careful to not add too much data for tests to the repository, git is not that great for bulk data storage, though 250kB seems fine to me
    12:31:28< sipa> and i was wondering if that's acceptable
    12:31:53< sipa> there are many more tradeoffs possible, which can reduce that - or extend it - in exchange for coverage/development time
    12:32:06< sipa> but if people are like "1 MB is just fine", that would simplify things
    12:32:22< wumpus> another drawback of large files is that it generates huge diffs, and this isn't really reviewable
    12:32:24< jonasschnelli> I guess the runtime memory requirements are unchanged for that?
    12:32:34< sipa> jonasschnelli: yes, it's just a (very simple) unit test
    12:32:40< wumpus> jonasschnelli: yes, it's only used at test time
    12:32:40< sipa> it's also 1 MB of json which is presumably compressed quite a bit by git
    12:32:43< cfields> sipa: does it compress at all in git?
    12:32:47< vasild> the json contains ascii hex, what if we save it in binary? would be 2x reduce
    12:32:49< cfields> hah
    12:33:23< wumpus> but as it's part of a unit test it also can't easily be moved to another repository like the fuzz dataset one
    12:33:47< sipa> vasild: yes, possible - but if git compresses it already in a similar degree, leaving it in a more readable form has advantages
    12:33:59< luke-jr> is there a reason not to just make this part of the fuzzer build?
    12:34:08< sipa> luke-jr: it's not a fuzzer
    12:34:16< sipa> you can't run it as a fuzzer
    12:34:34< sipa> (it would immediately fail, as it's not testing random inputs)
    12:34:42< vasild> sipa: right, I guess disk space when checked out is irrelevant for such sizes
    12:34:57< luke-jr> sipa: but can't the 250k be generated at test-time?
    12:35:02< sipa> luke-jr: it took me days
    12:35:11< luke-jr> hmm
    12:35:14< sipa> (of CPU time)
    12:35:18-!- S3RK (S3RK) [~s3rk@47.246.66.112] has joined #bitcoin-core-dev
    12:35:23< luke-jr> a special make target?
    12:35:28< sipa> it's nondeterministic
    12:35:36< jnewbery> is https://github.com/bitcoin-core/qa-assets used for test assets?
    12:35:42< jonasschnelli> +1
    12:35:53< sipa> luke-jr: to be clear, this is a test that already runs as a functional test, but only for 1 minute
    12:35:54< wumpus> jnewbery: yes, that's what I meant, the only thing is that it takes an extra step then
    12:36:17< wumpus> someone who wants to read the test also needs that erpository checked out -- not a problem for the CI at leat
    12:36:27< sipa> luke-jr: the approach to extract a very-good-coverage unit test from it makes it a bit more accessible and reusable, and gives the same coverage as running the functional test 1000s of times
    12:36:46-!- AaronvanW (AaronvanW) [~AaronvanW@unaffiliated/aaronvanw] has joined #bitcoin-core-dev
    12:36:58-!- shesek [~shesek@164.90.217.137] has quit [Changing host]
    12:36:58-!- shesek (realname) [~shesek@unaffiliated/shesek] has joined #bitcoin-core-dev
    12:36:58-!- shesek (realname) [~shesek@unaffiliated/shesek] has joined #bitcoin-core-dev
    12:37:05< sipa> wumpus: that's reasonable, i guess
    12:37:21< sipa> skip the test if the json file isn't found
    12:37:27< sipa> or something like that
    12:37:36< wumpus> sipa: yes, sounds fair to me
    12:38:02 * luke-jr glares at boost for not supporting skips still (last I checked)
    12:38:25< wumpus> though there are already some ~250kB json files in the repo, for the tests, I don't think one more is that bad... but let's not make a habit out of it, and also, you're planning to add more data in the future
    12:38:32-!- promag (promag) [~promag@bl19-22-20.dsl.telepac.pt] has joined #bitcoin-core-dev
    12:38:36< ryanofsky> I like the qa assetts idea. Skipping the test if input not found is also similar to what we do for backwards compatibility tests
    12:38:38< sipa> luke-jr: "return true;" works great
    12:38:44< wumpus> yes
    12:38:55< luke-jr> sipa: except it gives the impression it passed?
    12:39:00< ryanofsky> No objection to 250kb either, though
    12:39:34-!- S3RK [~s3rk@47.246.66.112] has quit [Ping timeout: 260 seconds]
    12:39:37< sipa> luke-jr: "make check" also doesn't run the fuzz tests; does that give the impression they passed? :)
    12:39:54< sipa> there could be a "notice: qa-assets not found, so large data-driven tests are skipped"
    12:39:55< luke-jr> sipa: boost explicitly says the tests pass..
    12:40:14< sipa> luke-jr: in aggregate
    12:40:20< luke-jr> sipa: individually
    12:40:25< luke-jr> sipa: this is a problem for another test already IIRC
    12:40:29< wumpus> ok, I think we've given sipa quite some input on this for now, decision doesn't need to be made in the meeting, only ~20 minutes left, and 1 or 2 topics
    

    Will the other json unit test data be moved here as well? What is the threshold?

    Based on the discussion above I'd say somewhere between 250 kB and 1 MB.

    Does make check pass when the json file is missing? I.e. is it optional?

    Yes. I've tested that https://github.com/bitcoin/bitcoin/pull/19953 fails when it's given access to a bogus script_assets_test.json file, but succeeds when the file is missing.

    Is there a script available to generate the json? (Similar to the fuzz script)

    Still working on cleaning up the code for that. The generation code should just be included in master (through #19953 or a separate PR), but the minimization code is more annoying as it's using the fuzz test framework (and needs a fuzz test based build), but isn't a fuzz test on itself that could be included/run automatically.

  6. maflcko commented at 8:10 AM on October 2, 2020: contributor

    I think end users and developers would normally not clone the qa-assets repo, so they might get less test coverage. For end users this is already true, because they don't run the fuzz tests, nor the functional tests.

    For developers this shouldn't matter as long as the lack of coverage doesn't lead to a code bug. Though, with ci running on this, we'll at least see the bug before merge. I expect this to be unlikely and rare. If the missing coverage leads to too many bugs that aren't caught on dev machines, we should think about moving the file to the main repo.

    Until then, it seems fine to host here.

  7. maflcko merged this on Oct 2, 2020
  8. maflcko closed this on Oct 2, 2020


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/qa-assets. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-15 09:25 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me