Verify the block filter hash when reading from disk. #19280

pull pstratem wants to merge 3 commits into bitcoin:master from pstratem:2020-06-14-blockfilterindex-checksums changing 5 files +79 −20
  1. pstratem commented at 0:19 am on June 15, 2020: contributor

    What’s stored in the block filter index data files is the block_hash and the encoded gcs filter.

    If there’s a disk error that corrupts the encoded filter there’s no way for the node to know.

    Clients that use the “header” chain will notice the error though.

    It takes 1ms to verify the hash on a filter with 10000 elements.

  2. fanquake added the label UTXO Db and Indexes on Jun 15, 2020
  3. MarcoFalke added the label P2P on Jun 15, 2020
  4. pstratem commented at 1:53 am on June 15, 2020: contributor
    the Travis failure doesn’t seem related
  5. jonatack commented at 9:17 am on June 15, 2020: member
    Code review ACK 180e094d52262322d525ccdef9, did not benchmark the performance cost.
  6. MarcoFalke commented at 10:24 am on June 15, 2020: member
    We don’t do this when reading blocks from disk and serving them to p2p. Also, with silent corruption on disk, wouldn’t it be better to abort the node?
  7. pstratem force-pushed on Jun 15, 2020
  8. pstratem commented at 4:18 pm on June 15, 2020: contributor
    @MarcoFalke that’s a good point, I’ve changed it so that we don’t run the checks in GCSFilter::GCSFilter when the encoded filter hash has been checked. That check is about twice as expensive as calculating the hash.
  9. pstratem commented at 4:31 pm on June 15, 2020: contributor
    I’ve added the benchmarks I used.
  10. DrahtBot commented at 7:46 pm on June 15, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18011 (Replace current benchmarking framework with nanobench by martinus)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. Verify the block filter hash when reading from disk.
    Don't run the (relatively) expensive sanity check in GCSFilter constructor
    if we've checked the encoded filter hash.
    3f0be5abab
  12. Add DecodeGCSFilter and BlockFilterGetHash benchmarks.
    Also standardize on the BASIC filter parameters so we can compare between all the benchmarks.
    1b7c2a07c4
  13. Add DecodeCheckedGCSFilter benchmark 75e86b1884
  14. pstratem force-pushed on Jun 15, 2020
  15. ryanofsky approved
  16. ryanofsky commented at 8:56 pm on June 30, 2020: member
    Code review ACK 75e86b18846aa21a88463dfd3e710534afb6c104. If I’m understanding this correctly, it is replacing slower less comprehensive checks that filters contain the right number of elements with faster checks verifying complete hashes of the filters. It also adds a benchmark. So this seems better in every respect.
  17. in src/blockfilter.cpp:93 in 75e86b1884
     96-    for (uint64_t i = 0; i < m_N; ++i) {
     97-        GolombRiceDecode(bitreader, m_params.m_P);
     98-    }
     99-    if (!stream.empty()) {
    100-        throw std::ios_base::failure("encoded_filter contains excess data");
    101+    if (!filter_checked) {
    


    jonatack commented at 12:55 pm on July 4, 2020:

    3f0be5a If you retouch, returning early might be preferable and a simpler change (feel free to ignore).

    0+    if (filter_checked) return;
    1-    if (!filter_checked) {
    2-       .../...
    3-       }
    
  18. in src/index/blockfilterindex.cpp:158 in 75e86b1884
    153@@ -153,7 +154,10 @@ bool BlockFilterIndex::ReadFilterFromDisk(const FlatFilePos& pos, BlockFilter& f
    154     std::vector<unsigned char> encoded_filter;
    155     try {
    156         filein >> block_hash >> encoded_filter;
    157-        filter = BlockFilter(GetFilterType(), block_hash, std::move(encoded_filter));
    158+        uint256 result;
    159+        CHash256().Write(encoded_filter.data(), encoded_filter.size()).Finalize(result.begin());
    


    jonatack commented at 3:02 pm on July 4, 2020:
    This looks like a future candidate to become a Span.
  19. jonatack commented at 3:06 pm on July 4, 2020: member

    ACK 75e86b1 per git diff 180e094 75e86b1

    Ran comparative benchmarks built with --enable-bench CXXFLAGS="-O2".

    This branch rebased on master:

    0# Benchmark, evals, iterations, total, min, max, median
    1DecodeCheckedGCSFilter, 100, 50000, 4.28914, 7.30889e-07, 1.02493e-06, 8.58921e-07
    2
    3DecodeGCSFilter, 10, 5000, 16.2784, 0.000314371, 0.000337642, 0.000329746
    4
    5BlockFilterGetHash, 10, 10000, 15.4826, 0.000142116, 0.000163642, 0.000158849
    6
    7ConstructGCSFilter, 10, 1000, 22.988, 0.00217746, 0.00253763, 0.00227277
    8
    9MatchGCSFilter, 10, 50000, 18.8786, 3.56681e-05, 4.12871e-05, 3.7422e-05
    

    Master, with the standardized BASIC filter params. Ran twice:

    0# Benchmark, evals, iterations, total, min, max, median
    1ConstructGCSFilter, 10, 1000, 23.2998, 0.00226351, 0.00248678, 0.00233295
    2ConstructGCSFilter, 10, 1000, 23.2406, 0.00225238, 0.00251339, 0.00230733
    3
    4MatchGCSFilter, 10, 50000, 19.3379, 3.70859e-05, 4.26215e-05, 3.85519e-05
    5MatchGCSFilter, 10, 50000, 18.7122, 3.54355e-05, 3.90555e-05, 3.78147e-05
    
  20. DrahtBot added the label Needs rebase on Jul 30, 2020
  21. DrahtBot commented at 2:29 pm on July 30, 2020: member

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  22. DrahtBot commented at 11:21 am on December 15, 2021: member
    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  23. MarcoFalke commented at 1:53 pm on December 15, 2021: member
    This had two acks before conflicting. @pstratem are you still working on this or should someone pick this up?
  24. DrahtBot commented at 1:07 pm on March 21, 2022: member
    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  25. MarcoFalke added the label Up for grabs on Mar 21, 2022
  26. MarcoFalke closed this on Mar 21, 2022

  27. fanquake removed the label Up for grabs on Apr 12, 2022
  28. fanquake referenced this in commit 5abbc9afec on Jul 7, 2022
  29. sidhujag referenced this in commit a42480f022 on Jul 8, 2022
  30. DrahtBot locked this on Apr 12, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-30 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me