Faster duplicate input check in CheckTransaction (alternative to #14387) #14397

pull sipa wants to merge 2 commits into bitcoin:master from sipa:201810_fastduplicate changing 3 files +112 −4
  1. sipa commented at 6:46 am on October 5, 2018: member

    This is a simple improvement to the performance of the duplicate input check in CheckTransaction.

    It includes the benchmark from #14387, for which it shows about a 2x speedup compared to master (while #14387 shows a 4.5x speedup, but with a lot more complexity).

  2. sipa force-pushed on Oct 5, 2018
  3. JeremyRubin commented at 6:53 am on October 5, 2018: contributor

    cr-ack 6d1779d – one of my earlier versions looked a lot like that.

    I’d also suggest another optimization to make the vector a prevector with say 20 outputs so we can avoid the allocation on most txns.

    May I suggest using adjacent_find with std::equal instead of your own loop? https://en.cppreference.com/w/cpp/algorithm/adjacent_find

  4. jnewbery commented at 7:04 am on October 5, 2018: member

    I prefer this to #14387, but I have the same question for @sipa as I asked Jeremy in that PR:

    This makes block propagation faster, but do we have an understanding of how much these milliseconds matter? Is there a way we can determine whether the increased complexity introduced is a reasonable price to pay for the performance win?

    Granted, the complexity added here is minimal compared to 14387, but I’d still like to understand the cost/benefit analysis.

  5. sipa commented at 7:06 am on October 5, 2018: member
    @jnewbery Yeah, that’s fair. I just wanted to show an alternative which is a smaller improvement but at much lower review complexity. I’m not convinced we need either approach.
  6. jamesob commented at 7:43 am on October 5, 2018: member
    For what it’s worth, bitcoinperf didn’t show any significant performance regression for IBD or reindex after the merge of #14247, so it’s not clear that either changeset is needed.
  7. MarcoFalke commented at 7:50 am on October 5, 2018: member
    @jamesob I think this is not so much about ibd but rather propagation
  8. JeremyRubin commented at 8:00 am on October 5, 2018: contributor

    @MarcoFalke correct.

    It’s something like 13ms slower.

  9. fanquake added the label Consensus on Oct 5, 2018
  10. DrahtBot commented at 10:42 am on October 5, 2018: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #14400 (Add Benchmark to test input de-duplication worst case by JeremyRubin)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. gmaxwell commented at 11:08 pm on October 5, 2018: contributor
    Even block propagation is no longer critical for this: When the prior optimization went in, we didn’t have relay-before-validate. I’m not opposed to this change– as it’s pretty straight forward to review– but since this particular validity criteria is pure (a function of the tx itself with no dependency on external state) effort might be better spent reorging things so that this test ends up covered by the wtxid validity caching which, although complementary, would likely give a bigger improvement.
  12. ryanofsky commented at 3:05 am on October 6, 2018: member
    Is there an explanation of what purpose the duplicate input check will serve if the connect block code is fixed to check for valid spends?
  13. gmaxwell commented at 4:43 pm on October 7, 2018: contributor
    @ryanofsky Making block connection alone check for duplicate inputs isn’t sufficient: The whole reason this code was initially added in PR #443 was to keep duplicate inputs out of the mempool– block connection did originally prevent the dupes. If both block-connection and mempool processing prevented duplicate inputs, then, indeed, there should be no reason for this code.
  14. jl2012 commented at 5:55 pm on October 10, 2018: contributor
    I think both set+insert (the existing one) and sort (this PR) are O(NlogN). So sort is faster as it is more optimized?
  15. sipa force-pushed on Oct 12, 2018
  16. sipa commented at 11:37 pm on October 12, 2018: member
    @jl2012 You get better memory locality and lower allocation overhead by representing things as a vector rather than a set. The set permits much faster updating though, but that’s not something we need here.
  17. Add Benchmark to test input de-duplication worst case 619a568e7c
  18. Faster duplicate checking in CheckTransaction 7a9ac18367
  19. sipa force-pushed on Oct 12, 2018
  20. in src/consensus/tx_verify.cpp:184 in 7a9ac18367
    181@@ -182,11 +182,14 @@ bool CheckTransaction(const CTransaction& tx, CValidationState &state, bool fChe
    182 
    183     // Check for duplicate inputs - note that this check is slow so we skip it in CheckBlock
    184     if (fCheckDuplicateInputs) {
    


    jl2012 commented at 7:55 am on October 13, 2018:
    Tx with 1 input is quite common. Would it be faster if we skip those?
  21. MarcoFalke referenced this in commit 327129f7a6 on Nov 25, 2018
  22. DrahtBot commented at 11:13 pm on November 25, 2018: member
  23. DrahtBot added the label Needs rebase on Nov 25, 2018
  24. MarcoFalke commented at 5:32 pm on April 19, 2019: member
    There hasn’t been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.
  25. MarcoFalke closed this on Apr 19, 2019

  26. laanwj removed the label Needs rebase on Oct 24, 2019
  27. Munkybooty referenced this in commit df8134b1a7 on Jul 30, 2021
  28. MarcoFalke locked this on Dec 16, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me