Hash Cache #8259

pull NicolasDorier wants to merge 2 commits into bitcoin:master from NicolasDorier:cachedhashes changing 9 files +207 −41
  1. NicolasDorier commented at 9:00 pm on June 24, 2016: contributor

    Some notes about the implementation:

    • It calculates the three midstate hashes as soon as a CheckSig segwit operation happens, whathever the SigHash,
    • It is possible that two different threads calculate the midstate hashes of a transaction twice,

    Befinits are:

    • hashcashes map access is limited so we don’t have too much lock contention
    • Fewer conditional branches in consensus code
    • Simple to review

    This commit is only for having a cache that is simple to review and understand. It is probably possible to fix the two first points above, but the code overhead is not worth it when our goal is only to fix the O(n²) issue.

    (rebased version of https://github.com/sipa/bitcoin/pull/70)

  2. Cache hashes c2ea4ddafc
  3. NicolasDorier force-pushed on Jun 24, 2016
  4. jl2012 commented at 8:01 pm on June 25, 2016: contributor
    This one is hopefully merged for versions with segwit defined on mainnet
  5. NicolasDorier renamed this:
    Cache hashes
    Hash Cache
    on Jun 25, 2016
  6. in src/script/sigcache.h: in c2ea4ddafc outdated
    35+        return true;
    36+    }
    37+    bool TrySet(uint256 txId, const CachedHashes& hashes)
    38+    {
    39+        LOCK(cs);
    40+        if(map.count(txId))
    


    dcousens commented at 3:33 am on June 27, 2016:

    You could avoid two look ups (IIRC) by comparing the size of the container before and after instead.

    Aka:

    0auto sizeBefore = map.size();
    1map.insert(txId, hashes);
    2return map.size() != sizeBefore;
    

    NicolasDorier commented at 10:31 am on June 27, 2016:
    Actually, I don’t think I really need TrySet to return a bool.
  7. in src/script/sigcache.h: in c2ea4ddafc outdated
    30+    {
    31+        LOCK(cs);
    32+        if(!map.count(txId))
    33+            return false;
    34+        *hashes = map[txId];
    35+        return true;
    


    dcousens commented at 3:33 am on June 27, 2016:

    Why not:

    0auto iter = map.find(txId);
    1if (iter == map.end()) return false;
    2*hashes = iter->second;
    3return true;
    

    edit: fixed early-exit


    NicolasDorier commented at 10:32 am on June 27, 2016:
    What do you mean by “fixed early-exit” ? Yes, I think your proposition is better as it makes only one lookup.

    dcousens commented at 4:44 am on June 28, 2016:

    What do you mean by “fixed early-exit” ?

    I accidentally inverted the early exit logic, I’ve since fixed it, but left the edit in case you read it in an email and followed up.

  8. dcousens commented at 3:35 am on June 27, 2016: contributor

    hashcashes

    hash caches?, had me confused for a second haha

    utACK c2ea4dd

  9. in src/script/interpreter.h: in c2ea4ddafc outdated
     97@@ -98,13 +98,30 @@ enum
     98 
     99 bool CheckSignatureEncoding(const std::vector<unsigned char> &vchSig, unsigned int flags, ScriptError* serror);
    100 
    101+class CachedHashes
    102+{
    103+public:
    104+    uint256 hashPrevouts,hashSequence,hashOutputs;
    


    dcousens commented at 3:37 am on June 27, 2016:
    trivial: maybe spacing between names?
  10. in src/main.h: in c2ea4ddafc outdated
    428@@ -426,6 +429,7 @@ class CScriptCheck
    429         std::swap(nFlags, check.nFlags);
    430         std::swap(cacheStore, check.cacheStore);
    431         std::swap(error, check.error);
    432+        std::swap(cachedHashesMap, check.cachedHashesMap);
    


    dcousens commented at 3:38 am on June 27, 2016:
    Would this need to enforce the lock on check?

    NicolasDorier commented at 2:17 pm on June 27, 2016:
    I don’t understand, you mean the internal lock of cachedHashesMap ? no, the lock is only meant to protect the internal map, which is not accessed during the swap.

    dcousens commented at 4:49 am on June 28, 2016:

    My logic was based on the presence of the lock implies multi-threaded access, and this function implies the entire memory is being swapped when swap() is called. To me, that would imply that it should be LOCK’d to avoid another thread reading it during this swap.

    Perhaps it isn’t needed for this case, but I just figured I’d ask.


    NicolasDorier commented at 8:23 am on June 28, 2016:
    Ah I see. Well, I don’t think we need it. The swap is always called before the closure is executed (which is only when we start having concurrent access)
  11. laanwj added the label Validation on Jun 27, 2016
  12. nits and micro optimisation for the cache hash map dc188d82da
  13. NicolasDorier commented at 9:25 am on June 28, 2016: contributor
    @dcousens addressed your nits in dc188d8.
  14. dcousens commented at 2:18 am on June 29, 2016: contributor
    utACK dc188d8
  15. laanwj added the label Needs backport on Jul 28, 2016
  16. in src/script/interpreter.cpp: in dc188d82da
    1109@@ -1110,35 +1110,46 @@ class CTransactionSignatureSerializer {
    1110 
    1111 } // anon namespace
    1112 
    1113-uint256 SignatureHash(const CScript& scriptCode, const CTransaction& txTo, unsigned int nIn, int nHashType, const CAmount& amount, SigVersion sigversion)
    1114+uint256 SignatureHash(const CScript& scriptCode, const CTransaction& txTo, unsigned int nIn, int nHashType, const CAmount& amount, SigVersion sigversion, CachedHashes* cache)
    


    jtimon commented at 8:31 pm on July 28, 2016:
    Perhaps here we can do the same trick we did with the script/sigcache to keep script/interpreter simpler and more reusable. ping @sipa

    jtimon commented at 8:34 pm on July 28, 2016:
    Or maybe move SignatureHash to BaseSignatureChecker. Just brainstorming.
  17. sipa commented at 10:41 pm on July 28, 2016: member

    Assume a transaction has many signatures. One is SIGHASH_SINGLE, all the others are SIGHASH_ALL | SIGHASH_ANYONECANPAY.

    The first computes and stores hashPrevouts. All the others will compute hashOutputs. However, after the first call, all TrySets will not do anything, as there is already a result in the cache, so it gets computed over and over again.

    I think CachedHashes needs a Merge method like this:

    0void Merge(const CachedHashes& hashes) {
    1    if (hashPrevouts.IsNull()) hashPrevouts = hashes.hashPrevout;
    2    if (hashSequence.IsNull()) hashSequence = hashes.hashSequence;
    3    if (hashOutputs.IsNull()) hashOutputs = hashes.hashOutputs;
    4}
    

    which can then be called from TrySet (instead of insert, use map[txid].Merge(hashes)).

  18. NicolasDorier commented at 11:08 pm on July 28, 2016: contributor

    so it gets computed over and over again

    That’s not true, I would be calculating the three hashes at the same time during the SIGHASH_SINGLE. Take a look at the SignatureHash method, I changed it to calculate the three hashes aggressively.

    I prefer not doing a merge. Without a merge my lock only have to protect the internal map as CachedHashes instances are read only. If we calculate the mid states lazily, I also need to be careful about locking at the CachedHashes instance level.

  19. sipa commented at 11:11 pm on July 28, 2016: member

    @NicolasDorier Oh, I see. I agree that the current code is fine in that case.

    I don’t understand the argument about the lock. The Merge function would also grab the lock, and be the only code that touches the map.

  20. NicolasDorier commented at 11:18 pm on July 28, 2016: contributor

    @sipa The Merge function as you did here can’t grab the lock, because the lock is at the cache map level, not at the CachedHashes instance level.

    But basically, if doing that way, I would need to grab the lock of the map around the Merge, as well as around any hash read of the HashedCaches instance. (so one can’t read and call merge on the HashedCaches at the same time) This would make lots of contention on a single lock. An alternative is to have one lock per HashedCaches… not sure if it is worth the complexity though, knowing that the only type of transaction which does not need the 3 midstate hashes are transaction which have no SIGHASH_ALL… this is very marginal case.

  21. NicolasDorier commented at 1:58 am on July 29, 2016: contributor
    Closing this one in favor of #8422 which calculate hashes lazily.
  22. NicolasDorier closed this on Jul 29, 2016

  23. MarcoFalke removed the label Needs backport on Jul 31, 2016
  24. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-17 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me