Merkleize the utxo set dump (was: gettxoutsetinfo serializedhash doesn’t contain the keys?) #7758

issue laanwj openend this issue on March 28, 2016
  1. laanwj commented at 8:48 pm on March 28, 2016: member

    I’ve stared at the code of https://github.com/bitcoin/bitcoin/blob/master/src/txdb.cpp#L114 for a while, and can’t seem to discover where the key (transaction id for the following outputs) is serialized to the HashWriter. I don’t think this is happening at all.

    As I see it, but I could be confused, is that this is a problem as it means different transactions in the same position with the same outputs will potentially result in the same hash.

    Another slight issue (but more of a naming/documentation one) is that nSerializedSize is confusing: it’s not the size of the data hashed to hashSerialized, but a differently computed quantity.

  2. laanwj added the label RPC on Mar 28, 2016
  3. sipa commented at 8:55 pm on March 28, 2016: member

    It seems that the old commit of mine (3499ce1e1ad87a86598d00b7124072c91ddad833) broke this in the chainstate obfuscation PR.

    Shows how often people actually use that function…

  4. laanwj commented at 9:08 pm on March 28, 2016: member
    Heh, I only noticed because I actually started using the hashed data in #7759.
  5. paveljanik commented at 7:22 am on March 29, 2016: contributor
    Maybe we can remove it completely? Or every developer keeps the old tree with unobfuscated chain data because of already existing tooling to parse files? ;-)
  6. laanwj commented at 7:32 am on March 29, 2016: member
    You’re not serious on removing utxo set stats I hope? IMO, we need more stats to see what is going on, and how resources are being used, not less.
  7. paveljanik commented at 7:34 am on March 29, 2016: contributor
    I’d like to found out the reason why none noticed ;-) I actually use it myself…
  8. laanwj commented at 7:41 am on March 29, 2016: member

    I do I use the hash sometimes to compare multiple nodes for correctness, especially after experimenting. Though I must admit running (almost) master everywhere, so I could easily miss this change.

    The RPC test isn’t helpful for detecting this either :dragon_face:

    assert_equal(len(res[u’hash_serialized’]), 64)

  9. laanwj commented at 9:05 am on March 29, 2016: member

    Discussed on IRC:

    0<sipa> wumpus: if we need to fix the serialization for gettxoutsetinfo... maybe we can replace it with a merkleized version?
    1<wumpus> sipa: you mean a 6-month masters project for someone? :) or isn't it that bad to do?
    2<sipa> wumpus: i don't mean UTXO commitments
    3<wumpus> ohh!
    4<wumpus> sure, would be good to have a better format
    5<wumpus> I'm everything but married to the current one, and apparently we already broke it once without anyone noticing, so (as long as we mention it in the release notes) I'm not against breaking it again
    6<sipa> but we can iterate over the utxo entries in order like now, but use an incremental merkle tree hasher (similar to the algorithm used by ComputeMerkleRoot and friends now)
    7<wumpus> cool
    8<sipa> the overhead would be the same, and you could make it answer queries for specific entries... and it could later just be converted to a commitment structure
    9<wumpus> yes, after  the memory improvement for the merkle tree hasher it doesn't have to store all the data in meory
    

    I think this makes sense, as we’ll have to break the compatibility anyway, let’s use the opportunity to move to a better format.

  10. laanwj renamed this:
    `gettxoutsetinfo` serializedhash doesn't contain the keys?
    Merkleize the utxo set dump (was: `gettxoutsetinfo` serializedhash doesn't contain the keys?)
    on Mar 29, 2016
  11. laanwj referenced this in commit 088c270c29 on Apr 9, 2016
  12. laanwj referenced this in commit 64a292435a on Apr 15, 2016
  13. laanwj referenced this in commit 76212bbc6a on Apr 15, 2016
  14. lateminer referenced this in commit 51a43ca5fe on Jan 5, 2018
  15. laanwj closed this on Nov 22, 2019

  16. DrahtBot locked this on Dec 16, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me