rpc: dumptxoutset as sqlite file #24628

issue prusnak openend this issue on March 21, 2022
  1. prusnak commented at 5:23 pm on March 21, 2022: contributor

    There has been at least 3 efforts to expand the dumptxoutset command to be able to export the data as CSV:

    JSON export also has been requested: #21670

    Lately @theStack suggested that we can leverage the SQLite format for the export.

    I think this is the best approach since it is trivial to export both CSV and JSON from SQLite + you can access the file programatically via virtually any programming language:

    • sqlite3 -header -csv foo.db "select * from foo" > foo.csv
    • sqlite3 -json foo.db "select * from foo" > foo.json

    … and we already have SQLite as an optional dependency. Such format is also future proof (we can add new table columns if needed).

  2. prusnak added the label Feature on Mar 21, 2022
  3. jamesob commented at 4:35 pm on March 22, 2022: member
    Concept ACK; happy to review any related PRs.
  4. dunxen commented at 9:37 pm on April 19, 2022: contributor
    Sounds good! I can take a look at doing this
  5. dunxen commented at 12:35 pm on April 21, 2022: contributor
    So I should have a PR up for this by the end of the week.
  6. dunxen referenced this in commit eaa4e63dac on Apr 23, 2022
  7. dunxen referenced this in commit 21ee90dda3 on Apr 24, 2022
  8. dunxen referenced this in commit 84a6e28a85 on Apr 24, 2022
  9. dunxen referenced this in commit fad59d2425 on Apr 24, 2022
  10. dunxen referenced this in commit 889f6c719d on Apr 24, 2022
  11. dunxen referenced this in commit 2d95dbb489 on Apr 25, 2022
  12. theStack commented at 11:48 am on April 25, 2022: member

    Some thoughts that I wanted to share for discussion: a pretty simple alternative to directly support dumping in SQLite format in bitcoind would be to provide a small external tool (e.g. in contrib, or even outside the repository) that simply takes the compact-serialized UTXO set format as it created by dumptxoutset now as input and create the SQLite database out of that. The drawback of that approach is of course that it’s a two-step process that hence also takes more time and space (the latter at least temporarily), but on the other hand the user could more easily adapt the details of the created database (which columns do we even want? for what columns do we want indices? what metadata should be included in the database? etc.) without the need to recompile bitcoind or wait for another release. There could be lots of potential PRs coming that will try to bikeshed or add extra data to the output, like address strings, timestamps of the UTXO creation etc… where in the end the use-case and trade-off between time/space and amount of needed information per UTXO is highly individual. For example, many users probably only want the outputs (scriptPubKey and nAmount) to see “who owns what” and don’t care about the outpoints (prevoutHash and prevoutIndex).

    Such a utxos_converter tool could be made in a generic way that is easily extendable in a way that pretty much any output format can be provided, by having a interface that takes an init and a write_utxo_entry function which can be specified by its own needs.

    Thoughts?

    // EDIT: Nevermind, I just found out that for decompressing a P2PK scriptPubKey one needs secp256k1 (uncompressed pubkeys are stored compressed in the compact-serialized variant), so having this dependency doesn’t make it a “simple tool” at all.

  13. dunxen referenced this in commit 3a761ef464 on Apr 25, 2022
  14. dunxen referenced this in commit 2d95f24c35 on Apr 25, 2022
  15. sipa commented at 4:12 pm on April 25, 2022: member

    Nevermind, I just found out that for decompressing a P2PK scriptPubKey one needs secp256k1

    Why would a dump tool need to do anything with public keys except pass them through?

  16. theStack commented at 4:39 pm on April 25, 2022: member

    Nevermind, I just found out that for decompressing a P2PK scriptPubKey one needs secp256k1

    Why would a dump tool need to do anything with public keys except pass them through?

    See the DecompressScript function for special types 4 and 5 (that is, P2PK with uncompressed pubkeys). The pubkey is stored as compressed in the compact-serialized UTXO set and then uncompressed via secp256k1:

    https://github.com/bitcoin/bitcoin/blob/0342ae1d395ca82614f6d3b8fabb6a44403baf2a/src/compressor.cpp#L123-L135

  17. sipa commented at 8:33 pm on April 25, 2022: member
    Oh, I didn’t realize the current dump format uses the UTXO database script compression. That seems undesirable in any case, TBH, independent from the question of sqlite or anything else.
  18. theStack commented at 4:29 pm on April 26, 2022: member

    Oh, I didn’t realize the current dump format uses the UTXO database script compression. That seems undesirable in any case, TBH, independent from the question of sqlite or anything else.

    Agreed, the current dump format isn’t very helpful (at least not from a user’s perspective).

    FWIW, I’ve written a tool outlined in #24628 (comment) that converts the compact-serialized format to a SQLite database, ~200 LOC in golang: https://github.com/theStack/utxo_to_sqlite Most of the code is decompression routines taken from Bitcoin Core (VARINTs, amounts and scriptPubKeys). It currently skips P2PK outputs with uncompressed pubkeys (for the reasons stated above) and hence can only provide an incomplete UTXO set, but seems to work fine otherwise. After #24952 is merged, there will be no need for this anymore, but maybe someone finds this useful until then or interesting to play around with (I saw it as a welcome opportunity to get a bit more familiar with golang).

  19. dunxen referenced this in commit e3265b12ff on Apr 26, 2022
  20. dunxen referenced this in commit 39c392af35 on Apr 26, 2022
  21. dunxen referenced this in commit b441376db0 on Apr 28, 2022
  22. dunxen referenced this in commit 5d8b0fb6a2 on Apr 28, 2022
  23. dunxen referenced this in commit fb9b13a126 on May 2, 2022
  24. dunxen referenced this in commit 850c57fb8f on May 11, 2022
  25. dunxen referenced this in commit 1db3a52c5b on May 17, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me