Add utxo_to_csv.py tool #34324

pull sipa wants to merge 1 commits into bitcoin:master from sipa:202601_dumpcsv changing 2 files +380 −0
  1. sipa commented at 9:37 pm on January 16, 2026: member

    This adds a new conversion tool, heavily based on utxo_to_sqlite.py, to produce a CSV file for the UTXO set.

    I created this because I wanted something that produced a greppable UTXO set, and am upstreaming it as it may benefit others.

    Unlike the sqlite version, it also dumps addresses/descriptors for the outputs.

  2. DrahtBot commented at 9:37 pm on January 16, 2026: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34324.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31560 (rpc: allow writing UTXO set to a named pipe, introduce dump_to_sqlite.sh script by theStack)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in contrib/utxo-tools/utxo_to_csv.py:66 in 34b9bc172a
    61+
    62+def byte_to_base58(b, version):
    63+    "Compute the Base58Check encoding of an input byte array with given version."""
    64+    result = ''
    65+    b = bytes([version]) + b  # prepend version
    66+    b += hashlib.sha256(hashlib.sha256(b).digest()).digest() # append checksum
    



    sipa commented at 11:43 pm on January 16, 2026:
    Done!
  4. in contrib/utxo-tools/utxo_to_csv.py:204 in 34b9bc172a
    199+def scriptpubkey_to_descriptor(spk, network_string):
    200+    """Infer a descriptor for the specified scriptpubkey."""
    201+    if len(spk) == 25 and spk[0] == 0x76 and spk[1] == 0xa9 and spk[2] == 20 and spk[23] == 0x88 and spk[24] == 0xac:
    202+        return "addr(" + byte_to_base58(spk[3:23], P2PKH_VERSIONS[network_string]) + ")"
    203+    if len(spk) == 23 and spk[0] == 0xa9 and spk[1] == 20 and spk[22] == 0x87:
    204+        return "addr(" + byte_to_base58(spk[1:21], P2SH_VERSIONS[network_string]) + ")"
    


    l0rinc commented at 10:38 pm on January 16, 2026:

    based on https://github.com/bitcoin/bitcoin/blob/fa942332b40c97375af0722f32f7575bca3af819/src/script/solver.cpp#L149 we need to skip the second element as well:

    0        return "addr(" + byte_to_base58(spk[2:22], P2SH_VERSIONS[network_string]) + ")"
    

    sipa commented at 11:43 pm on January 16, 2026:
    Fixed, thanks.
  5. in contrib/utxo-tools/utxo_to_csv.py:216 in 34b9bc172a outdated
    212+    if multi is not None:
    213+        keys, m = multi
    214+        return f"multi({m}," + ",".join(key.hex() for key in keys) + ")"
    215+    return "raw(" + spk.hex() + ")"
    216+
    217+def read_varint(f):
    


    l0rinc commented at 10:41 pm on January 16, 2026:
    I understand if we don’t want to dedup across test and helpers, but can we do that inside the same “UTXO” tools? #32116 (comment)

    sipa commented at 11:43 pm on January 16, 2026:
    I think if we want to deduplicate, it would be better to merge the two tools into one.

    l0rinc commented at 10:32 am on January 17, 2026:

    merge the two tools into one

    +1 for that

  6. DrahtBot added the label CI failed on Jan 16, 2026
  7. DrahtBot commented at 10:55 pm on January 16, 2026: contributor

    🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/actions/runs/21081641168/job/60636466740 LLM reason (✨ experimental): Lint failure: duplicate dictionary key “Testnet3” in contrib/utxo-tools/utxo_to_csv.py.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  8. Add utxo_to_csv.py tool 21fdbe06af
  9. sipa force-pushed on Jan 16, 2026
  10. DrahtBot removed the label CI failed on Jan 17, 2026
  11. in contrib/utxo-tools/utxo_to_csv.py:143 in 21fdbe06af
    138+    ret = bech32_encode(encoding, hrp, [witver] + convertbits(witprog, frombits=8, tobits=5))
    139+    return ret
    140+
    141+# Address/descriptor encoding
    142+
    143+def decode_bare_multisig(spk):
    


    l0rinc commented at 12:08 pm on January 17, 2026:
    Ran the conversion locally, the multisig detection looks a bit loose, and I think some nonstandard scripts can get misclassified as multisig, see https://github.com/bitcoin/bitcoin/blob/5a0f49bd2661d82efe13740856764e4e17fc1d06/src/pubkey.h#L77-L79

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-01-27 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me