Return “all” from scantxoutset #14584

issue domob1812 openend this issue on October 26, 2018
  1. domob1812 commented at 9:30 am on October 26, 2018: contributor

    scantxoutset allows to retrieve UTXOs matching certain criteria, based on a whitelist of desired “addresses” (in a wider sense). However, as far as I see there is no way to use this command to retrieve the full UTXO set via the JSON-RPC interface.

    I think that such a functionality would be useful for certain (non-mainstream) applications. For instance, block explorers, building rich lists and processing coin snapshots. I’ve seen multiple situations (and also done it myself) where applications like that either had to hack Bitcoin Core itself to add a custom-made RPC method for their task or read the full blockchain (rather than just the UTXO set) via RPC calls. Having the ability to process the full UTXO set by external tools hooking onto the RPC interface would be useful to have.

    One big caveat with this is, of course, that the full UTXO set is very large and returning it from a single RPC call is probably not a good idea (if it is even possible). Thus we would need to make scantxoutset “step-able”. For instance, a new optional argument could specify how many results the caller wants to get at most. Then the RPC would return those and remain with the operation in a “paused” state. Follow-up RPC calls would be needed to either “abort” or “continue” the paused scan. (With “continue” returning the next batch of results until the scan is done.)

    What do you think about such an extension, would that be useful and fit into the general goal for scantxoutset (which seems to be wallet support at the moment)? Is this already possible and I simply missed how to do it?

  2. fanquake added the label RPC/REST/ZMQ on Oct 26, 2018
  3. promag commented at 1:35 pm on October 26, 2018: member
    I think that pulling the entire set (even in chunks) periodically is bad design. Maybe it could be possible to have an “utxo set log” to ease building the set externally.
  4. sipa commented at 2:34 pm on October 26, 2018: member
    I don’t think RPC is a very good fit for that purpose, given the large amount of overhead. @laanwj has worked on streaming the UTXO set over the REST interface before I think, but AFAIK had to wait for some libevent features to allow sending in multiple chunks.
  5. domob1812 commented at 2:49 pm on October 26, 2018: contributor
    @promag: The applications I have in mind are not based on continuously keeping an external version of the UTXO set up to date, but rather processing it infrequently. For instance, an external script that is run just once to produce a “rich list” of addresses. In that case, a “UTXO update event” won’t help (although it might be useful to have for other purposes). @sipa: Why would the overhead be larger than with the REST interface? I agree that it is a lot of data and thus at least splitting it into chunks is necessary (as per my suggestion). And of course, this is not something that most users will call - but it can be useful for some specific applications. But for someone who knows what they are doing, shouldn’t it be fine to fetch the UTXO set over RPC? That said, if it is possible to read it through REST (or that would be a better approach), then that would be equally useful for the purposes I have in mind.
  6. sipa commented at 3:01 pm on October 26, 2018: member

    @domob1812 By REST interface I mostly meant in binary form rather than the JSON encoding that RPC necessarily uses.

    But fair enough; no reason why it shouldn’t be possible - but the need for chunking makes it nontrivial to implement.

  7. laanwj commented at 4:11 pm on October 26, 2018: member

    +1 for using REST for this; it’s a stateless request-only interface, after all

    see #7759 for my PR, feel free to pick it up, it might be that the libevent issue is no longer an issue

    By REST interface I mostly meant in binary form rather than the JSON encoding that RPC necessarily uses.

    Many of the REST calls have a format parameter, so you could make it stream in JSON format as well. The advantage of REST, using plain http, is that you can just send JSON records for each UTXO, it doesn’t have to be wrapped in a valid JSON-RPC envelope.

  8. domob1812 commented at 5:09 pm on October 26, 2018: contributor

    @sipa: I don’t think that a chunking implementation for scantxoutset would be too hard to do; keeping around the “currently active” scan between calls is not much different from the current state data that is already kept while a scan is in progress. (The data itself would be different, but not the general architecture.) @laanwj: Indeed, REST looks like a good interface for that - thanks for the pointer to your PR. This is not high priority for me (I mainly wanted to check this as a general idea with the community), but I may pick up your PR at some point in the future if I find time for it.

    However, by the same argument (“REST is a stateless request-only interface”), all of scantxoutset and many other RPC calls should be done through REST as well. RPC and REST are just two complementary interfaces that can both be convenient. (But I do agree that streaming the entire result through HTTP, if it can be made to work, is an elegant solution that avoids explicit chunking.)

  9. achow101 commented at 7:48 pm on October 26, 2022: member
    dumptxoutset was added in #16899
  10. achow101 closed this on Oct 26, 2022

  11. bitcoin locked this on Oct 26, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me