Add a “tx output spender” index #24539

pull sstone wants to merge 2 commits into bitcoin:master from sstone:add-txospender-index changing 13 files +496 −1
  1. sstone commented at 7:32 pm on March 11, 2022: contributor

    This PR adds a new “tx output spender” index, which allows users to query which tx spent a given outpoint with the gettxspendingprevout RPC call that was added by #24408.

    Such an index would be extremely useful for Lightning, and probably for most layer-2 protocols that rely on chains of unpublished transactions.

    UPDATE: this PR is ready for review and issues have been addressed:

    • using a watch-only wallet instead would not work if there is a significant number of outpoints to watch (see #24539 (comment))
    • this PR does not require -txindex anymore

    This index is implemented as a simple k/v database (like other indexes):

    • keys are siphash(spent outpoint) (64 bytes)
    • values are list of spending tx positions( in case there are collisions which should be exceptionally uncommon)., using the same encoding as txindex (variable size, max is 12 bytes).

    The spending tx can optionally be returned by gettxspendingprevout (even it -txindex is not set).

  2. sstone force-pushed on Mar 11, 2022
  3. ryanofsky commented at 9:24 pm on March 11, 2022: contributor
    Concept ACK. It would be good know more about specific use-cases for this, if you can provide some links or more description. But the implementation seems very simple, so I don’t think adding this would be a maintenance burden.
  4. DrahtBot added the label Build system on Mar 11, 2022
  5. DrahtBot added the label RPC/REST/ZMQ on Mar 11, 2022
  6. DrahtBot added the label UTXO Db and Indexes on Mar 11, 2022
  7. DrahtBot added the label Validation on Mar 11, 2022
  8. maflcko commented at 8:01 am on March 12, 2022: member

    @ ryan, I think the idea is that there is one coin that is owned by more than one entity. There may be several (conflicting) pre-signed txs spending it and as soon as it is spent, all entities with ownership in it want to check which pre-signed tx spent the coin. If they use Bitcoin Core to monitor the coin, they may use:

    • A watch-only wallet (which requires a file on disk and BDB/Sqlite)
    • Walk the raw chain and raw mempool themselves (According to #24408 (comment) this is doable for the chain, but may not be for the mempool)
    • Use an index. A blockfilterindex might speed up scanning the chain, whereas this index directly gives the answer.
  9. sstone commented at 11:46 am on March 12, 2022: contributor

    Lightning channels are basically trees of unpublished transactions:

    • a commit tx that spends a shared onchain “funding” UTXO
    • HTLC txs that spend the commit tx (there can be competing HTLC txs that spend the same commit tx output: one for successful payments, one for timed out payments)

    There is also a penalty scheme built into these txs: if you cheat, you need to wait before you can spend your outputs but your channel peer can spend them immediately.

    And LN is also transitioning to a model where these txs pay minimum fees and can be “bumped” with RBF/CPFP.

    => it’s very useful for LN nodes to detect when transactions have been spent, but also how they’ve been spent and react accordingly (by extracting info from the spending txs for example), and walk up chains of txs that spend each other.

    As described above, there are tools in bitcoin core that we could use and would help a bit, and Lightning nodes could also create their own indexes independently or use electrum servers, block explorers, but this would make running LN nodes connected to your own bitcoin node much easier.

  10. michaelfolkson commented at 3:13 pm on March 12, 2022: contributor

    Concept ACK

    I think the use case is clear and if others think the maintenance burden is reasonable there doesn’t seem to be an obvious downside.

    This PR can be viewed as an “extension” to #24408

    Seems like review should be focused on #24408 first but assuming that gets merged, Concept ACK for this.

  11. DrahtBot commented at 9:25 pm on March 12, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK ryanofsky, michaelfolkson, mzumsande, glozow, aureleoules, achow101, TheCharlatan, hodlinator
    Approach ACK w0xlt

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31483 (kernel: Move kernel-related cache constants to kernel cache by TheCharlatan)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  12. in src/init.cpp:869 in aa9f96370e outdated
    865@@ -857,6 +866,8 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    866     if (args.GetIntArg("-prune", 0)) {
    867         if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX))
    868             return InitError(_("Prune mode is incompatible with -txindex."));
    869+        if (args.GetBoolArg("-txospenderindex", DEFAULT_TXINDEX))
    


    mzumsande commented at 6:58 pm on March 13, 2022:
    Should use DEFAULT_TXOSPENDERINDEX

  13. in src/init.cpp:1411 in aa9f96370e outdated
    1407@@ -1397,6 +1408,9 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1408     if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) {
    1409         LogPrintf("* Using %.1f MiB for transaction index database\n", cache_sizes.tx_index * (1.0 / 1024 / 1024));
    1410     }
    1411+    if (args.GetBoolArg("-txospenderindex", DEFAULT_TXINDEX)) {
    


    mzumsande commented at 6:59 pm on March 13, 2022:
    DEFAULT_TXOSPENDERINDEX here as well

  14. in src/rpc/rawtransaction.cpp:281 in aa9f96370e outdated
    276+                {
    277+                    {"txid", RPCArg::Type::STR_HEX, RPCArg::Optional::NO, "The outpoint transaction id"},
    278+                    {"index", RPCArg::Type::NUM, RPCArg::Optional::NO, "The outpoint output index"},
    279+                },
    280+                RPCResult{
    281+                    RPCResult::Type::STR_HEX, "", "The id of the transaction that spend the provided outpoint"},
    


    mzumsande commented at 7:38 pm on March 13, 2022:
    spends

  15. in src/index/txospenderindex.cpp:12 in aa9f96370e outdated
     7+
     8+constexpr uint8_t DB_TXOSPENDERINDEX{'s'};
     9+
    10+std::unique_ptr<TxoSpenderIndex> g_txospenderindex;
    11+
    12+/** Access to the txindex database (indexes/txindex/) */
    


    mzumsande commented at 7:39 pm on March 13, 2022:
    update comment

  16. in src/index/txospenderindex.h:11 in aa9f96370e outdated
     6+#define BITCOIN_INDEX_TXOSPENDERINDEX_H
     7+
     8+#include <index/base.h>
     9+
    10+/**
    11+ * TxIndex is used to look up transactions included in the blockchain by hash.
    


    mzumsande commented at 7:40 pm on March 13, 2022:
    update comment

  17. mzumsande commented at 9:25 pm on March 13, 2022: contributor

    Concept ACK - I can see why this could be helpful for lightning, the code is very similar to the txindex code.

    Some general thoughts:

    • getmempooltxspendingprevout in #24408 takes a list of outpoints, gettxospender takes a single outpoint. Would it make sense to have the same format here (I’d guess a common use case would be to check both RPCs with the same outpoint(s)?)
    • Taking this one step further, getrawtransaction queries transaction data from the mempool or from blocks (if txindex is enabled) in one RPC. Would it make sense to have something similar for outpoints, instead of two separate RPCs?
    • I think that the indexed data will not be deleted in case of a reorg (but possibly overwritten if an outpoint would be spent in a different tx as a result of a reorg). That means that a result from gettxospender could be stale. Would it make sense to delete entries or should we at least mention this possibility in the rpc doc?
    • What’s the reason for the dependency on -txindex? As far as I can see it’s not technical, so would knowing the fact that an outpoint was used in a tx (even if we can’t lookup the details of that tx further) be sufficient for some use cases?
  18. sstone force-pushed on Mar 14, 2022
  19. sstone commented at 10:47 am on March 14, 2022: contributor
    * `getmempooltxspendingprevout` in [#24408](/bitcoin-bitcoin/24408/) takes a list of outpoints, `gettxospender` takes a single outpoint. Would it make sense to have the same format here (I'd guess a common use case would be to check both RPCs with the same outpoint(s)?)
    

    Bitcoin supports batched RPC request so from a functional p.o.v it’s equivalent to passing a list of outpoints but it’s a bit easier to reason about imho. I think that the use case for getmempooltxspendingprevout is a bit different: you want to bump fees for a set of txs that are related to the same channel and want to understand who is spending what. You may also have txs with multiple inputs, and would pass them all together to getmempooltxspendingprevout.

    For gettxospender the use case is trying to understand why channels have been closed and react if needed, if we could pass in multiple outpoints they would likely be unrelated.

    * Taking this one step further, `getrawtransaction` queries transaction data from the mempool or from blocks (if txindex is enabled) in one RPC. Would it make sense to have something similar for outpoints, instead of two separate RPCs?
    

    Maybe, let’s see where the discussion in #24408 but to me it feels cleaner to have separate calls.

    * I think that the indexed data will not be deleted in case of a reorg (but possibly overwritten if an outpoint would be spent in a different tx as a result of a reorg). That means that a result from `gettxospender` could be stale. Would it make sense to delete entries or should we at least mention this possibility in the rpc doc?
    
    * What's the reason for the dependency on -txindex? As far as I can see it's not technical, so would knowing the fact that an outpoint was used in a tx (even if we can't lookup the details of that tx further) be sufficient for some use cases?
    

    We would almost always want to get the actual tx that spends our outpoint. My reasoning was that relying on -txindex makes gettxospender robust and consistent in the case of reorgs and is also simpler and easier to understand that storing block positions instead of txids.

  20. DrahtBot added the label Needs rebase on Mar 23, 2022
  21. sstone force-pushed on Mar 24, 2022
  22. DrahtBot removed the label Needs rebase on Mar 24, 2022
  23. sstone force-pushed on Mar 24, 2022
  24. glozow commented at 6:41 pm on April 1, 2022: member
    Concept ACK, nice
  25. laanwj removed the label Build system on Apr 14, 2022
  26. laanwj removed the label RPC/REST/ZMQ on Apr 14, 2022
  27. laanwj removed the label Validation on Apr 14, 2022
  28. in test/functional/rpc_txospender.py:31 in e021b6bc77 outdated
    26+            args.append("-whitelist=noban@127.0.0.1")
    27+
    28+        self.supports_cli = False
    29+
    30+    def skip_test_if_missing_module(self):
    31+        self.skip_if_no_wallet()
    


    glozow commented at 0:40 am on April 23, 2022:
    RPC tests shouldn’t require wallet, please use MiniWallet

    maflcko commented at 1:34 pm on April 24, 2022:

    I’ve translated the test to MiniWallet. If you take it, make sure to rebase on current master first, as it is using new MiniWallet functions.

    Also, I am removing not needed stuff. Let me know if you have any questions. And no need to mention me in the commit as author.

     0diff --git a/test/functional/rpc_txospender.py b/test/functional/rpc_txospender.py
     1index 1d7157e9bf..c063b4e502 100755
     2--- a/test/functional/rpc_txospender.py
     3+++ b/test/functional/rpc_txospender.py
     4@@ -10,57 +10,36 @@ from test_framework.test_framework import BitcoinTestFramework
     5 from test_framework.util import (
     6     assert_equal,
     7     assert_raises_rpc_error,
     8-    find_vout_for_address,
     9 )
    10+from test_framework.wallet import MiniWallet
    11+
    12 
    13 class GetTxoSpenderTest(BitcoinTestFramework):
    14     def set_test_params(self):
    15-        self.setup_clean_chain = True
    16         self.num_nodes = 2
    17         self.extra_args = [
    18             ["-txindex", "-txospenderindex"],
    19             ["-txindex"],
    20         ]
    21-        # whitelist all peers to speed up tx relay / mempool sync
    22-        for args in self.extra_args:
    23-            args.append("-whitelist=noban@127.0.0.1")
    24-
    25         self.supports_cli = False
    26 
    27-    def skip_test_if_missing_module(self):
    28-        self.skip_if_no_wallet()
    29-
    30-    def setup_network(self):
    31-        super().setup_network()
    32-        self.connect_nodes(0, 1)
    33-
    34     def run_test(self):
    35-        self.log.info("Prepare some coins for multiple *rawtransaction commands")
    36-        self.generate(self.nodes[1], 1)
    37-        self.generate(self.nodes[0], COINBASE_MATURITY + 1)
    38-        self.sync_all()
    39-        self.generate(self.nodes[0], 5)
    40+        self.wallet = MiniWallet(self.nodes[1])
    41+        self.wallet.rescan_utxos()
    42 
    43         self.gettxospender_tests()
    44 
    45     def gettxospender_tests(self):
    46-        addr = self.nodes[1].getnewaddress()
    47-        txid = self.nodes[0].sendtoaddress(addr, 10)
    48-        self.generate(self.nodes[0], 1)
    49-        vout = find_vout_for_address(self.nodes[1], txid, addr)
    50-        rawTx = self.nodes[1].createrawtransaction([{'txid': txid, 'vout': vout}], {self.nodes[1].getnewaddress(): 9.999})
    51-        rawTxSigned = self.nodes[1].signrawtransactionwithwallet(rawTx)
    52-        txId = self.nodes[1].sendrawtransaction(rawTxSigned['hex'])
    53-        self.generateblock(self.nodes[0], output=self.nodes[0].getnewaddress(), transactions=[rawTxSigned['hex']])
    54+        partially_spent_tx = self.wallet.send_self_transfer_multi(from_node=self.nodes[1], num_outputs=2)
    55+        num_out = 1
    56+        spending_txid = self.wallet.send_self_transfer(from_node=self.nodes[1], utxo_to_spend=partially_spent_tx["new_utxos"][num_out])["txid"]
    57+        self.generate(self.wallet, 1)
    58 
    59-        for n in [0, 1]:
    60-            self.log.info(f"Test gettxospender {'with' if n == 0 else 'without'} -txospenderindex")
    61+        self.log.info(f"Test gettxospender with -txospenderindex")
    62+        assert_equal(self.nodes[0].gettxospender(partially_spent_tx["txid"], num_out), spending_txid)
    63+        self.log.info(f"Test gettxospender without -txospenderindex")
    64+        assert_raises_rpc_error(-8, "This RPC call requires -txospenderindex to be enabled.", self.nodes[1].gettxospender, partially_spent_tx["txid"], num_out)
    65 
    66-            if n == 0:
    67-                assert_equal(self.nodes[n].gettxospender(txid, vout), txId)
    68-            else:
    69-                # Without -txospenderindex, expect to raise.
    70-                assert_raises_rpc_error(-8, "This RPC call requires -txospenderindex to be enabled.", self.nodes[n].gettxospender, txid, vout)
    71 
    72-if __name__ == '__main__':
    73+if __name__ == "__main__":
    74     GetTxoSpenderTest().main()
    

    sstone commented at 5:16 pm on April 25, 2022:
    Thanks! I’ve rebased on #24408 and used your code to extend its RPC test
  29. sstone force-pushed on Apr 25, 2022
  30. sstone commented at 5:14 pm on April 25, 2022: contributor
  31. DrahtBot added the label Needs rebase on Apr 26, 2022
  32. maflcko commented at 12:37 pm on April 27, 2022: member
    Looks like the test somehow went away? If #https://github.com/bitcoin/bitcoin/pull/24408 is merged soon, you can fix everything on the next rebase.
  33. sstone force-pushed on Apr 27, 2022
  34. sstone force-pushed on Apr 27, 2022
  35. sstone commented at 3:23 pm on April 27, 2022: contributor

    Looks like the test somehow went away? If ##24408 is merged soon, you can fix everything on the next rebase.

    Yes sorry about that, I’ve brought it back. I will also add another commit to move gettxspendingprevout out of rpc/mempool.cpp.

  36. DrahtBot removed the label Needs rebase on Apr 27, 2022
  37. luke-jr commented at 8:52 pm on April 29, 2022: member
    As with #24408, unless there’s a need for looking up an unpredictable outpoint after its spend is mined (why?), I think a watch-only wallet is the correct approach for this use case, and expecting users to build infinitely-growing indexes isn’t a good solution.
  38. in src/index/txospenderindex.cpp:8 in 7805cfd193 outdated
    0@@ -0,0 +1,60 @@
    1+#include <index/txospenderindex.h>
    


    luke-jr commented at 8:54 pm on April 29, 2022:
    Missing copyright/license header

    sstone commented at 6:14 pm on May 3, 2022:
    Thanks! I’ve applied your patch and updated python tests
  39. in src/init.cpp:1584 in 7805cfd193 outdated
    1587@@ -1574,6 +1588,16 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1588         }
    1589     }
    1590 
    1591+    if (args.GetBoolArg("-txospenderindex", DEFAULT_TXOSPENDERINDEX)) {
    1592+        if (!args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) {
    1593+            return InitError(_("-txospenderindex requires -txindex."));
    


    luke-jr commented at 8:56 pm on April 29, 2022:
    Maybe(?) should SoftSetArg in param interaction
  40. in src/rpc/mempool.cpp:651 in 7805cfd193 outdated
    646+                }
    647+
    648+                prevouts.emplace_back(txid, nOutput);
    649+            }
    650+
    651+            bool f_txospenderindex_ready = g_txospenderindex ? g_txospenderindex->BlockUntilSyncedToCurrentChain() : false;
    


    luke-jr commented at 9:01 pm on April 29, 2022:

    Suggest moving this below where it’s actually needed/used.

    Note this is non-trivial: we shouldn’t re-“block” for each outpoint, and lock order might matter.


    sstone commented at 6:32 pm on May 3, 2022:

    That what I tried at first but could not get it to work without triggering a deadlock:

    0POTENTIAL DEADLOCK DETECTED
    1Previous lock order was:
    2(2) 'cs_main' in node/chainstate.cpp:30 (in thread 'qt-init')
    3(1) 'cs' in txmempool.cpp:709 (in thread 'qt-init')
    4Current lock order is:
    5(1) 'mempool.cs' in rpc/mempool.cpp:676 (in thread 'qt-rpcconsole')
    6(2) 'cs_main' in index/base.cpp:331 (in thread 'qt-rpcconsole')
    
  41. in test/functional/rpc_mempool_info.py:23 in 7805cfd193 outdated
    18+         self.num_nodes = 3
    19+         self.extra_args = [
    20+             ["-txindex", "-txospenderindex"],
    21+             ["-txindex", "-txospenderindex"],
    22+             ["-txindex"],
    23+         ]
    


    luke-jr commented at 9:01 pm on April 29, 2022:
    nit: Block indented 1 space too many

  42. in src/index/txospenderindex.h:1 in 7805cfd193 outdated
    0@@ -0,0 +1,46 @@
    1+// Copyright (c) 2017-2021 The Bitcoin Core developers
    


    luke-jr commented at 9:05 pm on April 29, 2022:
    nit: Wrong year range

  43. in src/rpc/mempool.cpp:672 in 7805cfd193 outdated
    667+                    // no spending tx in mempool, query txospender index
    668+                    uint256 spendingtxid;
    669+                    if(g_txospenderindex->FindSpender(prevout, spendingtxid)) {
    670+                        o.pushKV("spendingtxid", spendingtxid.GetHex());
    671+                    } else if (!f_txospenderindex_ready) {
    672+                        throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "No spending tx for the provided outpoint. Transactions spenders are still in the process of being indexed ");
    


    luke-jr commented at 9:13 pm on April 29, 2022:
    nit: Space at the end instead of punctuation

    luke-jr commented at 9:14 pm on April 29, 2022:

    Doesn’t say which output triggers it, and blocks partial results.

    Suggest something like a “warnings” key on the output result.


    luke-jr commented at 9:46 pm on April 29, 2022:
    RPC_INVALID_ADDRESS_OR_KEY seems like the wrong code to use



  44. luke-jr commented at 10:47 pm on April 29, 2022: member
  45. in src/rpc/mempool.cpp:695 in 7805cfd193 outdated
    662+
    663+                const CTransaction* spendingTx = mempool.GetConflictTx(prevout);
    664+                if (spendingTx != nullptr) {
    665+                    o.pushKV("spendingtxid", spendingTx->GetHash().ToString());
    666+                } else if (g_txospenderindex) {
    667+                    // no spending tx in mempool, query txospender index
    


    w0xlt commented at 3:15 pm on May 1, 2022:

    Perhaps a new field could be added to the response indicating whether the transaction is in the mempool or not? Something like:

    0o.pushKV("in_mempool", true); // or false
    

    sstone commented at 6:35 pm on May 3, 2022:
    I’m not sure it is needed: caller will receive a list of tx ids and will call getrawtransaction which will include number of confirmations etc….
  46. w0xlt commented at 3:16 pm on May 1, 2022: contributor
    Approach ACK
  47. in src/index/txospenderindex.cpp:39 in 7805cfd193 outdated
    30+
    31+bool TxoSpenderIndex::DB::WriteSpenderInfos(const std::vector<std::pair<COutPoint, uint256>>& items)
    32+{
    33+    CDBBatch batch(*this);
    34+    for (const auto& tuple : items) {
    35+        batch.Write(std::make_pair(DB_TXOSPENDERINDEX, tuple.first), tuple.second);
    


    w0xlt commented at 3:18 pm on May 2, 2022:

    Is there any reason to use std::make_pair(DB_TXOSPENDERINDEX, tuple.first) as the key instead of just tuple.first ?

    DB_TXOSPENDERINDEX seems redundant.


    sstone commented at 6:37 pm on May 3, 2022:
    This is how all other indexes work even when there is just one possible key. Maybe because it makes it easier to add new record type later with a new key prefix ?
  48. in src/rpc/mempool.cpp:615 in 7805cfd193 outdated
    609+            {
    610+                {RPCResult::Type::OBJ, "", "",
    611+                {
    612+                    {RPCResult::Type::STR_HEX, "txid", "the transaction id of the spent output"},
    613+                    {RPCResult::Type::NUM, "vout", "the vout value of the spent output"},
    614+                    {RPCResult::Type::STR_HEX, "spendingtxid", /*optional=*/true, "the transaction id of the mempool transaction spending this output (omitted if unspent)"},
    


    w0xlt commented at 6:34 pm on May 2, 2022:

    This message needs to be changed, as the transaction may not be in the mempool, but in the blocks. This field is no longer optional in this PR.

    0                    {RPCResult::Type::STR_HEX, "spendingtxid", "the id of the transaction that spends the output"},
    

  49. w0xlt commented at 6:42 pm on May 2, 2022: contributor

    @sstone I created a branch that modifies this PR to include the in_mempool and block_height of the transaction that spends the output. The result is like this:

    0[
    1  {
    2    "txid": "c8de5b22e7dac049cd6a2628580876d74c16a58a886be2b4b1c744901f8e953e",
    3    "vout": 0,
    4    "spendingtxid": "117dd0df58201d4ac6c49472dc59208487caa7d79079a4bb32d85c16f4174e72",
    5    "in_mempool": false,
    6    "block_height": 88551
    7  }
    8]
    

    If you agree with this approach, feel free to use the code of that branch. https://github.com/w0xlt/bitcoin/tree/add_txospender_index_2

  50. bitcoin deleted a comment on May 3, 2022
  51. w0xlt commented at 7:00 pm on May 3, 2022: contributor

    There is no need for a new commit to address review comments. The last commit can be squashed.

    https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits

  52. sstone force-pushed on May 3, 2022
  53. w0xlt commented at 9:18 pm on May 4, 2022: contributor

    The static RPCHelpMan getindexinfo() RPC should also be updated.

     0static RPCHelpMan getindexinfo()
     1{
     2    // ...
     3    if (g_txindex) {
     4        result.pushKVs(SummaryToJSON(g_txindex->GetSummary(), index_name));
     5    }
     6
     7    if (g_coin_stats_index) {
     8        result.pushKVs(SummaryToJSON(g_coin_stats_index->GetSummary(), index_name));
     9    }
    10
    11+    if (g_txospenderindex) {
    12+        result.pushKVs(SummaryToJSON(g_txospenderindex->GetSummary(), index_name));
    13+    }
    14
    15    ForEachBlockFilterIndex([&result, &index_name](const BlockFilterIndex& index) {
    16        result.pushKVs(SummaryToJSON(index.GetSummary(), index_name));
    17    });
    18    // ...
    19}
    
  54. sstone force-pushed on May 5, 2022
  55. sstone commented at 8:21 pm on May 5, 2022: contributor

    The static RPCHelpMan getindexinfo() RPC should also be updated.

     0static RPCHelpMan getindexinfo()
     1{
     2    // ...
     3    if (g_txindex) {
     4        result.pushKVs(SummaryToJSON(g_txindex->GetSummary(), index_name));
     5    }
     6
     7    if (g_coin_stats_index) {
     8        result.pushKVs(SummaryToJSON(g_coin_stats_index->GetSummary(), index_name));
     9    }
    10
    11+    if (g_txospenderindex) {
    12+        result.pushKVs(SummaryToJSON(g_txospenderindex->GetSummary(), index_name));
    13+    }
    14
    15    ForEachBlockFilterIndex([&result, &index_name](const BlockFilterIndex& index) {
    16        result.pushKVs(SummaryToJSON(index.GetSummary(), index_name));
    17    });
    18    // ...
    19}
    

    Thanks! done in https://github.com/bitcoin/bitcoin/pull/24539/commits/d9d96f8b5dab7a617efb95c7a4c53114a7815abe

  56. sstone force-pushed on May 29, 2022
  57. DrahtBot added the label Needs rebase on Jul 18, 2022
  58. sstone force-pushed on Jul 20, 2022
  59. DrahtBot removed the label Needs rebase on Jul 20, 2022
  60. sstone force-pushed on Sep 23, 2022
  61. aureleoules commented at 1:36 pm on September 28, 2022: contributor

    As described above, there are tools in bitcoin core that we could use and would help a bit

    MarcoFalke and luke-jr suggested using a watch-only wallet for this use case. I don’t fully understand how Lightning works in depth/implementation. Do you know @sstone what would be the pros and cons of using a watch-only wallet for this?

    I haven’t fully synced the txospenderindex (height 568907) and I know the index is optional but it’s already ~60GB, so is it worth it for layer 2 node operators?

  62. sstone commented at 4:41 pm on October 3, 2022: contributor

    For most LN nodes watch-only wallets are probably enough: you would import all your channel outpoints, and check your wallet transactions for spending transactions (bitcoin core should include them). AFAIK you would do this by importing each channel outpoint in a single “address” descriptor (though it seems that there is currently no way to remove a descriptor ?).

    But I’m not sure yet how practical and efficient this would be for large nodes with 10s of 1000s of transactions (I would like to have bitcoin core do some kind of filtering before it returns wallet transactions).

  63. DrahtBot added the label Needs rebase on Oct 4, 2022
  64. sstone force-pushed on Oct 5, 2022
  65. DrahtBot removed the label Needs rebase on Oct 5, 2022
  66. sstone commented at 6:49 pm on October 12, 2022: contributor

    update: unless I’ve missed something, unless you have just a few transactions to monitor, using watch-only wallet would not work as is and even if it was modified it would still create serious operational issues:

    If we were to use a single watch-only wallet and import all the outpoints that we need to watch, with one descriptor per outpoint , we would still to to fetch all wallet transactions to get spending transactions even if we used labels when importing our descriptors because label filtering only works for incoming transactions, so this does not scale (and again there does not seem to be a way to remove descriptors once they’ve been imported).

    Since with LN we currently know which transactions would spend the outpoints that we are watching, we could also import them. It would solves our filtering problem but we would end up with a huge number of transaction in our wallet (millions of them).

    Another option would be using one watch-only wallet per channel. This would be much better from an indexing point of view but would scale even less because there would be one wallet database per channel.

    And even if we modified how watch-only wallets work to add an optional “spent-by” field to listtransactions (this would be fairly easy), using them would be a problem from an operational point of view: rebuilding a node can be done very quickly, as we don’t even need to restore the bitcoin wallet that we used before (we just need to have a backup): we can just point our system to a new bitcoin node with a fresh wallet and we’re good to go. With watch-only wallets we would first need to rebuild this wallet, which could take hours, before we’re operational again.

    => I still believe that indexing “spending” transactions is a better approach than watch-only wallets.

  67. aureleoules commented at 6:58 pm on October 12, 2022: contributor

    Thanks for the clarification @sstone.

    I still believe that indexing “spending” transactions is a better approach than watch-only wallets.

    Agreed, this is Concept ACK for me then.

    rebuilding a node can be done very quickly, as we don’t even need to restore the bitcoin wallet that we used before (we just need to have a backup): we can just point our system to a new bitcoin node with a fresh wallet and we’re good to go. With watch-only wallets we would first need to rebuild this wallet, which could take hours, before we’re operational again.

    The “new bitcoin node” would still need to have the txospenderindex constructed though right? Which also takes some time no?

  68. sstone commented at 8:05 am on October 13, 2022: contributor

    The “new bitcoin node” would still need to have the txospenderindex constructed though right? Which also takes some time no?

    Yes, you would need to run your bitcoin nodes with -txospenderindex and -txindex (but theses indexes are “generic” and completely independent from your application).

  69. in src/index/txospenderindex.cpp:39 in 16b5cb0b9d outdated
    34+
    35+bool TxoSpenderIndex::DB::WriteSpenderInfos(const std::vector<std::pair<COutPoint, uint256>>& items)
    36+{
    37+    CDBBatch batch(*this);
    38+    for (const auto& tuple : items) {
    39+        batch.Write(std::make_pair(DB_TXOSPENDERINDEX, tuple.first), tuple.second);
    


    aureleoules commented at 10:51 am on October 13, 2022:

    nit

    0    for (const auto& [outPoint, hash] : items) {
    1        batch.Write(std::make_pair(DB_TXOSPENDERINDEX, outPoint), hash);
    

    maflcko commented at 12:16 pm on October 13, 2022:
    we use snake_case, so if this suggestion is taken, it should be outpoint or out_point

    sstone commented at 8:48 am on October 20, 2022:
    Thanks! I’ll use your suggestions next time I rebase

  70. aureleoules approved
  71. aureleoules commented at 11:44 am on October 13, 2022: contributor

    LGTM. I tried this:

    0./src/bitcoin-cli -testnet gettxspendingprevout '[{"txid": "f52c6102a5e9b9d11ff5edf0d888009cdcad85b44a7e83f057cf934745d9dce2", "vout": 1}]'
    1[
    2  {
    3    "txid": "f52c6102a5e9b9d11ff5edf0d888009cdcad85b44a7e83f057cf934745d9dce2",
    4    "vout": 1,
    5    "spendingtxid": "cb891f936bb45b347495542ebc7f413db612424c5516e14faef0239fbedecee1"
    6  }
    7]
    

    The information returned is correct (https://blockstream.info/testnet/tx/f52c6102a5e9b9d11ff5edf0d888009cdcad85b44a7e83f057cf934745d9dce2?output:1).

    Here’s the result on master:

    0./src/bitcoin-cli -testnet gettxspendingprevout '[{"txid": "f52c6102a5e9b9d11ff5edf0d888009cdcad85b44a7e83f057cf934745d9dce2", "vout": 1}]'
    1[
    2  {
    3    "txid": "f52c6102a5e9b9d11ff5edf0d888009cdcad85b44a7e83f057cf934745d9dce2",
    4    "vout": 1
    5  }
    6]
    

    I verified that if the spending transaction is already in the mempool, gettxspendingprevout behaves the same on master and this PR.

    For info, the index is 6.7GB on testnet, and 110GB on mainnet.

    I would also suggest splitting the PR in 3 commits (index, rpc and test) for easier review.

  72. in src/init.cpp:893 in 16b5cb0b9d outdated
    889@@ -881,6 +890,8 @@ bool AppInitParameterInteraction(const ArgsManager& args, bool use_syscall_sandb
    890     if (args.GetIntArg("-prune", 0)) {
    891         if (args.GetBoolArg("-txindex", DEFAULT_TXINDEX))
    892             return InitError(_("Prune mode is incompatible with -txindex."));
    893+        if (args.GetBoolArg("-txospenderindex", DEFAULT_TXOSPENDERINDEX))
    


    maflcko commented at 12:18 pm on October 13, 2022:

    nit for new code: (only if you retouch)

    0        if (args.GetBoolArg("-txospenderindex", DEFAULT_TXOSPENDERINDEX)) {
    

  73. in src/rpc/mempool.cpp:680 in 16b5cb0b9d outdated
    671@@ -644,6 +672,30 @@ static RPCHelpMan gettxspendingprevout()
    672                 const CTransaction* spendingTx = mempool.GetConflictTx(prevout);
    673                 if (spendingTx != nullptr) {
    674                     o.pushKV("spendingtxid", spendingTx->GetHash().ToString());
    675+                } else if (mempool_only.value_or(false)) {
    676+                    // do nothing
    677+                } else if (g_txospenderindex) {
    678+                    // no spending tx in mempool, query txospender index
    679+                    uint256 spendingtxid;
    680+                    if(g_txospenderindex->FindSpender(prevout, spendingtxid)) {
    


    maflcko commented at 12:19 pm on October 13, 2022:

    nit (only if you retouch):

    0                    if (g_txospenderindex->FindSpender(prevout, spendingtxid)) {
    

  74. maflcko commented at 12:20 pm on October 13, 2022: member
    didn’t review, but left some nits (no need to address them, unless you push for other reasons)
  75. Pantamis commented at 4:50 pm on November 18, 2022: none

    I did not review the code but tested the PR at commit 972728ab72b63ad074ffc1f2d8275d0c05672fa8 and it works.

    It would be great to reduce the size of the index because it is well larger than txindex:

    0bitcoin/indexes $ du -h --max-depth=1
    1119G	./txospenderindex
    238G	./txindex
    3157G	.
    

    Instead of storing outpoint (32 + 4 bytes) -> txid (32 bytes) it could be outpointhash (32 bytes) -> txid (32 bytes) to gain 4 bytes per txos (or even using RIPEMD160 to gain 4 + 12 bytes per txo).

    The txospenderindex stores redondantly the funding_txid as key for each transaction’s outputs. It could be made almost half its current size if it was possible to store with the following schema:

    funding_txid (32 bytes) -> concat_txids (32 * number_of_outpoints_in_funding_txid bytes)

    where concat_txid would be all the spending txid concatenated in the order of the vout of the funding transaction or 32 bytes at 0 if the outpoint is unspent. To get the spending transaction of an outpoint txid:vout, you would just have to read the bytes 32 * vout to 32 * (vout+1) of the concat_txids given by the index. However this requires modifying the value at a given key during the sync of the index, I don’t know if it is an issue.

    It could be nice to provide more informations on the spending transaction too, like blockheight of spending transaction and the index of the input spending the outpoint in the spending transaction althought not necessary since these information can be found by using txindex.

  76. sstone commented at 2:16 pm on November 21, 2022: contributor

    It would be great to reduce the size of the index because it is well larger than txindex: … Instead of storing outpoint (32 + 4 bytes) -> txid (32 bytes) it could be outpointhash (32 bytes) -> txid (32 bytes) to gain 4 bytes per txos (or even using RIPEMD160 to gain 4 + 12 bytes per txo).

    I went for a simple scheme that is easy to use and did not try to optimize for space as it would still be O(n) so there’s not much to be gained unless I’m missing something. It could be made smaller by using blockheight | txindex |outputindex (8 bytes) instead of txids ((this is how we define “short channel ids” in LN), but it would be harder to use.

    The txospenderindex stores redondantly the funding_txid as key for each transaction’s outputs. It could be made almost half its current size if it was possible to store with the following schema:

    funding_txid (32 bytes) -> concat_txids (32 * number_of_outpoints_in_funding_txid bytes)

    where concat_txid would be all the spending txid concatenated in the order of the vout of the funding transaction or 32 bytes at 0 if the outpoint is unspent. To get the spending transaction of an outpoint txid:vout, you would just have to read the bytes 32 * vout to 32 * (vout+1) of the concat_txids given by the index. However this requires modifying the value at a given key during the sync of the index, I don’t know if it is an issue.

    Same reason as above, I’ll have a look but I’m not sure the gain is worth the added complexity.

    It could be nice to provide more informations on the spending transaction too, like blockheight of spending transaction and the index of the input spending the outpoint in the spending transaction althought not necessary since these information can be found by using txindex.

    Yes, I supposed that users would call gettransaction to get additional info on the spending tx

  77. in src/rpc/mempool.cpp:715 in 16b5cb0b9d outdated
    685+                        } else {
    686+                            UniValue warnings(UniValue::VARR);
    687+                            warnings.push_back("txospenderindex is still being synced.");
    688+                            o.pushKV("warnings", warnings);
    689+                        }
    690+                    }
    


    achow101 commented at 9:38 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    I think we should be warning if the spending txid was found, but the index was not yet synced? It seems like it would be possible for a spend to have been recorded in the index, but was removed/changed by a reorg that has not been synced by the index yet.


  78. in src/index/txospenderindex.cpp:103 in 16b5cb0b9d outdated
    44+bool TxoSpenderIndex::CustomAppend(const interfaces::BlockInfo& block)
    45+{
    46+    std::vector<std::pair<COutPoint, uint256>> items;
    47+    items.reserve(block.data->vtx.size());
    48+
    49+    for (const auto& tx : block.data->vtx) {
    


    achow101 commented at 9:40 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    This should skip coinbase transactions.


  79. in src/init.cpp:1584 in 16b5cb0b9d outdated
    1533@@ -1520,6 +1534,16 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1534         }
    1535     }
    1536 
    1537+    if (args.GetBoolArg("-txospenderindex", DEFAULT_TXOSPENDERINDEX)) {
    1538+        if (!args.GetBoolArg("-txindex", DEFAULT_TXINDEX)) {
    1539+            return InitError(_("-txospenderindex requires -txindex."));
    


    achow101 commented at 9:43 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    I don’t think it’s necessary to require that txindex when enabling txospenderindex. There isn’t anything in the txospenderindex that inherently requires the txindex (i.e. it isn’t storing information specific to the txindex). I get that it isn’t as useful without the txindex enabled, but that isn’t really a reason to restrict it.

    Removing this restriction would also allow txospenderindex to be enabled with pruning.


    sstone commented at 4:17 pm on November 22, 2022:
    I did not see a use case for txospenderindex without txindex as you would always want you fetch the spending tx but I guess people could want to know that a tx was confirmed and spent and nothing more.

    sstone commented at 2:29 pm on December 2, 2022:
  80. in src/index/txospenderindex.h:25 in 16b5cb0b9d outdated
    20+    class DB;
    21+
    22+private:
    23+    const std::unique_ptr<DB> m_db;
    24+
    25+    bool AllowPrune() const override { return false; }
    


    achow101 commented at 9:44 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    I don’t see a technical reason for why the txospenderindex should not work with pruning (other than the current requirement of -txindex). It doesn’t store any information related to the block files.


    sstone commented at 4:21 pm on November 22, 2022:
    I will experiment with not requiring txindex and pruning support and add a separate commit for that.

    sstone commented at 2:30 pm on December 2, 2022:
  81. in src/index/txospenderindex.h:30 in 16b5cb0b9d outdated
    23+    const std::unique_ptr<DB> m_db;
    24+
    25+    bool AllowPrune() const override { return false; }
    26+
    27+protected:
    28+    bool CustomAppend(const interfaces::BlockInfo& block) override;
    


    achow101 commented at 9:48 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    I think this will need to implement CustomRewind as well so that reorgs can be handled. Otherwise, as it is now, a reorg that removes a spend and does not replace it with a conflicting one will leave the original spend in the index. However it would be incorrect to say that that txo has been spent as in the main chain following the reorg, the txo was not spent.

    So CustomRewind is needed in order to remove those records.

    A test for this condition would also be nice.


    sstone commented at 6:09 pm on November 23, 2022:

    done in https://github.com/bitcoin/bitcoin/pull/24539/commits/5957f992b7afe20dca250c398bb28ad0ae52f868

    I may have missed something, but it seems that if the chain tip is reorged out, CustomRewind won’t be called until a new block is added to the chain. Until then, our txospenderindex (and I guess this might be true for other indexes) will return stale results. In our case, you’ll be able to see that the spending tx has 0 confirmations and could be replaced, but it makes me wonder how usable this index is if you cannot tell if spending txs are confirmed or not. So yes technically txospenderindex does not require txindex but could it really be usable in practice without it ?

  82. in src/test/txospenderindex_tests.cpp:1 in 16b5cb0b9d outdated
    0@@ -0,0 +1,77 @@
    1+// Copyright (c) 2017-2021 The Bitcoin Core developers
    


    achow101 commented at 9:52 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    Wrong date range here.


  83. in src/test/txospenderindex_tests.cpp:53 in 16b5cb0b9d outdated
    48+    // Transaction should not be found in the index before it is started.
    49+    for (const auto& outpoint : spent) {
    50+        BOOST_CHECK(!txospenderindex.FindSpender(outpoint, txid));
    51+    }
    52+
    53+    // BlockUntilSyncedToCurrentChain should return false before txindex is started.
    


    achow101 commented at 9:53 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    s/txindex/txospenderindex


  84. in test/functional/rpc_mempool_info.py:74 in 16b5cb0b9d outdated
    69         result = self.nodes[0].gettxspendingprevout([ {'txid' : confirmed_utxo['txid'], 'vout' : 0}, {'txid' : txidA, 'vout' : 1} ])
    70         assert_equal(result, [ {'txid' : confirmed_utxo['txid'], 'vout' : 0, 'spendingtxid' : txidA}, {'txid' : txidA, 'vout' : 1, 'spendingtxid' : txidC} ])
    71+        result = self.nodes[1].gettxspendingprevout([ {'txid' : confirmed_utxo['txid'], 'vout' : 0}, {'txid' : txidA, 'vout' : 1} ])
    72+        assert_equal(result, [ {'txid' : confirmed_utxo['txid'], 'vout' : 0}, {'txid' : txidA, 'vout' : 1} ])
    73+        result = self.nodes[2].gettxspendingprevout([ {'txid' : confirmed_utxo['txid'], 'vout' : 0}, {'txid' : txidA, 'vout' : 1} ])
    74+        assert_equal(result, [ {'txid' : confirmed_utxo['txid'], 'vout' : 0, 'warnings': ['txospenderindex is unavailable.']}, {'txid' : txidA, 'vout' : 1, 'warnings': ['txospenderindex is unavailable.']} ])
    


    achow101 commented at 9:59 pm on November 21, 2022:

    In 16b5cb0b9d343de1080ec4761bd8f20d668059ee “Add a “tx output spender” index”

    I don’t see why the spends wouldn’t be in nodes 1 and 2’s mempools, other than that this is a race.


    sstone commented at 4:11 pm on November 22, 2022:
  85. achow101 commented at 10:01 pm on November 21, 2022: member
    Concept ACK
  86. sstone force-pushed on Nov 22, 2022
  87. sstone force-pushed on Nov 23, 2022
  88. sstone force-pushed on Dec 2, 2022
  89. sstone force-pushed on Dec 2, 2022
  90. in src/test/txospenderindex_tests.cpp:74 in 47e60750e7 outdated
    69+
    70+    // shutdown sequence (c.f. Shutdown() in init.cpp)
    71+    txospenderindex.Stop();
    72+
    73+    // Let scheduler events finish running to avoid accessing any memory related to txindex after it is destructed
    74+    SyncWithValidationInterfaceQueue();
    


    maflcko commented at 2:27 pm on December 5, 2022:

    nit if you retouch: Looks like the order was changed in the other tests.

    0    // It is not safe to stop and destroy the index until it finishes handling
    1    // the last BlockConnected notification. The BlockUntilSyncedToCurrentChain()
    2    // call above is sufficient to ensure this, but the
    3    // SyncWithValidationInterfaceQueue() call below is also needed to ensure
    4    // TSAN always sees the test thread waiting for the notification thread, and
    5    // avoid potential false positive reports.
    6    SyncWithValidationInterfaceQueue();
    7
    8    // Shutdown sequence (c.f. Shutdown() in init.cpp)
    9    coin_stats_index.Stop();
    

    See 861cb3fadce88cfaee27088185a48f03fb9dafe7


  91. DrahtBot added the label Needs rebase on Dec 6, 2022
  92. sstone force-pushed on Dec 7, 2022
  93. DrahtBot removed the label Needs rebase on Dec 7, 2022
  94. sstone force-pushed on Jan 27, 2023
  95. DrahtBot added the label Needs rebase on Apr 3, 2023
  96. sstone force-pushed on Apr 3, 2023
  97. DrahtBot removed the label Needs rebase on Apr 3, 2023
  98. sstone force-pushed on Apr 24, 2023
  99. DrahtBot added the label CI failed on May 13, 2023
  100. sstone force-pushed on May 19, 2023
  101. sstone force-pushed on May 19, 2023
  102. DrahtBot removed the label CI failed on May 21, 2023
  103. DrahtBot added the label CI failed on May 31, 2023
  104. sstone force-pushed on Jun 5, 2023
  105. DrahtBot removed the label CI failed on Jun 5, 2023
  106. DrahtBot added the label CI failed on Jul 15, 2023
  107. sstone force-pushed on Aug 7, 2023
  108. in src/index/txospenderindex.cpp:81 in 8368a74f53 outdated
    76+
    77+    do {
    78+        CBlock block;
    79+        if (!m_chainstate->m_blockman.ReadBlockFromDisk(block, *iter_tip)) {
    80+            return error("%s: Failed to read block %s from disk",
    81+                         __func__, iter_tip->GetBlockHash().ToString());
    


    maflcko commented at 7:36 am on August 8, 2023:
    0            return error("Failed to read block %s from disk",
    1                         iter_tip->GetBlockHash().ToString());
    

    nit: For new code there is no need to add __func__, because users can set -logsourcelocations if they want. Also, missing include #include "logging.h" for error(), see CI failure?


    sstone commented at 6:51 am on August 9, 2023:
  109. sstone force-pushed on Aug 8, 2023
  110. DrahtBot removed the label CI failed on Aug 8, 2023
  111. DrahtBot added the label CI failed on Sep 13, 2023
  112. sstone force-pushed on Sep 18, 2023
  113. DrahtBot removed the label CI failed on Sep 18, 2023
  114. sstone force-pushed on Oct 3, 2023
  115. DrahtBot added the label CI failed on Oct 4, 2023
  116. DrahtBot removed the label CI failed on Oct 4, 2023
  117. TheCharlatan commented at 4:45 pm on December 11, 2023: contributor
    Concept ACK
  118. DrahtBot added the label CI failed on Jan 14, 2024
  119. sstone force-pushed on Feb 1, 2024
  120. DrahtBot removed the label CI failed on Feb 1, 2024
  121. DrahtBot commented at 12:18 pm on March 15, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/21097591816

  122. DrahtBot added the label CI failed on Mar 15, 2024
  123. sstone force-pushed on Mar 19, 2024
  124. sstone force-pushed on Mar 19, 2024
  125. DrahtBot removed the label CI failed on Mar 19, 2024
  126. DrahtBot added the label Needs rebase on Mar 20, 2024
  127. sstone force-pushed on Mar 21, 2024
  128. DrahtBot removed the label Needs rebase on Mar 21, 2024
  129. DrahtBot added the label CI failed on Jun 4, 2024
  130. DrahtBot commented at 1:54 am on June 4, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/22929252121

  131. sstone force-pushed on Jun 4, 2024
  132. DrahtBot removed the label CI failed on Jun 4, 2024
  133. DrahtBot commented at 8:32 pm on June 14, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25775869882

  134. DrahtBot added the label CI failed on Jun 14, 2024
  135. sstone force-pushed on Jun 19, 2024
  136. DrahtBot removed the label CI failed on Jun 19, 2024
  137. in src/index/txospenderindex.cpp:32 in 40e44e96c3 outdated
    27+TxoSpenderIndex::DB::DB(size_t n_cache_size, bool f_memory, bool f_wipe) : BaseIndex::DB(gArgs.GetDataDirNet() / "indexes" / "txospenderindex", n_cache_size, f_memory, f_wipe)
    28+{
    29+}
    30+
    31+TxoSpenderIndex::TxoSpenderIndex(std::unique_ptr<interfaces::Chain> chain, size_t n_cache_size, bool f_memory, bool f_wipe)
    32+    : BaseIndex(std::move(chain), "txospenderindex"), m_db(std::make_unique<TxoSpenderIndex::DB>(n_cache_size, f_memory, f_wipe))
    


    hodlinator commented at 8:28 pm on August 5, 2024:
    nit: Inconsistent style, should probably have initializer list on own line in both cases.
  138. in src/index/txospenderindex.cpp:36 in 2f80cd2d5f outdated
    31+TxoSpenderIndex::TxoSpenderIndex(std::unique_ptr<interfaces::Chain> chain, size_t n_cache_size, bool f_memory, bool f_wipe)
    32+    : BaseIndex(std::move(chain), "txospenderindex"), m_db(std::make_unique<TxoSpenderIndex::DB>(n_cache_size, f_memory, f_wipe))
    33+{
    34+}
    35+
    36+TxoSpenderIndex::~TxoSpenderIndex() {}
    


    hodlinator commented at 6:42 am on August 7, 2024:

    nit: Consider using default syntax to indicate that you only included this in the .cpp file so that m_db can be destructed:

    0TxoSpenderIndex::~TxoSpenderIndex() = default;
    

    TxIndex already does this.

    (Inspired by #19988 (review)).


  139. in src/index/txospenderindex.cpp:185 in 40e44e96c3 outdated
    102+}
    103+
    104+bool TxoSpenderIndex::FindSpender(const COutPoint& txo, uint256& tx_hash) const
    105+{
    106+    return m_db->Read(std::make_pair(DB_TXOSPENDERINDEX, txo), tx_hash);
    107+}
    


    hodlinator commented at 6:56 am on August 7, 2024:

    Suggestion: use optional to promote error checking or [[nodiscard]] bool and document non-const “out”-references by variable naming or comment.

    0std::optional<Txid> TxoSpenderIndex::FindSpender(const COutPoint& txo) const
    1{
    2    uint256 tx_hash_out;
    3    if (m_db->Read(std::pair{DB_TXOSPENDERINDEX, txo}, tx_hash_out))
    4        return Txid::FromUint256(tx_hash_out);
    5    return std::nullopt;
    6}
    

  140. in src/index/txospenderindex.cpp:7 in 40e44e96c3 outdated
    0@@ -0,0 +1,110 @@
    1+// Copyright (c) 2022 The Bitcoin Core developers
    2+// Distributed under the MIT software license, see the accompanying
    3+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
    4+
    5+#include <chainparams.h>
    6+#include <common/args.h>
    7+#include <index/disktxpos.h>
    


    hodlinator commented at 7:10 am on August 7, 2024:

    No longer used?

    0#include <common/args.h>
    

  141. in src/index/txospenderindex.cpp:42 in 40e44e96c3 outdated
    37+
    38+bool TxoSpenderIndex::DB::WriteSpenderInfos(const std::vector<std::pair<COutPoint, uint256>>& items)
    39+{
    40+    CDBBatch batch(*this);
    41+    for (const auto& [outpoint, hash] : items) {
    42+        batch.Write(std::make_pair(DB_TXOSPENDERINDEX, outpoint), hash);
    


    hodlinator commented at 7:19 am on August 7, 2024:

    Could use pair{} for brevity here and elsewhere (https://stackoverflow.com/a/41521422).

    0        batch.Write(std::pair{DB_TXOSPENDERINDEX, outpoint}, hash);
    

  142. in src/index/txospenderindex.h:35 in 40e44e96c3 outdated
    30+    bool CustomRewind(const interfaces::BlockKey& current_tip, const interfaces::BlockKey& new_tip) override;
    31+
    32+    BaseIndex::DB& GetDB() const override;
    33+
    34+public:
    35+    /// Constructs the index, which becomes available to be queried.
    


    hodlinator commented at 7:22 am on August 7, 2024:

    Superfluous comment copied from txindex.h.

  143. in src/index/txospenderindex.h:38 in 40e44e96c3 outdated
    33+
    34+public:
    35+    /// Constructs the index, which becomes available to be queried.
    36+    explicit TxoSpenderIndex(std::unique_ptr<interfaces::Chain> chain, size_t n_cache_size, bool f_memory = false, bool f_wipe = false);
    37+
    38+    // Destructor is declared because this class contains a unique_ptr to an incomplete type.
    


    hodlinator commented at 7:24 am on August 7, 2024:

    nit: Explain why, not what:

    0    // Destroys unique_ptr to an incomplete type.
    
  144. in src/index/txospenderindex.h:46 in 40e44e96c3 outdated
    40+
    41+    bool FindSpender(const COutPoint& txo, uint256& tx_hash) const;
    42+};
    43+
    44+/// The global txo spender index. May be null.
    45+extern std::unique_ptr<TxoSpenderIndex> g_txospenderindex;
    


    hodlinator commented at 7:37 am on August 7, 2024:

    nit: I realize this is copied from txindex.h, but could be made less redundant:

    0/// May be null.
    1extern std::unique_ptr<TxoSpenderIndex> g_txospenderindex;
    
  145. in src/rpc/mempool.cpp:686 in 40e44e96c3 outdated
    681@@ -654,6 +682,36 @@ static RPCHelpMan gettxspendingprevout()
    682                 const CTransaction* spendingTx = mempool.GetConflictTx(prevout);
    683                 if (spendingTx != nullptr) {
    684                     o.pushKV("spendingtxid", spendingTx->GetHash().ToString());
    685+                } else if (mempool_only.value_or(false)) {
    686+                    // do nothing
    


    hodlinator commented at 7:41 am on August 7, 2024:

    Suggestion: Clarify the conditional branches and state of mempool_only later in the code:

    0                    // do nothing if only mempool querying is allowed
    
  146. in src/rpc/mempool.cpp:699 in 40e44e96c3 outdated
    694+                            UniValue warnings(UniValue::VARR);
    695+                            warnings.push_back("txospenderindex is still being synced.");
    696+                            o.pushKV("warnings", warnings);
    697+                        }
    698+                    } else if (!f_txospenderindex_ready) {
    699+                        if (mempool_only.has_value()) {  // NOTE: value is false here
    


    hodlinator commented at 7:48 am on August 7, 2024:

    nit: Clarification here and a few lines below:

    0                        if (mempool_only.has_value()) {  // NOTE: caller explicitly set value to false
    
  147. in src/rpc/mempool.cpp:603 in 40e44e96c3 outdated
    597@@ -597,6 +598,11 @@ static RPCHelpMan gettxspendingprevout()
    598                     },
    599                 },
    600             },
    601+            {"options", RPCArg::Type::OBJ, RPCArg::Optional::OMITTED, "",
    602+                {
    603+                    {"mempool_only", RPCArg::Type::BOOL, RPCArg::DefaultHint{"true if txospenderindex unavailable, otherwise false"}, "If true, txospenderindex will not be used even if mempool lacks a relevant spend. If false and txospenderindex is unavailable, throws an exception if any outpoint lacks a mempool spend."},
    


    hodlinator commented at 8:11 am on August 7, 2024:

    Description could be simplified:

    0                    {"mempool_only", RPCArg::Type::BOOL, RPCArg::DefaultHint{"true if txospenderindex unavailable, otherwise false"}, "If false and mempool lacks relevant spend, txospenderindex will be used (throws exception if unavailable)."},
    
  148. in src/rpc/mempool.cpp:670 in 40e44e96c3 outdated
    663@@ -641,6 +664,11 @@ static RPCHelpMan gettxspendingprevout()
    664                 prevouts.emplace_back(txid, nOutput);
    665             }
    666 
    667+            bool f_txospenderindex_ready{false};
    668+            if (g_txospenderindex && !mempool_only.value_or(false)) {
    669+                f_txospenderindex_ready = g_txospenderindex->BlockUntilSyncedToCurrentChain();
    670+            }
    


    hodlinator commented at 8:15 am on August 7, 2024:

    nit: Could make const:

    0            const bool f_txospenderindex_ready = !mempool_only.value_or(false) &&
    1                g_txospenderindex && g_txospenderindex->BlockUntilSyncedToCurrentChain();
    

  149. in src/index/txospenderindex.cpp:127 in 40e44e96c3 outdated
    79+        CBlock block;
    80+        if (!m_chainstate->m_blockman.ReadBlockFromDisk(block, *iter_tip)) {
    81+            LogError("Failed to read block %s from disk\n", iter_tip->GetBlockHash().ToString());
    82+            return false;
    83+        }
    84+        std::vector<COutPoint> items;
    


    hodlinator commented at 8:44 am on August 7, 2024:

    nit: Similarly to CustomAppend:

    0        std::vector<COutPoint> items;
    1        items.reserve(block.vtx.size());
    


    hodlinator commented at 9:32 pm on October 1, 2024:
  150. in src/index/txospenderindex.cpp:187 in 40e44e96c3 outdated
    105+{
    106+    return m_db->Read(std::make_pair(DB_TXOSPENDERINDEX, txo), tx_hash);
    107+}
    108+
    109+
    110+BaseIndex::DB& TxoSpenderIndex::GetDB() const { return *m_db; }
    


    hodlinator commented at 1:25 pm on August 7, 2024:
    (nit: Double newline is inconsistent with the rest of the file)

  151. in src/rpc/mempool.cpp:691 in 40e44e96c3 outdated
    686+                    // do nothing
    687+                } else if (g_txospenderindex) {
    688+                    // no spending tx in mempool, query txospender index
    689+                    uint256 spendingtxid;
    690+                    if (g_txospenderindex->FindSpender(prevout, spendingtxid)) {
    691+                        o.pushKV("spendingtxid", spendingtxid.GetHex());
    


    hodlinator commented at 1:50 pm on August 7, 2024:

    If other FindSpender suggestion is taken:

    0                   if (auto spending_txid{g_txospenderindex->FindSpender(prevout)}) {
    1                        o.pushKV("spendingtxid", spending_txid->GetHex());
    

  152. in src/index/txospenderindex.cpp:16 in 40e44e96c3 outdated
     8+#include <index/txospenderindex.h>
     9+#include <logging.h>
    10+#include <node/blockstorage.h>
    11+#include <validation.h>
    12+
    13+constexpr uint8_t DB_TXOSPENDERINDEX{'s'};
    


    hodlinator commented at 10:33 pm on August 7, 2024:

    (Saw this was somewhat covered in #24539 (review)). Maybe document here?

    0// "Namespace" within LevelDB, separate from the one used by BaseIndex.
    1constexpr uint8_t DB_TXOSPENDERINDEX{'s'};
    


    hodlinator commented at 7:32 pm on August 22, 2024:
    Better than my suggestion, thanks!

    hodlinator commented at 9:23 pm on October 1, 2024:
  153. hodlinator commented at 10:48 pm on August 7, 2024: contributor

    Concept ACK 40e44e96c38593210877f0b5d062e71a8dd292a5

    Can see how this is useful for L2s. Is a variant of this PR known to be used in production?

    I do agree with @Pantamis in #24539 (comment) that having it take 3x the space of TxIndex seems unfortunate, would prefer it used CDiskTxPos for the values. Edit: Would require disallowing pruning again.

    Commits could be squashed/rewritten. Seems like the latter two are responses to PR feedback rather than “telling a story”. Previously pointed out: #24539 (comment)

    PR summary still has “(so -txospenderindex requires -txindex)” which doesn’t match current implementation.

  154. hebasto added the label Needs CMake port on Aug 16, 2024
  155. sstone force-pushed on Aug 21, 2024
  156. sstone force-pushed on Aug 21, 2024
  157. DrahtBot added the label CI failed on Aug 21, 2024
  158. DrahtBot commented at 4:30 pm on August 21, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/29067777506

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  159. DrahtBot removed the label CI failed on Aug 21, 2024
  160. in src/rpc/mempool.cpp:686 in 20b259332f outdated
    678@@ -654,6 +679,35 @@ static RPCHelpMan gettxspendingprevout()
    679                 const CTransaction* spendingTx = mempool.GetConflictTx(prevout);
    680                 if (spendingTx != nullptr) {
    681                     o.pushKV("spendingtxid", spendingTx->GetHash().ToString());
    682+                } else if (mempool_only.value_or(false)) {
    683+                    // do nothing, caller has selected to only query the mempool
    684+                } else if (g_txospenderindex) {
    685+                    // no spending tx in mempool, query txospender index
    686+                    if (auto spendingTxId{g_txospenderindex->FindSpender(prevout)}) {
    


    hodlinator commented at 9:55 pm on August 21, 2024:
    https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c calls for snake_case and Txid is the name of the type in this codebase so spending_txid would fit better.
  161. in src/index/txospenderindex.cpp:26 in 20b259332f outdated
    21+
    22+    bool WriteSpenderInfos(const std::vector<std::pair<COutPoint, uint256>>& items);
    23+    bool EraseSpenderInfos(const std::vector<COutPoint>& items);
    24+};
    25+
    26+TxoSpenderIndex::DB::DB(size_t n_cache_size, bool f_memory, bool f_wipe) : BaseIndex::DB(gArgs.GetDataDirNet() / "indexes" / "txospenderindex", n_cache_size, f_memory, f_wipe)
    


    hodlinator commented at 10:00 pm on August 21, 2024:

    nit: I like how you responded to #24539 (review) but my primary intention was actually doing:

    0TxoSpenderIndex::DB::DB(size_t n_cache_size, bool f_memory, bool f_wipe)
    1    : BaseIndex::DB(gArgs.GetDataDirNet() / "indexes" / "txospenderindex", n_cache_size, f_memory, f_wipe)
    
  162. in src/test/txospenderindex_tests.cpp:67 in 20b259332f outdated
    62+        BOOST_REQUIRE(time_start + timeout > SteadyClock::now());
    63+        UninterruptibleSleep(std::chrono::milliseconds{100});
    64+    }
    65+    for (size_t i = 0; i < spent.size(); i++) {
    66+        auto txId{txospenderindex.FindSpender(spent[i])};
    67+        BOOST_CHECK(txId.has_value() && txId.value() == spender[i].GetHash());
    


    hodlinator commented at 10:08 pm on August 21, 2024:

    nit: Should also use either snake_case tx_id or preferably txid to be consistent with the Txid type.

    Could rely on .value() throwing exception if not set and avoid the naming altogether.

    + Slightly better for error messages etc to use BOOST_CHECK_EQUAL.

    0        BOOST_CHECK_EQUAL(txospenderindex.FindSpender(spent[i]).value(), spender[i].GetHash());
    

    hodlinator commented at 7:29 pm on August 31, 2024:

    ( If support for printing std::optional is added as in pending PR 16545, one could even write:

    0        BOOST_CHECK_EQUAL(txospenderindex.FindSpender(spent[i]), spender[i].GetHash());
    

    )

  163. sstone commented at 6:53 am on August 22, 2024: contributor

    I do agree with @Pantamis in #24539 (comment) that having it take 3x the space of TxIndex seems unfortunate, would prefer it used CDiskTxPos for the values. Edit: Would require disallowing pruning again.

    To make the index more efficient we could replace the indexed outpoint with an 8 byte value {block height|tx pos|output index} and use the block height where the tx is spent instead of the full spending tx id, reducing the size of an index entry from (32 + 4 + 32) to (8 + 4) bytes, though it would become less generic and harder to use (especially without -txindex).

  164. DrahtBot added the label CI failed on Aug 29, 2024
  165. DrahtBot commented at 6:43 am on August 29, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/29068484289

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  166. maflcko removed the label Needs CMake port on Aug 29, 2024
  167. DrahtBot added the label Needs rebase on Sep 2, 2024
  168. sstone force-pushed on Sep 9, 2024
  169. sstone force-pushed on Sep 9, 2024
  170. DrahtBot removed the label Needs rebase on Sep 9, 2024
  171. DrahtBot removed the label CI failed on Sep 9, 2024
  172. hodlinator commented at 1:01 pm on September 12, 2024: contributor

    I put some time into implementing 3 variations on different personal branches:

    Variant 1: unchanged_key_CDiskTxPos / baf57fb3a45c97d6cfb531b9252c97d4f27ad7ae

    uint256 value -> CDiskTxPos

    Still have key of COutPoint: 32 + 4 = 36 bytes old value - uint256: 32 bytes new value - CDiskTxPos: 4 + 4 + 4 = 12 bytes

    36+32 -> 36+12 => 29% theoretical size reduction.

    Variant 2: slow_key_CDiskTxPos / 1d3fd04a60ca5c28d8bef6ce4f611a583667dc2e

    Minimize key + CDiskTxPos value

    old key - COutPoint: 32 + 4 = 36 bytes new key - uint64: 8 bytes old value - uint256: 32 bytes new value - CDiskTxPos: 4 + 4 + 4 = 12 bytes

    36+32 -> 8+12 => 71% theoretical size reduction.

    Uses expensive way of computing keys: looking up through BlockManager and reading the block data from disk.

    Apologies for using “tx_index” both for the TxIndex database and the location of a transactions within a given block.

    Variant 3: minimal_key_value / ced505834dd1349bccbceeac3bf5abaa2327bda0

    Minimize key & value footprint

    old key - COutPoint: 32 + 4 = 36 bytes new key - uint64: 8 bytes old value - uint256: 32 bytes new value - uint32: 4 bytes

    36+32 -> 8+4 => 82% theoretical size reduction, matching #24539 (comment).

    Uses expensive way of computing keys: looking up through BlockManager and reading the block data from disk.

    TxoSpenderIndex::FindSpender uses questionable way of finding the correct block.

    Conclusion

    Not an expert on the BlockManager family of API’s so there may well be better implementations.

    The first one of just switching the value uint256 -> CDiskTxPos feels quite feasible. Should have similar performance as TxIndex::FindTx() for the TxoSpenderIndex::FindSpender()-lookups.

    The 2 later variants do significantly decrease the amount of theoretical space used, but I’m guessing the cost of computing the key would be somewhat prohibitive.

    If run in an environment where hardware cost is negligible, but speed of lookup is critical, keeping the current approach of full uint256 values may be preferable.

    (The size reductions have not been confirmed on chain syncs and remain theoretical for now).

    (All 3 explored variations contain the same 2 base refactoring commits de14caa0df5b90208400b96e5b1b34752f4d57c0 & 44e4e4e88fb10c3fe44ebe238f88da369a7572f0).

    Edit: Managed to fix functional tests for variant 2 with an additional commit but variant 3 is still flaky despite being patched up.

  173. antonilol commented at 2:40 pm on September 12, 2024: none

    minimal_key_value / ced5058

    Minimize key & value footprint

    This one (the third) is very similar to the ‘spending’ index of electrs (https://github.com/romanz/electrs), only different being how the key is calculated, electrs uses the first 8 bytes of the txid (as uint64) + vout, because the block height or tx position might not be known.

    Needing to parse whole blocks to find a transaction for every request impacts lookup time and requires all blocks to be available (so either disallows pruning or needs to request blocks from peers, which hurts privacy).

  174. DrahtBot added the label CI failed on Sep 12, 2024
  175. hodlinator commented at 9:35 pm on September 12, 2024: contributor

    This one (the third) is very similar to the ‘spending’ index of electrs (https://github.com/romanz/electrs), only different being how the key is calculated, electrs uses the first 8 bytes of the txid (as uint64) + vout, because the block height or tx position might not be known.

    Needing to parse whole blocks to find a transaction for every request impacts lookup time and requires all blocks to be available (so either disallows pruning or needs to request blocks from peers, which hurts privacy).

    Ah, I see: https://github.com/romanz/electrs/blob/875fb61951e3b337c9e4e468734e119ff58c0482/src/types.rs#L119-L124 Interesting variant. Just using 8 bytes from the txid + vout does allow for a lot of transactions before collisions start becoming frequent, and avoids the expensive block lookup. People do grind txid prefixes like https://mempool.space/tx/b10c0000004da5a9d1d9b4ae32e09f0b3e62d21a5cce5428d4ad714fb444eb5d so they could intentionally force a lot of equal prefixes, but that’s probably negligible.

  176. hodlinator commented at 10:33 pm on September 12, 2024: contributor

    New variant inspired by @antonilol with same disk space requirements as my 2nd variant above (avoiding my messy and not 100% functional block-lookup-by-height code from variant 3 for now). Much cleaner, removing requirement for additional locking and removes dependencies on BlockManager and TxIndex - on parity with variant 1 but with the disk space savings of variant 2!

    Edit 1: Drawback - does not handle key-collisions yet. :D Edit 2: Now contains tentative collision handling solution without specific tests. Prefix collisions should be very rare but do fall back to full outpoint keys (32 + 4 bytes).

    Variant 4: txidprefix+vout_CDiskTxPos / 3bc5c8dbffd1ef529eb12bae5f5722c99cc17c69 (+ c2d0d97f467649748a7f21af3b1e58b74da12b49 for collision handling)

    txout prefix + CDiskTxPos value

    old key - COutPoint: 32 + 4 = 36 bytes new key - (COutPoint.hash.u64_prefix + COutPoint.vout) -> uint64: 8 bytes old value - uint256: 32 bytes new value - CDiskTxPos: 4 + 4 + 4 = 12 bytes

    36+32 -> 8+12 => 71% theoretical size reduction.

  177. DrahtBot removed the label CI failed on Sep 15, 2024
  178. DrahtBot commented at 10:28 am on September 24, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/29873383730

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  179. DrahtBot added the label CI failed on Sep 24, 2024
  180. Add a "tx output spender" index
    Adds an outpoint -> txid index, which can be used to find which transactions spent a given output.
    This is extremely useful for Lightning and more generally for layer-2 protocol that rely on chains of unpublished transactions.
    If enabled, this index will be used by `gettxspendingprevout` when it does not find a spending transaction in the mempool.
    e41b0f550f
  181. sstone force-pushed on Sep 27, 2024
  182. DrahtBot removed the label CI failed on Sep 27, 2024
  183. antonilol commented at 11:45 am on September 27, 2024: none
    How hard is it for attackers to deliberately insert collisions? Is it worth to (try to) counteract this?
  184. sstone commented at 12:32 pm on September 27, 2024: contributor

    How hard is it for attackers to deliberately insert collisions? Is it worth to (try to) counteract this?

    The index is currently based on suggestions by @hodlinator and @antonilol and the key for a given outpoint is the last 8 bytes of the outpoint’s tx hash, represented as an 8 bytes integer, to which we add the outpoint’s output index. => to generated collisions attackers need to generate transactions which hash to values that have the same first 8 bytes. I guess that would not be that difficult.

    I think it may be countered by using a 64 bits hash mixed with a random key initialised when the index is created (so different nodes would use different keys), something like siphash.

  185. sstone commented at 12:36 pm on September 27, 2024: contributor

    Another option that I’m looking at is simply to replace and extend txindex and create 2 indexes:

    • txid (32 bytes) -> tx position (8 bytes) (same as txindex)
    • tx position (8 bytes) + output index (4 bytes) -> spending tx position (8 bytes)

    No more collisions, and we can return the full tx and not just its hash, but not as efficient as this version (which can also return the full tx)l

  186. sstone force-pushed on Oct 1, 2024
  187. sstone commented at 7:28 pm on October 1, 2024: contributor

    Updated with the following changes:

    • key is now siphash(k, spent outpoint) (8 bytes) where k is a random value created when the index is created, to defend against potential collisions attacks. Value is still a tx position (8 bytes)
    • gettxspendingprevout will return the full spending tx if option include_tx is set
  188. antonilol commented at 8:45 pm on October 1, 2024: none

    That’s a good way to prevent attackers from inserting collisions in everyone’s database, the chance of collision will be dependent on k, which is unknown to an attacker and different for everyone (except by chance or people copying each other’s database). Shouldn’t 8 bytes of randomness be enough (instead of 16)? Assuming that siphash’s output is uniformly distributed, more randomness than the output size (8 bytes) feels unnecessary.

    Also, the rpc description should be updated, https://github.com/sstone/bitcoin/blob/add-txospender-index/src/rpc/mempool.cpp#L589 (couldn’t add a github review comment because it is too far away from a changed line)

  189. in src/index/txospenderindex.cpp:68 in 93e1cafc3f outdated
    63+    CDBBatch batch(*m_db);
    64+    for (const auto& outpoint : items) {
    65+        std::vector<CDiskTxPos> positions;
    66+        std::pair<uint8_t, uint64_t> key{DB_TXOSPENDERINDEX, CreateIndex(outpoint)};
    67+        if (!m_db->Read(key, positions)) {
    68+            LogWarning("%s: could not read expected entry", __func__);
    


    hodlinator commented at 9:26 pm on October 1, 2024:

    Manually adding __func__ to new log lines is discouraged since -logsourcelocations CLI arg was added.

    0            LogWarning("Could not read expected entry");
    

    Here and elsewhere in the file.


  190. in src/index/txospenderindex.cpp:51 in 93e1cafc3f outdated
    46+        std::pair<uint8_t, uint64_t> key{DB_TXOSPENDERINDEX, CreateIndex(outpoint)};
    47+        if (m_db->Exists(key)) {
    48+            if (!m_db->Read(key, positions)) {
    49+                LogError("%s: Cannot read current state; tx spender index may be corrupted\n", __func__);
    50+            }
    51+        }
    


    hodlinator commented at 9:31 pm on October 1, 2024:

    Maybe avoid double-lookup in Release, depending on how unlikely/harmful you feel this is?

    0        if (!m_db->Read(key, positions)) {
    1            Assume(!m_db->Exists(key)); // Cannot read current state; tx spender index may be corrupted
    2        }
    

    sstone commented at 6:18 pm on October 6, 2024:
    Collisions are supposed to be extremely rare, I’d rather keep the check as is.
  191. in src/index/txospenderindex.cpp:148 in 93e1cafc3f outdated
    143+    } while (new_tip_index != iter_tip);
    144+
    145+    return true;
    146+}
    147+
    148+bool TxoSpenderIndex::ReadTransaction(const CDiskTxPos& postx, CTransactionRef& tx) const
    


    hodlinator commented at 9:33 pm on October 1, 2024:

    nit: More consistent with type name and snake_case variable naming convention.

    0bool TxoSpenderIndex::ReadTransaction(const CDiskTxPos& tx_pos, CTransactionRef& tx) const
    

  192. in src/index/txospenderindex.h:22 in 93e1cafc3f outdated
    17+ */
    18+class TxoSpenderIndex final : public BaseIndex
    19+{
    20+private:
    21+    std::unique_ptr<BaseIndex::DB> m_db;
    22+    std::pair<uint64_t, uint64_t> siphash_key;
    


    hodlinator commented at 9:36 pm on October 1, 2024:

    Should at least have m_-prefix (possibly also distinguishing it as a “salt” instead of a “key”, but I understand key is consistent with SipHash terminology).

    0    std::pair<uint64_t, uint64_t> m_siphash_salt;
    

    It’s unfortunate that SaltedOutpointHasher is not directly usable here. Would require serialization support at the very least.


  193. in src/index/txospenderindex.h:23 in 93e1cafc3f outdated
    18+class TxoSpenderIndex final : public BaseIndex
    19+{
    20+private:
    21+    std::unique_ptr<BaseIndex::DB> m_db;
    22+    std::pair<uint64_t, uint64_t> siphash_key;
    23+    uint64_t CreateIndex(const COutPoint& vout) const;
    


    hodlinator commented at 9:37 pm on October 1, 2024:
    Consider renaming function to CreateKey since underlying LevelDB is key/value store. “Index” also collides with the TxoSpenderIndex name itself.

  194. hodlinator commented at 10:40 pm on October 1, 2024: contributor

    Reviewed 93e1cafc3f8a7f5b36e495d7eb68c19acdab5809

    Concept: I am curious about the justification for including this Index in Bitcoin Core vs implementing it as a third party index, especially since it is not required by anything inside of Core itself. It keeps the index in close sync with the chain state and de-duplicates effort among L2-implementations, but adds maintenance burden in Core.

    Implementation: Appreciate the fresh SipHash take. Solves the DoS problem and has some nerd-sniping potential. Also nice having include_tx, as CDiskPos already implies deserialization of the full tx. Feared the approach of storing values with colliding keys together, but turned out quite smooth.


    sstone:

    Value is still a tx position (8 bytes)

    I believe CDiskPos is at least 12 bytes CDiskTxPos::nTxOffset + FlatFilePos::nFile + FlatFilePos::nPos. This is assuming there is no additional padding for alignment nor per-element size information in LevelDB. Since values are currently vector<CDiskPos> there should at the very least be an extra byte for the compressed size of the vector itself.


    antonilol:

    Shouldn’t 8 bytes of randomness be enough (instead of 16)? Assuming that siphash’s output is uniformly distributed, more randomness than the output size (8 bytes) feels unnecessary.

    According to https://en.wikipedia.org/wiki/SipHash key size is 128-bit as standard and that’s what is currently supported in Bitcoin Core. Having a smaller key might increase DoS risk to much after all? + The key is only stored in one entry in LevelDB. + This type of computation using data that is cache-coherent (8 bytes next to the first 8 bytes of the key) should be exceedingly cheap.

  195. antonilol commented at 9:43 am on October 2, 2024: none

    According to https://en.wikipedia.org/wiki/SipHash key size is 128-bit as standard and that’s what is currently supported in Bitcoin Core. Having a smaller key might increase DoS risk to much after all?

    • The key is only stored in one entry in LevelDB.
    • This type of computation using data that is cache-coherent (8 bytes next to the first 8 bytes of the key) should be exceedingly cheap.

    You are right, siphash is designed for a 128 bit key, so lets keep it 128 bit then. It is not a cryptographic hash according to that wikipedia page, so I could be wrong about the output being uniformly distributed. Also, the cost of the extra 8 bytes is minimal/negligible.

  196. Use smaller keys and values
    Key was 36 bytes (txid: 32 bytes, output index: 4 bytes) and is now 8 bytes: the siphash of the spent outpoint, keyed with a random key that is created when the index is created (to avoid collision attacks).
    Value was 32 bytes (txid: 32 bytes), and is now a list of tx positions (9 bytes unless there are collisions which should be extremely rare).
    00ea901253
  197. sstone force-pushed on Oct 6, 2024
  198. sipa commented at 6:21 pm on October 6, 2024: member

    SipHash with a 128-bit key seems like the correct choice for this.

    It is indeed not cryptographic, in the sense that it is not practically collision-resistant (no 64-bit hash function can be, as that inherently allows a 2^32 birthday attack).

    However, SipHash is effectively as secure and unpredictable as a 64-bit hash function can be, and is specifically designed for preventing attackers from triggering hash table collisions at scale, given that the salt is a uniform 128-bit key unknown to the attacker.

  199. sstone commented at 6:24 pm on October 6, 2024: contributor

    Concept: I am curious about the justification for including this Index in Bitcoin Core vs implementing it as a third party index, especially since it is not required by anything inside of Core itself. It keeps the index in close sync with the chain state and de-duplicates effort among L2-implementations, but adds maintenance burden in Core.

    I understand that adding new indexes must be justified. This one is simple and quite efficient now, the code is compact, isolated and easy to understand, and imho there are more and more apps/protocols that work by creating/updating transactions that are not published that would benefit from such an index.

    For applications that use Bitcoin Core’s wallet (this is is our case,) and benefit from all the features that have been and are being developed (fee management, fee bumping, coin selection, package relay, descriptors, …), adding a custom tool to build an external index would be sub-optimal from a dev and also an operational point of view, and people may prefer to use this PR instead.

  200. willcl-ark commented at 3:29 pm on October 14, 2024: member

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me