Implementation of SwiftSync #34004

pull rustaceanrob wants to merge 9 commits into bitcoin:master from rustaceanrob:swiftsync-v0 changing 15 files +723 −11
  1. rustaceanrob commented at 9:50 am on December 4, 2025: contributor

    Link to the original protocol writeup found here. See the Delving Bitcoin post for initial works and discussion.

    Motivation

    While many coins are available in cache when spent, every cache miss requires a trip to the disk. Memory can be scarce for single board machines, so cache misses are expected to be far more frequent. A trip to disk is all-the-more severe on low end hardware, so any minimization of disk interactions can speed up initial block download for limited resource devices. One such example may be a block template provider serving a home miner.

    Intuition

    The vast majority of coins created are later spent. The cost of this observation is many coins will be written to disk, only to be removed later. The SwiftSync protocol alleviates the problem by recording what coins have been added and removed with a small, in-memory aggregate. This allows for the removal of intermediate coin states until SwiftSync has completed. The state of the aggregate is verified at a predetermined chain height, and IBD continues as usual thereafter.

    For any chain height, we know all outputs - all inputs = UTXO set, which is equivalently all outputs - UTXO set = all inputs. The goal is to verify the set all outputs - UTXO set is equal to all inputs. Addition and subtraction modulo a field offers a way to do so. For any set, if we add all elements of the set and subtract them, regardless of ordering, we will arrive at zero. In the current implementation, the set elements being compared are hashed outpoints.

    The SwiftSync protocol is comprised of two structures. First, the aggregate that will record the state of what outpoints have been added (created) or subtracted (spent). Next, notice that the UTXO set must be reported in order to verify all outputs - UTXO set = all inputs. SwiftSync introduces a hints file, which may be provided to the client at startup. This hints file is nothing more than a pointer to each unspent output by recording the block height and the indexes that will remain unspent. By using this file, a client may omit outpoints that will be in the UTXO set when updating the aggregate. Note this file is not trusted. If a malicious actor provides a faulty UTXO set, the set comparison above will fail, and the client may begin a reindex.

    Implementation

    Aggregate

    The hash aggregate takes outpoints, hashes them with a salted hasher, and updates 4, 64-bit limbs with wrapping modulo arithmetic. The choice of 4, 64-bit limbs with no carries between limbs was made to: 1. minimize code complexity within the aggregate 2. conform to existing APIs, namely uint256. After adding and subtracting some arbitrary list of outpoints, we may poll if the sets were equivalent with IsZero.

    Hintsfile

    The hints file is a map of block height to the indexes of the outputs in that block that will be in the UTXO set. To construct this file for a particular UTXO set state, each block must be read and queried to find the outputs in that block that remain unspent. Because this is a read-only operation, this may be done on parallel threads in the future. To allow for parallelization, the hints file contains a “contents” section, which denotes the height and corresponding file position to find the hints for that block.

    To save space when representing the indexes that remain unspent in a block, an easy choice is to take the difference between the last coin that was unspent and the next one in the block. AFICT this is a version of run-length-encoding. These indexes are encoded using compactSize to further save space, the run-lengths should normally fit into at least 16 bits, often 8 bits.

    Parameter Interaction

    A side effect of omitting intermediate UTXO states while performing IBD is the “undo data” cannot be derived. There are already configurations that do not have full undo data history, namely pruned nodes. To avoid additional complexity in this PR, this version of accelerated IBD is only made available to pruned nodes, as the undo data would have been removed regardless of the this setting.

    Other parameter interactions, not yet implemented:

    • assumeutxo: Startup should fail if both are configured. A hints file will have no effect, as the UTXO set is already present.
    • assumevalid: A natural choice is to only allow SwiftSync when using assumevalid, and to only allow a hints file that commits to the assumevalid block. Similar to assumeutxo, a hash of the hints file may be committed to in the binary, which would be resistant to griefing attacks.

    Testing [Linux example]

    In your bitcoin directory, download the hints from the test server:

    0curl -o signet.hints http://utxohints.store/hints/signet
    

    Verify the hash of the file:

    0echo "2b625392ecb2adba2b5f9d7b00c8f3a8adca0d3c  signet.hints" | sha1sum --check -
    

    Build the branch:

    0cmake -B build && cmake --build build -j $(nproc)
    

    Remove existing chain state:

    0rm -rf ~/.bitcoin/signet/
    

    Provide the file at startup:

    0./build/bin/bitcoind --chain=signet --utxohints=./signet.hints --prune=550
    

    Other networks:

    • Testnet4: endpoint http://utxohints.store/hints/testnet4, expected checksum 7acbb25b651ffc5ebc59c46fd065a510c149f00e testnet4.hints

    Open Questions

    • Given the outcome of SwiftSync is pass/fail, and there is no use of intermediate states, should blocks be persisted to disk at all during the protocol?
    • To what degree should a hints file be committed to in the binary. No commitment allows for a server to provide hints for an arbitrary height, providing a convenience for the client, but opens up the possibility of a grief attack. Of course the server risks losing reputation for serving faulty files.

    Future Iteration

    This is the first iteration of the SwiftSync proposal, however more optimizations may be made. Namely, because set elements may be added or subtracted from the aggregate in any order, blocks may be downloaded in parallel from multiple peers. If you are interested in benchmarking the performance of a fully-parallel version, see the Rust implementation.

  2. DrahtBot commented at 9:50 am on December 4, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34004.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32741 (rpc: Add optional peer_ids param to filter getpeerinfo by waketraindev)
    • #32317 (kernel: Separate UTXO set access from validation functions by sedited)
    • #31974 (Drop testnet3 by Sjors)
    • #31774 (crypto: Use secure_allocator for AES256_ctx by davidgumberg)
    • #28690 (build: Introduce internal kernel library by sedited)
    • #25665 (refactor: Add util::Result failure types and ability to merge result values by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • “…hint file may take a few hours to build.” “Make sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)” -> “…hint file may take a few hours to build. Make sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)” [The two adjacent string literals are concatenated without an intervening space, producing “build.Make” which is hard to read; add a space between sentences.]

    2025-12-17

  3. DrahtBot added the label CI failed on Dec 4, 2025
  4. DrahtBot commented at 11:05 am on December 4, 2025: contributor

    🚧 At least one of the CI tasks failed. Task ASan + LSan + UBSan + integer: https://github.com/bitcoin/bitcoin/actions/runs/19924702497/job/57121471145 LLM reason (✨ experimental): swiftsync_tests failed due to an unsigned integer overflow (runtime error) in swiftsync.cpp.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  5. rustaceanrob force-pushed on Dec 4, 2025
  6. in src/swiftsync.cpp:28 in 183fd9aa2e outdated
    14+    std::span<unsigned char> salt;
    15+    GetStrongRandBytes(salt);
    16+    m_salted_hasher.write(std::as_writable_bytes(salt));
    17+}
    18+
    19+void Aggregate::Add(const COutPoint& outpoint)
    


    rustaceanrob commented at 12:53 pm on December 4, 2025:
    To suppress the UB sanitizer here I added a __attribute__(no_sanitize ... )) which cause a CI failure on windows. Is there 1. a wrapping addition/subtraction API I am not aware of 2. a way to suppress UB across all targets

    maflcko commented at 2:02 pm on December 4, 2025:

    You may need to use a suppressions file, see test/sanitizer_suppressions. They may be used as follows:

    0export LSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/lsan"
    1export TSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/tsan:halt_on_error=1:second_deadlock_stack=1"
    2export UBSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:report_error_type=1"
    
  7. rustaceanrob force-pushed on Dec 4, 2025
  8. DrahtBot removed the label CI failed on Dec 4, 2025
  9. in src/init.cpp:988 in fcdce16cf1 outdated
    982@@ -981,6 +983,10 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    983         if (args.GetBoolArg("-reindex-chainstate", false)) {
    984             return InitError(_("Prune mode is incompatible with -reindex-chainstate. Use full -reindex instead."));
    985         }
    986+    } else {
    987+        if (args.IsArgSet("-utxohints")) {
    988+            return InitError(_("UTXO hints cannot be used without pruned mode."));
    


    l0rinc commented at 8:11 am on December 5, 2025:
    This seems like a very serious limitation currently, I don’t see a huge difference between this and just loading the final AssumeUTXO state. If this is targeted to very low-memory environments, AssumeUTXO would be even better. I’m not a fan of AssumeUTXO, but it still seems better than just pruned SwiftSync.
  10. in src/swiftsync.cpp:32 in fcdce16cf1 outdated
    53+        m_file << height;
    54+        m_file << dummy_file_pos;
    55+    }
    56+}
    57+
    58+bool HintsfileWriter::WriteNextUnspents(const std::vector<uint64_t>& unspent_offsets, const uint32_t& height)
    


    l0rinc commented at 8:15 am on December 5, 2025:
    any reason for passing height (and a few other similar primitives) as (const) reference?

    rustaceanrob commented at 1:00 pm on December 15, 2025:
    No. Are all primitives expected to be passed by value? I don’t see a harm in making them const, but maybe I am missing something.

    sipa commented at 1:05 pm on December 15, 2025:
    C++ Core Guidelines say to pass things that are up to 2-3 words by value, for performance reasons: https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#rf-in

    l0rinc commented at 5:59 pm on December 15, 2025:

    for performance reasons

    yes, and it’s also easier to reason about local state

  11. in src/swiftsync.cpp:51 in fcdce16cf1 outdated
    49+    m_file << FILE_MAGIC;
    50+    m_file << FILE_VERSION;
    51+    m_file << preallocate;
    52+    for (uint32_t height = 1; height <= preallocate; height++) {
    53+        m_file << height;
    54+        m_file << dummy_file_pos;
    


    l0rinc commented at 8:16 am on December 5, 2025:
    why do we need a dummy during file writing?
  12. in src/swiftsync.cpp:89 in fcdce16cf1
    84+    m_file >> version;
    85+    if (version != FILE_VERSION) {
    86+        throw std::ios_base::failure("HintsfileReader: Unsupported file version.");
    87+    }
    88+    m_file >> m_stop_height;
    89+    for (uint32_t index = 1; index <= m_stop_height; index++) {
    


    l0rinc commented at 8:16 am on December 5, 2025:
    any reason for starting indexes at 1? nit: ++index usage is more common

    rustaceanrob commented at 2:55 pm on December 12, 2025:
    Side-effect of a past commit. Fixed.
  13. in src/swiftsync.h:33 in fcdce16cf1
    28+ * subtracting according to if the outpoint was added or spent.
    29+ * */
    30+class Aggregate
    31+{
    32+private:
    33+    uint64_t m_limb0{}, m_limb1{}, m_limb2{}, m_limb3{};
    


    l0rinc commented at 8:18 am on December 5, 2025:
    any reason for storing 64 bit chunks instead of e.g. arith_uint256?
  14. in src/swiftsync.h:72 in fcdce16cf1 outdated
    71+class HintsfileReader
    72+{
    73+private:
    74+    AutoFile m_file;
    75+    uint32_t m_stop_height;
    76+    std::unordered_map<uint32_t, uint64_t> m_height_to_file_pos;
    


    l0rinc commented at 8:19 am on December 5, 2025:
    nit: file pos could be size_t instead
  15. in src/swiftsync.h:84 in fcdce16cf1 outdated
    83+        std::ignore = m_file.fclose();
    84+    }
    85+    /** Read the hints for the specified block height. */
    86+    std::vector<uint64_t> ReadBlock(const uint32_t& height);
    87+    /** The height this file encodes up to. */
    88+    uint32_t StopHeight() { return m_stop_height; };
    


    l0rinc commented at 8:19 am on December 5, 2025:
    what should happen when -stopatheight is smaller than this target? We can’t stop before the swiftsync target…
  16. in src/validation.cpp:2595 in fcdce16cf1 outdated
    2564@@ -2535,14 +2565,18 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2565     int nInputs = 0;
    2566     int64_t nSigOpsCost = 0;
    2567     blockundo.vtxundo.reserve(block.vtx.size() - 1);
    2568+    std::optional<swiftsync::BlockHints> hints;
    


    l0rinc commented at 8:22 am on December 5, 2025:
    hmmm, when swiftsync isn’t active we shouldn’t have dangling state available in validation.cpp - it’s already huge, can we make sure all of the state is encapsulated? This bothers me the most, I don’t want the state sprinkled around (like we have with AssumeUTXO), it should be easy to disregard it when it’s not active.
  17. in src/init.cpp:1388 in fcdce16cf1 outdated
    1383@@ -1377,6 +1384,10 @@ static ChainstateLoadResult InitAndLoadChainstate(
    1384         }
    1385     };
    1386     auto [status, error] = catch_exceptions([&] { return LoadChainstate(chainman, cache_sizes, options); });
    1387+    if (utxo_hints.has_value()) {
    1388+        LogInfo("Applying UTXO hints from file");
    


    l0rinc commented at 1:00 pm on December 9, 2025:
    we could log the file path here.
  18. in src/init.cpp:1808 in fcdce16cf1 outdated
    1800@@ -1790,11 +1801,34 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1801     bool do_reindex{args.GetBoolArg("-reindex", false)};
    1802     const bool do_reindex_chainstate{args.GetBoolArg("-reindex-chainstate", false)};
    1803 
    1804+    std::optional<swiftsync::HintsfileReader> utxo_hints;
    1805+    if (args.IsArgSet("-utxohints")) {
    1806+        fs::path path = fs::absolute(args.GetPathArg("-utxohints"));
    1807+        if (!fs::exists(path)) {
    1808+            LogError("Provided UTXO file does not exist.");
    


    l0rinc commented at 1:01 pm on December 9, 2025:
    same, it would be helpful for these errors to show the invalid path
  19. in src/test/swiftsync_tests.cpp:18 in fcdce16cf1
    13+Txid txid_1{Txid::FromHex("bd0f71c1d5e50589063e134fad22053cdae5ab2320db5bf5e540198b0b5a4e69").value()};
    14+Txid txid_2{Txid::FromHex("b4749f017444b051c44dfd2720e88f314ff94f3dd6d56d40ef65854fcd7fff6b").value()};
    15+Txid txid_3{Txid::FromHex("ee707be5201160e32c4fc715bec227d1aeea5940fb4295605e7373edce3b1a93").value()};
    16+COutPoint outpoint_1{COutPoint(txid_1, 2142)};
    17+COutPoint outpoint_2{COutPoint(txid_2, 99328)};
    18+COutPoint outpoint_3{COutPoint(txid_3, 5438584)};
    


    l0rinc commented at 1:06 pm on December 9, 2025:
    0COutPoint outpoint_1{txid_1, 2142};
    1COutPoint outpoint_2{txid_2, 99328};
    2COutPoint outpoint_3{txid_3, 5438584};
    
  20. in src/test/swiftsync_tests.cpp:14 in fcdce16cf1 outdated
    10+#include <cstdio>
    11+
    12+namespace {
    13+Txid txid_1{Txid::FromHex("bd0f71c1d5e50589063e134fad22053cdae5ab2320db5bf5e540198b0b5a4e69").value()};
    14+Txid txid_2{Txid::FromHex("b4749f017444b051c44dfd2720e88f314ff94f3dd6d56d40ef65854fcd7fff6b").value()};
    15+Txid txid_3{Txid::FromHex("ee707be5201160e32c4fc715bec227d1aeea5940fb4295605e7373edce3b1a93").value()};
    


    l0rinc commented at 1:35 pm on December 9, 2025:

    hmmm, the value will fail during runtime for invalid values, we could add a

    0consteval explicit transaction_identifier(std::string_view hex_str) : m_wrapped{uint256{hex_str}} {}
    

    constructor which would allow us to have compile-time checked values as:

    0Txid txid_1{"bd0f71c1d5e50589063e134fad22053cdae5ab2320db5bf5e540198b0b5a4e69"};
    1Txid txid_2{"b4749f017444b051c44dfd2720e88f314ff94f3dd6d56d40ef65854fcd7fff6b"};
    2Txid txid_3{"ee707be5201160e32c4fc715bec227d1aeea5940fb4295605e7373edce3b1a93"};
    

    rustaceanrob commented at 1:03 pm on December 15, 2025:
    PR’d #34063
  21. in src/swiftsync.cpp:74 in fcdce16cf1 outdated
    76+HintsfileReader::HintsfileReader(AutoFile& file) : m_file(file.release())
    77+{
    78+    std::array<uint8_t, 4> magic{};
    79+    m_file >> magic;
    80+    if (magic != FILE_MAGIC) {
    81+        throw std::ios_base::failure("HintsfileReader: This is not a hint file.");
    


    l0rinc commented at 1:46 pm on December 9, 2025:
    these failures could be exercised in the provided unit test

    brunoerg commented at 7:53 pm on December 11, 2025:

    these failures could be exercised in the provided unit test

    +1. Some unkilled mutants for obvious lack of testing:

     0--- a/src/swiftsync.cpp
     1+++ b/muts-pr-34004-swiftsync-cpp/swiftsync.mutant.25.cpp
     2@@ -82,7 +82,7 @@ HintsfileReader::HintsfileReader(AutoFile& file) : m_file(file.release())
     3     }
     4     uint8_t version{};
     5     m_file >> version;
     6-    if (version != FILE_VERSION) {
     7+    if (1==0) {
     8         throw std::ios_base::failure("HintsfileReader: Unsupported file version.");
     9     }
    10     m_file >> m_stop_height
    
     0diff --git a/src/swiftsync.cpp b/muts-pr-34004-swiftsync-cpp/swiftsync.mutant.22.cpp
     1index bd028031f0..f0e04a9014 100644
     2--- a/src/swiftsync.cpp
     3+++ b/muts-pr-34004-swiftsync-cpp/swiftsync.mutant.22.cpp
     4@@ -77,7 +77,7 @@ HintsfileReader::HintsfileReader(AutoFile& file) : m_file(file.release())
     5 {
     6     std::array<uint8_t, 4> magic{};
     7     m_file >> magic;
     8-    if (magic != FILE_MAGIC) {
     9+    if (1==0) {
    10         throw std::ios_base::failure("HintsfileReader: This is not a hint file.");
    11     }
    
  22. in src/swiftsync.cpp:105 in fcdce16cf1 outdated
    126+
    127+BlockHints Context::ReadBlockHints(const int& nHeight)
    128+{
    129+    BlockHints hints{m_hint_reader->ReadBlock(nHeight)};
    130+    return hints;
    131+}
    


    l0rinc commented at 1:47 pm on December 9, 2025:
    These don’t seem to be exercised in the unit test
  23. in src/swiftsync.h:60 in fcdce16cf1
    55+public:
    56+    // Create a new hint file writer that will encode `preallocate` number of blocks.
    57+    HintsfileWriter(AutoFile& file, const uint32_t& preallocate);
    58+    ~HintsfileWriter()
    59+    {
    60+        std::ignore = m_file.fclose();
    


    l0rinc commented at 1:52 pm on December 9, 2025:

    hah, I didn’t know about this trick. We usually cast to void in these cases:

    0        (void)m_file.fclose();
    

    rustaceanrob commented at 2:56 pm on December 12, 2025:
    Changed to void cast for style consistency
  24. in src/swiftsync.h:59 in fcdce16cf1 outdated
    58+    ~HintsfileWriter()
    59+    {
    60+        std::ignore = m_file.fclose();
    61+    }
    62+    // Write the next hints to file.
    63+    bool WriteNextUnspents(const std::vector<uint64_t>& unspent_offsets, const uint32_t& height);
    


    l0rinc commented at 1:52 pm on December 9, 2025:
    instead of const vector reference we could likely pass std::span<uint64_t> instead
  25. in src/swiftsync.h:98 in fcdce16cf1 outdated
     97+    std::unordered_set<uint64_t> m_unspent_outputs_index;
     98+    uint64_t m_index{};
     99+
    100+public:
    101+    BlockHints(const std::vector<uint64_t>& unspent_offsets);
    102+    bool IsCurrOutputUnspent() { return m_unspent_outputs_index.contains(m_index); };
    


    l0rinc commented at 1:55 pm on December 9, 2025:

    Could be marked const and noexcept:

    0    bool IsCurrOutputUnspent() const noexcept { return m_unspent_outputs_index.contains(m_index); };
    

    note: this also seems untested

  26. in src/swiftsync.h:27 in fcdce16cf1
    22+ * This class is intentionally left opaque, as internal changes may occur,
    23+ * but all aggregates will have the concept of "adding" and "spending" an
    24+ * outpoint.
    25+ *
    26+ * The current implementation uses a salted SHA-256 hash and updates two
    27+ * 64-bit integers by taking the first 16 bytes of the hash and adding or
    


    l0rinc commented at 1:57 pm on December 9, 2025:
    0 * The current implementation uses a salted SHA-256 hash and updates
    1 * four 64-bit integers by taking the first 16 bytes of the hash and adding or
    
  27. in src/swiftsync.cpp:44 in fcdce16cf1
    39+    auto hash = (HashWriter(m_salted_hasher) << outpoint).GetSHA256();
    40+    m_limb0 -= hash.GetUint64(0);
    41+    m_limb1 -= hash.GetUint64(1);
    42+    m_limb2 -= hash.GetUint64(2);
    43+    m_limb3 -= hash.GetUint64(3);
    44+}
    


    l0rinc commented at 2:07 pm on December 9, 2025:

    The choice of 4, 64-bit limbs with no carries between limbs was made to: 1. minimize code complexity

    Wouldn’t a 256-bit group with carry propagation retain associativity, commutativity, identity (zero), inverses (negation), etc? The aggregate “add all created, subtract all spent, check for zero” works in either, I think we could simplify to:

     0class Aggregate
     1{
     2    arith_uint256 m_state{};
     3    HashWriter m_salted_hasher{};
     4
     5public:
     6    Aggregate();
     7    bool IsZero() const { return m_state == arith_uint256{}; }
     8    void Create(const COutPoint& outpoint);
     9    void Spend(const COutPoint& outpoint);
    10};
    

    and

    0void Aggregate::Create(const COutPoint& outpoint)
    1{
    2    m_state += UintToArith256((HashWriter(m_salted_hasher) << outpoint).GetSHA256());
    3}
    4
    5void Aggregate::Spend(const COutPoint& outpoint)
    6{
    7    m_state -= UintToArith256((HashWriter(m_salted_hasher) << outpoint).GetSHA256());
    8}
    

    It may perform a bit more calculations, but it’s a lot simpler and I doubt this will be the bottleneck.

    note: If subtraction is called Spend, this should rather be called: Create

  28. in src/swiftsync.h:97 in fcdce16cf1 outdated
     96+private:
     97+    std::unordered_set<uint64_t> m_unspent_outputs_index;
     98+    uint64_t m_index{};
     99+
    100+public:
    101+    BlockHints(const std::vector<uint64_t>& unspent_offsets);
    


    l0rinc commented at 2:14 pm on December 9, 2025:
    it’s not intuitive to convert a vector to BlockHints, can we make the constructor explicit?
  29. in src/swiftsync.h:93 in fcdce16cf1 outdated
    92+ * Stateful data structure that walks over the outputs of a block, determining the spent-ness.
    93+ */
    94+class BlockHints
    95+{
    96+private:
    97+    std::unordered_set<uint64_t> m_unspent_outputs_index;
    


    l0rinc commented at 2:18 pm on December 9, 2025:
    Do I understand correctly that instead of a compressed bitset, signalling which of the ordered transactions was (un)spent, we’re storing the unspent txs explicitly as a set? I understand that we expet old blocks to be mostly spent, but I have a hard time believing that’s the most optimal way to store and retrieve this information. It seems to me we could have a layered approach - first a bitset showing which blocks are completely spent/unspent and for the remaining ones we can likely store them in a compressed/sparse bitset (skipping uniform values automatically). I understand this is “just” the first implementation, but the file format should preferably be stable.

    rustaceanrob commented at 10:05 am on December 17, 2025:
    See comment, I will investigate this in the coming weeks.
  30. in src/swiftsync.h:118 in fcdce16cf1 outdated
    117+    /** Apply the hints from reader to this context. */
    118+    void ApplyHints(HintsfileReader reader);
    119+    /** The entire block history must be aggregated for accelerated IBD. */
    120+    void StartingFromGenesis() { m_is_starting_from_genesis = true; };
    121+    /** Accelerated IBD has completed. */
    122+    void Completed() { m_is_complete = true; };
    


    l0rinc commented at 2:20 pm on December 9, 2025:

    The past tense is confusing regarding its side-effect

    0    void Complete() { m_is_complete = true; };
    

    note: we should assert this was started and not completed as a sanity check

  31. in src/swiftsync.h:116 in fcdce16cf1 outdated
    115+    Aggregate m_aggregate{};
    116+    Context() = default;
    117+    /** Apply the hints from reader to this context. */
    118+    void ApplyHints(HintsfileReader reader);
    119+    /** The entire block history must be aggregated for accelerated IBD. */
    120+    void StartingFromGenesis() { m_is_starting_from_genesis = true; };
    


    l0rinc commented at 2:20 pm on December 9, 2025:

    Same as below:

    0    void StartFromGenesis() { m_is_starting_from_genesis = true; };
    

    note: we should assert this isn’t the case already as a sanity check

  32. in src/validation.cpp:2059 in fcdce16cf1 outdated
    2026@@ -2026,6 +2027,33 @@ void UpdateCoins(const CTransaction& tx, CCoinsViewCache& inputs, CTxUndo &txund
    2027     AddCoins(inputs, tx, nHeight);
    2028 }
    2029 
    2030+void UpdateCoinsWithHints(const CTransaction& tx, CCoinsViewCache& inputs, const CBlockIndex& pindex, swiftsync::Aggregate& agg, swiftsync::BlockHints& hints)
    2031+{
    2032+    const bool is_coinbase = tx.IsCoinBase();
    2033+    if (is_coinbase && IsBIP30Unspendable(pindex.GetBlockHash(), pindex.nHeight)) {
    


    l0rinc commented at 2:23 pm on December 9, 2025:
    My understanding is that we can only run this in in the assumevalid range - would it help if we buried the bip30 checks behind assumevalid as well (as attempted in #33817), or it’s orthogonal since they already invalidate the addition/subtraction and need to be explicitly checked anyway?
  33. in src/validation.cpp:2077 in fcdce16cf1 outdated
    2046+    for (uint64_t index = 0; index < tx.vout.size(); index++) {
    2047+        COutPoint outpoint = COutPoint(txid, index);
    2048+        if (!hints.IsCurrOutputUnspent() && !tx.vout[index].scriptPubKey.IsUnspendable()) {
    2049+            agg.Add(outpoint);
    2050+        } else {
    2051+            inputs.AddCoin(outpoint, Coin(tx.vout[index], pindex.nHeight, is_coinbase), is_coinbase);
    


    l0rinc commented at 2:27 pm on December 9, 2025:
    this should likely be EmplaceCoinInternalDANGER instead
  34. in src/validation.cpp:2048 in fcdce16cf1 outdated
    2043+        }
    2044+    }
    2045+    const Txid& txid{tx.GetHash()};
    2046+    for (uint64_t index = 0; index < tx.vout.size(); index++) {
    2047+        COutPoint outpoint = COutPoint(txid, index);
    2048+        if (!hints.IsCurrOutputUnspent() && !tx.vout[index].scriptPubKey.IsUnspendable()) {
    


    l0rinc commented at 2:37 pm on December 9, 2025:
    why are we calling IsUnspendable just to return immediately in AddCoin? We could skip these at the beginning of the loop for both branches. We should also skip this from the hints file of course?
  35. in src/validation.cpp:2034 in fcdce16cf1 outdated
    2026@@ -2026,6 +2027,33 @@ void UpdateCoins(const CTransaction& tx, CCoinsViewCache& inputs, CTxUndo &txund
    2027     AddCoins(inputs, tx, nHeight);
    2028 }
    2029 
    2030+void UpdateCoinsWithHints(const CTransaction& tx, CCoinsViewCache& inputs, const CBlockIndex& pindex, swiftsync::Aggregate& agg, swiftsync::BlockHints& hints)
    2031+{
    2032+    const bool is_coinbase = tx.IsCoinBase();
    2033+    if (is_coinbase && IsBIP30Unspendable(pindex.GetBlockHash(), pindex.nHeight)) {
    2034+        for (uint64_t index = 0; index < tx.vout.size(); index++) {
    


    l0rinc commented at 2:42 pm on December 9, 2025:
    nit: would prefer using brace init and ++ consistenctly throughout the PR
  36. in src/validation.cpp:2658 in fcdce16cf1
    2654+            UpdateCoins(tx, view, i == 0 ? undoDummy : blockundo.vtxundo.back(), pindex->nHeight);
    2655+        }
    2656+    }
    2657+    if (swiftsync_active && m_swiftsync_ctx.StopHeight() == (uint32_t)pindex->nHeight) {
    2658+        m_swiftsync_ctx.Completed();
    2659+        if (m_swiftsync_ctx.m_aggregate.IsZero()) {
    


    l0rinc commented at 2:43 pm on December 9, 2025:

    IsZero requires context to understand why it means “valid.” What we’re checking is: do all created outputs equal all spent inputs. Consider tracking created and spent sums separately, then comparing for equality:

    0bool IsBalanced() const { return m_created == m_spent; }
    

    This makes the invariant self-evident: all created outputs must equal all spent inputs.

  37. in src/swiftsync.cpp:26 in fcdce16cf1
    21+Aggregate::Aggregate()
    22+{
    23+    std::span<unsigned char> salt;
    24+    GetStrongRandBytes(salt);
    25+    m_salted_hasher.write(std::as_writable_bytes(salt));
    26+}
    


    l0rinc commented at 2:50 pm on December 9, 2025:

    we already have a dedicated SaltedOutpointHasher which just xors all four 64 bit chunks: https://github.com/bitcoin/bitcoin/blob/bdb8eadcdc193f398ebad83911d3297b5257e721/src/crypto/siphash.cpp#L177

    Wouldn’t that suffice (given that it’s what we’re already using for the map and we don’t expect it to have any collisions in our case)? And even if it would collide, SipHash basically guarantees that it can’t be weaponized since these inputs already have to satisfy an extremely high bar, we can’t just easily grind these values, especially since it’s salted.

    It would also allow storing the created and spent as

     0class Aggregate
     1{
     2    arith_uint256 m_created{};
     3    arith_uint256 m_spent{};
     4    SaltedOutpointHasher m_hasher{};
     5
     6public:
     7    bool IsBalanced() const { return m_created == m_spent; }
     8    void Create(const COutPoint& outpoint) { m_created += m_hasher(outpoint); }
     9    void Spend(const COutPoint& outpoint) { m_spent += m_hasher(outpoint); }
    10};
    

    and this way the created and spent sums couldn’t even overflow (which in itself is a collision threat).


    rustaceanrob commented at 3:01 pm on December 12, 2025:
    I have used this exact suggestion and added you as a commit co-author. Indeed, the salt is the primary defense against grinding here. I was not sure and discussed out of band, but I think siphash should be faster. As far as the “balancing” concept, this avoids having to invert a field element (subtracting an element is addition of the inverse), so I think this is strictly an improvement. Thanks for the suggestion. I was not aware of arith_uint256 prior to your review.

    l0rinc commented at 10:06 pm on December 12, 2025:

    added you as a commit co-author

    You need to add a coauthor with a fixed format, e.g. https://github.com/bitcoin/bitcoin/commit/8e4c66d0a7a0911c10dced0d6dd60ca7bd9545af

  38. in src/rpc/blockchain.cpp:42 in fcdce16cf1 outdated
    38@@ -39,6 +39,7 @@
    39 #include <script/descriptor.h>
    40 #include <serialize.h>
    41 #include <streams.h>
    42+#include <swiftsync.h>
    


    l0rinc commented at 2:57 pm on December 9, 2025:
    so when are we calling it swift and when accelerated?

    rustaceanrob commented at 3:01 pm on December 12, 2025:
    Changed the “accelerated” log to “SwiftSync” for consistency
  39. in src/test/swiftsync_tests.cpp:55 in fcdce16cf1
    50+        BOOST_CHECK(writer.WriteNextUnspents(unspent_block_2, 2));
    51+    }
    52+    FILE* file{fsbridge::fopen(temppath, "rb")};
    53+    AutoFile afile{file};
    54+    swiftsync::HintsfileReader reader{afile};
    55+    BOOST_CHECK(reader.StopHeight() == 4);
    


    l0rinc commented at 2:58 pm on December 9, 2025:
    you could use BOOST_CHECK_EQUAL for primitive comparison, it gives better errors
  40. in src/test/swiftsync_tests.cpp:46 in fcdce16cf1 outdated
    42+    const fs::path temppath = fsbridge::AbsPathJoin(gArgs.GetDataDirNet(), fs::u8path("test.hints"));
    43+    {
    44+        FILE* file{fsbridge::fopen(temppath, "wb")};
    45+        AutoFile afile{file};
    46+        swiftsync::HintsfileWriter writer{afile, 4};
    47+        BOOST_CHECK(writer.WriteNextUnspents(unspent_block_1, 1));
    


    l0rinc commented at 3:06 pm on December 9, 2025:
    if this is still the setup phase of the test, it could be a BOOST_REQUIRE
  41. in src/swiftsync.cpp:42 in fcdce16cf1 outdated
    63+    m_file.seek(cursor, SEEK_SET);
    64+    m_file << height;
    65+    m_file << curr_pos;
    66+    // Next append the positions of the unspent offsets in the block at this height.
    67+    m_file.seek(curr_pos, SEEK_SET);
    68+    WriteCompactSize(m_file, unspent_offsets.size());
    


    l0rinc commented at 3:14 pm on December 9, 2025:
    compact size is very wasteful in case we have values greater than 253. Consider WriteVarInt for cases that contain a significant portion of greater than 1 byte values
  42. in src/swiftsync.cpp:82 in fcdce16cf1 outdated
    103+    uint64_t num_unspents = ReadCompactSize(m_file);
    104+    std::vector<uint64_t> offsets{};
    105+    offsets.reserve(num_unspents);
    106+    for (uint64_t i = 0; i < num_unspents; i++) {
    107+        offsets.push_back(ReadCompactSize(m_file));
    108+    }
    


    l0rinc commented at 3:20 pm on December 9, 2025:
    we’re currently “decompressing” the data during read, instead of during access - it should be possible to read the whole file data into the vector without interpreting it and decipher them when the data is needed. Please see the way we’ve implemented it in https://github.com/l0rinc/bitcoin/pull/11/commits/8398cbc72c55047467bfef687582d89b1627793c#diff-c754d7c547c3523356e9a430b5448ba384fe135cf3f83508a7d0384d30c9d2d0R45-R54
  43. l0rinc changes_requested
  44. l0rinc commented at 3:40 pm on December 9, 2025: contributor

    I went through the change once to get a feel for it, left a ton of questions and suggestions. Many of my suggestions apply to many places of the PR - I didn’t comment on each occurrence. Some are very low-level, some very high - please put them into context. As far as I’m concerned there are a few blockers here - it should indeed be a draft for now.

    First, the storage assumes sparse blocks throughout the mainchain history - which is unlikely to be the case, I think we could achieve better storage if we used a mixed storage mechanism based on which is cheaper: an index-to-boolean map for each unspend tx or a bitset of the complete block. Given how easy it is to implement a compressed sparse bitset, I think we should investigate that. Could you plot the blockchain by height vs unspent tx for us?

    It also bothers me greatly that this can only seed pruned nodes - it limits its use a lot, it’s hard to argue what this provides over current AssumeUTXO. Is it possible to carve out features from this change and progressively introduce chunks into the codebase - e.g. introduce hints file but use it to optimize the current storage to warm the cache based on whether we expect those to end up on disk or not. It would be a pure optimization without any validation change - in worst case it would just make the code slower. And it would already get our foot in the door - I don’t expect this change to be merged soon, we should expect pushback.

    In its current state even I have strong objections against it: the logic is spread around the validation code, I want to be able to review this in isolation instead of sprinkled around consensus code. As @sedited also mentioned in a deleted comment: could we separate the gist into a coins cache like @andrewtoth did in #31132 to encapsulate its effect? We should expect some refactoring changes before this PR.

  45. rustaceanrob commented at 2:18 pm on December 10, 2025: contributor
    Thanks a lot for the thorough review. I will respond both high level and to review comments over the coming days.
  46. in src/validation.cpp:2643 in fcdce16cf1 outdated
    2612@@ -2579,13 +2613,15 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2613         // * legacy (always)
    2614         // * p2sh (when P2SH enabled in flags and excludes coinbase)
    2615         // * witness (when witness enabled in flags and excludes coinbase)
    2616-        nSigOpsCost += GetTransactionSigOpCost(tx, view, flags);
    2617-        if (nSigOpsCost > MAX_BLOCK_SIGOPS_COST) {
    2618-            state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "too many sigops");
    2619-            break;
    2620+        if (!swiftsync_active) {
    


    brunoerg commented at 7:58 pm on December 11, 2025:

    Unkilled mutant:

     0diff --git a/src/validation.cpp b/muts-pr-34004-validation-cpp/validation.mutant.16.cpp
     1index 3d84576e7d..efc5037570 100644
     2--- a/src/validation.cpp
     3+++ b/muts-pr-34004-validation-cpp/validation.mutant.16.cpp
     4@@ -2613,7 +2613,7 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
     5         // * legacy (always)
     6         // * p2sh (when P2SH enabled in flags and excludes coinbase)
     7         // * witness (when witness enabled in flags and excludes coinbase)
     8-        if (!swiftsync_active) {
     9+        if (1==0) {
    10             nSigOpsCost += GetTransactionSigOpCost(tx, view, flags);
    11             if (nSigOpsCost > MAX_BLOCK_SIGOPS_COST) {
    12                 state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "too many sigops");
    13@@ -6571,4 +6571,4 @@ std::pair<int, int> ChainstateManager::GetPruneRange(const Chainstate& chainstat
    14     int prune_end = std::min(last_height_can_prune, max_prune);
    15
    16     return {prune_start, prune_end};
    
  47. in src/init.cpp:1810 in fcdce16cf1 outdated
    1800@@ -1790,11 +1801,34 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1801     bool do_reindex{args.GetBoolArg("-reindex", false)};
    1802     const bool do_reindex_chainstate{args.GetBoolArg("-reindex-chainstate", false)};
    1803 
    1804+    std::optional<swiftsync::HintsfileReader> utxo_hints;
    1805+    if (args.IsArgSet("-utxohints")) {
    1806+        fs::path path = fs::absolute(args.GetPathArg("-utxohints"));
    1807+        if (!fs::exists(path)) {
    


    brunoerg commented at 8:06 pm on December 11, 2025:

    We could exercise these errors (file doesn’t exist and open failure) in the functional tests. An (obvious) unkilled mutant:

     0diff --git a/src/init.cpp b/muts-pr-34004-init-cpp/init.mutant.1.cpp
     1index fb7336f9ae..59de0f6b7a 100644
     2--- a/src/init.cpp
     3+++ b/muts-pr-34004-init-cpp/init.mutant.1.cpp
     4@@ -1812,7 +1812,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
     5         AutoFile afile{file};
     6         if (afile.IsNull()) {
     7             LogError("Failed to open UTXO hint file.");
     8-            return false;
     9+            return true;
    10         }
    
  48. in src/rpc/blockchain.cpp:3481 in fcdce16cf1 outdated
    3480+        }
    3481+        FlatFilePos file_pos = curr->GetBlockPos();
    3482+        std::unique_ptr<CBlock> pblock = std::make_unique<CBlock>();
    3483+        bool read = node.chainman->m_blockman.ReadBlock(*pblock, file_pos, curr->GetBlockHash());
    3484+        if (!read) {
    3485+            throw JSONRPCError(RPC_DATABASE_ERROR, "Block could not be read from disk.");
    


    brunoerg commented at 8:33 pm on December 11, 2025:
    This error isn’t covered by any test and could be addressed in feature_swiftsync. Perhaps by deleting a block file before calling the generatetxohints RPC.
  49. brunoerg commented at 9:34 pm on December 11, 2025: contributor
    I’ve ran a mutation analysis on this PR. Unkilled mutants should be killed (covered) by an unit or functional test (if make sense). However, I noticed that there are part of the code that doesn’t even have test coverage (corecheck.dev would have show it but did not generated the report for this PR) and should have it. As soon as more tests are added to this PR, I can re-run the analysis again.
  50. in test/functional/feature_swiftsync.py:42 in fcdce16cf1 outdated
    37+        ## Coinbase outputs are treated differently by the SwiftSync protocol as their inputs are ignored.
    38+        ## To ensure the hash aggregate is working correctly, we also create non-coinbase transactions.
    39+        self.log.info(f"Generating {BASE_BLOCKS} blocks to a source node")
    40+        self.generate(mini_wallet, BASE_BLOCKS, sync_fun=self.no_op)
    41+        self.log.info(f"Sending {NUM_SWIFTSYNC_BLOCKS} self transfers")
    42+        for i in range(NUM_SWIFTSYNC_BLOCKS):
    


    brunoerg commented at 9:35 pm on December 11, 2025:

    nit:

    0        for _ in range(NUM_SWIFTSYNC_BLOCKS):
    
  51. in src/swiftsync.h:39 in fcdce16cf1
    34+    HashWriter m_salted_hasher{};
    35+
    36+public:
    37+    Aggregate();
    38+    /** Is the internal state zero, representing the empty set. */
    39+    bool IsZero() const { return m_limb0 == 0 && m_limb1 == 0 && m_limb2 == 0 && m_limb3 == 0; }
    


    brunoerg commented at 10:23 pm on December 11, 2025:

    Unkilled mutant:

     0diff --git a/src/swiftsync.h b/muts-pr-34004-swiftsync-h/swiftsync.mutant.0.h
     1index 2c60e96895..ae229d18d8 100644
     2--- a/src/swiftsync.h
     3+++ b/muts-pr-34004-swiftsync-h/swiftsync.mutant.0.h
     4@@ -36,7 +36,7 @@ private:
     5 public:
     6     Aggregate();
     7     /** Is the internal state zero, representing the empty set. */
     8-    bool IsZero() const { return m_limb0 == 0 && m_limb1 == 0 && m_limb2 == 0 && m_limb3 == 0; }
     9+    bool IsZero() const { return m_limb0 == 0 || m_limb1 == 0 && m_limb2 == 0 && m_limb3 == 0; }
    10     /** Add an outpoint created in a block. */
    11     void Add(const COutPoint& outpoint);
    12     /** Spend an outpoint used in a block. */
    13@@ -132,4 +132,4 @@ public:
    14     uint32_t StopHeight() { return m_hint_reader->StopHeight(); };
    15 };
    
  52. in src/rpc/blockchain.cpp:3409 in fcdce16cf1 outdated
    3404+        "Build a file of hints for the state of the UTXO set at a particular height.\n"
    3405+        "The purpose of said hints is to allow clients performing initial block download"
    3406+        "to omit unnecessary disk I/O and CPU usage.\n"
    3407+        "The hint file is constructed by reading in blocks sequentially and determining what outputs"
    3408+        "will remain in the UTXO set. Network activity will be suspended during this process, and the"
    3409+        "hint file may take a few hours to build."
    


    DrahtBot commented at 8:28 am on December 12, 2025:

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    "initial block download""to" -> "initial block download" "to" [Two adjacent string literals were concatenated without a space, producing "downloadto". Insert a space between them.]
    "determining what outputs""will" -> "determining what outputs" "will" [Two adjacent string literals were concatenated without a space, producing "outputswill". Insert a space between them.]
    "hint file may take a few hours to build.""Make" -> "hint file may take a few hours to build." "Make" [Two adjacent string literals were concatenated without a space, producing "build.Make". Insert a space or newline between them.]
    
  53. Eunovo commented at 9:11 am on December 12, 2025: contributor

    Since this SwiftSync implementation uses assumevalid assumptions, it makes sense to reason about attacks that apply to assumevalid, here as well. What happens if an attack gives a SwiftSync node a malicious hint file and eclipses the node so that it is denied an honest chain? Assumevalid uses:

    I think we need to include these checks as well when doing SwiftSync with assumevalid assumptions. Unless, for some reason, SwiftSync with assumevalid is not vulnerable to the same attacks.

  54. rustaceanrob force-pushed on Dec 12, 2025
  55. rustaceanrob referenced this in commit c62f93b631 on Dec 12, 2025
  56. rustaceanrob referenced this in commit 9b62dadd00 on Dec 13, 2025
  57. rustaceanrob referenced this in commit 5ac3579520 on Dec 13, 2025
  58. rustaceanrob commented at 1:09 pm on December 15, 2025: contributor

    It also bothers me greatly that this can only seed pruned nodes - it limits its use a lot, it’s hard to argue what this provides over current AssumeUTXO. Is it possible to carve out features from this change and progressively introduce chunks into the codebase - e.g. introduce hints file but use it to optimize the current storage to warm the cache based on whether we expect those to end up on disk or not

    I entertained this idea for a while. In the current format, the only thing we get is a guarantee some outputs should go straight to disk and minimize time in the cache. While this might not make things slower, I don’t think it will offer a significant improvement. One iteration on this idea is to use time-to-live values similar to the UTXO draft BIP. So, something like 10 represents this output will be spent soon and 110 means this output will be spent a bit later, etc. This would have no benefit to the SwiftSync protocol itself, but could potentially offer an improvement in cache efficiency.

    Regarding the pruned-only feature. This would not be the case if there was a P2P message for undo-data. It would allow undo-data to be written in an un-trusted state and later confirmed as valid with a check of IsBalanced. Not to mention it would allow for full validation. Perhaps downloading undo-data should be required regardless of script checks, as 1. the changes to validation.cpp should be minimal in terms of omitted amount checks (especially with #32317) 2. it would support pruned, assume-valid, and fully validating nodes with no discrepancies between the code-paths for each of these options. I think it would be worthwhile for me to demonstrate this approach on a separate PR.

    Regarding the current format of the hintsfile, AFICT the linked delving bitcoin post explored the difference in encodings, and found essentially no difference in the bitset encoding and using the run-lengths. Given that std::vector<bool> is an optimized data structure, it makes sense to opt for the 0/1 format. I have no strong opinion here.

  59. rustaceanrob force-pushed on Dec 15, 2025
  60. rustaceanrob force-pushed on Dec 16, 2025
  61. DrahtBot added the label Needs rebase on Dec 16, 2025
  62. rustaceanrob commented at 10:03 am on December 17, 2025: contributor

    found essentially no difference in the bitset encoding and using the run-length

    This turned out to not be true. Using bit-packing as shown in the current state of the PR, the encoding was far worse. 0/1 encoding up to 928000 resulted in a hintsfile of 438Mb, whereas the run-lengths for height came in at 173Mb. I think 438Mb is not acceptable, so a hybrid approach is likely required. The construction of both files was around the same duration. I will do a more thorough analysis of chain history to see when each encoding style is appropriate.

  63. rustaceanrob force-pushed on Dec 17, 2025
  64. DrahtBot removed the label Needs rebase on Dec 17, 2025
  65. DrahtBot added the label CI failed on Dec 17, 2025
  66. fanquake referenced this in commit e5c600dc0e on Dec 17, 2025
  67. swiftsync: Add aggregate class
    This class allows for the comparison of two sets `A` and `B`. Elements
    of `A` are added to a wrapping integer, and elements of `B` are also
    added to a wrapping integer. If the two integers are equivalent, then
    sets `A` and `B` are equivalent. In this case we say they are
    "balanced".
    
    The client-side salt is required to prevent the generalized birthday
    attack, whereby an attacker may choose a targeted hash collision and
    reduce the search space by a square root of the original security.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    d370f6a706
  68. rustaceanrob force-pushed on Dec 17, 2025
  69. test: swiftsync: Add aggregate test
    Introduce a trivial example of creating and spending outpoints in
    arbitrary order.
    1ff59a1540
  70. swiftsync: Add hintfile reader/writer
    The set of all inputs ever created and the set of all outputs are not
    equivalent sets. Naturally, some outputs will be unspent. To update the
    aggregate correctly, we require "hints" as to what outputs will remain
    in the UTXO set at a particular height in the chain. These hints allow a
    client to verify `inputs = outputs - UTXOs` for a terminal block height.
    
    The hints are encoded using a bitset, where a `0` represents an output
    that will be spent, and a `1` represents an output that will be unspent.
    
    A header section is added so multiple blocks may be encoded
    concurrently. The header section denotes the block height and file
    position.
    27c4af3adb
  71. test: swiftsync: Add roundtrip hintfile read/write
    Assert that writing elements to file in an arbitrary order and reading
    them back in an arbitrary order does not fail.
    77b2518e99
  72. rpc: Generate UTXO set hints
    Generate hints for the location of UTXOs in each block. This RPC allows
    for a rollback of the chainstate to an arbitary height. Currently, the
    hintfile is made in sequential order, but this may be done in parallel
    as a future improvement.
    a59210b111
  73. swiftsync: Add unified context class
    This combines the aggregate and hints into a single class for consumers
    to interact with. The SwiftSync protocol is only possible if: the entire
    block history is being downloaded from genesis, the protocol is not
    already completed, and there are hints avaiable to use. If all of these
    conditions are met within the internal context, SwiftSync is possible.
    48b88f0f53
  74. init: Add hintfile path arg
    Allow a user to pass a file path to UTXO hints, failing for any errors
    encountered.
    
    Namely, undo-data cannot be written during this implementation of
    SwiftSync. A partial compension for this is to only allow hints when
    using `-prune`, as undo-data will be deleted anyway.
    
    A SwiftSync context is applied to the active chainstate. If a hints file
    is present, it is passed to the active chainstate for use later.
    d019f9abe7
  75. validation: Use UTXO hints in `ConnectBlock`
    During the SwiftSync protocol, inputs are not fetch-able from the coins
    view. This means the following cannot be checked until the terminal
    state may be verified as valid:
    1. inputs do not attempt to spend more coins than available
    2. script validity
    3. sigops
    4. subsidy + fees = coinbase
    
    At the terminal SwiftSync height, we verify the history of transaction
    inputs and outputs that have been fetched indeed correspond to the
    resulting UTXO set. In the case of success, IBD continues as usual. In
    the case of a failure, the UTXO set present does _not_ correspond to the
    block history received. In this situation, the program should crash and
    indicate to the user that they may recover the correct UTXO set with a
    `-reindex`.
    
    Of note, writing undo-data is not possible until the UTXO set present on
    disk is verified as correct.
    ba76ef67fa
  76. test: swiftsync: Add integration test
    The following test asserts that a client may sync to the network tip
    using a hints file and may reorganize thereafter. Furthermore, it tests
    a faulty hints file cannot be used to leave a client in an invalid
    state, and any client given a bad file may recover with a reindex.
    86d009d000
  77. rustaceanrob force-pushed on Dec 17, 2025
  78. in src/test/swiftsync_tests.cpp:58 in 77b2518e99 outdated
    53+                block_one.PushLowBit();
    54+            };
    55+        }
    56+        BOOST_CHECK(writer.WriteNextUnspents(block_one, 1));
    57+        swiftsync::BlockHintsWriter block_three{};
    58+        for (const bool hint : unspent_block_3) {
    


    brunoerg commented at 11:06 pm on December 17, 2025:
    77b2518e9903d0b1fffe1ca7cfd5f7e191af0b5a: nit: We could avoid this duplicated loop (we basically have the same thing for each unspent block) by creating an “util” function that does it.
  79. in test/functional/feature_swiftsync.py:49 in 86d009d000
    44+            self.generate(full_node, nblocks=1, sync_fun=self.no_op)
    45+        self.log.info("Creating hints file")
    46+        result = full_node.generatetxohints(GOOD_FILE)
    47+        hints_path = result["path"]
    48+        self.log.info(f"Created hints file at {hints_path}")
    49+        assert_equal(full_node.getblockcount(), NUM_SWIFTSYNC_BLOCKS + BASE_BLOCKS)
    


    brunoerg commented at 11:27 pm on December 17, 2025:
    86d009d0009a53bbb42923b1c1c062f11499d481: You could also assert that the block count is the same as result["height"].
  80. in src/init.cpp:482 in d019f9abe7 outdated
    478@@ -478,6 +479,7 @@ void SetupServerArgs(ArgsManager& argsman, bool can_listen_ipc)
    479     argsman.AddArg("-alertnotify=<cmd>", "Execute command when an alert is raised (%s in cmd is replaced by message)", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    480 #endif
    481     argsman.AddArg("-assumevalid=<hex>", strprintf("If this block is in the chain assume that it and its ancestors are valid and potentially skip their script verification (0 to verify all, default: %s, testnet3: %s, testnet4: %s, signet: %s)", defaultChainParams->GetConsensus().defaultAssumeValid.GetHex(), testnetChainParams->GetConsensus().defaultAssumeValid.GetHex(), testnet4ChainParams->GetConsensus().defaultAssumeValid.GetHex(), signetChainParams->GetConsensus().defaultAssumeValid.GetHex()), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    482+    argsman.AddArg("-utxohints=<path>", "Accelerate initial block download with the assistance of a UTXO hint file.", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS),
    


    brunoerg commented at 11:32 pm on December 17, 2025:
    d019f9abe70ba9b488baac3454e9c853b04fe885: We could point here that it cannot be used without pruned mode.
  81. brunoerg commented at 11:58 pm on December 17, 2025: contributor

    Regarding the pruned-only feature. This would not be the case if there was a P2P message for undo-data. It would allow undo-data to be written in an un-trusted state and later confirmed as valid with a check of IsBalanced. Not to mention it would allow for full validation.

    Are you intending to work on this P2P message? Is this part of any specification?

  82. in src/test/swiftsync_tests.cpp:28 in 86d009d000
    23+std::vector<bool> unspent_block_3{true, false, false, true, false, true, true, true, true, false, true, true, true, true, true, true};
    24+} // namespace
    25+
    26+BOOST_FIXTURE_TEST_SUITE(swiftsync_tests, BasicTestingSetup);
    27+
    28+BOOST_AUTO_TEST_CASE(swiftsync_aggregate_test)
    


    brunoerg commented at 12:04 pm on December 18, 2025:

    1ff59a15407cb5fdc2ae06358e45134166db2448: Besides a unit test for the aggregate, I think a fuzz testing would be nice, e.g.:

     0// Copyright (c) 2025-present The Bitcoin Core developers
     1// Distributed under the MIT software license, see the accompanying
     2// file COPYING or http://www.opensource.org/licenses/mit-license.php.
     3
     4#include <random.h>
     5#include <test/fuzz/FuzzedDataProvider.h>
     6#include <test/fuzz/fuzz.h>
     7#include <test/fuzz/util.h>
     8#include <test/util/setup_common.h>
     9#include <swiftsync.h>
    10
    11FUZZ_TARGET(swiftsync_aggregate_test)
    12{
    13    SeedRandomStateForTest(SeedRand::ZEROS);
    14    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
    15    swiftsync::Aggregate agg{};
    16    bool good_data{true};
    17    LIMITED_WHILE(good_data && fuzzed_data_provider.ConsumeBool(), 500) {
    18        const std::optional<COutPoint> out_point{ConsumeDeserializable<COutPoint>(fuzzed_data_provider)};
    19        if (!out_point) {
    20            good_data = false;
    21            return;
    22        }
    23
    24        if (fuzzed_data_provider.ConsumeBool()) {
    25            agg.Spend(*out_point);
    26        } else {
    27            agg.Create(*out_point);
    28        }
    29    }
    30    (void)agg.IsBalanced();
    31}
    
  83. rustaceanrob commented at 8:06 pm on December 18, 2025: contributor

    Are you intending to work on this P2P message? Is this part of any specification?

    Yeah I plan on creating a patch to add this, either here or separately. Notably, the UTREEXO BIPs specify serving undo data, however it is also grouped with proofs that SwiftSync would not require. I will suggest as a comment that these messages may be appropriate to split into undo-data and proofs, as there are a number of use cases for a network message for this data. The easiest approach would be to use the current compressed serialization format and send that directly over the wire.

  84. 0xB10C referenced this in commit b1593ac930 on Dec 28, 2025
  85. exd02 commented at 11:25 pm on January 6, 2026: none

    @rustaceanrob

    Possible issue with generatetxohints when passing rollback parameter

    I am testing the generatetxohints RPC in my full node (performing IDB, currently in block 912483) and observed a possible unexpected behavior when the optional rollback parameter is provided:

    Setup / Command

    0./build/bin/bitcoin-cli \
    1  -datadir=/run/media/exd/hd1/Bitcoin \
    2  --rpcclienttimeout=0 \
    3  generatetxohints ~/teste/test.hints 10000
    

    Behaviour

    • When no block height is provided:
      • The hint file is generated successfully.
    • When a block height is provided:
      • The output file is test.hints.incomplete (and it’s empty)

    Debugging details

    I attached gdb to bitcoind and set a breakpoint at rpc/blockchain.cpp:3457 (I believe that if there is an error it should be here, because the problem only occurs when params[1] is defined):

    0if (!request.params[1].isNull()) { // <----- Breakpoint here
    1    const CBlockIndex* invalidate_index = ParseHashOrHeight(request.params[1], *node.chainman);
    2    invalidate_index = WITH_LOCK(::cs_main, return node.chainman->ActiveChain().Next(invalidate_index));
    3    rollback.emplace(*node.chainman, *invalidate_index);
    4}
    
     0# attached the debugger
     1gdb build/bin/bitcoind
     2(gdb) set args -datadir=/run/media/exd/hd1/Bitcoin -debug=rpc
     3(gdb) start
     4(gdb) break rpc/blockchain.cpp:3457
     5(gdb) continue
     6
     7# called generatetxohints from cli
     8./build/bin/bitcoin-cli -datadir=/run/media/exd/hd1/Bitcoin --rpcclienttimeout=0 generatetxohints ~/teste/test.hints 10000
     9
    10# gdb output
    11Thread 4 "b-httpworker.0" hit Breakpoint 2, generatetxohints()::$_0::operator()(RPCHelpMan const&, JSONRPCRequest const&) const (request=..., this=<optimized out>, self=...) at ./rpc/blockchain.cpp:3457
    123457	    if (!request.params[1].isNull()) {
    13(gdb) n
    143458	        const CBlockIndex* invalidate_index = ParseHashOrHeight(request.params[1], *node.chainman);
    15(gdb) n
    163459	        invalidate_index = WITH_LOCK(::cs_main, return node.chainman->ActiveChain().Next(invalidate_index));
    17(gdb) n
    183460	        rollback.emplace(*node.chainman, *invalidate_index);
    19(gdb) n
    20
    21# after this point, the RPC execution does not continue its expected flow. Instead, the node resumes normal tip validation (UpdateTip logs continue), while the generatetxohints RPC appears to stall. The output file remains as `test.hints.incomplete`, stays empty, and is never finalized. No explicit error is returned.
    222026-01-06T23:05:37Z UpdateTip: new best=00000000000000000001c4803c902cfab517056bd8191e1d5870b2c58c36d3c2 height=912483 version=0x2001e000 log2_work=95.797984 tx=1234594715 date='2025-08-30T23:36:43Z' progress=0.953049 cache=15.6MiB(115671txo)
    23...
    

    Question

    Am I using generatetxohints incorrectly, or is this an unintended behavior when the rollback parameter is provided?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-01-07 18:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me