Add a getutxos command to the p2p protocol #4351

pull mikehearn wants to merge 1 commits into bitcoin:master from mikehearn:getutxo changing 4 files +89 −2
  1. mikehearn commented at 4:40 pm on June 16, 2014: contributor

    Introduction

    The getutxo command allows querying of the UTXO set given a set of of outpoints. It has a simple implementation and the results are not authenticated in any way. Despite this, there are times when it is a useful capability to have. I believe @jgarzik also has a use case for this, though I don’t know what it is.

    As a motivating example I present Lighthouse, an app I’m writing that implements assurance contracts:

    http://blog.vinumeris.com/2014/05/17/lighthouse/

    Lighthouse works by collecting pledges, which contain an invalid transaction signed with SIGHASH_ANYONECANPAY. Once sufficient pledges are collected to make the combination valid, we say the contract is complete and it can be broadcast onto the network, spending the pledged outputs. Before that occurs however, a pledge can be revoked and the pledged money redeemed by double spending the pledged output. For instance you might want to do this if it becomes clear not enough people care about the assurance contract for it to reach its goal in a timely manner, or if you simply need the money back due to some kind of cashflow crunch.

    It is convenient to be able to see when a pledge has been revoked, so the user interface can be updated, and so when the final contract is created revoked pledges can be left out. For this purpose “getutxos” is used.

    Protocol

    The getutxos message takes a boolean which controls whether outputs in the memory pool are considered, and a vector of COutPoint structures. It returns a bitmap with the same number of bits as outputs specified rounded up to the nearest 8 bits, and then a list of CTxOut structures, one for each set bit in the bitmap. The bitmap encodes whether the UTXO was found (i.e. is indeed unspent).

    Authentication

    The results of getutxos is not authenticated. This is because the obvious way to do this requires the work maaku has been doing on UTXO commitments to be merged, activated by default, miners to upgrade and a forking change made to enforce their accuracy. All this is a big project that may or may not ever come to fruition.

    For the Lighthouse security model however, this doesn’t matter much. The reason is that the pledge transactions you’re getting (which may be malicious) don’t come from the P2P network. They come in the form of files either from a simple rendezvous server, or e.g. a shared folder or email attachments. The people sending these files have no way to influence the choice of peers made by the app. Once the outputs are returned, they are used to check the signatures on the pledge, thus verifying that the pledge spends the UTXO returned by the P2P network.

    So we can be attacked in the following ways:

    • The pledge may be attempting to pledge non-existent outputs, but this will be detected if the majority of peers are honest.
    • The peers may be malicious and return a wrong or bogus output, but this will be detected when the signature is checked, except for the value (!) but we want to fix this by including the value into the sighash at some point anyway because we need it to make the TREZOR efficient/faster.
    • The peers may bogusly claim no such UTXO exists when it does, but this would result in the pledge being seen as invalid. When the project creator asks the pledgor why they revoked their money, and learns that in fact they never did, the bogus peers will be detected. No money is at risk as the pledges cannot be spent.
    • If the pledgor AND all the peers collaborate (i.e. the pledgor controls your internet connection) then they can make you believe you have a valid pledge when you don’t. This would result in the app getting “jammed” and attempting to close an uncloseable contract. No money is at risk and the user will eventually wonder why their contract is not confirming. Once they get to a working internet connection the subterfuge will be discovered.

    There is a final issue: the answer to getutxos can of course change the instant the result is generated, thus leading you to construct an uncloseable transaction if the process of revocation races with the construction. The app can detect this by watching for either a reject message, or an inv requesting one of the inputs that is supposed to be in the UTXO set (i.e. the remote peer thinks it’s an orphan). This can then cause the app to re-request the UTXO set and drop the raced pledge.

    In practice I do not anticipate such attacks are likely to occur, as they’re complicated to pull off and it’s not obvious what the gain is.

    There may be other apps that wish to use getutxos, with different security models. They may find this useful despite the lack of UTXO commitments, and the fact that the answer can change a moment later, if:

    • They are connecting to a trusted peer, i.e. localhost.
    • They trust their internet connection and peer selection, i.e. because they don’t believe their datacenter or ISP will commit financial fraud against them, or they are tunnelling via endpoints they trust like a randomly chosen Tor exit.
    • They are not using the response for anything important or worth attacking, like some kind of visualisation.

    Upgrade

    If enforced UTXO commitments are added to the block chain in future, it would make sense to rev the P2P protocol to add the proofs (merkle branches) to the response.

    Testing

    I attempted to write unit tests for this, but Core has no infrastructure for building test chains …. the miner_tests.cpp code does it but at the cost of not allowing any other unit test to do so, as it doesn’t reset or clean up the global state afterwards! I tried to fix this and ended up down a giant rabbit hole.

    So instead I’ve tested it with a local test app I wrote, which also exercises the client side part in bitcoinj.

    BIP

    If the code is ACKd then I will write a short BIP explaining the new message.

    Philosophy

    On IRC I have discussed this patch a little bit before. One objection that was raised is that we shouldn’t add anything to the P2P protocol unless it’s unattackable, because otherwise it’s a sharp knife that people might use to cut themselves.

    I personally disagree with this notion for the following reasons.

    Firstly, many parts of the P2P protocol are not completely unattackable: malicious remote nodes can withhold broadcast transactions, invent fictional ones (you’d think they’re orphans), miss out Bloom filter responses, send addr messages for IP’s that were never announced, etc. We shouldn’t hold new messages to a standard existing messages don’t meet.

    Secondly, even with UTXO commitments in the block chain, given the sick state of mining this only requires a collaboration of two people to undo, although that failure would be publicly detectable which is at least something. But anyway, there’s a clean upgrade path if/when UTXO authentication becomes available.

    Thirdly, we have a valid use case that’s actually implemented. This isn’t some academic pie in the sky project.

    Finally, Bitcoin is already the sharpest knife imaginable. I don’t think we should start rejecting useful features on the grounds that someone else might screw up with them.

    If the above analysis ends up not holding for some reason, and people do get attacked due to the lack of authentication, then Lighthouse and other apps can always fall back to connecting to trusted nodes (perhaps over SSL). But I would like to optimistically assume success up front and see what happens, than pessimistically assume the worst and centralise things up front.

  2. petertodd commented at 8:20 pm on June 16, 2014: contributor

    Why is there absolutely no privacy at all in this feature? You could easily search by prefix rather than being forced to always give the peer the exact outputs you are interested in. (recall how leveldb queries work re: the iterators)

    Also, re: security, Lighthouse is particularly bad as lying about UTXO’s - falsely claiming they don’t exist/are spent when they are unspent - can certainly lead to serious exploits where clients are fooled into thinking an assurance contract is not fully funded when in fact it is over-funded, leading to large fees being paid to miners. You’ve not only got a potential exploit, you’ve got a strong financial motivation to exploit that exploit.

  3. petertodd commented at 8:24 pm on June 16, 2014: contributor
    One last thing: needs a NODE_GETUTXO service bit - having an unencrypted copy of the UTXO set is definitely a service that not all nodes can be expected to have. (recall @gmaxwell’s clever suggestion of self-encrypting the UTXO set to avoid issues around storage of problematic data)
  4. mikehearn commented at 8:44 pm on June 16, 2014: contributor

    If the app thinks a pledge is revoked it won’t be included in the contract that is broadcast, so it can’t lead to overpayment.

    Re: encrypted UTXO set. That makes no sense. Nodes must be able to do this lookup internally to operate. Gregory’s suggestion was to obfuscate the contents on disk only to avoid problems with silly AV scanners, not that the node itself can’t read its own database.

    There is no prefix filtering because that would complicate the implementation considerably. You are welcome to implement such an upgrade in a future patch, if you like.

  5. petertodd commented at 8:57 pm on June 16, 2014: contributor

    If the app thinks a pledge is revoked it won’t be included in the contract that is broadcast, so it can’t lead to overpayment.

    The attacker would of course broadcast the pledges themselves; pledges are public information.

    Re: encrypted UTXO set. That makes no sense.

    I was thinking in the case where privacy is implemented, but actually on second thought my complaint is invalid for this implementation as you’re not returning the UTXO data associated with the UTXO.

    There is no prefix filtering because that would complicate the implementation considerably. You are welcome to implement such an upgrade in a future patch, if you like.

    You can query leveldb with the prefix, get a cursor in return, then just scan the cursor until the end of the prefix range. There’s no good reason to create yet more infrastructure with zero privacy, and the lack of privacy makes attacking specific targets without being detected much easier.

  6. mikehearn commented at 9:00 pm on June 16, 2014: contributor

    Pledges are not public. You’re making assumptions about the design without understanding it.

    Your second statement is nonsensical. The code does return “the UTXO data associated with the UTXO”, what else would it do?

    Your third statement is something I already know: I am the one who implemented LevelDB for Bitcoin. My point stands. This patch is what it is. If you’d like it to be better feel free to contribute code to make it so.

  7. petertodd commented at 10:17 pm on June 16, 2014: contributor

    Pledges are not public. You’re making assumptions about the design without understanding it.

    Either pledges are public information and can be attacked, or they are not and some single user is running the crowdfund, (the project owner) in which case the overhead of just using existing systems is not a big deal. In particular, in the “single project owner design” all pledges can easily be added to a single bloom filter and the chain scanned to keep the state of spent/unspent up to date at the same low cost as keeping an SPV wallet up-to-date. (remember that users are going to be pledging specific amounts and/or using a specific wallet for their pledges, so in the vast majority of cases you’ll need to create a transaction output for that pledge, which means the bloom filter behavior is identical to that of a standard SPV wallet)

    Relying on pledges not being public information for your security is a rather risky design with difficult to predict consequences. Easy to imagine, for instance, a user publishing a list of pledge amounts showing the progress of their campaign, and that list being used for the attack. (trick project owner into publishing an invalid tx with an input spent, record signatures, then make it look like other pledges are now spent and get more pledges) Even a “multiple utxo’s is one pledge” design can easily fall to an attacker who just guesses what UTXO’s are probably part of the pledge based on amounts pledged. Again, all attacks that are much more difficult to pull off if the app isn’t giving away exact info on what transaction outputs it’s looking for. (although UTXO anonymity does suffer from the inherent problem that UTXO size grows indefinitely, so your k-anonymity set is much weaker than it looks as many entries can be ignored due to old age - another argument for the bloom filter alternative)

    Your second statement is nonsensical. The code does return “the UTXO data associated with the UTXO”, what else would it do?

    It can return only a spent/unspent bit. What’s the use-case for requiring more than that? It’s easy to foresee this encouraging applications to abuse the UTXO set as a database. (back to the policy question: do we really want to redefine NODE_NETWORK as being able to provide UTXO data to peers as well?)

  8. laanwj commented at 5:50 am on June 17, 2014: member

    Looks OK to me, implementation-wise. Talking of testing, if you cannot integrate testing into the unit tester suite for some reason I think at least some Python script should be included in qa/... to be able to test this functionality. Such a test script could create a node, import a bootstrap file, and launch getutxos queries at it.

    I think we are at the point that we need to define an extension mechanism - whether that’s done with a NODE_* bit or some other way doesn’t matter. That would encourage experimentation, and I think this is an excellent use-case for such. No need to force all NODE_NETWORK nodes above a certain version to provide a specific query service. Alternative implementations (obelisk, btcd) may or not want to implement this, and may want to experiment with their own extensions. Then the bootstrapping network needs a way to find only nodes that provide a certain extension. Let’s not repeat the bloom debacle.

  9. petertodd commented at 6:34 am on June 17, 2014: contributor

    @laanwj I suggested awhile back we use a simple bitmask:

    0x0000000000000001.testnet-seed-mask.bitcoin.petertodd.org
    

    Returning all seeds with at least NODE_NETWORK set. There’s most likely to be a relatively small number of combinations people use, so DNS caching will still work fine. (though I’m no DNS expert) Of course, as always relying heavily on seeds is foolish, so just setting up app-specific seeds probably makes more sense in many cases and lets the authors of those apps implement whatever feature tests they need to ensure they’re serving up seeds that actually support the features required by the apps. (e.g. right now if there ever exist NODE_BLOOM-using nodes on the network and someone does a bloom IO attack against the nodes returned by the seeds, maliciously or by accident, you’ll easily wind up with only NODE_BLOOM-supporting nodes being returned, breaking anything relying on bloom filters)

    Also, as an aside it’d be reasonable to set aside a few service bits for experimental usage, with an understanding that occasional conflicts will be inevitable. In my replace-by-fee implementation that uses preferential peering to let replace-by-fee nodes find each other quickly I have:

    0// Reserve 24-31 for temporary experiments
    1NODE_REPLACE_BY_FEE = (1 << 26)
    

    https://github.com/petertodd/bitcoin/blob/f789d6d569063fb92d1ca6d941cc29034a7f19ef/src/protocol.h#L66

  10. laanwj commented at 6:51 am on June 17, 2014: member
    @petertodd The problem with service bits is that there is only a very limited number of them. It would, IMO, be better to have a string namespace defining extensions. A new version of the network protocol could add a command that returns a list of strings defining the supported extensions and maybe even per-extension versions. It’s still very simple and conflicts could be much more easily avoided.
  11. petertodd commented at 6:59 am on June 17, 2014: contributor

    @laanwj Well we’ve got 64 of them, 56 if you reserve some for experiments; I don’t see us using up that many all that soon. A string namespace thing can be added in the future for sure, but I just don’t see the short-term, or even medium-term, need. After all, NODE_BLOOM was AFAIK the first fully fleshed out proposal to even use a single service bit, with the closest runner up being @sipa’s thoughts on pruning.

    That said, strings, and especially UUIDs, (ugh) would definitely reduce the politics around them.

  12. laanwj commented at 7:04 am on June 17, 2014: member
    It’s not about fear of running out but about reducing the need for central coordination. Anyhow, let’s stop hijacking this thread. Using a service bit in this case is fine with me.
  13. petertodd commented at 7:07 am on June 17, 2014: contributor
    @laanwj Agreed.
  14. mikehearn commented at 9:44 am on June 17, 2014: contributor

    I don’t think the attack you have in mind works.

    Let’s assume that pledges are public for a moment, e.g. because the user chooses to publish them or collect them in a way that inherently makes them public, like people attaching them to forum posts. I don’t fully get what attack you have in mind, but I think you’re saying if you can control the internet connection of the fundraiser for an extended period of time, you could ensure they don’t close the contract as early as possible and continue to solicit pledges. Then Dick Dasterdly steps in, takes all the pledges and steals the excess by working with a corrupt miner.

    But this attack makes no sense. If the pledges are public any of the legitimate pledgors can also observe the contracts state and close it. The attacker has no special privileges. Unless you control the internet connection of all of them simultaneously and permanently, the attack cannot work: legitimate users will stop pledging once they see it’s reached the goal and then either close it themselves, or ask the owner via a secure channel why they aren’t doing so.

    What you’re talking about is only an issue if the pledges are NOT public, but the attacker is able to obtain them all anyway, AND control the users internet connection so they do not believe the contract is closeable AND they continue to solicit funds and raise money. Given that Lighthouse includes an integrated Tor client and can therefore tunnel through your control anyway, I don’t think this is a realistic scenario.

    It can return only a spent/unspent bit. What’s the use-case for requiring more than that?

    It’s explained in the nice document I wrote above, please read it! Then you would actually know what data is returned and why. @laanwj As far as I know only Peter thinks Bloom filtering was a debacle: if we had added a service bit, all nodes on the network would set it except for one or two that don’t follow the protocol properly anyway, so who knows what they would do. If a node doesn’t wish to support this simple command they can just not increase their protocol version past 90000. If they do, it should be only a few lines of code to add. As you note, using a service bit means implementations can be found but there are only a handful of them to go around, and not using a service bit means some entirely new mechanism must be designed which is way out of scope for this patch.

    Re; qa tests, as far as I can tell they are only capable of covering the JSON-RPC interface. We don’t seem to have any testing infrastructure for doing P2P tests except for the pull tester. I could try adding some code to that, but that code is maintained in the bitcoinj repository.

  15. laanwj commented at 9:51 am on June 17, 2014: member

    @mikehearn Binding features to version numbers assumes a linear, centralized progression. It means that everyone that implements A also needs to implement B even though they are unrelated. I don’t think this is desirable anymore.

    And as said above, using a service bit is fine with me. I do think we need another mechanism for signalling extensions to the protocol in the future, but for now we’re stuck with that.

  16. mikehearn commented at 10:04 am on June 17, 2014: contributor

    OK, I can add a service bit, although AFAIK nobody actually has any code that searches out nodes with a particular bit? I’m not sure Core does and bitcoinj definitely doesn’t. But that can be resolved later.

    The question of optionality in standards is one with a long history, by the way. The IETF has a guide on the topic here:

    http://tools.ietf.org/html/rfc2360#section-2.10

    Deciding when to make features optional is tricky; when PNG was designed, there were (iirc) debates over whether gz compression should be optional or mandatory. The feeling was that if it were optional, at least a few implementations would be lazy and skip it, then in order to ensure that their images rendered everywhere PNG creators would always avoid using it, thus making even more implementations not bother, and in the end PNGs would just end up bigger for no good reason: just because a few minority implementors didn’t want to write a bit of extra code. So they made gz compression mandatory. Another feature that GIF had (animation) was made optional and put into a separate MNG format (later another attempt, APNG). Needless to say, the situation they feared did happen and animated images on the web today are all GIFs.

    So I will add a service bit, even though this feature is so trivial everyone could implement it. If it were larger and represented a much bigger cost, I’d be much keener on the idea. As is, I advise caution - simply making every feature from now on optional is not necessarily good design. The tradeoffs must be carefully balanced.

  17. maaku commented at 10:14 am on June 17, 2014: contributor
    NODE_NETWORK is a hack. It is conflating two things: storing the whole block chain, and storing the current UTXO set. These are orthogonal things. I think there should be a service bit here, but the meaning is not constrained to just a ‘getutxos’ call. NODE_NETWORK should be split into NODE_ARCHIVAL and NODE_UTXOSET, with the latter eventually indicating presence of other things as well, such as a future p2p message that returns utxo proofs.
  18. laanwj commented at 10:17 am on June 17, 2014: member

    I agree it would be preferable for everyone to agree and do the same thing, but that makes progress incredibly difficult. From my (maybe over-cynical) view of the bitcoin community that means that nothing new ever happens. There’s always some reason not to agree with a change, it could be some perceived risk, disagreement on the feature set or how the interface should look, or even paranoid fantasies.

    Having optional features could mean the difference between something like this, which is useful but not absolutely perfect, being merged, or nothing being done at all. So I also advice caution on trying to push it to the entire network with a version bump.

  19. maaku commented at 10:20 am on June 17, 2014: contributor
    @laanwj how else do you indicate presence of this one particular p2p message except by version bump? That’s what the version field is for.
  20. mikehearn commented at 10:23 am on June 17, 2014: contributor

    Ah, you’re right, that’s why software projects have maintainers instead of requiring universal agreement from whoever shows up :) There will always be people who disagree or want something better (but don’t want to do the work). Sometimes those disagreements will make sense, and other times they will be bike shedding.

    If we look at projects like the kernel, it’s successful partly because Linus lets debates run for a while, he develops opinions and then if things aren’t going anywhere he steps in and moves things forward. Bitcoin has worked the same way in the past with @gavinandresen doing that, and I hope we will retain good project leadership going forward.

    Gavin, what are your thoughts on protocol extensibility / optionality? As it seems nobody has problems with the code in this patch itself.

  21. maaku commented at 10:32 am on June 17, 2014: contributor
    @mikehearn it would be better imho if the return value included the height and hash of the best block. That would help you figure out what is going on when you get different answers from peers, and parallels the information returned by a future getutxos2 that returns merkle proofs.
  22. mikehearn commented at 10:46 am on June 17, 2014: contributor
    Good idea! I’ll implement that this afternoon or tomorrow.
  23. petertodd commented at 11:03 am on June 17, 2014: contributor
    @mikehearn Sybil attacking the Bitcoin really isn’t all that hard; I really hope Lighthouse doesn’t blindly trust the DNS seeds like so much other bitcoinj code does. re: having getutxos return actual UTXO’s vs. spent/unspent, I see nothing in the design of Lighthouse that prevents pledges from containing the transactions required to prove the UTXO data. Also, last I talked to Gregory Maxwell about the issue he had strong opinions that NODE_BLOOM was the right idea - he did after all ask me to implement it. Warren Togami also is in that camp. (and asked me to re-base the patch and submit the BIP)
  24. laanwj commented at 11:11 am on June 17, 2014: member

    @mikehearn It may have been that way in the past, but Bitcoin Core is not the only node implementation anymore. Don’t confuse leadership over this project with leadership over the global P2P network, which has various other actors as well now.

    Edit: another concrete advantage of an optional-feature approach is that features can be disabled again if they either prove to be not so useful for what they were imagined for, or the implementation causes problems, or a later extension provides a better alternative. Locking it to >= a protocol version means every version in the future is expected to implement it.

  25. mikehearn commented at 11:49 am on June 17, 2014: contributor
    @laanwj New version numbers can mean anything, including “feature X is no longer supported”. So I don’t think we need service bits for that.
  26. sipa commented at 11:51 am on June 17, 2014: member

    We’ve talked about it, and I’m sure you’re aware of my opinion already, but I’ll still repeat it here to offer for wider discussion.

    I do not believe we should encourage users of the p2p protocol to rely on unverifiable data. Anyone using ‘getutxos’ is almost certainly not maintaining a UTXO set already, and thus not doing chain verification, so not able to check that whatever a peer claims as respond to getutxos is remotely meaningful. As opposed to other data SPV clients use, this does not even require faking PoW.

    Yes, there are other parts of the protocol that are largely unverified. Addr messages for new peers, the height at startup, requesting the mempool contents, … But those are either being deprecated (like the height at startup), or have infrastructure in place to minimize the impact of false data. In contrast, I do not see any use of getutxos where the result can be verified; if you’re verifying, you don’t need it. To the extent possible, Bitcoin works as zero-trust as possible, and I believe improving upon that should be a goal of the core protocol.

    Of course, that does not mean that the ability to request UTXO data is useless. I just don’t believe it should be part of the core protocol.

    I think the problem is that in some cases, there are very good reasons to connect to a particular (set of) full node, and trusting its responses. For example, when you have different Bitcoin-related services running within a (local and trusted) network, connected to the outside world using a bitcoind “gateway”. In this case, you are using bitcoind as a service for your system, rather than as a pure p2p node.

    So far, we have separated service providing done through RPC than through P2P. This makes often sense, but is not standardized, is not very efficient, and is inconvenient when most of the data you need is already done through P2P.

    My proposal would therefore be to add “trusted extensions” to the P2P protocol. They would only be available to trusted clients of a full node (through IP masking, different listening port, maybe host authentication at some point, …). I’ve seen several use cases for these:

    • When a local network wallet rebroadcasts transactions, you want the gateway to rebroadcast as well. Default current behavior is to only relay the first time you see a transaction.
    • You want local clients to bypass rate limiting systems, without triggering DoS banning (currently done for localhost, whuch is broken for Tor).
    • Some functionality really is only available when you have a trusted bitcoind. Mempool acceptance checking for detecting conflicting wallet transactions is one, getutxos is another. Mechanisms for these could be available but only to trusted clients.

    This may be controversial (and probably needs a separate mailinglist/issue), as it could all be done through a out of band separate non-P2P protocol, or just RPC.

    Comments?

  27. mikehearn commented at 11:58 am on June 17, 2014: contributor

    I think I put all my comments on that in the original writeup. Yes, in the ideal world everything would be perfect and authenticated by ghash.io ;) However we do not live in such a world and are dragging ourselves towards it one step at a time.

    BTW, on height in version message being “deprecated”, that’s the first I’ve heard of this. SPV clients use it. If someone wants to deprecate that they’re welcome to update all the clients that require it. But let’s discuss that in a separate thread.

  28. sipa commented at 12:03 pm on June 17, 2014: member
    Oh, not in the protocol. I just mean that full nodes don’t use it at all anymore. I wish it didn’t exist in the first place, but it’s too unimportant to bother changing in the protocol.
  29. mikehearn commented at 12:13 pm on June 17, 2014: contributor
    Ah, OK.
  30. petertodd commented at 12:30 pm on June 17, 2014: contributor

    @sipa +1

    There’s nothing wrong with trust. We’d like everything to be decentralized, but we don’t live in a perfect world so occasionally we introduce trust to solve problems that we don’t have decentralized solutions for yet. We did that in the payment protocol because we had no other way to authenticate payments; we should be doing that in UTXO lookup, because we have no other way to authenticate UTXO’s. (yet)

    We also have a responsibility to design systems that naturally lead to safe implementations that are robust against attack. This patch is anything but that on multiple levels - even little details like how it gives you 100% unauthenticated UTXO data rather than just a “known/unknown” response encourage inexperienced programmers to take dangerous shortcuts, like relying on your untrusted peer(s) for utxo data corresponding to a tx rather than at least getting actual proof via the tx and its merkle path to the block header. (PoW proof) Equally the presence of vulnerable targets encourages attackers to sybil attack the Bitcoin network to exploit those targets - we don’t want to encourage that.

    That said, I’m not convinced we need to add trusted extensions to the Bitcoin Core P2P protocol; that functionality already exists in the form of Electrum among others. UTXO’s can be looked up easily, you can authenticate the identity of the server you’re talking to via SSL, and it is already used for that purpose by a few applications. (e.g. the SPV colored coin client ChromaWallet) A client implementation is simple(1) and Electrum supports things like merkle paths where possible to reduce the trust in the server(s) to a minimum. Why reinvent the wheel?

    1. https://github.com/bitcoinx/ngcccbase/blob/master/ngcccbase/services/electrum.py
  31. laanwj commented at 12:37 pm on June 17, 2014: member

    @sipa Agreed - getutxos is in the same category of ‘information queries from trusted node’ as the mempool check for unconfirmed/conflicted transactions that an external wallet could use.

    Regarding the height in version messages: yes, nodes have lied about this, resulting in ‘funny’ information in the UI so we don’t use it anymore, not even behind a median filter. See #4065.

  32. mikehearn commented at 1:17 pm on June 17, 2014: contributor

    I have a section in the commit message about philosophy for a reason - this discussion is now firmly in the realm of the philosophical.

    There have been cases in the past few years where people loudly proclaimed that something should not be done because of $ATTACK or $CONCERN, then we did it, and so far things worked out OK. A good example of this was SPV clients in general, a few people said:

    • Nodes will silently drop transactions just because they can, so every client should use a trusted server.
    • People will DoS the network by requesting lots of blocks just because they can, so Bloom filters should be disabled by default (litecoin did this)
    • People will sybil the network and make clients believe in non-existent mempool transactions, so everyone should just use a trusted server.

    In fact none of these things have happened, the concerns were overblown. Will they happen in future? Maybe! But also maybe not. So far we benefited tremendously: pushing SPV forward was the right call.

    When any change is proposed it’s natural and human to immediately come up with as many objections as possible. The reason is, if we object now and something does go wrong, we can make ourselves look smart and wise by saying “told you so”. But if we object and nothing goes wrong, people usually forget about it and move on. This gives a huge incentive to consider only risks and not benefits, it gives huge incentives to try and shoot things down. Sometimes people call this stop energy, a term coined by Dave Winer: http://www.userland.com/whatIsStopEnergy - I see it here. Nobody above is talking about the considerable benefits of a fully decentralised assurance contract app. Instead people are focusing only on costs, costs like “maybe someone will do something dumb”, which is always a concern with Bitcoin.

    Now there are two possible outcomes here:

    1. Although I have explained why various attacks are not a concern above, let’s say my analysis is wrong somehow and someone finds a way to exploit the lack of block chain authentication on getutxos and causes problems for my users. Let’s also say that other ways to fix the problem, like using Tor and cross-checking nodes don’t work. In that case I will have to fall back to using a set of trusted nodes instead and people can say “told you so”. I’m sure they will enjoy it.

    2. In fact the concerns are overblown and nobody mounts successful attacks, either because it’s too hard, or because there’s no benefit, or because by the time someone finds a way to do it and decides they want to the world has moved on and e.g. we have UTXO commitments or simply Bitcoin assurance contracts are irrelevant for some reason.

    In the latter case, it’s a repeat of Bloom filtering so far - we will have benefited! More decentralisation! More simplicity!

    The argument being made here is, let’s just assume failure and skip straight to the centralised trust based solution. Or more subtly, let’s set up a hypothetical straw developer who we assume does something dumb, and use that as a reason to not add features.

    I have a different idea - let’s add this feature and see what happens. Maybe it turns out to be useless and people get attacked too much in practice, in which case it would fall out of use and in future could be removed from the protocol with another version bump. Or maybe it works out OK, eventually gets extended to contain UTXO proofs despite the lack of real-world attacks, and the story has a happy ending.

  33. gavinandresen commented at 1:38 pm on June 17, 2014: contributor

    Sorry @sipa, I agree with Mike– lets add this feature.

    RE: service bits versus version numbers: In my experience, APIs/protocols fail when they wimp out and make lots of things optional. It becomes impossible to test the 2^N combinations of N optional features once N is greater than… oh, two.

    The unspent transaction output set is something every ‘full’ node should know, so I see no reason to do a service bit over bumping the version number.

    RE: fears that lazy programmers will Architect In Bad Ways: “better is better.” Letting SPV clients query the state of a full node’s UTXO set is useful functionality. And simple is generally more secure than complex.

  34. sipa commented at 3:02 pm on June 17, 2014: member

    RE: fears that lazy programmers will Architect In Bad Ways: “better is better.” Letting SPV clients query the state of a full node’s UTXO set is useful functionality. And simple is generally more secure than complex.

    I don’t disagree at all that it is useful. I even gave an extra use case for it (mempool conflict checking).

    I just want to not make the distinction between the p2p system and services offered by full nodes fuzzier.

    Getutxos is not costly (in the current way of implementation) and I’m not particularly worried about DoS attacks that could result from it. I’m worried about providing a service that the ecosystem grows to rely upon, making it harder to change implementations (gmaxwell’s idea of provable deniability of chainstate data through encryption utxos is a nice example).

    If you’re going to use data that needs trusting a full node, fine. Let’s just make sure people actually trust it.

  35. gmaxwell commented at 4:04 pm on June 17, 2014: contributor

    This doesn’t appear incompatible with the txout set encryption. The idea there is to key the utxo set with some hash of the txid:vout and encrypt the data with some different hash of the txid:vout, thus the node itself does not have the data needed to decrypt the txout until the moment its needed. Since this would provide the txid’s it would still work even if it returned the data… though for the motivation of the encrypted txouts we might prefer to not receive the txid until strictly needed, and instead do query by hash for this kind of spendability.

    That said, I consider serving additional unauthenticated data strongly inadvisable. It risks incentivizing sibyl attacks against the network and we’ve already seen people (apparently) trying to attack miners in the past with nodes lying about the time— so these kinds of attack are not just theoretical. We should be moving in the opposite direction in the core protocol, not making it worse. And if we do provide facilities which are not necessary for the basic operation of the system they should be behind service flags so we have the freedom to abandon them later without instantly breaking any node that calls them.

    Trusted services are already offered by electrum nodes— which have authenticated and encrypted connections and a curated node database which may prevent sybil attacks, at the expense of a more centralized dependency— which should be acceptable here, since the argument was that the data doesn’t need to be authenticated at all. Why can’t this use the existing electrum infrastructure for quasi-trusted wallet data?

    Is the mempool bool really the right design? ISTM that nodes that want to know if the txout is confirmed or mempool is going to need to query all of them twice.

  36. maaku commented at 5:55 pm on June 17, 2014: contributor

    My +1 goes to both @sipa and @mikehearn on this. This is a trusted call, and we are giving people enough rope to shoot themselves in the foot. Ideally stuff like this should be disabled by default and/or placed behind a special authenticated connection. But that is a separate issue necessitating a separate pull request – this gets my ACK once best block hash & height is addd.

    When I first heard about this at the conference I thought this was crazy – we need to be implementing trustless mechanisms for these things! If there is a usecase now, then let that drive development on, e.g. UTXO commitments, and let’s do this the right way. I still feel that way generally. However in this particular case I am more than willing to make an exception: Lighthouse will be transformative to bitcoin development, and is exactly the ideal platform for crowdfunding work on trustless tech. So I’m okay with merging this now, and reaping the benefits for bitcoin while also working on all the improvements mentioned.

    The UTXO commitments I’m working on are currently developer-time limited. There’s a working Python implementation and complete test vectors, as well as an optimized commitment mechanism (#3977, which needs a rebase), and C++ serialization code for the proofs. All we need is the C++ code for updating proofs, and the LevelDB backend. So if there are other bitcoind hackers out there interested in doing this The Right Way, contact me. However it requires a soft-fork, so rolling it out necessitates some degree of developer consensus, community eduction, and miner voting process (or the beneficence of ghash.io…), all of which together requires as much as a year if past experience is a judge. Lighthouse, on the other hand, can do good for bitcoin crowdfudned development right now. @gavinandresen It is only temporarily the case that the full UTXO set is something every full node needs to know. With either TxO or UTxO commitments it becomes possible to prepend spentness proofs to block and transaction propagation messages, at which point nodes are free to drop (portions of) the UTXO set. There is consensus we are heading in a direction which enables this, just not consensus over TxO vs UTxO and the exact details of the data structure.

  37. jgarzik commented at 9:52 pm on June 17, 2014: contributor

    <vendor hat: on> This duplicates multiple other open source projects such as Insight, which provides the same queries and more: https://github.com/bitpay/insight-api

    Running Insight is trivial for anyone running bitcoind. Anyone not running bitcoind can probably ask or find someone trusted who is already running such a server.

    I’m just not seeing a driving use case here [that is not already filled by existing software]. You don’t have hordes asking for this feature; and if people are asking for this feature, it is easy to point them to an existing project that can roll this out instantly.

    (Because, remember, you cannot start using this functionality even if you merge the PR today)

  38. laanwj commented at 6:32 am on June 18, 2014: member

    @jgarzik Using insight for this seems overkill, as it needs no extra indexes - bitcoind has the required information ready. If this would have been a request to add a private RPC call instead of a public P2P network message, IMO it would have been an easy ACK. @sipa

    I don’t disagree at all that it is useful. I even gave an extra use case for it (mempool conflict checking). I just want to not make the distinction between the p2p system and services offered by full nodes fuzzier. @gmaxwell

    We should be moving in the opposite direction in the core protocol, not making it worse. And if we do provide facilities which are not necessary for the basic operation of the system they should be behind service flags so we have the freedom to abandon them later without instantly breaking any node that calls them.

    Right, that was my idea as well. It can be useful but if you want to offer these kind of ‘courtesy’ services, they should be separate - either behind a service bit or such or a separate network like Electrum. Not mandatory extensions that the Bitcoin P2P network is stuck with forever. (as you say, can’t remove features you have introduced with a version: all existing software will interpret >=version as ‘has this and this feature’ no matter what is specified later).

    As @maaku says, NODE_NETWORK already is a hack that conflates different ‘services’. I’d prefer to make this part of a new service and add a NODE_QUERIES bit or such.

  39. mikehearn commented at 7:28 am on June 18, 2014: contributor

    It does not make sense to run a full blown block explorer for something that all bitcoind’s can serve without any additional indexes. @drak There is no danger to people’s money. Have you read my original post or the analysis above? Even if you disagree for some reason, you just …. don’t run an app that uses the message. @maaku Thanks, it’s nice to see someone weigh the potential benefits too. I’ll try and make some time to test/play with your UTXO commitments work in future.

    I do not believe adding such messages is “going in the wrong direction”. Telling people to pick some trusted people and trust them is the wrong direction. This is the same reasoning by which we’d have no SPV clients at all because some theoretical attack exists, so we should all just use Electrum and presumably if one or two Electrum server operators turn out to be corrupt, we ask governments to regulate them? Or remove them from a magic list and wait until they come back under another identity? Am I supposed to ask Electrum operators to submit to an exchange-style passport examination? Or just assume because I met the guy in a bar once and he seemed OK he must be trustworthy? I don’t think “use Electrum” is quite the silver bullet it’s being made out to be. Actually I suspect that randomly picking a few nodes out of 8000 (especially via a Tor circuit using one of their ~1000 available exits) and cross checking their answers stands just as good a chance of being robust.

    Additionally, let’s be realistic about what UTXO commitment authentication in this context really means: we have something close to a miner cartel today. If we assume that nodes are untrustworthy and it’s the proofs that matter, then those proofs could be mis-generated by a grand total of two or three people; hardly a quantum leap in trustlessness. UTXO proofs make sense on the assumption we can fix the mining market though, so we should still optimistically do them.

    Now, look, there’s genuine and reasonable disagreement here. But we should all be willing to consider the idea that we’re wrong. I faced that possibility above: if attacks happen, then I will have to rewrite things to use some trusted nodes (probably bitcoin p2p tunnelled over SSL to nodes either run by myself, or by people I know). So I know what my plan is. But if you’re against this feature, what if you are wrong? What if the attacks don’t happen? Then you have made a real Bitcoin app less decentralised, and set a powerful precedent that nobody should propose new features for Bitcoin unless every imaginable risk can be mitigated, including risks borne by developers who haven’t come along yet. This practically ensures nobody ever tries to make the protocol better. That’s a big cost we should take very seriously.

    Anyway, I’m encouraged enough by support from Gavin and Mark that I’ll go ahead and add the height/block hash to the message. I’ve already added the service bit so I think that meets what @laanwj wants. When the code is merged I’ll write the BIP.

  40. petertodd commented at 8:30 am on June 18, 2014: contributor

    @sipa made a great point above in that getutxos is fundamentally even worse than bloom filters in terms of trust because there is absolutely no security at all. At least CMerkleBlock’s have strong assurances that a given transaction exists even if there’s no current way to know if you’ve seen all transactions. (which incidentally is something we can even reasonably good assurance for via random sampling without a soft-fork) Similarly it’s knowledge that only gets better every time you connect to an honest node, even after connecting to a dishonest one - the set of transactions you know about and block headers you know about can only be added too.

    getutxos doesn’t have any of that assurance. There’s no proof what-so-ever and there’s no way to reconcile conflicting responses. You can handwave and say you’ll cross check answers, but that’s assuming you even have a set of nodes to randomly pick from - you don’t. Fundamentally your root of trust in this design is the DNS seeds you started to learn nodes from. Compromise those nodes and your SPV will never learn about an honest node and your screwed. But unlike Electrum because there is no authentication anywhere in the P2P network not only are you trusting that root of trust, you’re trusting your ISP, you’re trusting Tor exit node operators, etc. It gets even worse when you remember that “compromising” the DNS seeds can mean nothing more than just running a few dozen fast nodes and DoS attacking the other nodes so yours end up at the top of the DNS seed lists.

    The whole rant about “What do you do if an Electrum node operator is dishonest?” is particularly bizarre: obviously you update the software to stop using that node. It’ll happen once in a while, it’ll be detected, and you have a very clear and simple procedure to follow if it does. That’s why Electrum itself has a config screen that lets you pick what node(s) you want to trust. (and cross-checks them too) Just like Tor it relies on the fact that the operators of the service are well known. (incidentally, over half the Tor bandwidth is provided by people/organizations the Tor project knows, and that’s a number that’s watched carefully)

    Why are we wasting our time on this very insecure design with no privacy? Electrum already exists, use it in the short term. As for the long term @maaku or someone else will finish a proper authenticated solution and we can add a secure NODE_GETUTXO service. Heck, the saddest part of all this is the use-case initially presented - Lighthouse - simply doesn’t need NODE_GETUTXO and would work perfectly well using bloom filters as I explained above.

    The success of Bitcoin’s decentralization is based on cryptographic proof, not blindly trusting strangers.

  41. laanwj commented at 10:00 am on June 18, 2014: member
    Talking of Electrum: At some point in a mailing list discussion the author of Electrum was also interested in UTXO queries. Though the exact semantics were different: a query of transactions by output/address instead of txout point (see https://www.mail-archive.com/bitcoin-development@lists.sourceforge.net/msg04744.html). That would make it possible to implement Electrum server on top of bitcoind - although in that case it doesn’t matter whether the calls gets added to RPC or ’trusted extensions’ on P2P.
  42. BitcoinPullTester commented at 10:06 am on June 18, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/93a2b928dca7d1d2b2aa17f991a53a99ee02fe86 for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  43. jgarzik commented at 11:44 am on June 18, 2014: contributor

    “NODE_NETWORK already is a hack that conflates different ‘services’”

    It’s not a hack. It continues to mean NODE_FULL, whatever that means. For years, clients really only cared about the binary question “full node: yes or no?” It is not a hack to provide what the clients want.

    For the purposes of this PR, you can bump the protocol version and things will be just fine; as noted, other full nodes must provide UTXO query ability in order to verify incoming traffic anyway.

    The trust remains, however. That is why existing solutions are attractive. Existing solutions handle the trust issue.

    I would be more comfortable if we had UTXO commitments and other features to help the trust issue.

  44. laanwj commented at 12:45 pm on June 18, 2014: member

    @jgarzik Yes it is obvious how NODE_NETWORK is defined now and in the past, but going forward it’s less clear what services a full node should offer. Or even what a “full node” is. Internally it needs to verify UTXOs but that doesn’t mean it needs to offer that capability to random other nodes on the network. It could be said that a full node, to maintain the network

    • needs to verify transactions and blocks before forwarding them
    • serves blocks to bootstrap other nodes

    All kinds of extra services for (SPV) clients are less clear-cut. And even these two could in principle be separate, for example a node could decide to stop serving blocks after a bandwidth quotum has been filled. That’s why NODE_NETWORK is a hack. It says what the node is, but what other users of the network really want to know is what services does it offer (hence “service bits” not “identity bits”).

  45. jgarzik commented at 2:22 pm on June 18, 2014: contributor

    NODE_NETWORK specifies quite precisely what the node offers. It is not a hack, it is very specific.

    Folks are attempting to redefine it as a hack, simply because the model of lumping all new features under NODE_NETWORK is not sustainable moving forward.

    Anyway, this is semantics, moving on, to a more on-topic issue:

    There is clearly a desire to generalize the P2P network. Make it a network with a forest of nodes, each of which carrying somewhat-bitcoin-related services. But, e.g., remote querying of the UTXO set is clearly not needed for bitcoin operation.

    Therefore, this proposal smells similar to others, which attempt to turn the bitcoin P2P network into a general decentralized-money-and-query-services network. That is a mistake.

  46. jtimon commented at 2:56 pm on June 18, 2014: contributor

    What about implementing this using @jgarzik’s #2844 instead of the p2p protocol? Seems like the right place to put trusted calls. If/when in the future committed utxo’s (or some other mechanism) allows these calls to be trust-less, they can be moved to the p2p protocol. Probably I’m missing the point of putting it directly in the protocol.

    Being accepted directly in the protocol (maybe with plans of extending it later with additional proofs), it rises the question of the version number vs service bits vs service strings vs service strings with their own version or whatever… But that’s not something unique to this PR and I think the mailing list is more appropriate for that discussion.

  47. mikehearn commented at 3:06 pm on June 18, 2014: contributor

    @jgarzik Bitcoin is a money-services-and-query network! Maybe in future it won’t be, but people have been paper napkinning entirely new network architectures for years, so it’d be a mistake to hold up changes to the existing protocol because someone wrote an email imagining some radical evolution somewhere. We need to work with what we have today. So IMHO theoretical debates about what the perfect Bitcoin design is belong on the mailing list, or blog posts, or on a forum somewhere, not on a random pull request.

    Current status: I have received exactly two comments on the code itself, from Wladimir and Mark, both of which are resolved.

  48. sipa commented at 3:21 pm on June 18, 2014: member

    I agree that discussion about what the properties and serviced offered by the network should be should not be done on this pull request.

    On the other hand, I think extensions to the P2P protocol itself also deserve discussion on the mailinglist, perhaps before implementing it…

  49. petertodd commented at 3:23 pm on June 18, 2014: contributor
    @mikehearn A bad idea with a working implementation is still a bad idea; the actual code is the least difficult part of doing Bitcoin development. It shouldn’t be a surprise that people who believe the idea is fundamentally bad haven’t bothered wasting their time reviewing the code. Secondly that we’re seeing strongly held fundamental disagreements about whether the idea is good at all strongly suggests this pull-req should be shelved until consensus on the idea itself can be reached - most likely on the mailing list.
  50. jgarzik commented at 3:24 pm on June 18, 2014: contributor

    @mikehearn You elided “generalized” The current network supports operations necessary to maintain the full node network, and service SPV clients. That is a focused, specific purpose. It is not a generalized query network, with query functions added because it was easy to add to bitcoind

    The use case mentioned can be easily handled using Insight or another existing UTXO query software.

    This is a significant change to the operation of the network, moving away from the focused purpose of maintaining the chain and servicing SPV nodes. If the community wants to do that… fine. But not appropriate material for a PR.

  51. jgarzik commented at 3:24 pm on June 18, 2014: contributor

    @jtimon I would ACK such a patch. Localized queries are fine by me.

    It is adding the query to the P2P network without thinking it through that is the problem.

    ETA: I have a similar use case @mikehearn auctions: https://github.com/jgarzik/auctionpunk It has the same problem: You make a pledge (sign a transaction), but it might be double-spent away etc. by the end of the auction. You do indeed need to query the UTXO. I would never add such query facility to the P2P network, though.

  52. petertodd commented at 3:29 pm on June 18, 2014: contributor
    @jgarzik CoinJoin implementations also have that need, and again we’re pursuing alternate ways to solve the issue. (trusted server w/ Dark Wallet, general anti-DoS mechanisms for decentralized CoinJoin)
  53. jgarzik commented at 3:31 pm on June 18, 2014: contributor

    @petertodd It is poor form to make assumptions about what others have reviewed, or not. The code in this PR was reviewed by myself and is straightforward.

    That does not change any of the larger issues.

  54. petertodd commented at 3:36 pm on June 18, 2014: contributor
    @jgarzik By “code review” I mean the whole process, which includes discussing it - at least ACKing it. I did as much review as you did, but I wouldn’t call that the full process of code review.
  55. mikehearn commented at 3:39 pm on June 18, 2014: contributor

    We’re going around in circles now. @jgarzik Re: Insight - as already pointed out, it makes no sense to run a full blown very expensive block explorer just for access to the UTXO set. I know it’s made by your employer but it’s really just not relevant here at all.

    This is not a major change to the network, that’s melodramatic. It’s a tiny patch. It may not match with your personal vision of what the network should be in future, but the Bitcoin protocol has always been about whatever is needed for a decentralised financial system: remember that in the early days it not only had the current functions (including distributed network status alerts), but also had the foundations of an entire P2P marketplace in it! Not to mention that SIGHASH_ANYONECANPAY and other unused features were obviously put there for a reason. This idea that the network should be as minimal as possible and we should rely on trusted third parties for everything else is very much the newcomer here; that’s not how the project has historically operated.

    We have a more fundamental and deeper rooted problem here. Bitcoin cannot operate anymore as a consensus driven project because some people (like @petertodd) reliably object to everything they did not invent and others like to say “no” to anything that anyone can possibly imagine a better solution for, regardless of whether that other solution exists or will be implemented in any reasonable time.

    Indeed, anyone reading this thread would correctly conclude that this community does not have ANY coherent vision of what Bitcoin should become.

    For this reason the only decision I really care about here is that of @gavinandresen. Someone needs to (re)set a clear vision and set of technical principles, then absurd threads like this one can be avoided in future.

  56. jgarzik commented at 3:48 pm on June 18, 2014: contributor

    @mikehearn: " the Bitcoin protocol has always been about whatever is needed for a decentralised financial system" This is your vision, sure.

    If you just want to have a private conversation with Gavin, then take it to private email. If you wish to participate in an open source project, you are going to have to suffer through opinions other than your own.

    Ignoring 100% of @petertodd ’s messages in this thread, even ignoring my own, there are plenty of valid criticisms you are just handwaving away.

    This belongs on the mailing list. It is obviously not an easy ACK.

    It is a very strong candidate for closing this PR, until discussion happens on the list and consensus is reached there.

  57. mikehearn commented at 3:53 pm on June 18, 2014: contributor

    I’m happy to suffer others opinions ;) But the outcome obviously can’t be based on consensus (there are no non-trivial changes to bitcoin that have had consensus in recent times).

    Which criticisms do you have in mind? Actually I think I addressed all of them, either in the original writeup or my followup comments. Certainly all the code comments were addressed. The others are why I included a “philosophy” section.

  58. petertodd commented at 4:10 pm on June 18, 2014: contributor
    @mikehearn Of course, the beautiful thing is that we don’t need consensus: you can always create a Bitcoin Core fork for people who want to volunteer to provide decentralized and unauthenticatable services to others if you can’t get consensus that doing so is a good idea. You’re welcome to copy the preferential peering code from my replace-by-fee branch so those nodes can find each other more easily. Do this and perhaps in the future it will be easier to get consensus in the future among the whole community that this is in fact a good idea, particularly if the mechanism gets widespread use. It’d be easy to implement the even more general txindex lookup that I recall you wanting before as a NODE_TXINDEX service in that fork too. It’d also make it easier to implement things like proof-of-passport to (perhaps) give some assurance that your peers for these services aren’t sybil attacking you - all things that can easily be done in a fork you’re leading the development of.
  59. jtimon commented at 4:37 pm on June 18, 2014: contributor
    @mikehearn you didn’t addressed my suggestion about supporting this from a rest call extending @jgarzik’s PR #2844 Wouldn’t that be enough for your purposes? I understand that using another software with its own database and indexes may not be satisfactory for you (specially when you have all the data you need here). But this would be in bitcoind without being part of the p2p protocol (yet, maybe in the future)
  60. jgarzik commented at 4:44 pm on June 18, 2014: contributor

    A direct query interface (e.g. HTTP REST) makes much more sense for the outlined use case(s), due to the trust issues.

    It is difficult to fathom that a completely untrusted interface will be used remotely across a network by a KickStarter fund administrator, when a trusted local interface is also available.

  61. mikehearn commented at 4:56 pm on June 18, 2014: contributor
    @jtimon I didn’t address that because it’s the same suggestion as “use Electrum” or “use your own trusted nodes” which has been discussed a lot above. The wire format (p2p or http) makes no real difference.
  62. jtimon commented at 5:03 pm on June 18, 2014: contributor
    Yes, since it is a trusted call you will have to use your own trusted nodes. If in the future there’s committed utxo then it make sense to put it in the p2p protocol since you don’t require trust anymore (you trust the most work). Probably I’m missing something with the references to Electrum. I didn’t proposed that and I’m not sure what Electrum has to do with this PR.
  63. mikehearn commented at 5:16 pm on June 18, 2014: contributor
    Please read this PR from beginning to end, as like I said, this discussion has started to go round in circles. My commit message has a complete discussion of authentication in it, along with why I don’t think the current design is a problem for my use case, and there’s more discussion in the followups which is where Electrum is mentioned.
  64. maaku commented at 5:37 pm on June 18, 2014: contributor
    @jtimon, Electrum has as much or as little to do with this PR as the REST api. @mikehearn intends to use this with untrusted hosts: query the utxoset from multiple peers, using both direct connections and multiple TOR exit nodes. He does some analysis of the attack vectors in the OP. The unstated assumption (and a fair one) is that running a full node is just not an option.
  65. mikehearn commented at 6:23 pm on June 18, 2014: contributor

    @drak I have not handwaved, I have given careful analysis of why I think this change is fine, and why I feel the risk is low enough to make an attempt worthwhile. The proposed alternatives are not better. They are merely the same thing but less decentralised, on the assumption that the approach I’d like to try will fail.

    Do you have a specific concern with my analysis?

    If you are concerned about the lack of miner authentication then great, please help @maaku with his project, as that’s a big task and he could use the help. But also please understand the basic counterargument: perfect is the enemy of the good, and at any rate given the state of the mining market “perfect” in this case is very far from actually being perfect.

    Note that we’re going to have this exact same discussion again in a month or two anyway when floating fees is integrated. SPV nodes need to find out what the fees are. There’s a low(er) trust way, but it’s a big project, so the first pass will probably just be “poll the peers and average”. The same objections will come up, the same arguments will be made. So we need to get this out of our system at some point - might as well be now.

  66. petertodd commented at 7:46 am on June 19, 2014: contributor
    @mikehearn Yes, perfect is the enemy of good. Using Electrum isn’t perfect, but it is good; certainly better than trusting random strangers with no authentication and a whole multitude of attack vectors.
  67. mikehearn commented at 8:40 am on June 19, 2014: contributor
    We disagree on that, fundamentally. As this new command costs nothing (ignoring pie in the sky future total redesigns that nobody has even started coding), how about, we merge this in, you build apps your way based on Electrum or heck just build apps period, and I’ll build apps my way. Then those apps can compete on decentralisation and security. Experience will tell us who is right. Sound fair?
  68. petertodd commented at 9:06 am on June 19, 2014: contributor

    @mikehearn Note how I’m not the one asking to add highly controversial features to Bitcoin Core and complaining that people don’t just merge them in instantly. You’re welcome to add these features (and more!) to a fork and convince people to volunteer provide those services.

    Secondly, the whole “Peter never gets anything done” thing is a rather dull insult - I am involved in multiple projects as an advisor to people building apps and spent a great deal of time helping those people build good apps. (e.g. reality keys, multiple colored coins projects, mastercoin, counterparty, coinkite, dark wallet, paypub, etc.) You might as well criticize Gavin for not writing enough apps in his new role as Chief Scientist.

    re: competing: I was thinking of sybil attacking testnet in the near future with the bloom data hiding attack. Any objections to the test?

  69. mikehearn commented at 9:24 am on June 19, 2014: contributor

    Sybil attacking the tiny testnet where you run one of the DNS seeds isn’t an interesting experiment - but if you want to break a resource other developers rely on and annoy them in the process in order to prove not much, go ahead.

    A better experiment is this. I have in my pocket a phone. In 48 hours I will look at what nodes it’s connected to. If you own the majority of them and can keep control of that phone for, say, a week, then that would be a more useful data point.

  70. petertodd commented at 9:33 am on June 19, 2014: contributor

    @mikehearn Obviously I’m going to be leaving the DNS seed untouched; wouldn’t be a very interesting attack otherwise. Besides, we already know that my node going down was enough to make bitcoinj-using software fail because it all had a single point of failure.

    re: a bloom filter attack, all (?) SPV clients using them only ask for a given block one time right? (although from multiple peers at once right?)

  71. mikehearn commented at 9:43 am on June 19, 2014: contributor

    Alright, so go sybil attack a tiny network without touching your DNS seed then. What are you trying to prove? That attacking small networks is cheap? You could outmine the testnet quite cheaply too, thus rendering UTXO proofs also irrelevant. Nobody disputes that attacks are possible if someone puts enough effort in, the difference between us is that the observation that:

    a) The gains are not obvious, so in reality nobody is incentivised to do it and so far nobody has. If your “incentive” is to win a pissing match on a github thread, that doesn’t tell us much about the behaviour of the system over the long term, does it?

    b) The main Bitcoin network is quite large and so attacking it takes a fair amount of resources, quite possibly more resources than the benefit, which as above, is often unclear anyway.

    Anyway, this is a waste of time. Yes, we know attacks exist. They always have. Demoing one in a test environment doesn’t show us much. What helps is practical experience from the real world, and adaptation to real attacks.

    Let me rephrase my question: as this feature costs nothing, why don’t you go advise others not to use it, and let their products compete on decentralisation and security instead? Seems like the simplest resolution of your complaint.

  72. petertodd commented at 9:53 am on June 19, 2014: contributor

    why don’t you go advise others not to use it, and let their products compete on decentralisation and security instead?

    Obviously adding this feature to Bitcoin Core is inherently putting our stamp of approval on it and will be taken as advice to use it.

  73. mikehearn commented at 10:07 am on June 19, 2014: contributor

    Oh, well good thing everyone always follows such advice then, that must explain why there are no wallets that have created their own client/server protocol :)

    The BIP can explain the tradeoffs and attacks involved just as it does for Bloom filtering and HD wallets, etc. It can also link to this discussion. I’m sure developers can come to their own conclusions, as they have done many times already. @drak It’s not ad hominem to point out that Peter doesn’t write apps, even he agrees with that statement. Now, I asked you for specific concerns that have not already been brought up and addressed; do you have any? The only remaining concern that is unaddressed is “some developer might use it without understanding it simply because it’s there”, but I can’t address that because we’re arguing about the behaviour of hypothetical people who aren’t here to say otherwise. It’s also an argument so vague it could be invoked for all of Bitcoin, and would hold, but that’s no reason not to build Bitcoin.

    Anyway, I’m not going to argue this any further. I think all possible aspects of this have been explored by now. No successful open source project is run by having random people turn up and vote on every change, ultimately there must be a chain of command.

  74. maaku commented at 10:13 am on June 19, 2014: contributor

    If unanimous consensus from this group of commentators is required in order to get something merged, then perhaps bitcoin should be forked. At least we’d then be able to get some work done. A culture of what-ifism goes nowhere.

    The text of the pull request lays out a very straight forward use case and analysis of attack vectors. Does anyone have any specific comments or concerns on what is actually being proposed within that context?

  75. petertodd commented at 10:57 am on June 19, 2014: contributor
    @maaku This isn’t “what-ifism”, this is what you expect from competent developers working on a project where mistakes are extremely expensive. (heck, I personally have spoken to people with a sum total of a low five figures of losses from zeroconf doublespends) If anything, Bitcoin Core development is particularly bad because we’re setting standards that other developers will use in turn - unauthenticated data services like getutxos with no privacy invite dangerous designs with a very large and unpredictable attack surface.
  76. mikehearn commented at 12:40 pm on June 19, 2014: contributor

    Oh, I’m going to break my “end of argument” rule just once, to outline another use case for getutxo that I didn’t mention before (because it’s not relevant quite yet), and maybe this one will interest @sipa and change his mind :)

    Floating fees are coming, and as part of that SPV clients need to know what the fee levels are so they can craft transactions. Gavin and I have discussed how to do this, and this is what I proposed: we start with a simple getfees that is just poll peers and average. Of course, the usual caveats apply: it’s attackable if someone can control your internet connection (and/or Tor), so it’d be nice if we had a lower trust method.

    So we also have an algorithm that SPV wallets could use to estimate fees themselves. It assumes that we have chain forked to implement tx v3, that in tx v3 is putting the output value into the scriptSig during signing along with the scriptPubKey (this was proposed a lot time ago by etothepi), and finally that some non-trivial number of people have upgraded and are making v3 transactions. OK.

    1. When an SPV client has started up and noisy Bloom filters were sent to the peers, the wallet watches out for v3 transactions being newly broadcast across the network. For these transactions the wallet records the current time/chain height, and also records the result of a getutxo query for the spent outputs. This data isn’t authenticated by anything of course, but that’s OK because all we’ll do with it is store it for a while.
    2. The wallet adds these false positives back into its Bloom filter so it finds out when they confirm.
    3. Once they confirm, the wallet now has an SPV-level proof that the transaction is valid and thus, that its connected outputs were also valid. They aren’t unspent anymore so getutxo would no longer work, but that’s OK because saved the data before. Now we run the scriptSigs in the transactions using the stored UTXO data, thus authenticating it.
    4. We can use the newly authenticated data to know the values of the inputs and calculate the fee. Using the time between our initial recording and when the tx confirmed, we can calculate a fee estimate.

    This doesn’t allow us to calculate priority, which also matters. For that we’d need to record the chain height of a UTXO in the UTXO set, so when a SPV wallet requests it, it can also get a Merkle proof showing the depth of all those UTXOs, allowing calculation of priority as well. If the remote node can’t provide such a Merkle branch we assume the dependencies are also unconfirmed and thus the priority is zero.

    This algorithm is not unattackable. However it’s still an upgrade over poll-and-median. Note that this doesn’t (I think) require any changes to getutxo over what I’ve coded up here. All it requires is more data being covered by the signature hash.

  77. maaku commented at 3:40 pm on June 19, 2014: contributor

    Chain height is in the UTXO set (along with tx nVersion and coinbase boolean flag). Height at the very least should be returned by getutxos… On Jun 19, 2014 5:41 AM, “Mike Hearn” notifications@github.com wrote:

    Oh, I’m going to break my “end of argument” rule just once, to outline another use case for getutxo that I didn’t mention before (because it’s not relevant quite yet), and maybe this one will interest @sipa https://github.com/sipa and change his mind :)

    Floating fees are coming, and as part of that SPV clients need to know what the fee levels are so they can craft transactions. Gavin and I have discussed how to do this, and this is what I proposed: we start with a simple getfees that is just poll peers and average. Of course, the usual caveats apply: it’s attackable if someone can control your internet connection (and/or Tor), so it’d be nice if we had a lower trust method.

    So we also have an algorithm that SPV wallets could use to estimate fees themselves. It assumes that we have chain forked to implement tx v3, that in tx v3 is putting the output value into the scriptSig during signing along with the scriptPubKey (this was proposed a lot time ago by etothepi), and finally that some non-trivial number of people have upgraded and are making v3 transactions. OK.

    1. When an SPV client has started up and noisy Bloom filters were sent to the peers, the wallet watches out for v3 transactions being newly broadcast across the network. For these transactions the wallet records the current time/chain height, and also records the result of a getutxo query for the spent outputs. This data isn’t authenticated by anything of course, but that’s OK because all we’ll do with it is store it for a while.
    2. The wallet adds these false positives back into its Bloom filter so it finds out when they confirm.
    3. Once they confirm, the wallet now has an SPV-level proof that the transaction is valid and thus, that its connected outputs were also valid. They aren’t unspent anymore so getutxo would no longer work, but that’s OK because saved the data before. Now we run the scriptSigs in the transactions using the stored UTXO data, thus authenticating it.
    4. We can use the newly authenticated data to know the values of the inputs and calculate the fee. Using the time between our initial recording and when the tx confirmed, we can calculate a fee estimate.

    This doesn’t allow us to calculate priority, which also matters. For that we’d need to record the chain height of a UTXO in the UTXO set, so when a SPV wallet requests it, it can also get a Merkle proof showing the depth of all those UTXOs, allowing calculation of priority as well. If the remote node can’t provide such a Merkle branch we assume the dependencies are also unconfirmed and thus the priority is zero.

    This algorithm is not unattackable. However it’s still an upgrade over poll-and-median. Note that this doesn’t (I think) require any changes to getutxo over what I’ve coded up here. All it requires is more data being covered by the signature hash.

    — Reply to this email directly or view it on GitHub #4351 (comment).

  78. laanwj commented at 4:46 pm on June 19, 2014: member

    @drak Right, an open source project depends on random people contributing and reviewing changes.

    Having said that, I still haven’t heard any arguments that this change is risky to either those running Bitcoin Core or to the Bitcoin P2P network itself. Querying the UTXO database is quite cheap (by design) so it does not pose more DoS risk than the other P2P protocol messages. It doesn’t pose more of a privacy leak to the node operator than for example the “mempool” message.

    Also it is behind a service bit now, so node implementations are not forced to implement this call if they don’t want to, for whatever reason.

    But I don’t think it warrants this cry-out about insecurity. It can be used in insecure ways, sure, but on the other hand it’s possible to shoot yourself in the foot with bitcoin in lots of ways already…

  79. mikehearn commented at 4:47 pm on June 19, 2014: contributor
    @maaku Ah! So what’s needed for priority calculation is already there. Excellent. Loading the block to calculate the Merkle branch is extra work that no client would use today. I’d rather add that later, once txv3 has happened and all the complicated client-side code to do low-trust fee/prio calculation is implemented so we know it works.
  80. maaku commented at 5:06 pm on June 19, 2014: contributor
    Well that triggered some thoughts about this pull request though. I would consider prepending the tx.nVersion and block height fields to the response (using VarInt serialization for minimal size), for two reasons. First, the nHeight lets you look up the transaction with a bloom filter request, as mentioned. Second, adding both tx.nVersion and nHeight prior to the output itself future-proofs the response against a later hard-fork change of the serialization format – I have code lying around somewhere that modifies the serialization primitives to take CChainParams and nHeight as parameters for exactly this reason, to enable serialization changes in a future hard-fork.
  81. mikehearn commented at 5:12 pm on June 19, 2014: contributor

    Currently the protocol doesn’t let you look up block by height, you always have to provide a hash. I guess you’re assuming we’d fix that?

    Good point about serialisation format and version prefixing. I’ll modify the patch to do that. Other than new script opcodes is there much chance of a txout format change in future? I can’t think of any proposals. But it can’t really hurt either.

  82. maaku commented at 5:19 pm on June 19, 2014: contributor

    You can get a header by height, and compute the hash from that, right? But if this becomes common, it’s probably something worth fixing.

    Regarding upgrades, if we’re going to hard-fork anyway to increase the block size, then there is some house-keeping we could do to e.g. make the serialization smaller. Potential larger changes I could see happening in bitcoin include switching to merklized scripts, or a divisibility improvement (e.g. switching to decimal64).

  83. mikehearn commented at 5:32 pm on June 19, 2014: contributor
    Can you? getheaders takes a block locator and a stop hash. Height is I think hardly ever used in the protocol except in the ver announcement. Unless I’m missing something obvious ….
  84. mikehearn commented at 5:40 pm on June 19, 2014: contributor
    Ah yes MASTs. Good point. OK, sure. I’ll work on that tomorrow. Should be a simple change. Thanks again for the useful feedback.
  85. jgarzik commented at 1:57 am on June 20, 2014: contributor

    @laanwj The change is risky for anyone (a) using this feature, but without (b) extensive multi-node quality testing of data. We are rolling out the easy part, with the Hard Part either TBD or DIY.

    IOW, the Path Of Least Resistance here is quite dangerous. Doing the easy thing gets you zero trust data.

  86. laanwj commented at 8:25 am on June 20, 2014: member

    Doing the easy thing always gets you into pain with bitcoin. Assuming immutable txids? You get swindled. Assuming that zero-conf double-spends never happen? You get swindled. Using an online wallet? You might get swindled. And so on… When developing bitcoin software one has to be extremely careful. Adding an obscure, optional query function to the P2P protocol is hardly going to make this worse.

    Anyhow, let’s turn this to a constructive discussion: who is going to submit a pull that implements this in a secure, trustable way?

  87. mikehearn commented at 11:10 am on June 20, 2014: contributor

    @laanwj The commit description already explains what has to be done to make it what I think you’re asking for i.e. authenticated by miners (which let’s face it, is not hugely meaningful at the moment). But as I explained it requires UTXO commitments, a huge upgrade to all of Bitcoin. The pull request you’re asking for is in reality a large series of complex pull requests and upgrades that would take according to @maaku at least a year, assuming the work is completed at all. Hence why this patch works the way it does.

    I do feel like most things brought up so far were discussed in the commit description. It’s not like this patch lacks authentication because I hate security. But large amounts of code have a cost. That’s why I’m working on assurance contracts in the first place - as a way to help us raise the money for large, complex, expensive projects like UTXO commitments.

  88. jtimon commented at 7:26 pm on June 22, 2014: contributor
    It’s hard to follow the thread with so much lateral discussion, but it seems to me that the only thing that’s polemic about this PR is whether to put the new functionality on the p2p protocol, on the rpc or on an http rest api. I still don’t know what are the perceived disadvantages of the rest option. For example, how would lighthouse be affected by using http instead of the p2p protocol as the source of the utxo trusted messages?
  89. sipa commented at 7:40 pm on June 22, 2014: member
    The question does not seem to be “what transport protocol to use for communication with a trusted bitcoind instance”. It is rather “Should untrusted bitcoind instances provide unverifiable information”.
  90. Diapolo commented at 10:14 pm on June 22, 2014: none
    @sipa If that is the main question I vote NO, that should be avoided!
  91. maaku commented at 11:03 pm on June 22, 2014: contributor

    I don’t think that’s a fair assessment of the issue here. Doing getutxos the “proper” way involves at least a few more months of work on UTXO commitments, then an equally lengthy discussion period where we build consensus among developers, and finally the very lengthy process of convincing miners to follow through with a soft fork, one which causes mining pool software to take a considerable performance hit in validating shares. The proper way is going to be a long and arduous process.

    In the meantime, there are applications such as Lighthouse which need this capability now. The message is unauthenticated and therefore dangerous if you don’t know what you are doing – just like many other p2p messages! – but @mikehearn does a good job in the OP of laying out what those risks are and how to avoid them. We shouldn’t be playing nanny with the p2p protocol and blocking or disabling features because people might hurt themselves.

    About this time next year, I hope that we are having a discussion about depreciating ‘getutxos’ once that a fully authenticated ‘getutxos2’ is deployed, using UTXO hash tree commitments. But until then, there is significant good that can be done with this unauthenticated version.

  92. jtimon commented at 9:45 am on June 23, 2014: contributor

    “what transport protocol to use for communication with a trusted bitcoind instance”.

    I say jgarzik’s http rest, maybe optionally authenticating the consumer of the api.

    “Should untrusted bitcoind instances provide unverifiable information”

    I agree the answer is not, but there’s another question.

    “Should TRUSTED bitcoind instances provide unverifiable information?”

    And I think the answer here is, yes, definitely. But from some of your comments about optionally maintaining addition indexes in bitcoin core, @sipa, leaves me with doubts, and kind of denies assumptions I’ve being making about how the core and the wallet would be separated. Anyway, getting off-topic, I’ll write about that on the mailing list instead of here… @maaku “I hope that we are having a discussion about depreciating ‘getutxos’ once that a fully authenticated ‘getutxos2’ is deployed, using UTXO hash tree commitments. But until then, there is significant good that can be done with this unauthenticated version.”

    So what’s the disadvantage of having a rest getutxo first that is replaced/complemented with a p2p getutxo later as opposed to having 2 versions of the getutxo p2p message? My question is not “how to solve the potential security problems of using the unverifiable utxo message?”, my question is “why can’t you do this with rest?”.

  93. petertodd commented at 2:54 pm on June 23, 2014: contributor

    So with height and blockhash being returned, you can use getutxo in conjunction with getblock to get proof the TXO at least existed at some point. Additionally adding prefixes queries can be done in a backwards compatible way. That’s enough to be useful authenticated data for some applications so I’m going to say ACK.

    Besides, it’ll be good fun exploiting the insecure software that’ll inevitably get written depending on this.

  94. maaku commented at 3:41 pm on June 23, 2014: contributor
    @jtimon Neither JSON-RPC or the proposed HTTP REST APIs are exposed on public ports. How does the Lighthouse app which has no local copy of bitcoind access either?
  95. jgarzik commented at 3:45 pm on June 23, 2014: contributor
    @maaku The HTTP REST API is as public or private as you want it to be. It only exports public data, and is not a control plane like RPC.
  96. maaku commented at 4:12 pm on June 23, 2014: contributor

    Sure but how does the SPV node find the HTTP REST API port? On Jun 23, 2014 8:46 AM, “Jeff Garzik” notifications@github.com wrote:

    @maaku https://github.com/maaku The HTTP REST API is as public or private as you want it to be. It only exports public data, and is not a control plane like RPC.

    — Reply to this email directly or view it on GitHub #4351 (comment).

  97. jtimon commented at 7:34 pm on June 23, 2014: contributor
    Ok, RPC then.
  98. in src/main.cpp: in 93a2b928dc outdated
    4209@@ -4144,6 +4210,7 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv)
    4210     else
    4211     {
    4212         // Ignore unknown commands for extensibility
    4213+        printf("Unknown command type: %s\n", strCommand.c_str());
    


    rebroad commented at 4:02 am on June 27, 2014:
    LogPrint()?
  99. mikehearn commented at 4:18 pm on June 30, 2014: contributor

    @rebroad Thanks for the LogPrint fix.

    OK, I rebased and added coin height and version as @maaku suggested.

    Here’s my current understanding; @sipa agrees the fee algorithm I outlined above would work once we include value under the signature hash, Peter also sees authenticated use cases with the new data, Gavin is for this, @laanwj seems in agreement too as he sees the security arguments. So I’d like to ask the maintainers for another consideration.

  100. sipa commented at 5:06 pm on June 30, 2014: member

    Can we add least start a thread on the mailinglist about this? Protocol changes affect more than just one client.

    Regarding the fee estimation: I’d rather see opting in to getting txins spent along with relayed transactions (which, in ordinary cases, is authenticated information indeed), rather than through an extra roundtrip with getutxos, with the race condition that a block spending the coin in between deletes the txo.

  101. mikehearn commented at 5:31 pm on June 30, 2014: contributor
    @sipa We could, but is there an advantage to fragmenting the discussion? I think by now every possible angle has been talked about. What should I say on the mailing list thread?
  102. in src/version.h: in 4261be03f0 outdated
    25@@ -26,7 +26,7 @@ extern const std::string CLIENT_DATE;
    26 // network protocol versioning
    27 //
    28 
    29-static const int PROTOCOL_VERSION = 70002;
    30+static const int PROTOCOL_VERSION = 90001;
    


    sipa commented at 5:47 pm on June 30, 2014:
    Why not 70003? Protocol versions are intended to be independent from client versions.

    mikehearn commented at 5:52 pm on June 30, 2014:
    OK, changing ..
  103. jtimon commented at 6:01 pm on June 30, 2014: contributor
    I think sipa’s point is that not all bitcoin nodes developers follow bitcoind development. And since this would be a protocol change and not just a bitcoind change, some feedback from other implementations devs could be useful. Again, I would say the only thing controversial is p2p message vs rpc message. Although I’m still having troubles to understand the cases where the p2p message has an advantage over the rpc/rest version, for the record, I’ve not opposed to this change, I was just trying to understand it better.
  104. mikehearn commented at 6:10 pm on June 30, 2014: contributor
    Alright, I’ll mail the list. I’ll try and boil down this enormous centi-thread :)
  105. sipa commented at 6:17 pm on June 30, 2014: member
    @mikehearn Thanks! It’s partially my fault for continuing the discussion here - it should probably have been done mostly on the mailing list in the first place.
  106. laanwj commented at 6:44 am on July 1, 2014: member
    I think it now makes sense to post a BIP draft to the mailing list and point here for the implementation. (the OP here looks like a BIP already and contains the usage scenarios and pitfalls so there’s not much to be done)
  107. mikehearn commented at 10:14 am on July 16, 2014: contributor

    int -> uint32_t

    thanks to @jrick for noticing.

  108. jgarzik commented at 12:13 pm on July 16, 2014: contributor

    I do like that it adds and advertises NODE_GETUTXOS, as it should.

    This feature should be made conditional on a command line option, so that it can be turned off.

  109. petertodd commented at 12:18 pm on July 16, 2014: contributor
    @jgarzik There should also be an easily available “chaos monkey” option that makes getutxo’s requests return invalid and/or misleading data to both make it easy to test the robustness of applications using it, and in addition, make the point that it’s very easy to use it in an unverified way. Similarly we should do the same for bloom filters. (others too, but those are the two major offenders)
  110. jgarzik commented at 1:53 pm on July 16, 2014: contributor
    Speaking of, this wants tests before going into bitcoind.
  111. davecgh commented at 5:36 pm on July 16, 2014: contributor

    I also like the NODE_GETUTXOS flag. I’d like to see @petertodd’s pull request that added a NODE_BLOOM flag revisited for the same reason.

    I plan to make a more thorough response on the mailing list, but in short, this is not a feature I’m interested in implementing in btcd because it’s too insecure as currently proposed for my tastes. We don’t need more services out there requesting data that can be easily faked only to notice months later they’re missing a bunch of coins.

    Even though the BIP purports to address the security issues, all it really says is “Yes, this is insecure. Don’t care.” In my opinion, the things necessary to make a request such as this secure should be done first, before trying to get this easily abused feature merged mainline.

  112. jgarzik commented at 5:52 pm on July 16, 2014: contributor

    I am forced to agree with @davecgh RE “yes, this is insecure, don’t care”

    I will not NAK this change, but I do give it a solid “-1”

    AFAICT, all other P2P commands fall into one of a few categories:

    • P2P connection maintenance
    • Sending and receiving data which we may verify back to a root of trust (genesis block, eventually).

    “getutxos” is different. “getutxos” just sends data openly acknowledged as impossible to verify securely [at this time].

    Even the humble “addr” message can be said to be data that is taken and verified (if by hueristics and general observation of remote node operation over time).

    IMO this is an unprecedented addition.

  113. petertodd commented at 6:34 pm on July 16, 2014: contributor
    @jgarzik Note that the patch has been changed to respond with a block height and block hash, which lets you in turn lookup the transaction and merkle path to the block header. That does let you verify fully, which is why I changed my initial NACK to ACK. Equally in the case of things like CoinJoin, you don’t particularly care whether or not a given UTXO is or is not actually spent; all you care about is whether the peer you’re going to send a tx too claims it is. @davecgh NODE_BLOOM is just one of those issues that got broad support, but had a very vocal minority against it. If you like the idea, just give your users a config option to disable bloom filters; advertising NODE_BLOOM is just being nice.
  114. jgarzik commented at 6:38 pm on July 16, 2014: contributor
    @petertodd height only. No block hash AFAICS.
  115. in src/main.cpp: in c8aacf2ecf outdated
    3561+    //
    3562+    // Also the answer could change the moment after we give it. However some apps can tolerate
    3563+    // this, because they're only using the result as a hint or are willing to trust the results
    3564+    // based on something else. For example we may be a "trusted node" for the peer, or it may
    3565+    // be checking the results given by several nodes for consistency, and it of course may
    3566+    // run the UTXOs returned against scriptSigs of transactions obtained elsewhere.
    


    petertodd commented at 6:45 pm on July 16, 2014:
    Point out here that nHeight lets you obtain transactions elsewhere.
  116. petertodd commented at 6:51 pm on July 16, 2014: contributor
    @jgarzik Right, misremembered. Anyway, height is sufficient to lookup the blockhash and still recover the block.
  117. jrick commented at 7:20 pm on July 16, 2014: none
    If we’re going with the “there may be enough authentication with the block height/hash”, perhaps all mempool queries should be removed as well.
  118. mikehearn commented at 1:59 pm on July 23, 2014: contributor

    Tests are added here:

    https://github.com/bitcoinj/bitcoinj/commit/d737581c61a0512e96711430a6eef955a3689435

    It revealed a regression introduced by re-org changes that left the mempool UTXO view in a bogus state, so the pull tester changes there won’t pass until #4575 is merged.

  119. laanwj added the label P2P on Jul 31, 2014
  120. laanwj commented at 7:26 am on August 8, 2014: member
    Needs a reference to BIP 64 which was allocated for this in https://github.com/bitcoin/bips/pull/88
  121. Add a getutxos command to the p2p protocol. It allows querying of the UTXO set
    given a set of outpoints.
    da2ec100f3
  122. mikehearn commented at 12:04 pm on August 11, 2014: contributor
    Added reference to BIP 64 and discussion of how to use the included height.
  123. BitcoinPullTester commented at 12:19 pm on August 11, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4351_da2ec100f3681176f60dec6dc675fc64147ade3a/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  124. laanwj commented at 12:41 pm on August 15, 2014: member

    Seems this is ready now.

    I’d like a last round of ACKs here.

  125. gavinandresen commented at 5:20 pm on August 15, 2014: contributor
    Untested ACK
  126. maaku commented at 6:53 pm on August 15, 2014: contributor

    You could save some bytes by using VARINT for the nTxVer, nHeight, and chainActive.Height() serializations.

    Regardless, ACK.

  127. btcdrak commented at 8:09 pm on August 24, 2014: contributor
    How far away is this from being merged? Untested ACK.
  128. laanwj merged this on Aug 25, 2014
  129. laanwj closed this on Aug 25, 2014

  130. laanwj referenced this in commit 11270ebde4 on Aug 25, 2014
  131. jgarzik commented at 12:40 pm on August 25, 2014: contributor

    WTF? getutxos can obviously be abused, turning bitcoind into a fileserver.

    I thought the IRC discussion was to hold this, until such an obvious flaw was fixed?

    NAK.

  132. mikehearn commented at 12:51 pm on August 25, 2014: contributor

    No, it can’t, at least not sanely. I explained this to Gregory when he emailed me about it but I’ll do so again here.

    To fetch a piece of data using this protocol you must upload a hash and an integer, so 36 bytes. You get back whatever the max amount of data is that can be put in an output. Using a standard CHECKMULTISIG output that’s more than 36 bytes but not much more, especially if you don’t want to destroy the money so need at least one key to be real.

    So to download a file this way you’d end up uploading a significant fraction of what you download. What’s more, to fetch data you’d need a file full of COutPoint structures, which would be a significant percentage of the size of the real file itself. At that point you may as well just distribute the real file, or a torrent file or …. basically anything would work better than this.

    But if for some reason someone desperately wants to use the block chain as a file server, they already have a better option. They can just mark their transactions with an OP_RETURN output and then distribute their file URL as a set of block heights, then download the blocks (either Bloom filtered for extra efficiency or just not filtered, seeing as anyone who does this obviously doesn’t care about efficiency).

    In short, trying to serve a file with this feature would result in a bizarre hack that was dramatically worse than any other way of storing files, including just using regular block downloads. It’s not something that’s worth worrying about.

  133. laanwj commented at 1:19 pm on August 25, 2014: member

    @jgarzik “I thought the IRC discussion was…” I know of no IRC discussion about that. I saw only ACKs here and a don’t care from you:

    I will not NAK this change, but I do give it a solid “-1”

    This issue has been open for months, and people changed their opinion at some point that this was either a good thing to have or not-too-bad.

    And now it’s suddenly NAK? Can you give details on the possible exploit?

  134. jgarzik commented at 1:40 pm on August 25, 2014: contributor

    (summarized from IRC) @mikehearn “No, it can’t, at least not sanely.” Translation: Yes, under certain conditions.

    I’m currently running some numbers on txouts in the chain. “36 bytes for 36 bytes” is a handwave. @laanwj Yes, status changed based on this new information.

  135. petertodd commented at 4:22 pm on August 25, 2014: contributor

    @jgarzik So the efficient way to use the UTXO set for storage with getutxos is to create a transaction with a large number of outputs, all storing data, and then increment the vout index to retrieve the data. That technique works even with p2pkh and p2sh outputs. With the 100KB limit on transactions you can get ~3000 P2SH outputs, or 60KB of data, and you can of course also chain transactions to store more than that if needed. It’s worth remembering that using the Bitcoin network for data storage has excellent anti-censorship and reliability properties that other methods simply can’t replicate. It’s not a hack by any means.

    A second use-case for getutxos is timestamping. Waiting for a confirmation is highly impractical, so a user-friendly timestamping tool can create an unspendable UTXO and immediately store the txid:vout. When the timestamp needs to be validated later you simply use getutxos to find the blockheight, optionally getting the actual transaction to properly verify it. Again, the improvement for the user is is to avoid reliance on centralized TXO index servers by forcing everyone to be an index server for you.

  136. mikehearn commented at 4:34 pm on August 25, 2014: contributor

    To download that 60kb of data you’d need to upload 102kb of data, which you have to get from somewhere. It makes no difference from an anti-censorship perspective.

    Now can we please focus on writing code to solve problems we have today?

  137. petertodd commented at 4:37 pm on August 25, 2014: contributor

    @mikehearn Like I said, the bandwidth for that kind of use-case is completely irrelevant. The advantage is the very strong resistance to censorship and high reliability.

    People use bittorrent all the time, even though to download 60KB of data you end up having to upload about 60KB of data…

  138. petertodd commented at 4:45 pm on August 25, 2014: contributor

    @mikehearn Oh, wait, I misread your reply:

    To download that 60kb of data you’d need to upload 102kb of data, which you have to get from somewhere. It makes no difference from an anti-censorship perspective.

    Again, by incrementing the vout deterministicly and keeping the txid the same you only need a single txid. Basically I’d create a tx with data-encoding outputs 0…n, give out the txid as the “locator” for the data, and the retrieval routine uses getutxos to get txid:0, txid:1, …, txid:n, thus recovering the whole file. If I can’t fit that data in a single transaction, I just add a level of indirection/chaining to store it in multiple transactions.

  139. maaku commented at 5:12 pm on August 25, 2014: contributor

    How does getutxos make that any worse than the current state? You could just as easily use block hash and a bloom filter to identify the transaction.

    On Mon, Aug 25, 2014 at 9:46 AM, Peter Todd notifications@github.com wrote:

    @mikehearn https://github.com/mikehearn Oh, wait, I misread your reply:

    To download that 60kb of data you’d need to upload 102kb of data, which you have to get from somewhere. It makes no difference from an anti-censorship perspective.

    Again, by incrementing the vout deterministicly and keeping the txid the same you only need a single txid. Basically I’d create a tx with data-encoding outputs 0…n, give out the txid as the “locator” for the data, and the retrieval routine uses getutxos to get txid:0, txid:1, …, txid:n, thus recovering the whole file. If I can’t fit that data in a single transaction, I just add a level of indirection/chaining to store it in multiple transactions.

    — Reply to this email directly or view it on GitHub #4351 (comment).

  140. petertodd commented at 5:20 pm on August 25, 2014: contributor
    @maaku That may be correct now, but what we’re worried about is limiting our options in the future. If we implement it now and an ecosystem of apps depends on it removing the functionality in the future to deal with abuse will be much more difficult. We may be better off with those applications using functionality that can’t be abused; note how as I mentioned earlier Lighthouse - as an example - has multiple alternatives to using getutxos.
  141. mikehearn commented at 5:27 pm on August 25, 2014: contributor

    It doesn’t, but Peter hates Bloom filtering too remember?

    Given that BitTorrent and other systems exist, I doubt this will ever be a major problem. But if it does become one, UTXOs can be deleted any time the community wants. If you know they’re unspendable you can delete them immediately without any coordination with other peers. Given that the point is censorship resistance that’d usually be public broadcasted data anyway, so finding them should not be very hard. If they are spendable, you need a chain fork to do so, but if the weight of such abuse was seriously hurting Bitcoin as a financial system then getting consensus to do that would happen.

    Peter is undoubtably now trying to come up with an even more convoluted reason to object. Please, don’t. If you’re so concerned about abuse today go implement some anti-DoS code and make yourself useful for a change.

    Also please stop repeating this nonsense about what my app does or does not need. Do you really think I’d have bothered if there was no point to it? Saying some app or another doesn’t need this feature is correct in an entirely useless nitpicky way, for the same reason nothing “needs” Bloom filtering - it can always just download all the data and calculate the answer itself. But then performance would be unusably bad. Back here in the real world, where performance matters, such a feature makes the difference between something being practical or not.

  142. petertodd commented at 5:39 pm on August 25, 2014: contributor

    @mikehearn

    It doesn’t, but Peter hates Bloom filtering too remember?

    Please apologise for that inflammatory statement and keep personal attacks out of this discussion. As you know my opposition to bloom filters has nothing to do with this issue; prefix filters are just as easy, if not easier, to use for the purpose of data publication.

  143. btcdrak commented at 5:48 pm on August 25, 2014: contributor
    I was told this morning on IRC this PR the same as the RPC version of gettxout.
  144. jgarzik commented at 6:08 pm on August 25, 2014: contributor
    @btcdrak except, you know, it is available to the world vs. just the system owner… :)
  145. maaku commented at 6:08 pm on August 25, 2014: contributor

    Drak, there are some minor differences but yes it is basically a p2p form of gettxout. That doesn’t make it unnecessary however – the whole point is that you would use getutxos when you don’t have access to a full node from which to query gettxout.

    Peter, the fact is this pull request or a future one using Merkle tree commitments is not making the situation any worse than it already is, and provides significant benefit. We must be utilitarian about these decisions. The argument for this pull request has been laid out in great detail: it makes possible entirely new services which would be very beneficial to bitcoin. An argument against based on enabling a DoS vector which is less efficient than existing attack vectors we have no intention of closing is not very convincing…

    On Mon, Aug 25, 2014 at 10:48 AM, ฿tcDrak notifications@github.com wrote:

    I was told this morning on IRC this PR the same as the RPC version of gettxout.

    — Reply to this email directly or view it on GitHub #4351 (comment).

  146. btcdrak commented at 6:12 pm on August 25, 2014: contributor
    Frankly, given the unauthenticated nature of p2p, I think this patch is dangerous. Reasons seem well covered by others above. I can’t see a valid reason for not running a full node when you need access to UTXO whatever your project is. Sorry, this PR is clearly ill thought.
  147. petertodd commented at 6:17 pm on August 25, 2014: contributor
    @maaku We do intend on closing off this DoS attack vector, via pruning. Exactly what form that takes is still uncertain, but there’s no reason to limit our options yet, particularly when applications like Lighthouse have good alternatives to using getutxos that can be used while we take the time to better understand the landscape. Equally many of the “entirely new services” made possible may not be things we actually want to encourage on Bitcoin. @btcdrak An good alternative for many of the use-cases where getutxos is reasonably secure - including Lighthouse - would be to have tx rejection messages include information on what inputs have already been spent. The worst that can happen there is your peer rejects the transaction; whether or not they should have is irrelevant as they can always block propagation.
  148. jgarzik commented at 6:21 pm on August 25, 2014: contributor
    The open idea on IRC was to return spent-ness.
  149. petertodd commented at 6:22 pm on August 25, 2014: contributor
    @jgarzik Right, which I think we should reject due to being completely unauthenticated - encourages dangerous practices.
  150. sipa commented at 6:59 pm on August 25, 2014: member

    The txout data and spentness are independent data sets in my opinion.

    As this PR implements now, it is possible to authenticate the txout data partially (assuming standard transactions, knowing the input spending the utxo validates the script part), but not possible to validate the spentness. A node can lie both ways about the spentness, though requiring the full txout to be returned makes it more expensive (needs a txindex).

    They also have different use cases. It seems to me that lighthouse really only needs spentness - the full txout data just makes it more expensive to lie. Other examples mentioned before (e.g. the fee estimation) only need the txout data, not the spentness.

    I think that lumping them together makes it harder to reason about.

  151. jgarzik commented at 4:30 pm on August 26, 2014: contributor
    IRC report: <dhill> so getutxos makes bitcoind send mesages larger than the max 32MB
  152. dajohi commented at 4:46 pm on August 26, 2014: contributor

    ReadMessage: message payload is too large - header indicates 2892934254 bytes, but max message payload is 33554432 bytes.

    Just some thoughts:

    1. bitcoind should not attempt to send messages that exceed the max message payload (currently 32MB)
    2. bitcoind should ignore duplicate outpoints in a getutxos request.
    3. perhaps using MAX_INV_SZ (50000) is too high for a getutxos request limit. Perhaps it needs its own define which is much smaller.

    I produced this by using a small script and btcwire testnet tx bd1f9401a9c284a04353f925276af62f23f452d297eb2cc582d037064b2a795f, and getutxo s requesting outpoint 1 … 50,000 times (the limit).

  153. laanwj commented at 6:23 am on August 27, 2014: member
    I would have hoped this kind of testing was done in all the time that this was still a pull request. But it’s clear that there are still too many gotchas, going to revert.
  154. mikehearn commented at 9:09 am on August 27, 2014: contributor

    MAX_INV_SZ is used for getdata as well, which returns entire transactions, and that doesn’t seem to remove duplicate requests either. There’s a manual check against the size of the send buffer on the getdata code path which would be fairly easy to duplicate here, but this sort of network code is easy to duplicate.

    At any rate, it should be an easy fix. @laanwj Why don’t you wait for me to fix it instead? It’s not a big surprise that things in master get more testing than things that are not.

  155. sipa commented at 9:15 am on August 27, 2014: member
    getdata returns the results as individual tx/block messages, which are throttled based on space in network send buffers, so the same problem does not exist there.
  156. btcdrak commented at 9:18 am on August 27, 2014: contributor
    @mikehearn @laanwj Better to revert this. Clearly this PR needs a lot of work and testing first.
  157. mikehearn commented at 9:22 am on August 27, 2014: contributor

    Yes, that’s what I said.

    As CCoin is a fixed size structure and the size of the bitmap is equal to the size of the inputs, just keeping track of the space required to send and stopping if there’d be insufficient space is good enough. We could remove duplicates too but it doesn’t seem necessary if there’s a simple counter.

    With respect to reverting things - no, reverting any change that someone finds a bug in is not a sane strategy. If we were about to do a major release it’d be different, but then we shouldn’t be merging features at all. Applied consistently this strategy would have resulted in every major change to Bitcoin being rejected. Recent example: using this patch I discovered a serious regression in re-org handling that was the result of (I think) headers first. Rolling back all of headers first would have been a mistake for all kinds of reasons, not least of which is that I wasn’t intending to find that bug and wouldn’t have done so if the buggy code had still been sitting on a branch.

  158. btcdrak commented at 9:29 am on August 27, 2014: contributor

    @mikehearn For a new feature like this, recently merged (just yesterday!), a revert is absolutely the right course of action. It means if the feature is fixed and merged in again there is one clean merge, one unit. It makes the history much more understandable. From what I can see, we also need more discussion about the rationale of this PR. Clearly there has been a lot of ‘hand waving’ and not enough research, which when someone actually did, uncovered some pretty nasty issues. Throwing caution to the wind is not right for Bitcoin Core. I still don’t see why this isn’t done over RPC personally.

    Looking at history, other PRs have been reverted pending further investigation, I don’t see why this PR needs special treatment.

  159. sipa commented at 9:37 am on August 27, 2014: member

    Keeping track of the output size constructed would indeed remove the problem. But then what to do to inform the peer? Split the result in two? Truncate it? DoS ban the peer? At least some thought is needed.

    I would prefer just to only return spentness. No matter what, that data is not authenticated, and can’t be by the setup envisioned here. Adding the full txout data just complicates things (and yes, makes it a bit more expensive to lie, but your system doesn’t require peers to be honest, right?). Most concerns about potential incentive shifts and DoS potential are gone that way.

    Of course, I would still prefer us not to provide unauthenticated access to UTXO information at all. If there is no way to prevent peers lying, either you don’t care about the truth, or you’re better off using a (set of) central servers, either trusted or with reputation to lose.

  160. mikehearn commented at 9:55 am on August 27, 2014: contributor

    It can just be truncated. The result bitmap is supposed to be the same length as the input query. If it’s not then you know the result has been truncated. I don’t think any normal client would ever hit this case anyway. Not complicated.

    Can we please stop going over the rationale again and again and again? @btcdrak the RPC stack doesn’t even try to handle resource usage attacks so your suggestion would make things worse rather than better. @sipa The patch is implemented in this way for a reason. My app runs the scripts to try and ensure you can sign for the output you are pledging. Because a pledge is not a valid transaction there is no other way to test this: you cannot broadcast it and see what happens (and proxying transactions would just allow clients to get you banned anyway in the current architecture). The assumption is that your network peers (randomly chosen) are not the same as the people sending you pledges. This is a very realistic and reasonable assumption.

    Both of the above things have been explained multiple times over the past few months including in the description of the patch. I really don’t know what I can do here to make things clearer. Is there a problem with my writing style or something? I get the overwhelming impression this entire community of people comes to strong opinions on my work yet does not read a word I’ve written, and it’s unbelievably frustrating.

    I will add some more DoS controls to this feature, although we should all remember that Bitcoin Core can be DoSd in lots of different ways - this is hardly changing the status quo.

  161. petertodd commented at 11:59 am on August 27, 2014: contributor

    @mikehearn re: “not reading a word I’ve written”, you’re doing the exact same thing: @btcdrak made clear above that he believed you should be running a full node: “I can’t see a valid reason for not running a full node when you need access to UTXO whatever your project is.”; his suggestion of using RPC is for local nodes where resource usage attacks are irrelevant.

    In any case, stop trying to deflect technical criticisms with responses based on personal disputes and personal attacks.

    Additionally remember that when there exists a technical concern about a proposal of local and global DoS attacks and economics/resource usage issues, submitting a patch with a fairly obvious DoS attack in it is a strong indication that the submitter hasn’t thought through the consequences. We don’t have the time to consider in detail every single pull-req, and for that matter, fix all the bugs in them, so it’s only reasonable that finding such an issue be considered a strong indication that the patch should be rejected/reverted for now.

    I will add some more DoS controls to this feature, although we should all remember that Bitcoin Core can be DoSd in lots of different ways - this is hardly changing the status quo.

    It’s adding to a yet unsolved problem. Don’t be surprised if people are reluctant to dig the hole deeper when we don’t know if we’re ever going to find a ladder out.

  162. mikehearn commented at 3:42 pm on August 27, 2014: contributor

    There’s a fix here:

    #4770

  163. genjix commented at 4:28 pm on August 27, 2014: none
    wow what a stupid change, all the more reason why we can’t have a single group making unilateral decisions on one codebase. I had never heard of this. I think you guys need to stop trying to throw all this crap into the Bitcoin protocol, and focus on keeping it small + focused. Bloated software overextends itself introducing security flaws through new attack surfaces.
  164. mikehearn commented at 4:37 pm on August 27, 2014: contributor

    I actually can’t make my bitcoind crash even when firing many such bogus queries in parallel, even without the fix. Memory usage does go up, and if you don’t have much RAM that could be a problem, but it drops down again immediately after.

    With the extra ten lines to track bytes used the code is clearly better, but it’s way overkill to go running to reddit and claim it’s an “easy way to crash bitcoind”. Heck our networking code doesn’t even have a limit to aim for. Bitcoind could run out of RAM on some systems just by handling a lot of clients, or if there was a lot of transactions in the mempool.

  165. petertodd commented at 4:46 pm on August 27, 2014: contributor
    Lots of nodes out there without all that much RAM. Other DoS attacks are prevented by existing resource limits, e.g. tx fees, coin-age, etc. and “handling lots of clients” is something we already have limits on via the connection limits. I should know - I’ve spent a lot of time looking for and fixing cheap DoS attacks. (e.g. the sigops one I fixed) getutxos is unique in how easy it is to use to crash systems at no cost. @mikehearn You’d be smart to just own up and say “oops, I screwed that up” rather than trying to make excuses. Heck, I personally screwed up a bit by ACKing the patch without noticing that flaw.
  166. laanwj commented at 4:52 pm on August 27, 2014: member

    I had never heard of this

    It is not as if this was kept secret. To be fair, the BIP was posted to the bitcoin development mailing list, https://sourceforge.net/p/bitcoin/mailman/message/32590257/ The BIP was reviewed, finalized and merged at some point, https://github.com/bitcoin/bips/pull/88 This pull has been open for months as well. You could have heard of this, and given your opinion in all of those instances…

  167. mikehearn commented at 5:09 pm on August 27, 2014: contributor

    Peter, I wrote a patch right? I’m grateful that Dave did this testing and found this problem. I am less grateful that he then ran to reddit and started complaining about how badly unit tested Bitcoin Core is (blame Satoshi for that one, if he must).

    I think it’s an open question about how such things can be caught in future. Any future change could result in large temporary memory usage without anyone noticing. The lack of any definition for “large” makes this harder - as I said, I can’t actually make my testnet node crash even when repeating the conditions that were given. So how to find this sort of thing systematically is tricky. Ideally Bitcoin Core would print a warning if its memory usage went over a certain amount, but we don’t have that.

    One solution that will definitely NOT work is blaming people for being imperfect programmers. Core has shipped fatal bugs before and will do so again. The right solution is usually to ask “how can we stop such mistakes systematically”?

  168. genjix commented at 5:10 pm on August 27, 2014: none
    There’s a ton of silly discussion on that mailing list which consists of “who has the most stamina to invest in arguments” which I don’t have time for. Therefore I cannot sift through all that looking for the gems of important discussion to register my single objection. @mikehearn you can avoid mistakes by taking features out not trying to stuff more features in (which we don’t need). This is a case of developers going mad for features that want which should be happening on another layer of Bitcoin, not the core protocol which should stay pure and focused. I see very little impetus for real implementation work happening apart from odd bugfixes or cosmetic changes, and a lot of “THE NEXT BIG THING” spurned by corporations who see Bitcoin as the new payments innovation instead of wanting to protect the integrity, security and values of Bitcoin. I’m a protocol conservative.
  169. btcdrak commented at 6:32 pm on August 27, 2014: contributor
    Reverted in 70352e11c0194fe4e71efea06220544749f4cd64
  170. dgenr8 commented at 7:18 pm on September 1, 2014: contributor

    Nothing about this change is harmful enough to violate the process. A BIP was even merged, for crying out loud.

    It would be great if core supported optional and p2p-queryable indexes for everything. An optional way to authenticate data served would also be great. Lack of these extra features should not doom this change. They could be in a layer maintained by the core project. There is absolutely no reason to punt stuff like this to third parties if open-source developers want to create it.

  171. btcdrak commented at 7:22 pm on September 1, 2014: contributor
    @dgenr8 A BIP getting merged doesn’t make it a standard, it just starts it in the ‘draft’ workflow status: https://github.com/bitcoin/bips/blob/master/bip-0001/process.png
  172. gmaxwell commented at 7:49 pm on September 1, 2014: contributor
    There are several bad bips which (IMO) no one should ever use, the BIP process doesn’t tell you if something is good or not… it just specifies it.
  173. jonasschnelli referenced this in commit b755435cfb on Dec 3, 2014
  174. jonasschnelli referenced this in commit 9d330d9746 on Dec 3, 2014
  175. jonasschnelli referenced this in commit 00e22e0d4a on Dec 3, 2014
  176. jonasschnelli referenced this in commit fdae2cd7d0 on Dec 3, 2014
  177. jonasschnelli referenced this in commit bc44d8271e on Dec 3, 2014
  178. jonasschnelli referenced this in commit 4110e00dfc on Dec 3, 2014
  179. jonasschnelli referenced this in commit 0fdcb23fe8 on Dec 3, 2014
  180. jonasschnelli referenced this in commit c83935887e on Dec 3, 2014
  181. jonasschnelli referenced this in commit eb523648f1 on Dec 3, 2014
  182. jonasschnelli referenced this in commit 9a9d38c4d9 on Mar 4, 2015
  183. jonasschnelli referenced this in commit 8b75c3a77f on Mar 4, 2015
  184. jonasschnelli referenced this in commit 97ee866549 on Apr 21, 2015
  185. rebroad referenced this in commit 8d47d40a51 on Sep 1, 2016
  186. rebroad referenced this in commit 32b173285b on Sep 1, 2016
  187. rebroad referenced this in commit 671f1ad664 on Sep 1, 2016
  188. bmstaten commented at 11:48 am on November 17, 2018: none
    How do you know which Node to use B.S.
  189. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me