Breaking User Space #14363

issue nopara73 opened this issue on October 2, 2018
  1. nopara73 commented at 12:34 AM on October 2, 2018: none

    At the time of writing approximately 86.51% of the Bitcoin network is vulnerable to CVE-20181-7144. It has been two weeks since 0.16.3 was tagged with the vulnerability fix.

    I believe the occasional breaking of user space is one of the contributing factor to this situation where things can be improved. I believe marking an RPC method deprecated and removing it within a few new releases may not be sufficient, since most developers do not code directly against Bitcoin Core, rather they take an RPC client implementation, written by someone else in their respective programming language. These projects tend to be unmaintained, incomplete, and sometimes even buggy.
    These developers, in such an environment may consider upgrading their Core nodes to be too risky.

    I know it is hard living together with legacy code and immature API design decisions, but a "never break the user space" policy may result in a more secure and stable environment.

  2. RHavar commented at 4:51 PM on October 2, 2018: contributor

    This is rather reaching. An astonishingly few amount of nodes are used for rpc purposes, and even fewer of those are going to be in a position where they can't easily upgrade. Their contribution to the amount of vulnerable nodes is negligible at best.

    A much directer way of supporting the goal of having users patch their nodes quickly would be to have had the patch back-ported for previous releases much faster.

    Although I do think core is a bit aggressive in how they deprecate stuff (and require using a deprecated flag) which makes upgrading a bit more painful than it should be. And I have been bitten really badly by some instability before (namely changes in the edge-cases of ListSinceBlock which broke my application about 3 times) to the point I had to stop even using it. However, I would argue a lot of that pain is caused by poor documentation (some stuff literally has no documentation of what it should do so it's anyones guess what is correct behavior) and a lack of tests (to catch changes/regressions) and ultimately a lack of developers.

    I think increasing the maintenance burden by imposing a "never break user space" will just strictly make things worse.

  3. sipa commented at 3:14 AM on October 3, 2018: member

    I don't think you can ascribe the reason for users not upgrading to breaking RPC changes. This is the reason why backports exists, and we maintain old versions for critical vulnerabilities.

    I think it is far more likely that a large portion of non-upgraded nodes are in fact not being used for anything, and their operators don't care about what chain they accept.

  4. MarcoFalke added the label Brainstorming on Oct 3, 2018
  5. MarcoFalke commented at 3:58 AM on October 3, 2018: member

    Link to related material: https://bitcoincore.org/en/lifecycle/

  6. jnewbery commented at 6:54 AM on October 3, 2018: member

    I agree with @sipa. @nopara73 - you make an argument for backporting critical fixes to previous versions, which in fact we already do.

    Although I do think core is a bit aggressive in how they deprecate stuff (and require using a deprecated flag) which makes upgrading a bit more painful than it should be. @RHavar - I'm a bit surprised by this sentence. I introduced the deprecatedrpc flag in #11031. The idea is that downstream users have at least one release where both the old API and new API are valid. To use the old API, the user must take action (enable the flag). That means they can't not be aware that the old API is being deprecated in the next version. The alternative would be just to have a written warning, which could easily be missed by users.

    Is the deprecatedrpc burdensome for users? If so, do you have a suggestion for a better way to deprecate old APIs?

  7. RHavar commented at 7:20 AM on October 3, 2018: contributor

    @jnewbery I understand the theory, it's just that in practice it's kind of painful

    Taking a recent example, I just published a little program which uses signrawtransaction. But that's been deprecated and effectively renamed to signrawtransactionwithwallet in 0.17.

    Ok, cool, I'm aware of that. So what should I do? Use the signrawtransactionwithwallet but now it has a hard dependency on 0.17. Or should I continue to use signrawtransaction but require running with -deprecatedrpc=signrawtransaction in 0.17 ? This is a problem that all the bindings need to deal with too now, and there's no good solution (unless you expect them to do "feature detection" and dynamically switch between the two, but at that point it raises that question of was is really worth creating all that complexity for applications, just so you're sure they're aware of your deprecation schedule?) .

    The fact is this sort of deprecation stops allowing smooth upgrades, and honestly -- I don't think you can justify it the aggressiveness of it. It's my experience that a WARN_ONCE style deprecation system is vastly superior and smoother. Yes, it's easier to ignore -- but those people will end up hurting themselves -- for everyone else, everything just goes smoother.

    And probably more importantly: if the system uses deprecation warnings I can just do the upgrade, and then migrate away from the deprecated methods at my leisure. While with the deprecated-method-hidden-behind-a-flag system, I need to upfront know exactly which rpc methods my application is using (which might not be obvious or easy to find, especially so if I'm using libraries/code I didn't write myself) and most likely pressured into doing a more complex simultaneous release of my code + bitcoin-core as no one likes turning on deprecated flags :P

    I also think deprecation schedule can be slower unless there's some tangible benefit for it (e.g. cleaning up the code to make it easier to work with). An aggressive schedule for renaming signrawtransaction to signrawtransactionwithwallet seems a bit overzealous.

    (And I do think this stuff benefits by a more case-by-case analysis, some stuff can't go fast enough for sure)

  8. laanwj commented at 9:36 AM on October 3, 2018: member

    There's been a long time it's been overly cautious with RPC API changes, in some cases for years—like with getinfo.

    Maybe it's going too fast now, but I expect this to be a moment of temporary breakage, as there is a lot going on—such as the change to multi-wallet, which necessitates some RPC changes to keep sanity.

    It will slow down again.

    I think increasing the maintenance burden by imposing a "never break user space" will just strictly make things worse.

    Yes, this is just not realistic with the current state of things. Everyone involved is over-worked already, I somehow don't see that changing. These kind of compromises have to be made.

    But also: this project is open source; if you really want longer backwards compatibility, you could get involved, build backwards compatibility wrappers, make test cases for them, maintain them in the long run.

  9. laanwj commented at 9:38 AM on October 3, 2018: member

    The OP is nonsense, by the way. We didn't spend time rolling backport releases for two different historical branches (0.14.3, 0.15.2) which are all the affected major releases just to get bullshit complaints like this, sorry. These versions have no breaking RPC changes.

  10. kristapsk commented at 9:45 AM on October 3, 2018: contributor

    Taking a recent example, I just published a little program which uses signrawtransaction. But that's been deprecated and effectively renamed to signrawtransactionwithwallet in 0.17.

    Ok, cool, I'm aware of that. So what should I do? Use the signrawtransactionwithwallet but now it has a hard dependency on 0.17. Or should I continue to use signrawtransaction but require running with -deprecatedrpc=signrawtransaction in 0.17 ?

    I solve these by trying the new RPC call and, if it fails, fallbacking to old one.

  11. laanwj commented at 9:58 AM on October 3, 2018: member

    I solve these by trying the new RPC call and, if it fails, fallbacking to old one.

    Though doing that every time might result in twice the number of RPC calls, which can be a serious overhead for frequent calls or when the arguments are large. Other ways would be:

    • call both the methods without arguments—then look at the error code. You get the help response (-1) when it is available, or else a method deprecated (-32), or a missing method (-32601) error, store this result in your wrapper or application logic and call the method that is available (I thought about calling help instead but I don't think this will give the deprecation information, as the deprecation check if after the help check!)
    • get getnetworkinfo().version, store it, and compare it against a threshold value to see what method to use from then on
  12. promag commented at 1:13 PM on October 3, 2018: member

    Is the deprecatedrpc burdensome for users? If so, do you have a suggestion for a better way to deprecate old APIs?

    IIUC the suggestion is to wait more releases before removing deprecated code?

    IMO we are doing the right thing. We don't force updates, we don't force breaking changes. If you want to update you better take the proper measures to keep the system working. Maybe keep 2 nodes (current + new), check client libraries, create a middleware to handle API changes..

  13. RHavar commented at 4:16 PM on October 3, 2018: contributor

    There's been a long time it's been overly cautious with RPC API changes, in some cases for years—like with getinfo.

    I guess it's not often someone one says "good job", but that's really a great example of where (from an external perspective) it was done really well, and allowed for a really smooth transition.

    Other ways would be:

    Of course it's all possible, just almost no one is going to put in the time and energy to do so. It's pretty easy to see by looking at bitcoin bindings, and seeing they pretty much all break now with the upgrade to 0.17.

    I don't like to complain too loudly about this stuff, as I know it's all volunteer driven and I'm certainly not volunteering. But a less painful deprecation schedule for would make things go smoother, especially when there's no maintenance burden on core for leaving it in for a while. (e.g. something like signrawtransaction is probably not very burdensome to be left in, so I think should be deprecated a lot slower than something that's actually causing pain).

    --

    A bit of a shower-thought, but instead hiding stuff behind deprecated-flag -- it'd could be nicer to have something like a deprecated use log. The first time you use a deprecated features, it appends to the log.

    Then on startup the software warns about the deprecated shit you've been using, and if any of them have since been removed it gives an error (until the person clears the deprecated log).

  14. kristapsk commented at 4:38 PM on October 3, 2018: contributor

    get getnetworkinfo().version, store it, and compare it against a threshold value to see what method to use from then on

    That was my initial plan, to do so. But I got confused that there are getnetworkinfo.version and getwalletinfo.version, also could not find so fast which RPC APIs with which parameters are supported with which version, so trying one call after another was a simpler way for me (faster to develop). Probably some document somewhere with list of all calls, with possible parameter changes and from / to version numbers would be a good idea.

  15. promag commented at 4:46 PM on October 3, 2018: member

    Sorry but if you want to use the latest version as soon as possible then act accordingly - read release notes, keep track of release candidates, test client side, update 3rd party software...

  16. nopara73 commented at 6:33 PM on October 3, 2018: none

    While I brought up RPC as an example, it was not my intention to limit the scope of this conversation to it.

  17. promag commented at 6:46 PM on October 3, 2018: member

    @nopara73 what you mean beside RPC? Configurations?

  18. nopara73 commented at 7:21 PM on October 3, 2018: none

    @promag Yes configuration, I was debugging for hours today what the heck went wrong when updated my regtest core node factory for 0.17. I know I am an incompetent idiot for not reading the release notes properly, and only trying to fix stuff those broke introduced by the RPC change.

  19. jb55 commented at 9:21 PM on October 3, 2018: member

    As a general idea I think it's a useful one.

    An API surface that could be reliably depended on until the heat death of the universe and that never breaks is useful and shouldn't be dismissed.

    It then comes down to what API boundary should be considered kernel space and what should be considered user space? As long as that boundary is made explicit, then issues like the ones described above could be reduced.

  20. sipa commented at 1:24 AM on October 4, 2018: member

    I don't think anyone is dismissing that breaking changes are not an issue - but they are sometimes necessary (we essentially didn't do any for years, which locked us into some very bad design choices forever), and I believe we're doing what we can within reason to limit their impact:

    • RPC changes/configuration changes are rare events, only in major releases, and generally not things that affect non-expert usage.
    • Incompatibilities are always documented in the release notes.
    • You can use the deprecatedrpc functionality to retain old behavior while still upgrading (making you acknowledge that the behavior will be removed in a later release).
    • For critical issues, backports to old supported releases are created.

    There are individual concerns here that can be addressed - for example the difficulties around deprecatedrpc that @RHavar brings up, and perhaps an alternative can be suggested.

    But can you please not use CVE-2018-17144 to start a rant about your personal preferences in dealing with compatibilities? No, I don't think the "slow" adoption of a new release has anything to do with breaking changes (I doubt any significant percentage of nodes are used via RPC at all, and for those that do, we have backports).

  21. jnewbery commented at 1:26 AM on October 4, 2018: member

    This is a problem that all the bindings need to deal with too now, and there's no good solution (unless you expect them to do "feature detection" and dynamically switch between the two, but at that point it raises that question of was is really worth creating all that complexity for applications, just so you're sure they're aware of your deprecation schedule?) .

    Yes, this is a breaking API change. The client does need to know about it. The deprecatedrpc just means that you get an extra release as warning. Whether it happens in 0.17 or 0.18, your client will need to change behaviour based on server version.

    I also think deprecation schedule can be slower unless there's some tangible benefit for it (e.g. cleaning up the code to make it easier to work with). An aggressive schedule for renaming signrawtransaction to signrawtransactionwithwallet seems a bit overzealous.

    The API isn't changed simply because developers like breaking the API for users. There was very tangible benefit to breaking signrawtransaction into wallet and non-wallet methods. See #10570 for rationale. It was also part of a larger effort to remove the circular dependencies between server and wallet (https://github.com/bitcoin/bitcoin/issues/7965). This was long-term technical debt inherited from V0.1 when wallet and node code were completely mixed up main.cpp.

    Please believe me that developers take API stability very seriously (eg count the uses of IsDeprecatedRPCEnabled() in https://github.com/bitcoin/bitcoin/blob/0.17/src/wallet/rpcwallet.cpp for logic added to support the accounts API while users transition to the labels API, or the tests in https://github.com/bitcoin/bitcoin/blob/0.17/test/functional/rpc_deprecated.py and https://github.com/bitcoin/bitcoin/blob/0.17/test/functional/wallet_labels.py). However, that consideration needs to be weighed against the benefits to all users when we pay down technical debt and make the product more secure and maintainable.

    There's been a long time it's been overly cautious with RPC API changes, in some cases for years—like with getinfo.

    I guess it's not often someone one says "good job", but that's really a great example of where (from an external perspective) it was done really well, and allowed for a really smooth transition.

    I disagree. This was not a good job. Supporting legacy APIs for many years that prevent us from doing useful work like separating wallet from node hurts the security of all users. And even if we take years to remove a legacy API we still get complaints (eg #11671).

    Maybe it's going too fast now, but I expect this to be a moment of temporary breakage, as there is a lot going on—such as the change to multi-wallet, which necessitates some RPC changes to keep sanity.

    Yes - I've been aggressive in removing legacy code and paying down technical debt over the last couple of releases. Much of this technical debt goes back to v0.1. In the short term it may be painful for a small subset of users, but in the long run everyone benefits.

  22. nopara73 commented at 8:42 AM on October 4, 2018: none

    But can you please not use CVE-2018-17144 to start a rant about your personal preferences in dealing with compatibilities? No, I don't think the "slow" adoption of a new release has anything to do with breaking changes (I doubt any significant percentage of nodes are used via RPC at all, and for those that do, we have backports).

    If you have an explanation on why node operators aren't updating, then can you share it? If you don't, then that means there are a number of different reasons why it isn't happening, and it is reasonable to assume that "one of the contributing factors to it" is that over the past few years you taught your users that updating is not a simple download and go job, thus when in doubt and short on time, inaction is the action they take, until they "get to it."

  23. MarcoFalke commented at 8:50 AM on October 4, 2018: member

    See #14363 (comment)

    There shouldn't be any breaking rpc changes in minor releases.

  24. nopara73 commented at 8:58 AM on October 4, 2018: none

    There shouldn't be any breaking rpc changes in minor releases.

    That's beside the point. For example a store node operator, running BTCPay won't keep up with all the complexity of Bitcoin Core. Their logic is simply: "Last time I updated Core, our server was down for 12 hours, until I figured out how to downgrade it. This update should be safe, but I wouldn't take all the shit again from the owner."

  25. sipa commented at 11:41 AM on October 4, 2018: member

    If you have an explanation on why node operators aren't updating, then can you share it?

    Most people who operate a node (sadly) just don't care enough to keep up to date. This is true for all software without automatic update system (and we very intentionally don't have that).

    Also, it doesn't necessarily matter. We can't actually observe what percentage of economically relevant nodes has updated. I believe (but this is just a guess, and I have no strong evidence) that among the number of nodes that are actually used (through RPC or otherwise indirectly affecting validation of transactions that are economically relevant to the owner), the percentage of upgraded nodes is much higher.

    Please. There is a time and place to discuss backward compatibility, but can you not turn into a drama? The rate of upgrading is completely expected. In fact, I think it's much better now than it was years ago, when breaking changes were far more rare.

  26. nopara73 commented at 11:58 AM on October 4, 2018: none

    Please. There is a time and place to discuss backward compatibility, but can you not turn into a drama?

    ACK. Maybe it would be smarter to bring this up sometimes later in another context.

    Also I have a feeling you were replying to my rantish comment, I've edited right away after posting, because I realized I went overboard with it. (Edits are often missed by those who are interacting with GitHub over email.)
    image

  27. nopara73 closed this on Oct 4, 2018

  28. promag commented at 12:00 PM on October 4, 2018: member

    For example a store node operator, running BTCPay won't keep up with all the complexity of Bitcoin Core.

    Why not? I'd say they should.

    our server was down for 12 hours, until I figured out how to downgrade it

    No serious user/service should blindly upgrade even if there are no breaking changes. Upgrades should be tested, data should be backup, plan B should be in place, etc..

  29. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me