Code style PRs after v0.18 branch split

jnewbery commented at 4:43 pm on February 22, 2019: member

We’ve adopted a bunch of code style guidelines over the last couple of years.

According to sipa’s comment:

In my view, writing a particular guideline in the document implies that we’re as a project committing to (eventually) adopting it. Choosing a guideline that mismatches the existing code is giving ourself extra work in fixing it (and while that is ongoing, live with the inconsistencies in the codebase).

The question is whether the suggested style is enough of an improvement over the existing guidelines that it warrants all that work.

We’d eventually like to converge on the new style, but since our developer guidelines state “Do not submit patches solely to modify the style of existing code.”, it’ll be a very long time (or more likely, never) until all code is migrated to the new style.

I’d like to suggest that we take a more proactive approach to change (parts of) the codebase to the new style. This is beneficial because those parts of the codebase can act as models for new contributors, and we could perhaps enable linters on them to enforce style and avoid regressing. We could start with one or two files or modules to migrate.

The obvious time to do this is immediately after a major release: the focus running up to a release branch is on reviewing, so the PR queue is drained, and many of the PRs that are still open will already require rebase. For example, if we decide to do this for the wallet after the 0.18 branch split, there are currently only 18 PRs open that are tagged “Wallet”, not tagged “Needs rebase” and succeeding checks.

Put another way: every line that needs to be changed to converge will be changed eventually, causing rebases for other PRs. Concentrating those changes to times when there are fewest PRs open minimized total rebase pain (although increases it in the short run). There are also economies of scale to just doing a single style rebase over trickling out the changes so there are many small style rebases.

I can see arguments both ways, and would be interested in hearing others’ thoughts.

promag commented at 4:52 pm on February 22, 2019: member

My 2 cents, I tend to support this if:

there is support from maintainers to push changes “fast”
style it’s well documented
ideally backed by linters or alike

I don’t like deep git blames but I can live with those.

But there’s also red code so..

MarcoFalke commented at 4:58 pm on February 22, 2019: member

I wouldn’t object doing it for specific modules (as you suggest), for example GUI, rpc, wallet or tests. Though, when it comes to net_processing, validation or consensus code, I’d rather not touch it at all, since style fixes could change behaviour (even if done through a formatter that is supposed to only change whitespace). Reviewing such a patch to make sure it doesn’t change the logic might not be worth the effort.

About adding a linter: Different versions of formatters generally produce differently formatted output, so I am not sure if we can force every developer to install the same tool in the same version.

MarcoFalke added the label Brainstorming on Feb 22, 2019

MarcoFalke added the label Refactoring on Feb 22, 2019

MarcoFalke added the label Docs on Feb 22, 2019

MarcoFalke commented at 5:04 pm on February 22, 2019: member

Related comment copy-paste from #6839 (comment)

A couple thoughts:

When changing existing code we must not just trust clang-format repeatability to verify that no bugs were introduced, we should test via checks for identical object files. This will take a little effort because some macros insert line numbers, and these will need to be bypassed.

To use clang format normativeily we must end up specifying a particular version that all contributors must use, as clang format changes its behavior. This is irritating, and disruptive to contributing. It could be made much better with good automation like cfield’s depbuilder work, to help people get a particular version of clang-format; though due to the flowing API breakage in C++ code, I don’t know how long an older version will be maintainable.

Another concern is which style to specify. The most common behavior in Bitcoin Core today has stylistic elements which have been objectively shown to increase software defect rate and confuse review– e.g. unbraced multi-line IFs. We should not make any disruptive reformating changes unless we are quite confident that the style is what we want to live with.

We could avoid disrupting the workflow of existing patches by only reformatting (and then requiring it) on new files and substantial rewrites. I think that would be a better path.

To the extent that there is hope to address bikeshedding via this, I beg anyone thinking that to give up that thought. Bikeshedding is not a result of any particular coding style or consistency, bikeshedding arises from a lack of perspective which can only be addressed by experience, community norms, and having more important things to worry about.

Stylistic consistency has actual benefits; it aids newcomers in their contributions because it is easier for them to make sure their work is okay on styleistic grounds; though this may be offset by having to install some particular version of clang-format before they can get started. It eases review because the uniformity creates better expectations; but reformating makes looking at the history harder, which harms review. Good style choices (as opposed to merely consistent) have, at times, been shown to lower defect rates in software– but there is not a universal opinion on what choices are good. So there are benefits to be had, but they must be carefully managed with other risks and costs. And the magnitude of the potential benefits, I think, are enough to say that this subject should have a very low priority in general.

sdaftuar commented at 7:18 pm on February 23, 2019: member

In general I think that being proactive about style changes is the wrong risk-reward tradeoff for this project. If we’re looking to come up with examples of good style that we can point new contributors to, surely we can do that in other ways, such as by having more extensive examples that fit our style guidelines in our docs?

practicalswift commented at 1:23 am on February 24, 2019: contributor

Very strong concept ACK

In some areas it is better to have a decision; any decision, followed consistently than to have no decision and continued debate. Code style is one such area.

Using an automatically enforced canonical form will free up reviewer time and energy. It will also eliminate the need for code style cleanup PR:s.

In addition to freeing up reviewer time it will also potentially free up PR author time since this will allow developers to set-up pre-commit hooks and have their code auto-formatted before commit.

In summary: after this one-time change no contributor or reviewer will ever have to think about this ever again.

I suggest using black for Python code and clang-format for C++ code. (And git hyper-blame in the local toolset to ignore the change commit when blaming.)

Suggested scripted-diff:s and the corresponding Travis checks:

Python:

0# scripted-diff
1pip3 install --upgrade black
2black .
3
4# travis check to guard against regressions
5black --check --diff .

FWIW, Python PEP-8 health before and after black:

0$ flake8 --ignore=E203,E302,E501,E712,E713,E722,E731,F632,F841,W503,W605 . | wc -l
11868
2$ black .
3$ flake8 --ignore=E203,E302,E501,E712,E713,E722,E731,F632,F841,W503,W605 . | wc -l
40

C++:

 0# scripted-diff
 1apt install clang-format
 2git ls-files -- ":(exclude)depends/" ":(exclude)src/leveldb/" \
 3    ":(exclude)src/secp256k1/" ":(exclude)src/univalue/" \
 4    ":(exclude)src/crypto/ctaes/" "*.cpp" "*.h" | \
 5    xargs clang-format -style=file -i
 6
 7# travis check to guard against regressions
 8git ls-files -- ":(exclude)depends/" ":(exclude)src/leveldb/" \
 9    ":(exclude)src/secp256k1/" ":(exclude)src/univalue/" \
10    ":(exclude)src/crypto/ctaes/" "*.cpp" "*.h" | \
11    xargs clang-format -style=file -i
12git diff --exit-code

sipa commented at 2:36 am on February 24, 2019: member

NACK, sorry, I very much understand the sentiment and would also prefer being able to stick to more consistency. However, in this project, I don’t think it is the right choice for prioritize reviewer and developer time.

luke-jr commented at 5:58 am on February 24, 2019: member

NACK also. I think it would be more productive to just stop caring about code style unless it’s actively harmful in some way (eg, mis-indents or the braces-or-sameline policy). We can have guidelines, but it’s not really a good use of time to have back-and-forth over style that at the end of the day doesn’t matter.

laanwj commented at 8:37 am on February 24, 2019: member

Agree with @sipa and @luke-jr’s sentiment here. Please, let’s spend our work on fixing user side issues, it’s not like we’ve run out of issues and PRs and the only thing left is to polish it up now.

practicalswift commented at 9:58 am on February 24, 2019: contributor

@sipa Out of curiosity, what changed since #4498? :-) @luke-jr @laanwj I think we all agree that caring about code style is largely a waste of time. The point is that status quo means that we’ll keep wasting that time. On the other hand, by doing the suggested one-time change using black and clang-format and enforcing it mechanically via a Travis check (+ corresponding pre-commit hook locally) we’ll never have to think about this ever again. We will have literally no back-and-forths over style and thus no time will be wasted in the future, no?

gmaxwell commented at 10:56 am on February 24, 2019: contributor

I feel like we have already unintentionally swung fairly far towards a trade-off that optimizes for various more aesthetically oriented ideas of ‘good code’ over more prosaic definitions of software quality like stability, innovation, or general functionality…

[I say aesthetically intentionally: very little coding-standard advice has ever been experimentally verified to reduce the defect rate of software or improve development speed. I support our coding standards and also support the spread of their application, but we should not deceive ourselves into thinking that we know they provide a substantial level of benefit beyond aesthetically.]

There is already a large number of changes going on already that disrupt longer living PRs that are improving the software in ways that are unambiguously visible to users, to the point where multiple (previously) high volume contributors have cited it as a reason for decreasing their involvement in the project.

My own view is that this pattern is exacerbated by the fact that changes that are intended to have no effect feel a lot safer, so they are more attractive to make (more stuff done), easier to convince people to merge and as a result end up in a fast-lane compared to substantive changes that carry some known risk, require more specialized review, have review delayed due to changeset instability due to style nits, and are harder to mechanically rebase.

Because the most dangerous change is an unsafe change falsely believed to be intrinsically safe many refactors even carry a “coerced review” cost– where it looks sure to get merged quickly but where someone feel forced to review because it might not actually be as safe as people think. I’ve reviewed more than a few changes that I thought were entirely pointless, due to this. Doing a little bit of work you don’t care about is a cost of collaboration, but we need to manage that cost. Coding style changes are probably among the most extreme in terms of the small amount of benefit they provide users vs the amount of collateral damage they do to other development.

There is another option to wasting time going back and forth over style: We could simply stop cutting ourselves, by being more relaxed about it. – exactly as Luke-jr says. Ignore the smallest of nits, make style advice in a purely advisory way (as in, feel free to adjust if you happen to rebase, otherwise don’t bother), move style adjustments– if any– until right before merging and not as soon as a PR opens. We’ve operated in that fashion before and I think the project accomplished a lot more at that time and was more enjoyable. (Likely not all due to coding style nits, but I would be surprised if it wasn’t a factor.)

If I thought we were currently hitting a good stride, I wouldn’t have any particular qualm with hitting parts of the system with formatting cleanups especially if an effort was made to specifically avoid disrupting other work. In a really healthy state we would have have a resource surplus to absorb some of the disruption that sort of work would create but I don’t feel we do now. At this time I don’t even think it is worth considering.

(I also wouldn’t apply the above reasoning to a specific narrow changes with a clear and strong benefit profile. If someone were to a case could be made for going and– say– adding missing bracing, based on concrete information in research and past experience in our own codebase about defects being hidden by missing braces I would think it was worth considering. … But not “everything in the style guide because its in the style guide”)

gmaxwell commented at 11:03 am on February 24, 2019: contributor

As an aside, something to keep in mind is that refactors “immediately after a major release” impose a substantial cost on fix backports. I also agree with the principle that refactors bunched up is better then spread out, but more/frequent vs fewer/less often is another critical dimension too.

[my first thought on seeing this thread before seeing wumpus’ response was to send him a pleading that if we were to do this, we’d do it in a limited window and strongly discourage any refactors outside of that window, in trade.]

jnewbery commented at 3:38 pm on February 28, 2019: member

Thanks everyone for the thoughtful input. Needless to say, I don’t think this is a burning priority, but I do think that we could reduce overall disruption and improve the developer experience by being more strategic in how we tackle code style. A few comments:

a couple of comments above mention appropriateness for this project. I don’t think considering this project as a uniform whole is a useful model. The project is made up of many different parts, each with their own properties, which should be considered separately. For example, changing code style in the rpc code is a vastly different trade-off from touching net_processing.cpp, validation.cpp or EvalScript(). That’s why I suggested attempting this for a small number of files or self-contained module.
All the comments above have been from long-time contributors. You all have the great advantage of knowing the history of the codebase and having context on why a coin in the collection of coins is called a CCoinsCacheEntry, but a single coin is a Coin (why not CCoin?), or why wallets contain nWalletVersion, TxSpends and m_location, or 100 other quirks of the code. You have the disadvantage that it’s difficult for you to see the codebase with the eyes of a new contributor, where the lack of consistency is quite confusing. I found naming in the bitcoin core codebase very confusing when I started looking at the code 2 years ago. Arguably it’s now even worse because we’ve partially adopted a new style and so there’s even more inconsistency.
I maintain that given that we do have code style guidelines, doing a one-time fixup reduces the overall workload and improves contributor experience and the quality of review. For example, if I touch a single line if block in my PR, am I obligated to add braces? Either way, I’m likely to get review comments saying “add braces” or “don’t change unrelated code style”. Starting from a point where the code matches the style guide would avoid these wasted review cycles.
Finally, I’ve received out-of-band feedback from two separate contributors who say that they want to comment on this issue but feel like the debate has been shut down and they wouldn’t be listened to. That in itself is troubling.

practicalswift commented at 7:32 pm on February 28, 2019: contributor

@jnewbery Thanks for the summary. FWIW I agree 100%.

I find the last point extremely troubling and sad.

When evaluating arguments, we should ask ourselves a.) if the premises are true and b.) if the premises provide enough logical support for the conclusion. Nothing more. Nothing less.

If some contributors feel like their best arguments won’t be evaluated that way we risk ending up in very dangerous territory.

MarcoFalke commented at 8:54 pm on March 4, 2019: member

I think the current guideline of refactoring/changing the style only when you have a feature, bugfix or tests that build on top of those changes is reasonable and works well. With that, you are effectively asking the common reviewers and commiters in the “module” you change for approval. Assume you want to do something to ./src/wallet/ and get approval from a bunch of recent wallet-reviewers, because they all like it (for whatever reason), the change should be fine to go in.

moneyball commented at 1:12 am on March 23, 2019: contributor

I read through this PR just now and had the following impressions: a) I thought good arguments were made by several people NACKing b) Since most of the early arguments expressed were NACKs I did feel like NACK was appropriate, but that is because I thought the arguments presented were good, not because I had the impression the debate was shut down or that others’ views wouldn’t be listened to. c) I then read @jnewbery’s follow-up response and felt it made the best arguments for ACK, especially the point about making it easier for new contributors. I am optimistic that if others post good arguments for ACK that they will be weighed.

I would encourage any contributor to speak up if they have a good argument for or against this or any change.

laanwj commented at 8:11 am on March 23, 2019: member

I found naming in the bitcoin core codebase very confusing when I started looking at the code 2 years ago.

For what it’s worth I agree that the code base is intimidating to newcomers. I think this is mostly due to lack of documentation of the architecture, and lack of clear interfaces (though this is improving, for sure!) between components. I don’t think surface-level aesthetics is very important here. Whether a variable is called nFee or n_fee or braces wrap around is not going to affect understanding as much as whether, say, the function it is in has a good description and Doxygen documentation for all arguments. And high-level documentation of how things work and interact, would be even nicer.

What kind of frustrates me, personally, is that we didn’t manage to clearly separate out the consensus parts. Not to a library (which would be useful for rust-bitcoin et al), but also not even to a separate directory. I was a bit shocked by #15618 which snuck in a (luckily harmless) change into validation.cpp in a completely unrelated PR.

If we need any more linters it should be to check for unexpected changes to consensus code, for example! This is dangerous territory.

IMO, the goal of this project is to maintain the correct consensus on the bitcoin network. All else is secondary. Again, I feel that the focus of some people on surface-level code style issues is mistaken.

doing a one-time fixup reduces the overall workload and improves contributor experience and the quality of review

If it would really be that, it would be great. However I don’t believe this would actually be a one-time change. I’m afraid it will just encourage people to be even more nitpicky, both in filing “fixup” PRs and commenting in PRs about code style issues.

Assume you want to do something to ./src/wallet/ and get approval from a bunch of recent wallet-reviewers

Yep. Ideally we would automate this through CODEOWNERS (https://github.blog/2017-07-06-introducing-code-owners/).

gmaxwell commented at 10:01 am on March 23, 2019: contributor

My view hasn’t changed.

I’m dubious of the claim that adherence to code style is a material impediment to many new contributors, and to the extent that it is the fact that PRs get style nitpicked all to heck is probably a greater deterrent than consistency.

For example, if I touch a single line if block in my PR, am I obligated to add braces? Either way, I’m likely to get review comments saying “add braces” or “don’t change unrelated code style”.

I think this is more of a symptom of the current review process that a massive style change would not resolve– yes, the instances of questions of if you’re responsible to make a style update to a piece of code may be reduced somewhat but I think the complaint is just a symptom that the review is inconsistent and focused on minutia (and focused on minutia in a way that feels oppressive rather than enabling). That underling issue wouldn’t be resolved by the refactor, but just be moved to any of the hundreds of other bits of minutia review that go on (like &/* hugging types or names or whatever).

For all these things– “CCoinsCacheEntry, but a single coin is a Coin (why not CCoin?), or why wallets contain nWalletVersion, TxSpends and m_location”– I don’t know and I have to look them up each time. But that will be true regardless of what they’re named. Anyone just guessing a variable from the name alone would always be taking a significant risk. I’m doubtful of any contributor’s (even experienced ones) ability to craft code in a way which will be automatically by its structure/style apparent to anyone else (even their future selves)– better documentation helps, cleaner interfaces help. Variable naming? once you are past the point of actively misleading names (or meaningless single characters) there are pretty diminishing returns. We specify a style so that you can just fall back on it without having to think and result in something that isn’t too bad. The style is a thinking aid and a time saver but not itself a independent measure of quality.

I’m not completely unreceptive to the view that there are different approaches and these factors are more important to some people than others… and I do appreciate the efforts to improve the software in ways that are themselves not very meaningful to me. But again I reiterate: the background level of refactors, style changes, cleanups, and other related activity is actively repelling multiple long term contributors, myself included. I beg: give it a bit of a rest, we have so many other things that are crying for our attention and our resolve. We should try to find some initiatives that we all feel more exited about, success with them would make it easier to work on other things… but right now a big style change is just not a unifying effort.

laanwj commented at 10:11 am on March 23, 2019: member

I think this is more of a symptom of the current review process that a massive style change would not resolve– yes, the instances of questions of if you’re responsible to make a style update to a piece of code may be reduced somewhat but I think the complaint is just a symptom that the review is inconsistent and focused on minutia

Yes, to be clear, as a contributor you’re never forced to take all review comments into account. It’s perfectly fine to say that you don’t want to do something in a PR if you feel it is outside the scope of the change, including/especially if it seems extremely nitpicky. What can hold up merge is if you don’t respond to them at all, because it looks like “author hasn’t got to it yet”.

lucayepa commented at 11:47 pm on March 23, 2019: contributor

On one side, as stated by some long term contributors, a lot of time is wasted in style related comments, issues and PRs. On the other side, we have confused contributors and, maybe, we have less contribution, because of style doubts.

I think this can be solved with policies that have a long grace period (say five years). In order not to waste any time on style, the policies can be written on a wiki, like the release notes, or on a different project, not with our code, so that the editing of the style guidelines is not a PR itself. This way, every time we have some comment on style, we can link the wiki and spend less time on this kind of comments. On the other side, a new contributor (or an old, as well) has a guideline to follow.

The policies will have two states: “grace period” and “enforced”. During the grace period, contributors are requested (but not forced) to write new code compliant with the rules. At the end of the grace period of each rule, we will decide if migrating existing code or not (something like #15612 and #15638).

Example of rules can be (I’m not advocating these rules; they are only examples, taken from Google C++ style guide):

“Objects with static storage duration are forbidden, unless they are trivially destructible.”
“We do not use exceptions.”
“Each line of text in your code should be at most 80 characters long.”

This approach (rule -> long grace period -> enforcement decided (or not) at the end of the long period) will freed a lot of time wasted on this type of recurring discussions.

Just my two cents.

flack commented at 9:40 am on March 24, 2019: contributor

Maybe I’m missing something, but doesn’t the current policy (style guide only applies to new code) pretty much guarantee that you get the worst of both worlds? i.e. you get all the review overhead of people wanting to make new code conform to it, and none of the (perceived) benefits, since the codebase will be just as inconsistent as it was before.

If the style guide was enforced, you could just have some tool format your code before committing and then forget about it. Or if there was no style guide at all, you could just commit what you think is best. But the way it is now, you have to know and follow all these rules (and you’ll get review notes if you don’t), but still filter out all the rule violations in the files you are working on since “it’s been like this before”

ryanofsky commented at 12:26 pm on March 24, 2019: member

Maybe I’m missing something, but doesn’t the current policy (style guide only applies to new code) pretty much guarantee that you get the worst of both worlds?

This matches my experience where projects that have a more consistent style also tend have less discussion about style. Because of this, some of gmaxwell’s and laanwj’s arguments to me sound something like, “I don’t want to go to the doctor today because I’m feeling sick.”

That said, I think whether or not this proposal is a good idea depends on the details of the proposal. I strongly agree with Greg’s “if we were to do this, we’d do it in a limited window and strongly discourage any refactors outside of that window,” and he’s right that merging style PRs immediately after branching would make backporting unnecessarily difficult. I also like Marco’s and Wladimir’s ideas about treating validation code differently.

To me an good proposal look something like:

There will be a 2 week window a few weeks after each release (roughly every 6 months) where we consider merging style and refactoring-only PRs.
Developers will be free to submit, review, and ACK style PRs outside the window, but the PRs will only be considered for merging outside the window if they help implement a user-visible feature or make a measurable performance improvement.
PRs should avoid changing validation and non-validation code at the same time, and there will be a much higher bar for accepting style changes to validation code.

lucayepa commented at 1:15 pm on March 24, 2019: contributor

“I don’t want to go to the doctor today because I’m feeling sick.”

Agree. It seems something that we are procrastinating without a reason.

There will be a 2 week window a few weeks after each release (roughly every 6 months) where we consider merging style and refactoring-only PRs.

Disagree. This will block the development for two weeks every six months. For about nothing.

I prefer my idea of a long grace period for each rule. We are plenty of time to have every PR with the new code before of the enforcement, so we don’t need to backport anything.

ryanofsky commented at 2:38 pm on March 24, 2019: member

There will be a 2 week window a few weeks after each release (roughly every 6 months) where we consider merging style and refactoring-only PRs.

Disagree. This will block the development for two weeks every six months. For about nothing.

In case there is a misunderstanding, the idea is not that cleanup PRs should be developed in these two weeks. More the opposite: they should be discussed, implemented, posted, reviewed, and ACKed before this time and just merged in the 2 week window.

ajtowns commented at 6:38 am on May 22, 2019: member

I’m really not a fan of having code style guidelines that don’t really match how the code is currently written; particularly the class/member variable naming when things like “nSequence” are protocol-level terms. In particular, having multiple styles within a single function/class seems like it makes things harder, rather than easier. On the other hand, spending lots of cycles changing the code to match the guidelines seems high risk and low reward.

So, personally, I’d rather revisit #10461 and add something like “hungarian-style with camel case is fine for local variables, public/protected class members and function parameters if it matches nearby/related code; prefixes are “n” for number, “p[class]” for pointers, […].”

Otherwise, how about changing all (non-const, non-static) globals to actually have the g_ prefix? I could see that potentially being a refactor to make things more consistent that actually has a worthwhile risk/reward payoff (since it’s then easier to find globals, and harder to accidently shadow/alias them via locals or class members)?

jnewbery commented at 3:13 pm on May 23, 2019: member

I’m really not a fan of having code style guidelines that don’t really match how the code is currently written; particularly the class/member variable naming when things like “nSequence” are protocol-level terms. In particular, having multiple styles within a single function/class seems like it makes things harder, rather than easier.

Couldn’t agree more.

I’d rather […] add something like “hungarian-style with camel case is fine for local variables…

Bleurgh. I don’t think this helps now that we have a mixed style codebase

Otherwise, how about changing all (non-const, non-static) globals to actually have the g_ prefix?

Obviously I’m in favour of this :slightly_smiling_face:

MarcoFalke commented at 3:35 pm on May 23, 2019: member

Obviously I’m in favour of this slightly_smiling_face

We use globals for everything (even the mempool is a global) and prefixing all of those with g_ is hard to review because we shadow those globals locally. (E.g. bce301875bcfd09760a3fd87e104c724d2768c47). I’d prefer to transition to code without globals.

jnewbery commented at 4:08 pm on May 23, 2019: member

prefixing all of those with g_ is hard to review because we shadow those globals locally

Surely it’s better to review the changes in a PR that is focused solely on removing the shadowing than to leave them there as a landmine for future PRs?

MarcoFalke commented at 4:46 pm on May 23, 2019: member

Either keeping the shadow or removing it can be landmine, which is why I’d prefer to remove the mempool as a global. Though, that seems off-topic for this “code style PRs” issue.

MarcoFalke commented at 1:00 am on April 27, 2020: member

Closing for now. The discussion can continue and is not “censored” or “shut off”, but I think keeping this open will distract from other issues.

MarcoFalke closed this on Apr 27, 2020

jnewbery commented at 3:41 am on April 27, 2020: member

ACK closing out old stale issues for good housekeeping.

rebroad commented at 7:49 pm on November 28, 2021: contributor

IMHO #22829, for example (as it sneaked in a variable name change), ought to have been NACKed, if I understand the sentiment of this issue correctly. These sorts of changes create more work overall due to the effort to rebase changes based on variable names that change, when this time could instead have been spend on adding or improving features.

JeremyRubin commented at 8:38 pm on December 15, 2021: contributor

i also had this experience with #23546 recently, which is annoying since the conflicts are depends-on for a larger PR.

overall i still think we’d be best to rip the band-aid and just clang-format the whole damn thing.

some ideas for how to do it safely:

write a tool that can easily re-write a branch with clang-format on every patch.
write a tool that can compare deterministic build binaries before/after clang format (can we make that deterministic)?
write a bot that sends an email to all devs with a simple to read shell script that reworks all their open PRs with tool 1
clang-format the whole project, file-by-file in commits to permit bisecting any issues.
move on with our lives

With respect to renaming of variables/class names, we could also include in the script renaming for C-prefixed classes if we want to… certain things like “nSequence” we probably can’t rename given their extensive use in protocol documentation outside of this repo.

MarcoFalke commented at 8:53 pm on December 15, 2021: member

i also had this experience with #23546 recently, which is annoying since the conflicts are depends-on for a larger PR.

The conflicts caused by this pull are all trivial to solve, most of them just causing a conflict because an adjacent line was touched. It should cost less than 10 seconds to solve them by hand. With the right tool, even less than 10 milliseconds, as explained in #22954 (comment) .

overall i still think we’d be best to rip the band-aid and just clang-format the whole damn thing.

I tend to disagree. clang-format on a large scale will need to be reviewed very carefully. It is rare, but does happen that clang-format produces code that is almost impossible to read after a reformat.

write a tool that can compare deterministic build binaries before/after clang format (can we make that deterministic)?

I’ve seen binaries change with just a single empty line added or removed in a header file. https://github.com/bitcoin-core/bitcoin-maintainer-tools#build-for-compare might be a good start, but I think developing a tool that is actually good at this will be tricky.

Also, 23546 was a clang-tidy change, not clang-format ;)

JeremyRubin commented at 9:59 pm on December 15, 2021: contributor

i though clang-tidy changed the variable names, which means I will have to change the variable names too. I don’t mean to complain though, I’m not upset, I can rebase, just need to get to it (and that code was a bit tricky since the way those tests are written is really brittle).

I’m overall for things like clang-tidy, clang-format, etc. I want to spend less than 0 brain cycles thinking about this, and we are experiencing a lot of burnout which I view losing 1 contributor and qualified regular reviewer over as a much bigger security risk than clang-format introducing a bug.

jnewbery commented at 11:19 am on December 16, 2021: member

I’m going to lock the conversation on this issue. I opened it to propose a specific action (do a significant code style cleanup after the v0.18 branch-off). It’s now spread into general discussion about clang-tidy, clang-format, people’s grievances about rebasing their own PRs, etc. I think it’s fine to have those discussions somewhere, but we can spare the inboxes of the people who commented on this issue nearly 4 years ago.

jnewbery locked this on Dec 16, 2021

Code style PRs after v0.18 branch split #15465