Coding style update with clang-format #4498

pull sipa wants to merge 1 commits into bitcoin:master from sipa:clangformat changing 2 files +82 −45
  1. sipa commented at 4:25 pm on July 9, 2014: member

    This pull request just to show what the effect of running a recent clang-format-3.5 (from clang’s PPA; the version in the Trusty repository misses some features still) would be on our source code.

    For now, I’ve set the column limit to 0, which means it should respect existing line breaking decisions, but as you can see the result is still fairly arbitrary (it breaks and unbreaks in several place).

    I would prefer using the normal mode where an actual column limit is specified (it breaks quite intelligently, with different penalty scores for breaking in several places), but that deals very badly with the IMPLEMENT_SERIALIZE macros.

    This is not intended for merging. If we decide to adopt this policy and its effects, I’d rerun it just before release candidates, and then hopefully afterwards more frequently, but now it would obviously break every pull request in existence, with 0 benefit.

  2. sipa commented at 4:58 pm on July 9, 2014: member
    Even with column limit 0, IMPLEMENT_SERIALIZE doesn’t work well. See addrman.h, where it suddenly loses two levels of indentation, starting inside the serialize implementation, up until the end of the file.
  3. in src/addrman.h: in 24f4aa15b4 outdated
    52         READWRITE(*pthis);
    53         READWRITE(source);
    54         READWRITE(nLastSuccess);
    55-        READWRITE(nAttempts);
    56-    )
    57+        READWRITE(nAttempts);)
    


    jgarzik commented at 5:32 pm on July 9, 2014:
    Yuck (as @sipa already commented)
  4. in src/addrman.h: in 24f4aa15b4 outdated
    311-                    {
    312-                        int nIndex = mapUnkIds[*it2];
    313-                        READWRITE(nIndex);
    314-                    }
    315-                }
    316+        int nUBuckets = ADDRMAN_NEW_BUCKET_COUNT;
    


    jgarzik commented at 5:33 pm on July 9, 2014:
    Something odd is going on with indentation here. From reading the diff (ie. possibly not the whole story), it looks like it un-indented one level too much.
  5. in src/addrman.h: in 24f4aa15b4 outdated
    466-    {
    467-         nKey.resize(32);
    468-         RAND_bytes(&nKey[0], 32);
    469+{
    470+    nKey.resize(32);
    471+    RAND_bytes(&nKey[0], 32);
    


    jgarzik commented at 5:34 pm on July 9, 2014:
    Ah, I see what’s going on. It’s not keeping the indentation from the header file’s method implementation.
  6. kazcw commented at 9:04 pm on July 9, 2014: contributor

    Putting curlies inside the parens gets reasonable clang-format results:

    0IMPLEMENT_SERIALIZE({
    1    READWRITE(stuff);
    2})
    
  7. sipa commented at 3:49 pm on July 10, 2014: member

    @kazcw Look at addrman.h, where even that construct fails in a very odd way (we lose indentation, even after the IMPLEMENT_SERIALIZE ends).

    And when setting an explicit line length, it’s even worse.

  8. in src/serialize.h: in 24f4aa15b4 outdated
    436@@ -334,8 +437,8 @@ I ReadVarInt(Stream& is)
    437     }
    438 }
    439 
    440-#define FLATDATA(obj)  REF(CFlatData((char*)&(obj), (char*)&(obj) + sizeof(obj)))
    441-#define VARINT(obj)    REF(WrapVarInt(REF(obj)))
    442+#define FLATDATA(obj) REF(CFlatData((char*)&(obj), (char*)&(obj) + sizeof(obj)))
    443+#define VARINT(obj) REF(WrapVarInt(REF(obj)))
    


    luke-jr commented at 3:54 pm on July 10, 2014:
    Does it need to break aligned lines? :(
  9. in src/.clang-format: in 24f4aa15b4 outdated
    35+PenaltyBreakComment: 300
    36+PenaltyBreakFirstLessLess: 120
    37+PenaltyBreakString: 1000
    38+PenaltyExcessCharacter: 1000000
    39+PenaltyReturnTypeOnItsOwnLine: 200
    40+PointerAlignment: Left
    


    luke-jr commented at 3:55 pm on July 10, 2014:
    As much as I wish PointerAlignment Left made sense, C/C++ sadly are designed with it Right semantically.

    sipa commented at 3:58 pm on July 10, 2014:
    I agree, but I don’t feel like bikeshedding.

    jgarzik commented at 4:01 pm on July 10, 2014:
    AFAICT, our existing codebase uses ‘Right’

    sipa commented at 4:02 pm on July 10, 2014:
    It has a mix. According to @laanwj the majority seems to use Left.

    sipa commented at 4:03 pm on July 10, 2014:
    Don’t just look at these files. Much code that I wrote uses Right, which is a lot in addrman.h and main.cpp.

    jgarzik commented at 4:08 pm on July 10, 2014:

    I disagree with @laanwj ’s assessment. The majority of new code is Right, looking at all files in src/*.{cpp,h}

    Of course, heh, there are nutters out there such as,

    0CDB::CDB(const char *pszFile, const char* pszMode) :
    

    laanwj commented at 9:19 am on July 14, 2014:
    I don’t care either way, just pick one.

    sipa commented at 9:12 pm on July 28, 2014:
    Left results in 1000 lines less diff. Left it is.
  10. kazcw commented at 5:29 pm on July 10, 2014: contributor

    @sipa that’s not what I’m seeing in addrman.h (with or without ColumnLimit):

     0public:
     1    IMPLEMENT_SERIALIZE({
     2        CAddress* pthis = (CAddress*)(this);
     3        READWRITE(*pthis);
     4        READWRITE(source);
     5        READWRITE(nLastSuccess);
     6        READWRITE(nAttempts);
     7    })
     8    void Init()
     9    {
    10        nLastSuccess = 0;
    

    I’m using a clang from yesterday [4a3ff91], so it could be a recent change. I do see behavior that looks like what you’re describing when I open with ({ and close with )}.

  11. sipa commented at 5:31 pm on July 10, 2014: member
    @kazcw That one works. Look at the larger one in CAddrMan.
  12. sipa commented at 5:37 pm on July 10, 2014: member
    Upgrading; let’s see.
  13. kazcw commented at 5:40 pm on July 10, 2014: contributor
    @sipa: I see, that block makes clang-format just give up. Maybe clang-format has a nesting limit for blocks within apparent function calls or something. I found that it works if you use just ({ and }) rather than the current (({{ and }})), though.
  14. sipa commented at 2:23 pm on July 17, 2014: member
    Rebased on #4508, and changed pointer alignment to right.
  15. jgarzik commented at 2:26 pm on July 17, 2014: contributor
    I retract my previous position. I think pointer* and reference& should cuddle with the type, rather than the name.
  16. jgarzik commented at 2:28 pm on July 17, 2014: contributor

    It is disappointing that we lose multi-column alignments, e.g.

    0- SIGHASH_FOO          = (second column nicely aligned)
    1+ SIGHASH_FOO = (left aligned)
    
  17. sipa commented at 2:30 pm on July 17, 2014: member
    @jgarzik Agree. it retains it in some places (for comments eg), but not always.
  18. jgarzik commented at 2:36 pm on July 17, 2014: contributor

    We lose indented cpp

    0-# error "Bitcoin cannot be compiled without assertions."
    1+#error "Bitcoin cannot be compiled without assertions."
    
  19. jgarzik commented at 2:47 pm on July 17, 2014: contributor
    Overall I think it is a net-win.
  20. jtimon commented at 3:40 pm on July 17, 2014: contributor
    I also prefer pointer* ptr and reference& ref over pointer *ptr and reference &ref. It is more clear to me since being a pointer is really part of type. But I don’t think it is important enough to change it unless you’re already touching that line.
    Anyway, this is bike shedding, we should just always use one or the other.
  21. Diapolo commented at 8:04 pm on July 17, 2014: none
    I love the intention of this pull, I really love how it feels to have a unified style all over the code! Only small drawback is that (as @jgarzik calls them), multi-column alignments are lost, which are nice for readability. @sipa @laanwj What about the include order policy? This won’t be fixed by this, right? If we intent to merge this, am I allowed to fix the includes again or not?
  22. sipa commented at 8:09 pm on July 17, 2014: member
    So this is purely about indentation/whitespace. Anything that changes anything but a space or newline is outside of scope here. That doesn’t mean we can have a policy about other things too - it just won’t be automatically enforceable.
  23. laanwj commented at 8:15 pm on July 17, 2014: member

    It is disappointing that we lose multi-column alignments, e.g.

    • SIGHASH_FOO = (second column nicely aligned)
    • SIGHASH_FOO = (left aligned)

    I’m happy that we lose manual alignments like that. No more code changes that change surrounding lines to ‘realign’ when the point is just to add one another constant (that happens to be longer than the rest). Or pulls that ‘fix’ the silly column alignment afterwards.

  24. sipa commented at 8:55 pm on July 17, 2014: member

    Added more examples. Ugly things:

    • chainparams.cpp: the list of seed IPs gets expanded tot 1 per line.
    • uint256.h: one of the friend operators ends up concatenated to the previous line (perhaps confused by « »?).
    • The usual inconsistent splitting/rejoining of long lines.
  25. laanwj commented at 7:21 am on July 18, 2014: member

    chainparams.cpp: the list of seed IPs

    As this is auto-generated data, it makes sense to move it to an external include file that is not affected by clang-format. Talking about autogenerated files, bitcoinstrings.cpp should also be excluded.

  26. sipa commented at 11:24 am on July 27, 2014: member

    Can I have some ACKs on the general effect of these?

    If accepted, I’ll remove the example change here and just commit the .clang-format file.

    Next steps: apply the changes one by one to infrequently-changing files. The majority of them will probably need to wait until close to release candidate stage, as to not affect pull requests.

  27. jgarzik commented at 2:12 pm on July 27, 2014: contributor
    ACK general direction
  28. jtimon commented at 2:43 am on July 28, 2014: contributor

    ACK on general effect, absolutely. The roadmap seems reasonable too. And even if we will need to eventually break some PR, I wouldn’t worry too much since the conflicts in the rebase should be trivial to resolve.

    Now excuse me for my bike shedding… By changing a couple of parameters:

    0PointerAlignment: Left
    1SpacesBeforeTrailingComments: 1
    

    We get less changes in the example files as shown in https://github.com/jtimon/bitcoin/compare/clang?expand=1 Well, I also initialize chainparams.cpp::pnSeed in a single line… The spaces change seems pretty obvious but probably the pointer alignment is more controversial. I propose to leave tastes aside and use minimal changes required to impose a uniform style as criterion of quality for the clang-format file. That would make the transition less painful. Maybe what is true in these example files isn’t true in the whole project. But although apparently both alignments are used for local variables and parameters, left seems to be consitently used in return values, castings and template instanciations (for example, map<uint256, CBlockIndex*>::iterator instead of map<uint256, CBlockIndex *>::iterator). Well, as said this is bike shedding, but I think minimizing unnecessary changes makes sense as a criterion, doesn’t it?

  29. jtimon commented at 2:59 am on July 28, 2014: contributor
    Oh, also added BOOST_REVERSE_FOREACH to ForEachMacros
  30. laanwj commented at 7:42 am on July 28, 2014: member
    (untested :p) ACK on general effects
  31. Diapolo commented at 10:03 am on July 28, 2014: none
    (untested) ACK
  32. sipa commented at 6:33 pm on July 28, 2014: member
    @jtimon: thanks for backing up that suggestion with actual numbers (less changes to the code is indeed what I was aiming for mostly at this point).
  33. sipa commented at 7:40 pm on July 28, 2014: member

    Added BOOST_REVERSE_FOREACH to ForEachMacros and changed SpacesBeforeTrailingComments to 1.

    Then tried both PointerAlignment: Left and Right for src/*.{h,cpp}

    • Left: -8195 lines +8206 lines
    • Right: -9124 lines +9135 lines

    I’m switching to: PointerAlignment: Left as well.

  34. Update coding style and add .clang-format 2887bffcfd
  35. sipa commented at 8:09 pm on July 28, 2014: member
    Ready for review. I’ve updated coding-style.md as well. Note that I’ve removed the variable/class naming style as well - it wasn’t being followed strictly anymore anyway.
  36. sipa renamed this:
    Preview: clang-format result
    Coding style update with clang-format
    on Jul 28, 2014
  37. sipa commented at 8:18 pm on July 28, 2014: member
  38. BitcoinPullTester commented at 8:21 pm on July 28, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4498_2887bffcfdc138f53c60091ab2906790971c9af5/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  39. jgarzik commented at 8:59 pm on July 28, 2014: contributor
    ACK
  40. jtimon commented at 9:17 pm on July 28, 2014: contributor
    ACK. It would be nice to add a “format” make target later.
  41. jtimon commented at 9:39 pm on July 28, 2014: contributor
    By the way, I’ve tried to find an alternative to @kazcw’s solution to implement_serialize on http://clang.llvm.org/docs/ClangFormatStyleOptions.html with no luck, but the curlies inside the parenthesis don’t look bad in that case in my opinion. Also, AllowShortBlocksOnASingleLine: true would be good in terms of changes, but in that case I think I would prefer not to think about what’s more appropriate for each case, which is precisely what you’re trying to avoid when automatizing the style.
  42. sipa commented at 9:41 pm on July 28, 2014: member
    I’m going to try to replace the IMPLEMENT_SERIALIZE macros with just (templated) class methods.
  43. laanwj added the label Improvement on Jul 31, 2014
  44. laanwj merged this on Aug 6, 2014
  45. laanwj closed this on Aug 6, 2014

  46. laanwj referenced this in commit 8833acc4c9 on Aug 6, 2014
  47. Diapolo commented at 8:19 pm on August 6, 2014: none
    Thanks!
  48. kostaz commented at 7:45 am on September 14, 2014: contributor

    Hi guys,

    I’ve proposed the coding style changes in the Fix coding style with uncrustify [#4891](/bitcoin-bitcoin/4891/). I’ve used uncrustify tool, but the tool is not important, the coding style is. :-) Anyway, the configuration file for the uncrustify tool is in the mentioned pull merge.

    The rationale for most coding style changes are Stroustrup, Google coding style and common visual and readability sense.

    By your generous permission, I’d like to discuss the coding style changes here and if they can be made automatic using git clang-format. The main purpose of the proposed coding style is to make the code more readable and, hence, bring more developers to the Bitcoin Core.

    The changes are…


    Always add curly braces.


    Always add curly braces to for, if, while, do-while, try-catch, boost loops and any possible construct.

    Example:

    0     if (pnId)
    1+    {
    2         *pnId = nId;
    3+    }
    

    Example:

    0 std::string CUnsignedAlert::ToString() const
    1 {
    2     std::string strSetCancel;
    3-    BOOST_FOREACH(int n, setCancel)
    4
    5+    BOOST_FOREACH (int n, setCancel)
    6+    {
    7         strSetCancel += strprintf("%d ", n);
    8+    }
    9+
    

    Example:

    0                 if (fThread)
    1+                {
    2                     boost::thread t(runCommand, strCmd); // thread runs free
    3+                }
    4                 else
    5+                {
    6                     runCommand(strCmd);
    7+                }
    

    Always put curly braces on the new line.


     0 static bool findSighashFlags(int& flags, const string& flagStr)
     1 {
     2     flags = 0;
     3
     4-    for (unsigned int i = 0; i < N_SIGHASH_OPTS; i++) {
     5-        if (flagStr == sighashOptions[i].flagStr) {
     6
     7+    for (unsigned int i = 0; i < N_SIGHASH_OPTS; i++)
     8+    {
     9+        if (flagStr == sighashOptions[i].flagStr)
    10+        {
    11             flags = sighashOptions[i].flags;
    

    Surround curly-braced code with blank lines.


    Example:

    0         boost::mutex::scoped_lock lock(mutex);
    1-        if(!size) return;
    2+
    3+        if (!size)
    4+        {
    5+            return;
    6+        }
    7+
    8         const size_t base_addr = reinterpret_cast<size_t>(p);
    9         const size_t start_page = base_addr & page_mask;
    

    Example:

     0 int64_t CTransaction::GetValueOut() const
     1 {
     2     int64_t nValueOut = 0;
     3-    BOOST_FOREACH(const CTxOut& txout, vout)
     4+
     5+    BOOST_FOREACH (const CTxOut& txout, vout)
     6     {
     7         nValueOut += txout.nValue;
     8+
     9         if (!MoneyRange(txout.nValue) || !MoneyRange(nValueOut))
    10+        {
    11             throw std::runtime_error("CTransaction::GetValueOut() : value out of range");
    12+        }
    13     }
    14+
    15     return nValueOut;
    

    Example:

     0         vout.size(),
     1         nLockTime);
     2+
     3     for (unsigned int i = 0; i < vin.size(); i++)
     4+    {
     5         str += "    " + vin[i].ToString() + "\n";
     6+    }
     7+
     8     for (unsigned int i = 0; i < vout.size(); i++)
     9+    {
    10         str += "    " + vout[i].ToString() + "\n";
    11+    }
    12+
    13     return str;
    14 }
    

    No one-liners.


    Example:

     0-    void SetNull() { hash = 0; n = (uint32_t) -1; }
     1-    bool IsNull() const { return (hash == 0 && n == (uint32_t) -1); }
     2+    void SetNull()
     3+    {
     4+        hash = 0;
     5+        n = (uint32_t) -1;
     6+    }
     7+
     8+    bool IsNull() const
     9+    {
    10+        return (hash == 0 && n == (uint32_t) -1);
    11+    }
    

    Example:

     0-    CInPoint() { SetNull(); }
     1-    CInPoint(const CTransaction* ptxIn, uint32_t nIn) { ptx = ptxIn; n = nIn; }
     2-    void SetNull() { ptx = NULL; n = (uint32_t) -1; }
     3-    bool IsNull() const { return (ptx == NULL && n == (uint32_t) -1); }
     4+    CInPoint()
     5+    {
     6+        SetNull();
     7+    }
     8+
     9+    CInPoint(const CTransaction* ptxIn, uint32_t nIn)
    10+    {
    11+        ptx = ptxIn;
    12+        n = nIn;
    13+    }
    14+
    15+    void SetNull()
    16+    {
    17+        ptx = NULL;
    18+        n = (uint32_t) -1;
    19+    }
    20+
    21+    bool IsNull() const
    22+    {
    23+        return (ptx == NULL && n == (uint32_t) -1);
    24+    }
    

    Maximum one blank line.


    Example:

    0     }
    1 };
    2
    3-
    4 class CBlock : public CBlockHeader
    5 {
    6 public:
    

    Could you please help to produce the .clang-format file to test this coding style?

    Thanks, — Kosta

  49. jtimon commented at 4:25 pm on September 14, 2014: contributor

    To change the .clang-format file, you may want to read this:

    http://clang.llvm.org/docs/ClangFormatStyleOptions.html

    First of all, I wouldn’t change anything on it until we have applied the current style to the whole project. Second, you cannot apply a style to the whole project at one because that would break practically all open PRs. To me the biggest benefit of automating the style is not having to think about style, you just follow the rules automatically. Now I’ll comment on the proposed changes:

    -Always add curly braces. -Always put curly braces on the new line.

    I don’t think these increase readability. Is python less readable than C for not having curly braces? I don’t think so.

    -Surround curly-braced code with blank lines. I don’t like to be strict about this. Sometimes it make sense (semantically) to say, divide a function in 3 blocks, those blocks don’t necessarily coincide with curly braces.

    -No one-liners. We discussed this and we decided that allowing them for now would cause less changes.

    -Maximum one blank line. I like this. If you want more separation you can add a comment. I’m not sure what’s the clang option for this though. I think the current behavior is to turn N > 2 empty lines into 2 empty lines.

  50. sipa commented at 6:38 pm on September 14, 2014: member

    I disagree that coding style specifics matter much for readability. It’s mostly convention - and we should mostly be aiming at consistency. People will always have different preferences about coding style.

    If there’s enough people who prefer one style to another, I guess we can change things. But the purpose here is mostly having one style that everyone agrees on, which means the things like indentation changes after refactors can be done automatically, with much easier review requirements (just redo the reformatting locally, and check that the result is the same).

    For most of the choices in the style, the decisions are just based on minimizing changes to the existing code. Because actually reformatting things will take time and won’t be trivial. Just doing all changes all at once will break every single pull request in a very painful way. While redoing formatting afterwards is trivial. So we’ll likely want to delay reformatting changes to just before release (candidate) when big changes are in. Even then, smaller changes are better as not every PR gets in within one release cycle.

  51. kostaz commented at 1:33 pm on September 15, 2014: contributor

    Sipa, it is pity that you think that the coding style is not related to the readability. After all, the code is there to read it, change and write a new one. The C++ source code matters only to humans who mostly read it! Hence, IMHO, more readable code brings more people. Hmmm… exactly like the book - if you can read book easily, you keep reading it. Otherwise, you put it off for a while / for eternity.

    :-(

    Regarding the existing .clang-format file, why not add the git pre-commit hook to run git clang-format?

  52. laanwj commented at 1:44 pm on September 15, 2014: member

    @sipa mentioned a fun thing once: once the entire source code is formatted using clang-format, you could locally format your source after checking out however suits you best, then before checking in re-format it with the ‘canonical’ style file of the project.

    Anyhow - different people have different opinions on what coding style is most readable. That’s subjective. But it is a fact that a consistent coding style is more efficient. Any time saved not worrying about whether a certain { should be at the end or the beginning, or the else, and the eternal discussions, is worth it.

  53. sipa commented at 6:12 pm on September 15, 2014: member
    I fully agree that the source code is there for humans. Humans are however known to have different preferences.
  54. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me