Skip to content

Conversation

@sipa
Copy link
Member

@sipa sipa commented Jul 9, 2014

This pull request just to show what the effect of running a recent clang-format-3.5 (from clang's PPA; the version in the Trusty repository misses some features still) would be on our source code.

For now, I've set the column limit to 0, which means it should respect existing line breaking decisions, but as you can see the result is still fairly arbitrary (it breaks and unbreaks in several place).

I would prefer using the normal mode where an actual column limit is specified (it breaks quite intelligently, with different penalty scores for breaking in several places), but that deals very badly with the IMPLEMENT_SERIALIZE macros.

This is not intended for merging. If we decide to adopt this policy and its effects, I'd rerun it just before release candidates, and then hopefully afterwards more frequently, but now it would obviously break every pull request in existence, with 0 benefit.

@sipa
Copy link
Member Author

sipa commented Jul 9, 2014

Even with column limit 0, IMPLEMENT_SERIALIZE doesn't work well. See addrman.h, where it suddenly loses two levels of indentation, starting inside the serialize implementation, up until the end of the file.

src/addrman.h Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuck (as @sipa already commented)

@kazcw
Copy link
Contributor

kazcw commented Jul 9, 2014

Putting curlies inside the parens gets reasonable clang-format results:

IMPLEMENT_SERIALIZE({
    READWRITE(stuff);
})

@sipa
Copy link
Member Author

sipa commented Jul 10, 2014

@kazcw Look at addrman.h, where even that construct fails in a very odd way (we lose indentation, even after the IMPLEMENT_SERIALIZE ends).

And when setting an explicit line length, it's even worse.

src/serialize.h Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to break aligned lines? :(

@kazcw
Copy link
Contributor

kazcw commented Jul 10, 2014

@sipa that's not what I'm seeing in addrman.h (with or without ColumnLimit):

public:
    IMPLEMENT_SERIALIZE({
        CAddress* pthis = (CAddress*)(this);
        READWRITE(*pthis);
        READWRITE(source);
        READWRITE(nLastSuccess);
        READWRITE(nAttempts);
    })
    void Init()
    {
        nLastSuccess = 0;

I'm using a clang from yesterday [4a3ff91], so it could be a recent change. I do see behavior that looks like what you're describing when I open with ({ and close with )}.

@sipa
Copy link
Member Author

sipa commented Jul 10, 2014

@kazcw That one works. Look at the larger one in CAddrMan.

@sipa
Copy link
Member Author

sipa commented Jul 10, 2014

Upgrading; let's see.

@kazcw
Copy link
Contributor

kazcw commented Jul 10, 2014

@sipa: I see, that block makes clang-format just give up. Maybe clang-format has a nesting limit for blocks within apparent function calls or something. I found that it works if you use just ({ and }) rather than the current (({{ and }})), though.

@sipa
Copy link
Member Author

sipa commented Jul 17, 2014

Rebased on #4508, and changed pointer alignment to right.

@jgarzik
Copy link
Contributor

jgarzik commented Jul 17, 2014

I retract my previous position. I think pointer* and reference& should cuddle with the type, rather than the name.

@jgarzik
Copy link
Contributor

jgarzik commented Jul 17, 2014

It is disappointing that we lose multi-column alignments, e.g.

- SIGHASH_FOO          = (second column nicely aligned)
+ SIGHASH_FOO = (left aligned)

@sipa
Copy link
Member Author

sipa commented Jul 17, 2014

@jgarzik Agree. it retains it in some places (for comments eg), but not always.

@jgarzik
Copy link
Contributor

jgarzik commented Jul 17, 2014

We lose indented cpp

-# error "Bitcoin cannot be compiled without assertions."
+#error "Bitcoin cannot be compiled without assertions."

@jgarzik
Copy link
Contributor

jgarzik commented Jul 17, 2014

Overall I think it is a net-win.

@jtimon
Copy link
Contributor

jtimon commented Jul 17, 2014

I also prefer pointer* ptr and reference& ref over pointer *ptr and reference &ref. It is more clear to me since being a pointer is really part of type. But I don't think it is important enough to change it unless you're already touching that line.
Anyway, this is bike shedding, we should just always use one or the other.

@Diapolo
Copy link

Diapolo commented Jul 17, 2014

I love the intention of this pull, I really love how it feels to have a unified style all over the code!
Only small drawback is that (as @jgarzik calls them), multi-column alignments are lost, which are nice for readability.

@sipa @laanwj What about the include order policy? This won't be fixed by this, right? If we intent to merge this, am I allowed to fix the includes again or not?

@sipa
Copy link
Member Author

sipa commented Jul 17, 2014

So this is purely about indentation/whitespace. Anything that changes anything but a space or newline is outside of scope here. That doesn't mean we can have a policy about other things too - it just won't be automatically enforceable.

@laanwj
Copy link
Member

laanwj commented Jul 17, 2014

It is disappointing that we lose multi-column alignments, e.g.

  • SIGHASH_FOO = (second column nicely aligned)
  • SIGHASH_FOO = (left aligned)

I'm happy that we lose manual alignments like that. No more code changes that change surrounding lines to 'realign' when the point is just to add one another constant (that happens to be longer than the rest). Or pulls that 'fix' the silly column alignment afterwards.

@sipa
Copy link
Member Author

sipa commented Jul 17, 2014

Added more examples. Ugly things:

  • chainparams.cpp: the list of seed IPs gets expanded tot 1 per line.
  • uint256.h: one of the friend operators ends up concatenated to the previous line (perhaps confused by << >>?).
  • The usual inconsistent splitting/rejoining of long lines.

@laanwj
Copy link
Member

laanwj commented Jul 18, 2014

chainparams.cpp: the list of seed IPs

As this is auto-generated data, it makes sense to move it to an external include file that is not affected by clang-format. Talking about autogenerated files, bitcoinstrings.cpp should also be excluded.

@sipa
Copy link
Member Author

sipa commented Jul 27, 2014

Can I have some ACKs on the general effect of these?

If accepted, I'll remove the example change here and just commit the .clang-format file.

Next steps: apply the changes one by one to infrequently-changing files. The majority of them will probably need to wait until close to release candidate stage, as to not affect pull requests.

@jgarzik
Copy link
Contributor

jgarzik commented Jul 27, 2014

ACK general direction

@jtimon
Copy link
Contributor

jtimon commented Jul 28, 2014

ACK on general effect, absolutely. The roadmap seems reasonable too. And even if we will need to eventually break some PR, I wouldn't worry too much since the conflicts in the rebase should be trivial to resolve.

Now excuse me for my bike shedding...
By changing a couple of parameters:

PointerAlignment: Left
SpacesBeforeTrailingComments: 1

We get less changes in the example files as shown in https://github.com/jtimon/bitcoin/compare/clang?expand=1
Well, I also initialize chainparams.cpp::pnSeed in a single line...
The spaces change seems pretty obvious but probably the pointer alignment is more controversial.
I propose to leave tastes aside and use minimal changes required to impose a uniform style as criterion of quality for the clang-format file. That would make the transition less painful.
Maybe what is true in these example files isn't true in the whole project. But although apparently both alignments are used for local variables and parameters, left seems to be consitently used in return values, castings and template instanciations (for example, map<uint256, CBlockIndex*>::iterator instead of map<uint256, CBlockIndex *>::iterator).
Well, as said this is bike shedding, but I think minimizing unnecessary changes makes sense as a criterion, doesn't it?

@jtimon
Copy link
Contributor

jtimon commented Jul 28, 2014

Oh, also added BOOST_REVERSE_FOREACH to ForEachMacros

@laanwj
Copy link
Member

laanwj commented Jul 28, 2014

(untested :p) ACK on general effects

@Diapolo
Copy link

Diapolo commented Jul 28, 2014

(untested) ACK

@sipa
Copy link
Member Author

sipa commented Jul 28, 2014

@jtimon: thanks for backing up that suggestion with actual numbers (less changes to the code is indeed what I was aiming for mostly at this point).

@sipa
Copy link
Member Author

sipa commented Jul 28, 2014

Added BOOST_REVERSE_FOREACH to ForEachMacros and changed SpacesBeforeTrailingComments to 1.

Then tried both PointerAlignment: Left and Right for src/*.{h,cpp}

  • Left: -8195 lines +8206 lines
  • Right: -9124 lines +9135 lines

I'm switching to: PointerAlignment: Left as well.

@sipa
Copy link
Member Author

sipa commented Jul 28, 2014

Ready for review. I've updated coding-style.md as well. Note that I've removed the variable/class naming style as well - it wasn't being followed strictly anymore anyway.

@sipa sipa changed the title Preview: clang-format result Coding style update with clang-format Jul 28, 2014
@sipa
Copy link
Member Author

sipa commented Jul 28, 2014

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4498_2887bffcfdc138f53c60091ab2906790971c9af5/ for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@jgarzik
Copy link
Contributor

jgarzik commented Jul 28, 2014

ACK

@jtimon
Copy link
Contributor

jtimon commented Jul 28, 2014

ACK.
It would be nice to add a "format" make target later.

@jtimon
Copy link
Contributor

jtimon commented Jul 28, 2014

By the way, I've tried to find an alternative to @kazcw's solution to implement_serialize on http://clang.llvm.org/docs/ClangFormatStyleOptions.html with no luck, but the curlies inside the parenthesis don't look bad in that case in my opinion.
Also, AllowShortBlocksOnASingleLine: true would be good in terms of changes, but in that case I think I would prefer not to think about what's more appropriate for each case, which is precisely what you're trying to avoid when automatizing the style.

@sipa
Copy link
Member Author

sipa commented Jul 28, 2014

I'm going to try to replace the IMPLEMENT_SERIALIZE macros with just (templated) class methods.

@laanwj laanwj merged commit 2887bff into bitcoin:master Aug 6, 2014
laanwj added a commit that referenced this pull request Aug 6, 2014
2887bff Update coding style and add .clang-format (Pieter Wuille)
@Diapolo
Copy link

Diapolo commented Aug 6, 2014

Thanks!

@kostaz
Copy link
Contributor

kostaz commented Sep 14, 2014

Hi guys,

I've proposed the coding style changes in the Fix coding style with uncrustify #4891.
I've used uncrustify tool, but the tool is not important, the coding style is. :-)
Anyway, the configuration file for the uncrustify tool is in the mentioned pull merge.

The rationale for most coding style changes are Stroustrup, Google coding style and common visual and readability sense.

By your generous permission, I'd like to discuss the coding style changes here and if they can be made automatic using git clang-format. The main purpose of the proposed coding style is to make the code more readable and, hence, bring more developers to the Bitcoin Core.

The changes are...


Always add curly braces.


Always add curly braces to for, if, while, do-while, try-catch, boost loops and any possible construct.

Example:

     if (pnId)
+    {
         *pnId = nId;
+    }

Example:

 std::string CUnsignedAlert::ToString() const
 {
     std::string strSetCancel;
-    BOOST_FOREACH(int n, setCancel)

+    BOOST_FOREACH (int n, setCancel)
+    {
         strSetCancel += strprintf("%d ", n);
+    }
+

Example:

                 if (fThread)
+                {
                     boost::thread t(runCommand, strCmd); // thread runs free
+                }
                 else
+                {
                     runCommand(strCmd);
+                }

Always put curly braces on the new line.


 static bool findSighashFlags(int& flags, const string& flagStr)
 {
     flags = 0;

-    for (unsigned int i = 0; i < N_SIGHASH_OPTS; i++) {
-        if (flagStr == sighashOptions[i].flagStr) {

+    for (unsigned int i = 0; i < N_SIGHASH_OPTS; i++)
+    {
+        if (flagStr == sighashOptions[i].flagStr)
+        {
             flags = sighashOptions[i].flags;

Surround curly-braced code with blank lines.


Example:

         boost::mutex::scoped_lock lock(mutex);
-        if(!size) return;
+
+        if (!size)
+        {
+            return;
+        }
+
         const size_t base_addr = reinterpret_cast<size_t>(p);
         const size_t start_page = base_addr & page_mask;

Example:

 int64_t CTransaction::GetValueOut() const
 {
     int64_t nValueOut = 0;
-    BOOST_FOREACH(const CTxOut& txout, vout)
+
+    BOOST_FOREACH (const CTxOut& txout, vout)
     {
         nValueOut += txout.nValue;
+
         if (!MoneyRange(txout.nValue) || !MoneyRange(nValueOut))
+        {
             throw std::runtime_error("CTransaction::GetValueOut() : value out of range");
+        }
     }
+
     return nValueOut;

Example:

         vout.size(),
         nLockTime);
+
     for (unsigned int i = 0; i < vin.size(); i++)
+    {
         str += "    " + vin[i].ToString() + "\n";
+    }
+
     for (unsigned int i = 0; i < vout.size(); i++)
+    {
         str += "    " + vout[i].ToString() + "\n";
+    }
+
     return str;
 }

No one-liners.


Example:

-    void SetNull() { hash = 0; n = (uint32_t) -1; }
-    bool IsNull() const { return (hash == 0 && n == (uint32_t) -1); }
+    void SetNull()
+    {
+        hash = 0;
+        n = (uint32_t) -1;
+    }
+
+    bool IsNull() const
+    {
+        return (hash == 0 && n == (uint32_t) -1);
+    }

Example:

-    CInPoint() { SetNull(); }
-    CInPoint(const CTransaction* ptxIn, uint32_t nIn) { ptx = ptxIn; n = nIn; }
-    void SetNull() { ptx = NULL; n = (uint32_t) -1; }
-    bool IsNull() const { return (ptx == NULL && n == (uint32_t) -1); }
+    CInPoint()
+    {
+        SetNull();
+    }
+
+    CInPoint(const CTransaction* ptxIn, uint32_t nIn)
+    {
+        ptx = ptxIn;
+        n = nIn;
+    }
+
+    void SetNull()
+    {
+        ptx = NULL;
+        n = (uint32_t) -1;
+    }
+
+    bool IsNull() const
+    {
+        return (ptx == NULL && n == (uint32_t) -1);
+    }

Maximum one blank line.


Example:

     }
 };

-
 class CBlock : public CBlockHeader
 {
 public:

Could you please help to produce the .clang-format file to test this coding style?

Thanks,
--- Kosta

@jtimon
Copy link
Contributor

jtimon commented Sep 14, 2014

To change the .clang-format file, you may want to read this:

http://clang.llvm.org/docs/ClangFormatStyleOptions.html

First of all, I wouldn't change anything on it until we have applied the current style to the whole project.
Second, you cannot apply a style to the whole project at one because that would break practically all open PRs.
To me the biggest benefit of automating the style is not having to think about style, you just follow the rules automatically.
Now I'll comment on the proposed changes:

-Always add curly braces.
-Always put curly braces on the new line.

I don't think these increase readability. Is python less readable than C for not having curly braces?
I don't think so.

-Surround curly-braced code with blank lines.
I don't like to be strict about this. Sometimes it make sense (semantically) to say, divide a function in 3 blocks, those blocks don't necessarily coincide with curly braces.

-No one-liners.
We discussed this and we decided that allowing them for now would cause less changes.

-Maximum one blank line.
I like this. If you want more separation you can add a comment. I'm not sure what's the clang option for this though. I think the current behavior is to turn N > 2 empty lines into 2 empty lines.

@sipa
Copy link
Member Author

sipa commented Sep 14, 2014

I disagree that coding style specifics matter much for readability. It's mostly convention - and we should mostly be aiming at consistency. People will always have different preferences about coding style.

If there's enough people who prefer one style to another, I guess we can change things. But the purpose here is mostly having one style that everyone agrees on, which means the things like indentation changes after refactors can be done automatically, with much easier review requirements (just redo the reformatting locally, and check that the result is the same).

For most of the choices in the style, the decisions are just based on minimizing changes to the existing code. Because actually reformatting things will take time and won't be trivial. Just doing all changes all at once will break every single pull request in a very painful way. While redoing formatting afterwards is trivial. So we'll likely want to delay reformatting changes to just before release (candidate) when big changes are in. Even then, smaller changes are better as not every PR gets in within one release cycle.

@kostaz
Copy link
Contributor

kostaz commented Sep 15, 2014

Sipa, it is pity that you think that the coding style is not related to the readability. After all, the code is there to read it, change and write a new one. The C++ source code matters only to humans who mostly read it! Hence, IMHO, more readable code brings more people. Hmmm... exactly like the book - if you can read book easily, you keep reading it. Otherwise, you put it off for a while / for eternity.

:-(

Regarding the existing .clang-format file, why not add the git pre-commit hook to run git clang-format?

@laanwj
Copy link
Member

laanwj commented Sep 15, 2014

@sipa mentioned a fun thing once: once the entire source code is formatted using clang-format, you could locally format your source after checking out however suits you best, then before checking in re-format it with the 'canonical' style file of the project.

Anyhow - different people have different opinions on what coding style is most readable. That's subjective. But it is a fact that a consistent coding style is more efficient. Any time saved not worrying about whether a certain { should be at the end or the beginning, or the else, and the eternal discussions, is worth it.

@sipa
Copy link
Member Author

sipa commented Sep 15, 2014

I fully agree that the source code is there for humans. Humans are however known to have different preferences.

@maflcko maflcko mentioned this pull request Feb 25, 2019
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants