Skip to content

add util/check-format.pl for checking adherence to C coding style#10363

Closed
DDvO wants to merge 115 commits intoopenssl:masterfrom
siemens:openssl-format-check
Closed

add util/check-format.pl for checking adherence to C coding style#10363
DDvO wants to merge 115 commits intoopenssl:masterfrom
siemens:openssl-format-check

Conversation

@DDvO
Copy link
Contributor

@DDvO DDvO commented Nov 5, 2019

When contributing PRs to OpenSSL it typically happens that the code does not (fully) adhere to the coding style guidelines at https://www.openssl.org/policies/codingstyle.html and to further implicit conventions that are not (yet) part of the rules stated there.

In particular when we contributed larger chunks of code, reviewers typically found many formatting issues that are usually trivial to resolve but tend to cause extra overhead and distract from more important issues.

There is already a tool that is supposed to automatically (re-)format code: util/openssl-format-source, yet this appears rather outdated and not really practical to use.
So I spent some effort on a new Perl tool that aims at checking adherence at least to most of the OpenSSL formatting rules we are aware of.

This new tool is in preliminary state: it is incomplete and yields some false positives.
Still it has already proved pretty useful for detecting most typical formatting glitches.

@richsalz
Copy link
Contributor

richsalz commented Nov 5, 2019

Cool! I think part of this PR should be to remove the old script.

@mattcaswell
Copy link
Member

Nice!

@levitte
Copy link
Member

levitte commented Nov 5, 2019

I agree with @richsalz

@DDvO
Copy link
Contributor Author

DDvO commented Nov 5, 2019

Pleased to hear!
I invite you to try it out and potentially give more detailed feedback.

@levitte
Copy link
Member

levitte commented Nov 5, 2019

I tried it on crypto/store/*.[ch], and got a few interesting nits:


crypto/store/loader_file.c:225:indent=16!={3,20}:                 || PKCS12_verify_mac(p12, NULL, 0)) {

Here's the line with a bit of context:

if (PKCS12_verify_mac(p12, "", 0)
|| PKCS12_verify_mac(p12, NULL, 0)) {
pass = "";

So it seems that this scripts want to make the extra indent "mandatory", while this isn't policy (and is still subject for debate). We allow that extra indentation, so this script should currently accept both styles, i.e. both this:

            if (PKCS12_verify_mac(p12, "", 0)
                || PKCS12_verify_mac(p12, NULL, 0)) {
                pass = "";

and this:

            if (PKCS12_verify_mac(p12, "", 0)
                    || PKCS12_verify_mac(p12, NULL, 0)) {
                pass = "";

crypto/store/loader_file.c:281:indent=5!=1:      p12_end:

We have this kind of hanging indent of labels all over the place, i.e. a hang back of 3 columns, not 4. This came to be with our util/openssl-format-source with the auxilliary file util/indent.pro, where you find this:

The indent manual explains it like this:

-iln, --indent-labeln
Set offset for labels to column n.
See INDENTATION.

Why this was a good idea at the time, I cannot remember.


crypto/store/loader_file.c:1109:indent=8!=4:         : PEM_read_bio(bp, pem_name, pem_header, data, len);

So this is a complete mystery to me. This is the line with a bit of context:

int i = secure
? PEM_read_bio_ex(bp, pem_name, pem_header, data, len,
PEM_FLAG_SECURE | PEM_FLAG_EAY_COMPATIBLE)
: PEM_read_bio(bp, pem_name, pem_header, data, len);


crypto/store/loader_file.c:1419:indent=8!={4,4}:         "file",
crypto/store/loader_file.c:1420:indent=8!={4,4}:         NULL,
crypto/store/loader_file.c:1421:indent=8!={4,4}:         file_open,
crypto/store/loader_file.c:1422:indent=8!={4,4}:         file_ctrl,
crypto/store/loader_file.c:1423:indent=8!={4,4}:         file_expect,
crypto/store/loader_file.c:1424:indent=8!={4,4}:         file_find,
crypto/store/loader_file.c:1425:indent=8!={4,4}:         file_load,
crypto/store/loader_file.c:1426:indent=8!={4,4}:         file_eof,
crypto/store/loader_file.c:1427:indent=8!={4,4}:         file_error,
crypto/store/loader_file.c:1428:indent=8!={4,4}:         file_close

This seems to be some kind of fault in how indentation is calculated. Here are those lines plus a little extra:

static OSSL_STORE_LOADER file_loader =
{
"file",
NULL,
file_open,
file_ctrl,
file_expect,
file_find,
file_load,
file_eof,
file_error,
file_close
};

So er, the actual complaint would really be that the braces are on a line of their own... right?

@DDvO
Copy link
Contributor Author

DDvO commented Nov 6, 2019

I tried it on crypto/store/*.[ch], and got a few interesting nits:

Thanks for trying it out and your detailed feedback!
Meanwhile I've handled all issues your reported: fix false positives and allow for both possibilities where the specification is unclear.

So it seems that this scripts want to make the extra indent "mandatory", while this isn't policy (and is still subject for debate). We allow that extra indentation, so this script should currently accept both styles

Done.

We have this kind of hanging indent of labels all over the place, i.e. a hang back of 3 columns, not 4. This came to be with our util/openssl-format-source with the auxilliary file util/indent.pro,

I've relaxed the check for label indents to allow them both at column 1 and (current_indent-3).

crypto/store/loader_file.c:1109:indent=8!=4:         : PEM_read_bio(bp, pem_name, pem_header, data, len);

So this is a complete mystery to me.

This was a false positive and went away with my today's improvements.
I know that handling of conditional expressions ( ? : ) is still to be improved.

This seems to be some kind of fault in how indentation is calculated. Here are those lines plus a little extra:

static OSSL_STORE_LOADER file_loader =
{
"file",
NULL,
file_open,
file_ctrl,
file_expect,
file_find,
file_load,
file_eof,
file_error,
file_close
};

So er, the actual complaint would really be that the braces are on a line of their own... right?

This was due to incomplete detection of complex multi-line assignments involving braces; I've strongly improved this (also for the rare case of constant value decls within enum).

@richsalz
Copy link
Contributor

richsalz commented Nov 6, 2019

This work is amazing.

@beldmit
Copy link
Member

beldmit commented Nov 6, 2019

Great job!

@DDvO
Copy link
Contributor Author

DDvO commented Nov 6, 2019

Thanks!
I was quite a piece of work to get it that far.

@DDvO
Copy link
Contributor Author

DDvO commented Nov 6, 2019

I've just fixed two small further issues,
reducing false positives to zero also for crypto/{cmp,crmf}/*.[ch]

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

I've improved the handling of comments and string literals
and added TODOs on multi-line string literals.

@mspncp
Copy link
Contributor

mspncp commented Nov 7, 2019

Note: looking at the util directory, many of the scripts are named using imperative style ("find doc nits", "merge err lines", "format source", "find unused errs", etc.) and not a (typical german?) nominal style as in "openssl format check". Also most of those scripts don't have a .pl extension.

How about just calling it openssl-check-format?

@mspncp
Copy link
Contributor

mspncp commented Nov 7, 2019

Or openssl-check-coding-style?

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

I like openssl-check-format,
while "check coding style" would be too broad, at least for the current state of the tool.

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

Should I remove the .pl suffix?

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

Is there sufficient agreement to remove the old script (openssl-format-source)?

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

I've meanwhile resolved the TODOs on handling multi-line string literals.

@mspncp
Copy link
Contributor

mspncp commented Nov 7, 2019

Should I remove the .pl suffix?

It‘s not a request, only a suggestion based on recent habits. Unless the script works and is intended to be used on Windows too, I see no reason to keep it.

@richsalz
Copy link
Contributor

richsalz commented Nov 7, 2019

I think the "openssl-" prefix wastes needless space. After all, this is in the openssl source distribution. :) I like "check-formatting" or something like that. Yes, remove the .pl extension. As for removing the old file, an OMC member said yes; #10363 (comment)

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

check-format is even shorter than check-formatting :)
Since the script should work on Windows as well (as long as Perl is installed), I suggest keeping the .pl suffix.

So my proposal is to go for check-format.pl.

@richsalz
Copy link
Contributor

richsalz commented Nov 7, 2019

One last attempt on the .pl extension: Most development is on unix-like platforms; those windows developers can put "perl" in front of the command-line. :)

@mspncp
Copy link
Contributor

mspncp commented Nov 7, 2019

So my proposal is to go for check-format.pl.

O.k. It does have the advantage that the script can be invoked on Windows without prefixing it with perl.

@DDvO
Copy link
Contributor Author

DDvO commented Nov 7, 2019

I'm also quite in favor of not putting an extra burden on those poor guys (sometimes including myself) developing in Windows ;)

For Unix use, we have the first line referring to Perl, and on Windows the name extension usually does the trick.
Moreover, there are various Perl scripts in util/ that do have the .pl ending as well.

@richsalz
Copy link
Contributor

richsalz commented Nov 7, 2019

:)

@DDvO DDvO changed the title add util/openssl-format-check.pl add util/check-format.pl for checking adherence to C coding style Nov 7, 2019
@mspncp
Copy link
Contributor

mspncp commented Nov 8, 2019

cool. Adding a unit test for a utility script seems to be unprecedented.

@DDvO
Copy link
Contributor Author

DDvO commented Nov 8, 2019

Yeah, given the non-trivial nature of the thing,
I felt the need for tests assuring that there are only known false negatives. I'll add test cases for more interesting indentation cases in complex expressions etc. soon.
On the other hand, false positives are rather obvious when running the tool on actual source files, so I do not plan to add test cases for these.

BTW, meanwhile I've further reduced both false positives and false negatives; some more of this is to come.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

@levitte, should I add a report for one-letter names other than i and j to the tool?

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

BTW, this PR is still waiting for a formal review.

@levitte
Copy link
Member

levitte commented Jan 29, 2020

I only looked at this one to understand what you were on to in #10363 (comment)

apps/crl.c:287:no '{' on same line after '} else':                } else

@levitte
Copy link
Member

levitte commented Jan 29, 2020

@levitte, should I add a report for one-letter names other than i and j to the tool?

No, I think that's a bit over the top. There's more over the place, such as the local pointers p and q, and there are places where you see a and b (typical for cmp functions), or k for a third index, etc. we will drive ourselves crazy... some things can be left to the humans to judge.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

We could automatically check for names that definitely should not be used, as mentioned by @bernd-edlinger: #10725 (comment), namely
l (lower case el) and I (upper case i) and O (upper case o).

@levitte
Copy link
Member

levitte commented Jan 29, 2020

That, I think is a better idea. Advice against specifics instead of disallowing all but just a few. Yes.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

Done. The tool now reports 5 one-letter variables 'l' in apps/*.c

@kaduk
Copy link
Contributor

kaduk commented Jan 29, 2020

There's more over the place, such as the local pointers p and q, and there are places where you see a and b (typical for cmp functions), or k for a third index, etc.

Don't forget s for SSL objects!

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

Don't forget s for SSL objects!

There is also x for X509 objects, sigh.

@levitte
Copy link
Member

levitte commented Jan 29, 2020

See "etc" 😉

@DDvO
Copy link
Contributor Author

DDvO commented Jan 29, 2020

I don't like those one-char variable names at all:
searching for all occurrences of such a variable can become a nightmare of false positives.

@richsalz
Copy link
Contributor

So maybe don't worry about one-character variables for now. The phrase "diminishing returns" comes to mind.

@levitte
Copy link
Member

levitte commented Jan 29, 2020

I don't like those one-char variable names at all:

Noted. foo, bar, baz it is, then 😉

@kaduk
Copy link
Contributor

kaduk commented Jan 29, 2020

I don't like those one-char variable names at all:

Noted. foo, bar, baz it is, then wink

Or qux, quux, quuux. "Now which was quux and which one was quuux again?"

@DDvO
Copy link
Contributor Author

DDvO commented Jan 30, 2020

Sorry to disappoint your creativity 😉
We already have on the list in #10725:

Use brief but descriptive names for variables, types, functions, etc.

(Yet surely this is something a tool cannot check.)

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On pure principle, we should have this script. It's possible that it needs another tweak or two, I don't know and haven't really checked thoroughly, but...

@levitte levitte added approval: done This pull request has the required number of approvals branch: master Applies to master branch labels Mar 7, 2020
@openssl-machine openssl-machine removed the approval: done This pull request has the required number of approvals label Mar 8, 2020
@openssl-machine
Copy link
Collaborator

This pull request is ready to merge

@openssl-machine openssl-machine added the approval: ready to merge The 24 hour grace period has passed, ready to merge label Mar 8, 2020
openssl-machine pushed a commit that referenced this pull request Mar 9, 2020
aims at checking most of https://www.openssl.org/policies/codingstyle.html
and various requirements not yet explicitly stated there - see also #10725

add util/check-format.pl and its self-tests in
util/check-format-test-{positives,negatives}.c
remove util/openssl-format-source

Reviewed-by: Richard Levitte <[email protected]>
Reviewed-by: David von Oheimb <[email protected]>
(Merged from #10363)
@DDvO
Copy link
Contributor Author

DDvO commented Mar 9, 2020

Pushed (using ghmerge --nobuild --squash 10363 levitte @DDvO).

Thanks for all feedback by @levitte and many others.

@DDvO DDvO closed this Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Applies to master branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants