lint: Check for use of NULL instead of nullptr #14205

practicalswift commented at 9:40 AM on September 12, 2018: contributor

Add linter: Check for use of NULL instead of nullptr.

As suggested by @HashUnlimited in #14203 :-)

Technical rationale:

nullptr cannot be confused with an int. nullptr also has a well-specified (very restrictive) type, and thus works in more scenarios where type deduction might do the wrong thing on NULL or 0.

Thus: using nullptr instead of NULL consistently will avoid a certain class of bugs.

fanquake added the label Tests on Sep 12, 2018

[wallet.cpp] replace NULL with nullptr

came with cebefba0855cee7fbcb9474b34e6779369e8e9ce

a linter would be nice to detect those

ab7a46c4cf

practicalswift force-pushed on Sep 12, 2018

MarcoFalke commented at 12:51 PM on September 12, 2018: member

Hmm, doing linting with overly complicated regexes seems fragile. Couldn't this be done by a compiler warning or by running the AST?

practicalswift commented at 1:59 PM on September 12, 2018: contributor

@MarcoFalke Perhaps this simple linter can be replaced by a more advanced AST based solution in the future should we encounter any problems? I don't oppose an AST based solution, but I'm afraid it might introduce significant running time to the linting job – don't you think?

FWIW, this linter has a running time of 16 milliseconds – git grep is blazingly fast! :-)

[^A-Za-z0-9_"]NULL[^A-Za-z_"] means:

Match any char which is not A-Z, a-z, 0-9, _ or ", then
Match NULL, then
Match any char which is not A-Z, a-z, 0-9, _ or "

Should I add that as a comment to make the regexp easier to understand?

practicalswift force-pushed on Sep 12, 2018

MarcoFalke commented at 3:25 PM on September 12, 2018: member

Note that several linters already take a couple of seconds and the travis instance takes sometimes up to one minute to boot. Building the AST shouldn't take much longer than that? Also, it could be built once and reused for all linters.

Note that with "fragile" I mean "works fine right now, but breaks after some refactoring and has to be fixed up with another confusing regex that works fine right then".

donaloconnor commented at 5:56 PM on September 12, 2018: contributor

Would this fail for a CPP comment? Eg: // Add string NULL character to end of buffer

practicalswift commented at 8:17 PM on September 12, 2018: contributor

@donaloconnor Nope, that is taken care of – see the grep -vE :-)

promag commented at 8:37 PM on September 12, 2018: member

We could remove all NULL occurrences:

src/qt/paymentserver.cpp:        // Don't log NULL certificates
src/torcontrol.cpp:        // Prefer NULL, otherwise SAFECOOKIE. If a password is provided, use HASHEDPASSWORD
src/torcontrol.cpp:            LogPrint(BCLog::TOR, "tor: Using NULL authentication\n");
src/wallet/wallet.cpp:        return NULL;

And then just disallow all NULL?

BTW, the following would not be detected: return NULL; // Using NULL :-).

practicalswift commented at 9:07 PM on September 12, 2018: contributor

@promag Removing the NULL occurences and then keeping the git grep as is but removing the grep -vE?

promag commented at 9:11 PM on September 12, 2018: member

Yup, with grep -vE we have like a blacklist, beside being fragile.

Anyway, I'm -0 on this matter, no strong opinion really.

DrahtBot commented at 12:25 AM on September 13, 2018: contributor

Note to reviewers: This pull request conflicts with the following ones:

#10973 (Refactor: separate wallet from node by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

HashUnlimited commented at 5:50 AM on September 13, 2018: contributor

trying to find a "simple" regex I failed there:

tor control_tests.cpp:

        "AUTH METHODS=NULL",
        "AUTH", "METHODS=NULL");
        "METHODS=NULL", {
            {"METHODS", "NULL"},

lint: Check for use of NULL instead of nullptr 719937b626

practicalswift force-pushed on Sep 13, 2018

practicalswift commented at 6:29 AM on September 13, 2018: contributor

Updated version to address raised concerns.

I've now also added a comment clarifying the meaning of the POSIX regex [^abc]:

The POSIX regular expression [^abc] matches any character except a, b and c.

To make sure the linter is robust against false positives I've back tested this by stepping back iteratively in the git history – running git reset --hard HEAD~N repeatedly and confirming after each step back – all the way back to when we replaced NULL with nullptr. No false positives were found, so I don't think we'll have to worry about false positives in the wild.

Please re-review.

laanwj commented at 8:09 AM on September 13, 2018: member

Hmm, doing linting with overly complicated regexes seems fragile. Couldn't this be done by a compiler warning or by running the AST?

I agree, and, IMO, this is not worth it. using NULL instead of nullptr will not result in any potential bugs, which is the only valid reason to add linters IMO

laanwj commented at 8:12 AM on September 13, 2018: member

as i've said before, I'm not happy with piling up an increasingly large heap of shell scripts running fragile regexps over the source code

donaloconnor commented at 8:16 AM on September 13, 2018: contributor

I'd argue NULL could (in rare cases) cause incorrect overload functions to be called (int vs T*) but I don't think the extra lint step is necessary. These things should be caught in reviews anyway :-)

practicalswift commented at 8:28 AM on September 13, 2018: contributor

using NULL instead of nullptr will not result in any potential bugs, which is the only valid reason to add linters IMO

nullptr cannot be confused with an int. nullptr also has a well-specified (very restrictive) type, and thus works in more scenarios where type deduction might do the wrong thing on NULL or 0.

Thus: using nullptr instead of NULL consistently will avoid a certain class of bugs.

practicalswift commented at 9:04 AM on September 13, 2018: contributor

These things should be caught in reviews anyway :-)

While one certainly would hope so I wouldn't bet my money on expert human reviewing alone. We need both.

My automated analysis has uncovered numerous real bugs that were not caught in human review: everything from boring/mundane things like the use of uninitialized variables, reliance on undefined behaviour etc and all the way up to things like a buffer overflow in the networking code (reported responsibly of course).

Even the most severe of these bugs were introduced by some of the most highly skilled developers we have in the project and their code was reviewed by multiple equally skilled developers. I've seen it proven time after time after time: automated analysis finds issues not picked up by expert human reviewers.

Compared to other security critical C++ projects I think we vastly underuse the tools that are typically used to automatically catch bugs in new code before the code hits master (examples: no ASAN build in Travis, no static analyzer jobs in Travis, etc.). I hope to fix that :-)

Obviously automated analysis is no panacea, but we should use it as a complement to expert human reviewing. Remember: computers are quick, accurate and never bore – let's use that to our advantage! :-)

donaloconnor commented at 9:44 AM on September 13, 2018: contributor

While one certainly would hope so I wouldn't bet my money on expert human reviewing alone. We need both.

Indeed! That's why I said should catch these things. ;-)

My automated analysis has uncovered numerous real bugs that were not caught in human review: everything from boring/mundane things like the use of uninitialized variables, reliance on undefined behaviour etc and all the way up to things like buffer overflows in the networking code (reported responsibly of course).

I don't disagree! These class of bugs are common and dangerous. Static analysis is very important for this class of project. I am referring to specifially this PR - NULL -> nillptr. I would think it's one of the less important checks condering it's affects are rare. And even easier to catch in a review so chances are low. I'm no expert on this project but would it make more sense to run SA nightly? I need to get familiar with Travis and the build system

Compared to other security critical C++ projects I think we vastly underuse the available tools that are typically used to automatically catch bugs in newly introduced code before they hit master (examples: no ASAN build in Travis, no static analyzer jobs in Travis, etc.). I hope to fix that :-)

I agree with this :-)

practicalswift commented at 9:51 AM on September 13, 2018: contributor

@donaloconnor Agreed! My point was simply that human review didn't catch the mentioned severe issues so I wouldn't count on it to catch less severe issues like type deduction doing the wrong thing due to NULL being used instead of nullptr :-)

practicalswift commented at 7:52 AM on September 14, 2018: contributor

Added technical rationale in OP since it was questioned if using nullptr instead of NULL consistently could avoid any potential bugs.

scravy commented at 8:13 AM on September 14, 2018: contributor

I am wondering – there are quite a few shell/python linters checking different things. The one presented here in this PR seems oddly specific to me. Wouldn't a tool like checkcpp or clang-tidy do a much better (and broader) job?

I guess it could be configured like the shellcheck tool we're already using to lint shell script code style to not execute certain checks which would fail a zillion times in the code base (missing explicit in a constructor, static assignment might throw exception, etc.).

practicalswift commented at 8:19 AM on September 14, 2018: contributor

@scravy Absolutely! Based on previous discussions regarding review automation I've simply assumed that @MarcoFalke and/or @laanwj would object to having broader static analyzers like cppcheck, clang-tidy or Clang Static Analyzer run as part of Travis, but if I'm mistaken about that I'd be glad to put in the required work to add it :-) @MarcoFalke and @laanwj, can you clarify your (preliminary) positions so that I don't put in work on something that has near zero chance of getting merged? :-)

MarcoFalke commented at 12:38 PM on September 14, 2018: member

Personally I wouldn't object running a cpp checker, but we used to have problems where purely stylistic issues (such as shadowing to only name one example) caused massive disruption of progress on the project as a whole. So clearly we don't want to run cpp checkers for stylistic issues or on issues that have false positives. Other than that, I don't object running them to check for critical cpp programming issues.

laanwj commented at 12:53 PM on September 14, 2018: member

Fully agree with @MarcoFalke, also as I've said before: if it catches real errors or wacky dangerous code, it's good to have.

This also means that the static check shouldn't need to much accommodation or hand-holding: don't want to change the code to work around peculiarities of a specific tool.

If the gain is questionable, say for spelling errors or cosmetic issues (as I consider this one to be), by all means run it locally and file a PR, say, every month that fixes things, but don't hook it into travis.

scravy commented at 1:08 PM on September 14, 2018: contributor

cppcheck's goal is to actually improve security of software by first and foremost identifying undefined behavior.

To quote:

Cppcheck is a static analysis tool for C/C++ code. It provides unique code analysis to detect bugs and focuses on detecting undefined behaviour and dangerous coding constructs. The goal is to detect only real errors in the code (i.e. have very few false positives).

[...]

CVEs that was found using Cppcheck:

CVE-2017-1000249 : file : stack based buffer overflow<br>This was found by Thomas Jarosch using Cppcheck. The cause is a mistake in a condition.

CVE-2013-6462 : 23 year old stack overflow in X.org that was found with Cppcheck. CVE-2012-1147 : readfilemap.c in expat before 2.1.0 allows context-dependent attackers to cause a denial of service (file descriptor consumption) via a large number of crafted XML files..

These CVEs are shown when you google "cppcheck CVE". Feel free to compare the search results with other static analysis tools.

Security experts recommend that static analysis is used. And using several tools is the best approach from a security perspective.

Other tools like clang-tidy can cleary identify issues like NULL vs nullptr (which I do not consider just a stylistic issue due to type safety), but in a more straight forward way then a bunch of possibly fragile handcrafted scripts. They can be configured just like the existing shellscript checker to only check for certain conditions.

MarcoFalke commented at 5:07 PM on September 20, 2018: member

A conversation with 24 comments is too much noise for a simple stylistic change like this. These stylistic changes are better catched during review and too late to fix up after review due to the massive overhead involved in pull requests.

MarcoFalke closed this on Sep 20, 2018

jnewbery commented at 7:49 PM on September 20, 2018: member

can you clarify your (preliminary) positions so that I don't put in work on something that has near zero chance of getting merged? :-) @practicalswift - I think as a general rule, contributors should rate-limit their refactor PRs. There are currently ~550 open issues on this project. Contributor and reviewer time spent on getting that number down would be more valuable than yet more refactor PRs.

Code quality is obviously important. At the same time there's a general feeling that the continual code churn and PR traffic that results from all these refactor PRs is slowing down the project and deflecting reviewer/maintainer focus from more important issues.

MarcoFalke commented at 7:59 PM on September 20, 2018: member

@practicalswift I feel guilty myself for doing too much refactoring, but what helps me to keep the number down is to rate-limit myself by postponing every pull request by one day or week. If I still feel the improvement is worth it after that time, I submit it for review.

practicalswift deleted the branch on Apr 10, 2021

bitcoin locked this on Aug 18, 2022