ARROW-1629: [C++] Add miscellaneous DCHECKs and minor changes based on infer tool output#1151
ARROW-1629: [C++] Add miscellaneous DCHECKs and minor changes based on infer tool output#1151wesm wants to merge 6 commits intoapache:masterfrom
Conversation
…elper functions internal / non-exported Change-Id: I4b91ecce99dde4e87dcf70f026df1af17d2f067f
Change-Id: I2ef789a21cdcaf1573480841af1ac3c45e747133
Change-Id: I75c624b0ca1f0daf3940c4cd0c4ec5497a04285f
Change-Id: Ib30ff3d4784983342eca31405b8916dd4474be03
Change-Id: I4e01f340e12d9aa6d319aa77dc05b9bfe6bfa970
|
+1 the plasma related changes look good to me! I'm very glad it caught the memory leak. I'd like to do some modernizations on the Plasma code base, in particular getting rid of fling and using explicit MemoryMappedFile from https://github.com/apache/arrow/blob/84e5e02fbf412c979387b0a53b0ad0c6d5c5e790/cpp/src/arrow/io/file.h instead of shipping file descriptors between processes; this will make it harder to do the cleanups properly since we cannot rely on the reference counting from the OS any more to clean the files up after they are not used by anyone any more, but it will make it possible to have a native Java client and potentially other native bindings. I wonder what your thoughts on this tradeoff are. |
Change-Id: Icb0567eef9f2fac5284171c4371e3c3d22b91fc5
|
+1 |
…n infer tool output This was an interesting journey through some esoterica. I went through all the warnings/errors that infer (fbinfer.com) outputs and made changes if it seemed warranted. Some of the checks might be overkill. See https://gist.github.com/wesm/fc6809e4f4aaef3ecfeb21b8123627bc for a summary of actions on each warning Most of the errors that Infer wasn't happy about were already addressed by DCHECKs. This was useful to go through all these cases -- in nearly all cases the null references are impossible or would be the result of an error on behalf of the application programmer. For example: we do not do array boundschecking in most cases in production builds, but these boundschecks are included in debug builds to assist with catching bugs caused by improper use by application developers. As a matter of convention, we should not use error `Status` to do parameter validation or asserting pre-conditions that are the responsibility of the library user. If parameter validation is required in binding code (e.g. Python), then this validation should happen in the binding layer, not in the core C++ library. There are some other cases where we have a `std::shared_ptr<T>` out variable with code like: ``` RETURN_NOT_OK(Foo(..., &out)); out->Method(...); ``` Here, infer complains that `out` could contain a null pointer, but our contract with developers is that if `Foo` returns successfully that `out` is non-null. Interestingly, infer doesn't like some stack variables that are bound in C++11 lambda expressions. I noted these in the gist with `LAMBDA`. Author: Wes McKinney <[email protected]> Closes apache#1151 from wesm/fix-infer-issues and squashes the following commits: f285be9 [Wes McKinney] Restore code paths for empty chunked arrays for backwards compat 5aa86ce [Wes McKinney] More DCHECK esoterica / tweaks based on infer report 22c5d36 [Wes McKinney] Address a couple more infer warnings 75131a6 [Wes McKinney] Some more minor infer fixes 5ff3e3a [Wes McKinney] Compilation fix 05316ce [Wes McKinney] Fix miscellaneous things that infer does not like. Make some Python helper functions internal / non-exported
This was an interesting journey through some esoterica. I went through all the warnings/errors that infer (fbinfer.com) outputs and made changes if it seemed warranted. Some of the checks might be overkill.
See https://gist.github.com/wesm/fc6809e4f4aaef3ecfeb21b8123627bc for a summary of actions on each warning
Most of the errors that Infer wasn't happy about were already addressed by DCHECKs. This was useful to go through all these cases -- in nearly all cases the null references are impossible or would be the result of an error on behalf of the application programmer. For example: we do not do array boundschecking in most cases in production builds, but these boundschecks are included in debug builds to assist with catching bugs caused by improper use by application developers.
As a matter of convention, we should not use error
Statusto do parameter validation or asserting pre-conditions that are the responsibility of the library user. If parameter validation is required in binding code (e.g. Python), then this validation should happen in the binding layer, not in the core C++ library.There are some other cases where we have a
std::shared_ptr<T>out variable with code like:Here, infer complains that
outcould contain a null pointer, but our contract with developers is that ifFooreturns successfully thatoutis non-null.Interestingly, infer doesn't like some stack variables that are bound in C++11 lambda expressions. I noted these in the gist with
LAMBDA.