Skip to content

Conversation

@fzaiser
Copy link
Contributor

@fzaiser fzaiser commented Aug 16, 2022

Description of changes:

The following assertion currently fails in Kani: assert_eq!(Vec::<u8>::new(), Vec::<u8>::new()). The reason is that an empty vector allocates a dangling pointer, which is then passed to memcmp, but zero bytes will be compared. CBMC does not allow dangling pointers there even if they are not read (see #1489).

Therefore this PR adds a hook to intercept calls to memcmp and if the number of bytes to compare is zero it does not call memcmp but just checks that the pointers are not null. We should also check that they are aligned, but I'm not sure how to find the alignment at this stage of the translation.

Resolved issues:

Resolves #1489

Call-outs:

I'm not happy with this special casing. It would be good to have a more general solution for other C functions as well. An example is memcpy, but I wasn't able to produce a test case with safe Rust code. Suggestions welcome.

Testing:

  • How is this change tested? Added a regression test.

  • Is this a refactor change? No.

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@fzaiser fzaiser requested a review from a team as a code owner August 16, 2022 16:03
Copy link
Contributor

@YoshikiTakashima YoshikiTakashima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the comments, LGTM

tcx.decl_temp_variable(second.typ().clone(), Some(second), Location::none());
let is_count_zero = count_var.clone().is_zero();
// TODO: we should also check that the pointers have correct alignment.
// At this point, we don't have the Rust types available anymore, however, so this is more difficult.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use Instance to get the parameter types.

Copy link
Contributor Author

@fzaiser fzaiser Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that the pointers that memcmp accepts are *const u8, so their alignment is 1, so there is no need to check alignment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! :)

let second = fargs.remove(0);
let count = fargs.remove(0);
let (count_var, count_decl) =
tcx.decl_temp_variable(count.typ().clone(), Some(count), Location::none());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use loc instead of Location::none()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

impl<'tcx> GotocHook<'tcx> for MemCmp {
fn hook_applies(&self, tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> bool {
let name = with_no_trimmed_paths!(tcx.def_path_str(instance.def_id()));
name.contains("memcmp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this check is too open...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you were right :D It made the tests fail. I fixed it to match the exact name (core::slice::cmp::memcmp).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to have hooks for a bunch of memory operations. I do remember they had to also take into consideration the top module being named std::. @danielsn might have better insight why that was the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modules get re-exported. So we often see it either as "std" or "core", depending on whether it was re-exported.

impl<'tcx> GotocHook<'tcx> for MemCmp {
fn hook_applies(&self, tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> bool {
let name = with_no_trimmed_paths!(tcx.def_path_str(instance.def_id()));
name == "core::slice::cmp::memcmp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the outdated discussion, can you also compare it against "std::slice::cmp::memcmp".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@fzaiser fzaiser merged commit 04c5a74 into model-checking:main Aug 19, 2022
@RalfJung
Copy link

We should also check that the pointers are aligned, but I'm not sure how to find the alignment of a type at this stage.

memcmp works on u8, so the alignment requirement is trivial.

However this will accept some code that Miri rejects (like the 2nd example here), when the pointers have provenance that makes this a use-after-free or an out-of-bounds access. I don't know if and how Kani tracks provenance, so I am not sure if Kani can check these things.

@zhassan-aws
Copy link
Contributor

It might be possible to strengthen the check:

let is_first_ok = first_var.clone().is_nonnull();

to exclude pointers that were deallocated (and thus have provenance), e.g.

let is_first_ok = first_var.clone().is_nonnull() && first_var.clone().is_not_deallocated();

Perhaps, CBMC's memory primitives can be used for determining if the pointer was deallocated?

@RalfJung
Copy link

RalfJung commented Aug 21, 2022

For completeness' sake, here is another example Miri considers invalid, using an out-of-bounds pointer rather than a deallocated pointer:

fn make_oob() -> *const u8 { unsafe {
    let b = Box::into_raw(Box::new(0u8));
    // Leaking the box, so it remains allocated
    (&*b as *const u8).wrapping_offset(42)
} }

fn main() { unsafe {
    // out-of-bounds is UB
    memcmp(make_oob(), make_oob(), 0) ;
} }

@fzaiser
Copy link
Contributor Author

fzaiser commented Aug 23, 2022

memcmp works on u8, so the alignment requirement is trivial.

@RalfJung Yeah, I realized this too and changed the code, but forgot to update the PR description.

However this will accept some code that Miri rejects (like the 2nd example #1489 (comment)), when the pointers have provenance that makes this a use-after-free or an out-of-bounds access. I don't know if and how Kani tracks provenance, so I am not sure if Kani can check these things.

Thanks for the examples, @RalfJung! I checked the CBMC documentation: CBMC does track an object ID and offset for each valid pointer, which I believe would correspond to provenance. (I'm not sure what happens with pointer-integer casts though.) However, it does not consider dangling pointers to be valid, so it does not follow Rust's stance of there being an implicit zero-sized allocation.

So as I understand it, we would have to convince the CBMC people to change their semantics of validity of pointers to support this in Kani.

@fzaiser
Copy link
Contributor Author

fzaiser commented Aug 23, 2022

Since this particular issue is mitigated, but we should find a better solution, I opened a new issue for it: #1574

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate failed check in memcmp

6 participants