-
Notifications
You must be signed in to change notification settings - Fork 134
Mitigate issue with dangling pointers and memcmp
#1526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
YoshikiTakashima
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the comments, LGTM
| tcx.decl_temp_variable(second.typ().clone(), Some(second), Location::none()); | ||
| let is_count_zero = count_var.clone().is_zero(); | ||
| // TODO: we should also check that the pointers have correct alignment. | ||
| // At this point, we don't have the Rust types available anymore, however, so this is more difficult. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use Instance to get the parameter types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out that the pointers that memcmp accepts are *const u8, so their alignment is 1, so there is no need to check alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point! :)
| let second = fargs.remove(0); | ||
| let count = fargs.remove(0); | ||
| let (count_var, count_decl) = | ||
| tcx.decl_temp_variable(count.typ().clone(), Some(count), Location::none()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use loc instead of Location::none()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done.
| impl<'tcx> GotocHook<'tcx> for MemCmp { | ||
| fn hook_applies(&self, tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> bool { | ||
| let name = with_no_trimmed_paths!(tcx.def_path_str(instance.def_id())); | ||
| name.contains("memcmp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this check is too open...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you were right :D It made the tests fail. I fixed it to match the exact name (core::slice::cmp::memcmp).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to have hooks for a bunch of memory operations. I do remember they had to also take into consideration the top module being named std::. @danielsn might have better insight why that was the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modules get re-exported. So we often see it either as "std" or "core", depending on whether it was re-exported.
| impl<'tcx> GotocHook<'tcx> for MemCmp { | ||
| fn hook_applies(&self, tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> bool { | ||
| let name = with_no_trimmed_paths!(tcx.def_path_str(instance.def_id())); | ||
| name == "core::slice::cmp::memcmp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the outdated discussion, can you also compare it against "std::slice::cmp::memcmp".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
celinval
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
memcmp works on u8, so the alignment requirement is trivial. However this will accept some code that Miri rejects (like the 2nd example here), when the pointers have provenance that makes this a use-after-free or an out-of-bounds access. I don't know if and how Kani tracks provenance, so I am not sure if Kani can check these things. |
|
It might be possible to strengthen the check:
to exclude pointers that were deallocated (and thus have provenance), e.g. let is_first_ok = first_var.clone().is_nonnull() && first_var.clone().is_not_deallocated();Perhaps, CBMC's memory primitives can be used for determining if the pointer was deallocated? |
|
For completeness' sake, here is another example Miri considers invalid, using an out-of-bounds pointer rather than a deallocated pointer: fn make_oob() -> *const u8 { unsafe {
let b = Box::into_raw(Box::new(0u8));
// Leaking the box, so it remains allocated
(&*b as *const u8).wrapping_offset(42)
} }
fn main() { unsafe {
// out-of-bounds is UB
memcmp(make_oob(), make_oob(), 0) ;
} } |
@RalfJung Yeah, I realized this too and changed the code, but forgot to update the PR description.
Thanks for the examples, @RalfJung! I checked the CBMC documentation: CBMC does track an object ID and offset for each valid pointer, which I believe would correspond to provenance. (I'm not sure what happens with pointer-integer casts though.) However, it does not consider dangling pointers to be valid, so it does not follow Rust's stance of there being an implicit zero-sized allocation. So as I understand it, we would have to convince the CBMC people to change their semantics of validity of pointers to support this in Kani. |
|
Since this particular issue is mitigated, but we should find a better solution, I opened a new issue for it: #1574 |
Description of changes:
The following assertion currently fails in Kani:
assert_eq!(Vec::<u8>::new(), Vec::<u8>::new()). The reason is that an empty vector allocates a dangling pointer, which is then passed tomemcmp, but zero bytes will be compared. CBMC does not allow dangling pointers there even if they are not read (see #1489).Therefore this PR adds a hook to intercept calls to
memcmpand if the number of bytes to compare is zero it does not callmemcmpbut just checks that the pointers are not null. We should also check that they are aligned, but I'm not sure how to find the alignment at this stage of the translation.Resolved issues:
Resolves #1489
Call-outs:
I'm not happy with this special casing. It would be good to have a more general solution for other C functions as well. An example is
memcpy, but I wasn't able to produce a test case with safe Rust code. Suggestions welcome.Testing:
How is this change tested? Added a regression test.
Is this a refactor change? No.
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.