-
Notifications
You must be signed in to change notification settings - Fork 134
Add option to restrict vtable function pointers #536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to restrict vtable function pointers #536
Conversation
7f075ae to
400eb8b
Compare
celinval
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make this change less CBMC dependent? Split between collecting potential candidates and doing the CBMC translation after that.
|
@celinval Agreed on the goal. I tried to keep the totally CBMC-specific logic to within |
|
@avanhatt I was thinking about whether we should split this file: compiler/rustc_codegen_rmc/src/context/vtable_ctx.rs |
500fb23 to
9c09cd8
Compare
9c09cd8 to
4b33922
Compare
fe4fb66 to
4c60685
Compare
ad92aca to
06d823d
Compare
|
@celinval many rebases and switching to serde later, this is now ready for re-review! The CBMC specifics are now in an entirely different crate. |
|
|
||
| /// Add a given call site for a virtual function, incremementing the call | ||
| /// site index. | ||
| fn add_call_site(&mut self, trait_name: String, method: usize, function_location: String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of things of type "String"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the tostring needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not anymore at least
library/rmc_restrictions/src/lib.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interned string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
library/rmc_restrictions/src/lib.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the format of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just the name now, renamed the variable to make clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right place to put the clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we talked about this in a chime sync, this is just the set of assumptions on the type, so returning None is no less safe than unimplemented! but let's us not crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. Can you please add a comment to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you intern the string these clones become unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using a serde_json Value instead of the actual struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IS this the best name for this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you like emit_vtable_restrictions better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment makes me expect a list of files, not the data from inside them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect linking and serializing to be independent steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a single call to serde_json::to_value(&output).unwrap() so I don't feel strongly about that, but happy to move if you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just feels weird given the function name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this guaranteed to be stable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but there is no safe/stable way to mangle the vtable AFAIK, and the test would fail if it were to change.
scripts/cargo-rmc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.restrictions.json
library/rmc_restrictions/src/lib.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InternedString
scripts/rmc.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the new whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check python formatting with autopep8 on changed files, it inserted the whitepace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah didn't realize quite how copious it was, might need to move some of these to doc comments to make it happy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a strong reason to even make this a flag? Can we emit these files unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reasons are that currently emitting restrictions currently (1) requires wrapper functions at ever vtable call, because CBMC uses fragile naming conventions for specifying the function call site, and (2) the restrictions could be incomplete if the Rust code is doing certain types of casts that we don't handle, meaning the restriction set would be too small.
The 3rd even more transient reason is that the CBMC PR has not landed, so restrictions fail on our minimum required CBMC version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get that broken into an issue, with e.g. links to CBMC issues or PRs we depend on?
This change is awesome, and I'd like to make sure we have a well-lit path to making it the always-on default. e.g. do we know what the "certain types of casts that we don't handle" are, and can we document that?
(This is obviously not a blocker to merging, I just don't want it accidentally dropped or anything!)
ed04501 to
19bb530
Compare
danielsn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with comments
| } | ||
| } | ||
|
|
||
| struct InternedStringVisitor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really go with InternedString itself.
| version = "0.1.0" | ||
| dependencies = [ | ||
| "rmc_restrictions", | ||
| "rustc_data_structures", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we using from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler's FxHashMap, not strictly necessary/could be replaced with a vanilla HashMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If easy, could do that here; if not, lets just do that as a cleanup PR
| ) { | ||
| assert!(self.emit_vtable_restrictions); | ||
| let site = CallSite { | ||
| trait_method: TraitDefinedMethod { trait_name: trait_name, vtable_idx: method }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need to say trait_name: trait_name, trait_name would do
| ) | ||
| .into(); | ||
|
|
||
| // We only have the Gotoc type, we need to normalize to match the MIR type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any risk here if MIR changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less so now that we have the type_map, we are normalizing the MIR on both sides instead of relying on string editing.
| ) | ||
| .as_stmt(Location::none()); | ||
|
|
||
| // Declare symbol for the single, self parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but this feels like something that would happen often enough to be worth a helper function
| let key = TraitDefinedMethod { trait_name, vtable_idx: method }; | ||
|
|
||
| if let Some(possibilities) = self.possible_methods.get_mut(&key) { | ||
| possibilities.push(imp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason we might want to keep this list ordered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently.
| } | ||
|
|
||
| /// Add a given call site for a virtual function, incremementing the call | ||
| /// site index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does it increment the index??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was stale, removed.
celinval
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't finished reviewing this but so far I haven't seen anything critical. Once @danielsn approves this, I'm ok with merging and making the changes as a follow up PR.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. Can you please add a comment to the code?
| if let Some(fn_symbol) = self.symbol_table.lookup(&fn_name) { | ||
| if self.vtable_ctx.emit_vtable_restrictions { | ||
| // Add to the possible method names for this trait type | ||
| self.vtable_ctx.add_possible_method( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the if statement to be inside the add_possible_method() implementation. This will make the code cleaner and less error prone.
| .codegen_unimplemented( | ||
| format!("drop_in_place for {}", drop_sym_name).as_str(), | ||
| Type::empty(), | ||
| let drop_sym = self.ensure(&drop_sym_name, |ctx, name| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the drop case. Would it make more sense for us to replace the function call with the unimplemented assertion instead of generating the function body?
|
|
||
| struct InternedStringVisitor; | ||
|
|
||
| impl<'de> serde::Deserialize<'de> for InternedString { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a unit test for this?
|
Will address followups in #671, merging to unblock Ted's linked PR. |
Add vtable restriction option, need CBMC PR to use
Add vtable restriction option, need CBMC PR to use
Description of changes:
RMC can currently take a long time to verify code with lots of virtual function pointer calls (or, never terminate). This is in part because CBMC uses a very general heuristic to reason about function pointers: it assumes any function of the right type signature could be a target. CBMC does allow you to restrict possible function targets via a command line option and auxiliary json file: http://cprover.diffblue.com/md__home_travis_build_diffblue_cbmc_doc_architectural_restrict-function-pointer.html
Note, using this functionality requires this branch of CBMC, diffblue/cbmc#6376, tracking issue to update RMC once that merges: #671
For virtual function pointer through a vtable, we can often do much better than CBMC's default heuristic. In particular, we can virtually guarantee (unless the user transmutes the vtable memory) that a call through a vtable will be one of the trait-defined method for that trait type that we have previously seen in codegen. Because CBMC asserts(false) if the function pointer is not actually within the restricted set, this optimization is sound.
This is especially helpful for
drop_in_placedestructor calls through a vtable. Many functions have types that matchdropsince it takes a single self argument and returns unit. In a Firecracker example, we have seen 300+ function targets identified by the default CBMC heuristic for a single virtual drop call, which otherwise renders CBMC verification basically intractable (think: a massive, 300+ armifblock after CBMC performs function pointer elimination).This PR add a
VtableCtxobject that build a map from virtual call sites to possible trait-defined method implementations, then writes out the restrictions as a JSON file.You can invoke this on single Rust files by passing a
--restrict-vtablesflag tormc, or on full crates with dependencies by adding the following to your harness script (which should eventually be handled bycargo rmc:Example:
Say we have two traits that both implement a
noisemethod,AnimalandOther. Without the vtable restriction flag, any call tonoiseand any implicitdropon a dynamic trait object would unroll to every implementation of either trait:Now, we see each call site only considers the implementations that match that trait type:
We also get the benefit of CBMC inserting an
assert(0)at the end of the block instead of just anassume(0).Resolved issues:
Resolves #521
Call-outs:
Testing:
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.