Skip to content

Conversation

@giltho
Copy link
Contributor

@giltho giltho commented Aug 6, 2022

Description of changes:

Right now, because of CBMC, Kani produces a x = nondet() for a Deinit(x).
Progressing on #920, I'm making this small PR to mark these statements using a comment.

Call-outs:

I don't think this is the optimal way of doing it, so I'm happy with any suggestion.
Right now, I added a field in one of the Location:: kinds, the one that corresponds to my use case, and then I set it when required, but the way it's done isn't very systematic

Testing:

I'm not sure how I should be testing this inside Kani.
I've parsed these with Kanillian and successfully compiled them to the right poison statement instead.

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@giltho giltho requested a review from a team as a code owner August 6, 2022 01:52
@danielsn
Copy link
Contributor

danielsn commented Aug 6, 2022

Why not create a poison expression, that translates to nondet plus comment when you lower to irep?

@giltho
Copy link
Contributor Author

giltho commented Aug 6, 2022

I'm ok with that, but it means actively taking the decision to get stmt.rs and cbmc out of sync and so I went with a softer solution first
But happy to update!

@giltho
Copy link
Contributor Author

giltho commented Aug 7, 2022

Replacing the Nondet expression with Deinit does raise one question: what happens with the NondetTransformer ?

  1. I don't understand if and when it's called. If it was, I would expect no Nondet expressions to ever be sent to me by Kani, but it's not the case
  2. If I add a poison value but want to maintain the invariant that this tranformer currently has, I need to also change every call to poison to a call to some poison function that would have the same behaviour. I will implement that function, but do note that it probably means creating 2 nondet functions for each type that both needs Poison and Nondet

@giltho
Copy link
Contributor Author

giltho commented Aug 8, 2022

Ok fixed the various comments from @danielsn

That being said, after rereading my PR with a fresh post-weekend mind, I'm a bit unhappy with a Poison expression.

The argument against Poison as an expression

My opinion is that Poison should not be an expression, because it is not a valid value. Values are mathematical entities that result from evaluating an expression. A good definition is given by the UCG WG.

Poison is precisely not a value because it cannot result from an expression evaluation, and it cannot be read from memory. It can only exist in memory.
That's why MIR has a deinit statement and not an uninit value that is assigned to a place.

Another good description is given by Robbert Krebbers in the description of the CH2O memory model. Poison bits are part of a value representation in memory, but not part of the value itself.

The argument in favor of Poison as an expression

CBMC has this surprising behaviour of 1) explicitely having paddings as part of the type definitions and 2) explicitely writing nondet in the padding as part of Struct expressions. This conflicts with what I said above for C, and in that sense, CBMC is not C. I'm not sure of the theoretical and practical ramifications of being allowed to produce poison expressions, since the whole point of poison is that it cannot be read in the first place, but it seems that, to work properly in CBMC, you'd need a poison expression. That probably means that you need to add, every read, a check that the produce value isn't poison or something similar

@giltho
Copy link
Contributor Author

giltho commented Aug 8, 2022

Note that this behaviour of expliciting the value of padding might create discrepancies between C and GotoC.
Take the following example

struct S {
   u8 x;
   [24 bit padding]
   u32 y;
}
struct S s = { 0; Poison; 4 }
u8* y = ((u8*) s) + 1;
*y = 2;
y++;
*y = 2;
y++
*y = 3;

Variable s now contains the equivalent of the u8 array [0, 1, 2, 3, 4].

However, if you then write

struct S t = s

then actually, you shouldn't be allowed to read the padding bits and you'll only be writing poison in there. That means, t should be the equivalent of the "u8" array [0, P, P, P, 4]

The natural semantics for a language that explicits the value of padding would be to simply copy the value of s in t and therefore not have the poison in the right place.

},
/// `__nondet()`
Nondet,
/// Poison corresponds to a region of memory that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poison is more than uninitialized, right? Or are we narrowing down the concept for now?

I do wonder if we should state here that CBMC doesn't currently support poison and this gets translated to nondet.

It might be helpful to add a link to diffblue/cbmc#7014.

@celinval
Copy link
Contributor

celinval commented Aug 9, 2022

Ok fixed the various comments from @danielsn

That being said, after rereading my PR with a fresh post-weekend mind, I'm a bit unhappy with a Poison expression.

The argument against Poison as an expression

My opinion is that Poison should not be an expression, because it is not a valid value. Values are mathematical entities that result from evaluating an expression. A good definition is given by the UCG WG.

Poison is precisely not a value because it cannot result from an expression evaluation, and it cannot be read from memory. It can only exist in memory. That's why MIR has a deinit statement and not an uninit value that is assigned to a place.

Another good description is given by Robbert Krebbers in the description of the CH2O memory model. Poison bits are part of a value representation in memory, but not part of the value itself.

The argument in favor of Poison as an expression

CBMC has this surprising behaviour of 1) explicitely having paddings as part of the type definitions and 2) explicitely writing nondet in the padding as part of Struct expressions. This conflicts with what I said above for C, and in that sense, CBMC is not C. I'm not sure of the theoretical and practical ramifications of being allowed to produce poison expressions, since the whole point of poison is that it cannot be read in the first place, but it seems that, to work properly in CBMC, you'd need a poison expression. That probably means that you need to add, every read, a check that the produce value isn't poison or something similar

I totally agree with your point of view. What if you change poison to be a statement instead of an expression? Poison(place) would take an address as its only argument.

This matches the DeInit API as well as the initial API suggested here: diffblue/cbmc#7014.

@giltho giltho changed the title Add a comment on assignments resulting from Deinit statements Add Poison to the Kani-GotoC AST Aug 10, 2022
@giltho giltho self-assigned this Aug 11, 2022
@giltho
Copy link
Contributor Author

giltho commented Aug 17, 2022

Updated to a deinit statement.
This is not ready to be merged for two reasons:

  • First I need to sync my Kanillian branch with it to make sure it works as expected
  • I need to understand when the NondetTransformer is used and why. Right now, even when the NondetTransformer is used, the final code will contain Nondet expressions because the lowering is done at the very last moment. Is it ok? Is it not?

}

pub fn deinit(self, loc: Location) -> Stmt {
Stmt::deinit(self, loc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this require that expr be an lvalue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I don't know how to check for that.
Assignment checks for type equality but that's it, even though the left-hand side has to be an lvalue
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to add an "is l-value" check to cprover bindings, but obviously not in this pr...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly, enabling it on assign causes the stdlib test to crash. Investigating.

@giltho
Copy link
Contributor Author

giltho commented Aug 17, 2022

I have one broken test in the regression test suite, I'll try to figure out what's wrong, but I'm not sure how to approach that
That was a local nonsense

Copy link
Contributor

@tedinski tedinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I basically asked for some comment updates, and then it lgtm...

Stmt::assign(self, rhs, loc)
}

pub fn deinit(self, loc: Location) -> Stmt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely want a comment here, especially as "deinit" is not a c notion really, and this is "cprover_bindings"

That's kind of why I thought this should be called "poison" but I don't remember if that was decided against for some reason?

Also, I don't know if this is me reading it weird, but I have the comment note that this is a shorthand/helper (perhaps by example? Like: "Construct a deinit statement from an expression by writing expr.deinit()"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because @celinval (correct me if I'm wrong) pointed out that "poison" in this form is only intended to model uninitialized memory. So calling it poison is too vague and deinit makes more sense. I tend to agree with that

}
}

pub fn with_comment<T: Into<InternedString>>(self, c: T) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. doc comment? I know I'm the one who vaguely waved you in the direction of "putting a comment on an irep" but did you dig into exactly how this works? It'd be good to explain I think...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one is a bit blurry actually. It will add a comment directly to the irep.
As of now, comments are usually in the "location" sub, and here I'm making the decision to put it outside.
Not sure that's the best option, but the location is handled by Expr::to_irep not ExprValue::to_irep, so I can't propagate it up without a big refactoring that would clutter the entire thing for one use case.

I do think we need a more standard way of writing comments like that for alternative back-ends

@tedinski
Copy link
Contributor

I need to understand when the NondetTransformer is used and why. Right now, even when the NondetTransformer is used, the final code will contain Nondet expressions because the lowering is done at the very last moment. Is it ok? Is it not?

NondetTransformer isn't actively used, and was part of an earlier effort to make --gen-c emit more runnable/parsable C code. If it just rewrites deinit away, I think it'd be fine (just stick a comment in there). We might remove it later...

Copy link
Contributor

@tedinski tedinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, LGTM, let's merge.

  1. It looks like Daniel is looking into adding an l-value assertion, but I think that addition can go in that PR when a possible problem there is resolved?
  2. Let's open a quick follow-up issue about how we're attaching comments, to go double check next week that our approach seems sensible.

@giltho giltho merged commit cae7aca into model-checking:main Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants