Skip to content

Conversation

@romainbrenguier
Copy link
Contributor

@romainbrenguier romainbrenguier commented Dec 31, 2016

Preprocessing of goto-programs for the string refinement.
This pull request includes the changes from #374 #549(merged) which should be merged before.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly looks good, three main flavours of change to make:

(1) Some possible factoring in preprocess to consider
(2) Run cpplint.py to find code style problems
(3) Add commentary / function block comments to explain the reasoning behind some of the less-obvious code.

Regarding (3), cpplint.py will demand that you include a full docblock for every one of axioms_for_concat and axioms_for_substring and .... However I suspect if you consult with @forejtv / @peterschrammel they might be willing to accept a docblock for the /family/ of functions rather than lots of tedious repetition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unify this with java_bytecode/java_bytecode_convert_method.cpp::tmp_variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you factor these three nearly-identical functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style -- use cpplint.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To help people reading generated VCCs, give this a more descriptive name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, descriptive name would be good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on what this function does / what it's purpose is, and consider a more descriptive name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on what sort of simplification this can do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? Subst...itution, I guess? But from what, to what? Looking below, I guess this could be renamed "instantiate_quanfitier" or "substitute_quantifier" or similar? What is f for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should make sense in context of a block comment up top, to tell us what 'the element' is, and what it's being added to, and so on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain, here or somewhere else, what's happening here-- will this be called repeatedly to decrement the quantifier towards its lower bound, or is this a one-shot instantiate-with-upper-bound-minus-one?

@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch 2 times, most recently from 9b1e639 to f5a6cfd Compare January 17, 2017 10:52
@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch 6 times, most recently from 0625ef1 to b69129b Compare February 2, 2017 10:34
@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch from b69129b to ff66780 Compare February 2, 2017 15:18
@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch from 4c26c59 to 2cd801d Compare February 3, 2017 09:09
Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only looked over the first part of it. My comments apply to the whole PR, though. I'll review again once it has been rebased and cleaned up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a static class member.
These are candidates to go into a util class at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpplint

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function comment block missing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed this here 918d97a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a ranged for

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use id2string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code commented out? Fix, remove or use #if 0 and explain why

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this code commented out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do so

@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch 6 times, most recently from 334ac5f to 3fd299a Compare February 10, 2017 12:51
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, this is forbidden. This would introduce a dependency of goto-programs on solvers. So, the immediate guess is that string_expr.h should go into util/. However, this contains dependencies to solvers and java_bytecode, which I don't understand why they are needed. In my opinion. string_expr.h should be language-independent and be moved into util/: for this to happen, the includes bv_refinement.h (seems obsolete) and refined_string_type.h must be removed. The function to_refined_string_type should be defined in refined_string_type.h in the style of similar functions in std_expr.h.

Note: Investigating all these unwanted dependencies, I noticed that solvers depends on ansi-c (in qbf and flattening) and also on java_bytecode (in the string solver), which is not "clean". Also, goto-programs has dependencies on ansi-c, which should not be the case if we take the view of a goto-program being language-independent. All this is related to our multi-language facilities requiring some refactoring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterschrammel would you mind opening an issue to track those (undesirable) dependencies? I'm aware of some of them, but as you have done a proper investigation it would be great if you could note down the details of your findings so that we can all work on them. Thanks!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_expr.h could be moved to util but we need refined_string_type.h (this is the type of string_exprt, and maybe these should be renamed to make that obvious). To avoid dependencies on java_bytecode and such, some static of methods of refined_string_type should be moved out of this class to more appropriate places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterschrammel do you prefer to have these changes (on string_expr and refined_string_type) in this pull request or should it be a new one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using #523 (once it has been merged) for adding fresh variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpplint (check whole file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get any error from cpplint on this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignments (check whole file)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpplint

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::size_t

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpplint

@peterschrammel
Copy link
Member

Looks much cleaner now. Please squash the commits when finalising.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit picking: most files place the include of the class-specific header as the last #include (I can't argue this is good or bad).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use the facilities from #523 (which implies that #523 should get merged asap).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO note there, in case this PR is merged before

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems some poking and pushing on #523 is required...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of #' as separator may be dangerous as it is used in the SSA encoding. Code shouldn't be doing this, but may be relying only a single #' to exist...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used $ instead (8196d90), it seems that's what #523 uses

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return tmp_symbol.symbol_expr();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding ID_java looks scary to me. Why does that work? (Though, in fact, it might as well just not matter.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the next comment, this preprocessing functions are only for Java, so that's why there shouldn't be any problem there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is Java only, shouldn't all the files go in the java_bytecode folder? Language-specific parts should be as clearly separated from generic ones as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the point of view of dependencies, we need goto-programs but not anything from java_bytecode. If code in java_bytecode is allowed to use things from goto-programs then yes we could move it there, but I'm not sure it's the case. Maybe @peterschrammel has an opinion on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this code being invoked, i.e., where is the call that says "now let's do string preprocessing"? That might help to understand where it fits best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invocation of the preprocessing is not part of this pull request but of this one: #429 (in particular here: https://github.com/diffblue/cbmc/pull/429/files#diff-41db988013f70caad45ce489749c7df6R921)
This is because the original pull request was too big and thus was split into several parts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the pointer (I was aware it had been split out, but couldn't find it on the spot)! This certainly suggests that goto-programs is the right place, and that one should rather think about how to make this extensible to other languages? (Which does not imply any request to do so right away, but just bringing up the thought and possibly placing some TODO notes.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit use of irep_idt should not be necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With C++11 you should be able to initialize that map using { { id, value}, {id, value} , ... } as a more concise syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it works for maps but not unordered maps

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be a std::unordered_map?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely these should be std::unordered_map as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above: std::unordered_map

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch 3 times, most recently from 3af4fb8 to d35d9ca Compare February 15, 2017 14:48
@romainbrenguier
Copy link
Contributor Author

I squashed the commits, but this now depends on #549 which moves string_exprt to util

const std::string &signature)
{
if(function_name==ID_cprover_string_copy_func)
make_string_copy(goto_program, i_it, lhs, arguments[0], location);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are sure there is an argument?

goto_programt::targett &i_it,
const std::list<code_assignt> &va)
{
auto i=va.begin();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure the list is not empty?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems some poking and pushing on #523 is required...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is Java only, shouldn't all the files go in the java_bytecode folder? Language-specific parts should be as clearly separated from generic ones as possible.

if(i_it->is_function_call())
{
code_function_callt &function_call=to_code_function_call(i_it->code);
for(auto arg : function_call.arguments())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto &arg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this particular one cannot be made const as we modify it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm sorry, but not knowing the precise rules for auto just yet: are you sure you end up with a reference and not with a copy? In the latter case you can obviously modify it, but it won't have much of an effect... I'd suggest auto &arg to be on the safe side.

assert(is_java_string_type(expr.type()) ||
is_java_string_builder_type(expr.type()));
typet object_type=ns.follow(expr.type());
assert(object_type.id()==ID_struct);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is covered by the subsequent to_struct_type

typet object_type=ns.follow(expr.type());
assert(object_type.id()==ID_struct);
const struct_typet &struct_type=to_struct_type(object_type);
for(auto component : struct_type.components())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto &component


exprt string_refine_preprocesst::make_cprover_string_assign(
goto_programt &goto_program,
goto_programt::targett &i_it,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to find a more descriptive name for this parameter?

i_it->code=*i;
for(i++; i!=va.end(); i++)
{
i_it=goto_program.insert_after(i_it);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither source_location nor function is properly set for these newly inserted instructions


if(object_size.is_nil())
debug() << "string_refine_preprocesst::make_string_assign "
<< "got nil object_size" << eom;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it nevertheless ok to continue?

@tautschnig
Copy link
Collaborator

From my point of view #523 is the main blocker here. (Actually, the only one, for me.)

@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch from 7aea419 to f14f165 Compare March 15, 2017 16:44
@romainbrenguier
Copy link
Contributor Author

Now that #523 is merged, I rebased and used this function to declare fresh array and strings. See the commit f14f165

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add // NOLINT(runtime/explicit) to this line to silence the linter on this invocation of the constructor from the two-parameter one.

Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ready to go after some minor polishing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert !arguments().empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert !arguments().empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert !arguments().empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert !arguments().empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return on next line

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use const & here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use const & here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use const & here

This avoid dependencies between goto-programs and solver.

We also removed function is_unrefined_string_type which should not be needed in the string solver.

Removed also mention of java string types and unrefined types in the
solver. These mentions should
not be necessary, since we are not supposed to do anything java
specific. The solver should only have to deal with refined string
types and possibly char arrays.
This is a more appropriate location for this module since it's used
both in the preprocessing of goto-programs and the string solver.
Refined string type should be used instead as we are trying to be
language independent.
@romainbrenguier romainbrenguier force-pushed the string-refine-preprocessing branch from ff69281 to 5cb0e5e Compare March 17, 2017 22:41
Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now.

@tautschnig tautschnig removed their assignment Mar 20, 2017
@kroening
Copy link
Collaborator

kroening commented Apr 4, 2017

Why did refined_string_type.h move into util/?

@romainbrenguier
Copy link
Contributor Author

@kroening the refined_string_typet class is used both in the preprocessing of goto-program and the string-solver when the --refine-strings option is activated. Moving it to util avoids dependencies between goto-programs and solvers

@romainbrenguier
Copy link
Contributor Author

I'm closing this PR which is obsolete compared to what is in the test-gen-support branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants