This project is read-only.

Verification ease: cast to void pointer and (more generally) palatable specification for adding an element to a set

May 18, 2010 at 9:38 AM

Hi all,

apologies that this post is a bit lengthy, it's not about something needs to be fixed urgently but more "general discussion" ...

Yesterday we discovered that when observing 

#include "vcc.h"
typedef struct A_str {
    int dummy;
} A_t;

void test(A_t *a) {
   obj_t o;
   void *v;
   int *i;
   o = (obj_t)a;
   v = (void *)a;
   i = (int *)a;
}

in BPL ("vcc /t") the three casts to obj_t, void *, and int * are converted in the following way:

 

// o := a;
L#o := $ptr(^A_str, P#a);
... 
// v := a;
L#v := $ref($ptr(^A_str, P#a))
...
// i := (int32_t*)a;
L#i := $ref($ptr_cast($ptr(^A_str, P#a), ^^i4));

So obviously casts to void are not translated to a "ptr_cast" (credits to this observation are to Thorsten).

 

Admittedly, I guess the rationale for slipping the cast in the BPL translation can be found in C99 6.3.2.3 "A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer."

However, this feature can lead to confusing situations like in here where we attempted had to use set of void pointers as specification objects:

#include "vcc.h"

typedef struct data_str {
    int i;
} data_t;

typedef struct manager_str {
    int dummy;
    spec( ptrset s; )
} manager_t;

void *callee(manager_t *man)
ensures(
  result != 0 ==> set_eq(man->s, set_union(old(man->s), SET((void*)result)))
  )
;

void caller(manager_t *man)
{
   data_t *e;
   spec(state_t s0 = current_state();)
   e =  callee(man);
   assert(e != 0 ==> set_eq(man->s, set_union(in_state(s0, man->s), SET((void*)e))));
}

Where verification gives us:

p:\tool\vcc\sandbox\elementary\result\callee-result-void.c(23,13) : error VC9500: 
Assertion 'e != 0 ==> _vcc_set_eq(man->s,_vcc_set_union(_vcc_in_state(s0, man->s),
_vcc_create_set(_vcc_dummy_set_element(), (void*)e)))' did not verify.

This is caused because the the cast to "(void*)" visible in the C code is dropped in the BPL translation. Of course an easy work-around is to work with sets of "unsigned int *" instead of "void *" (in the above example simply replace all occurrences of "SET((void*)" with "SET((unsigned int*)").

This brings us right to some points that we would like to see discussed:

(1) in the above specification, what is most "prover-friendly", e.g. for the prover does it make a difference in performance whether what the type of any pointer is that it reasons about? (Or is the type-checking completely orthogonal (say, stage 1, BPL translation phase) to the prover (say, stage 2, Boogie/z3 phase)?).

(2) the above specification is a simple transition consisting of adding an element to a set. As later we might want to assert invariants on the set, we have packed it into a "struct data_str". Is there any specification style that people consider "more natural" (and/or more performant) for a simple transition like this?

Thanks and cheers, Holger (and Thorsten)

May 18, 2010 at 10:11 PM

VCC removes (void*) casts because CCI used to add them in some places, where VCC doesn’t like them. In particular this applies to arguments of free(), which takes void*, and thus arguments are implicitly casted to void*. VCC “overrides” free() with its own version that is type-safe, and these casts confuse it. This can be in principle fixed, so please report a bug, but don’t hold your breath.

Nevertheless, the ptrset is the set of pointers along with their types, i.e., pairs of (reference,type). I’m not quite sure what you’re trying to achieve, but in general it is better to use maps to Booleans than the ptrset. Such maps can be made strongly typed. The downside is that you need to define your own functions for the usual set operations, but on the other hand, it’s fairly simple with lambda expressions.

As for (1), Z3 will see the type, because it just sees a set of pairs. Does that answer the question?

As for (2), there is an integer packed in data_t, not a set. Are you talking about manager? Then I guess it’s fine to have invariants there to specify what might happen to the set.

Hope this helps,

Michal

May 19, 2010 at 2:56 PM

Hello Michal,

as for removal of (void*) now moved to issue tracker (http://vcc.codeplex.com/WorkItem/View.aspx?WorkItemId=4669), priority low (just document it). Many thanks for the elaborations on specification style (thinking about it, maps may indeed seem more natural when predicates are simple - gives me an incentive to look closer at the maps ... ). As for (1) "Z3 will see the type": just for general understanding: so the type-checking is both within VCC (front-end, e.g. VC0018 error) and Z3 (back-end)? I so far had thought usually type-checking would be done rather early (in dragon book terms: during semantic analysis).

Cheers, Holger

May 19, 2010 at 3:28 PM

For the record, here is the map version of the above snippet,

#include "vcc.h"
typedef struct data_str {
    int i;
} data_t;

spec (
typedef struct manager_str {
    int dummy;
    bool set_f[int*];
} manager_t; );

void *callee(spec(manager_t *man))
ensures(result != 0 ==> man->set_f[(int*)result]);

void caller(spec(manager_t *man))
{
   data_t *e;
   spec(state_t s0 = current_state();)
   e =  callee(man);
   assert(e != 0 ==> man->set_f[(int*)e]);
}
Greetings from Sergey (who moreover reminded me that the doubly-linked list example List.c in the VCC examples does it similar), Holger

May 19, 2010 at 7:54 PM

The type-checking, in the sense of C standard, is of course done in the compiler. However, type correctness in C doesn’t really guarantee much in terms of runtime because of casts. For this reason the verification doesn’t really depend on the C type correctness of the input program. Instead, at every memory access we’re checking if it dynamically has the type we’re accessing it with. This is actually hidden in the check for thread-locality, which we would need to do anyhow.

Note that type casts in VCC do not involve any proof obligations, you can cast as much as you want, it’s only when you access the memory the type comes into play.

Below you’ll find a fragment from the upcoming tutorial, which gives another shot at explaining this:

Talking about instances of structs in C is a tricky business.  In plain C a

structure type generally just gives some guidelines how to interpret arrays of

bytes.  Some programming languages, like Java or ML, have a much more

disciplined view of memory: a program allocates an object of given type, not

merely $n$ bytes of memory, and later there is no way to change the type

assigned to that memory location.  Most C programs also follow this model, most

of the time: when a function gets a pointer to \vcc{struct Foo}, it usually

doesn't expect to find data corresponding to some \vcc{struct Bar} there.

Still, there are rare situations where the program needs to change type

assignment of a pointer.  The most common is in the memory allocator, which

needs to create and destroy objects of arbitrary types from arrays of bytes in

its memory pool.  Therefore, the general rule in VCC is that programs are

forced to follow the strict Java-like type discipline, except for places where

explicit annotations indicate reinterpretations of type assignment (these

annotations are explained in \secref{reint}).

In (sequential) Java programs it is usually enough to check that \vcc{o} is

non-null in order to deduce that accessing \vcc{o.f} is safe.  This is not the

case is C, particularly not in concurrent C.  The pointer \vcc{o} might be

non-null, but still point to an invalid (for example unallocated) memory

location.  VCC inserts an assertion in front of every memory access, \vcc{*p}

in the program.  This assertion will check that memory location pointed to by

\vcc{p} is currently allocated (which is something any sound C verifier would

need to do), but also that this memory location \emph{is currently assigned

type} \vcc{T}, where \vcc{T} is the statically known type of \vcc{*p}.  For

example:

\begin{VCC}

struct A { int x; };

struct A *a;

// ...

_(assert \valid((int*) &a->x)) // inserted by VCC

tmp = a->x;

\end{VCC}

VCC checks that the \vcc{&a->x} points to currently valid \vcc{int}.  There are

restriction on type assignment (to be explained in \secref{memmodel}), which

guarantee that C pointers behave like type-safe, Java-like objects.  Note that

your program can still access memory almost arbitrarily, but it requires

additional annotations.  Normally there isn't much need to think about the

validity of pointers, as it is part of the thread-locality/writeability check

described previously (\secref{writes}).

May 21, 2010 at 2:11 PM
Hello Michal, thanks for the nice clarification. Holger