Length of lists

Jul 27, 2012 at 1:20 PM

Looking at the list example used in the tutorial, I was wondering how one would go about trying to verify a function returning the length of the list. A sample implementation would be something like:

int list_length(List *list)
{
    Node *head = list->head;
    Node *current = head;
    int count = 0;
    while (current != NULL)
    {
        current = current->next;
        count++;
    }
    return count;
}

Storing an integer within the list doesn't seem to work at first glance, as there is no way to know the variable 'count' of the implementation equals the length variable of the list. Also, since maps are infinite, I don't think we can derive the length from the idx map. Does anybody have any ideas how to do this within the current defnition of list in the tutorial?

Using an inductive type instead would probably work, but that requires rewriting the other example functions of the tutorial as well.

Developer
Aug 23, 2012 at 9:58 PM

The list in the tutorial is not, strictly speaking, supposed to be a list at all; it is supposed to be a list representation of a set. It is not intended to expose this representation. In that sense, what you really want to compute is not the length of the list but the size of the set.

The natural way to define the size of the set (without exposing the representation) is with a recursive function that iterates over all ints and counts those ints x s.t. val[x]. Because the range of ints is bounded, this can be given a termination measure, and so you can define this function. You can also prove by induction that the size of a set is one more than the size of the set with an element of the set removed.

Unfortunately, for this particular list, your function doesn't count the size of the set, because the list invariant allows a datum to occur multiple times in the list. You can eliminate this problem by strengthening the invariant of the list to have at most one node representing any datum.

Now, to prove your code above, you just need to maintain a ghost variable that keeps track of the set of values you haven't seen yet. Your invariant is that the size of this set plus count is the size of the set represented by the list, and that every element in this set has an index greater than the index of the node you are currently at. (This also gives the termination measure for your loop above.)

If you have trouble proving this, give me a holler (ernie.cohen@microsoft.com) and I'll send code.

If you actually wanted to expose the list representation of a list (e.g., as a sequence), then a different approach is called for: you should define a recursive function that gives the length of the list starting at any node of the list. To prove that this terminates, you need a measure for the recursion. The current list is unfortunate in this regard, because it uses increasing natural numbers as indices. You could keep in a ghost field of the list a bound on the maximum index of the list, which would give you the necessary measure. Or you can do things the hard way, and prove that the physical address space contains a bounded number of list nodes (which you can prove because there is an injective map from node pointers to physical addresses).