Formal Assertions for C and C++

Haim Zamir, 1991

 

Abstract

A technique for incorporating formal assertions compatible with both C and C++ is presented. It is shown that use of these techniques aid in detecting an important class of bugs caused by the failure of modules and their clients to live up to their declared responsibilities.

 

Introduction

The formal assertion tools described here are the product of an attempt to transform the concept of  “programming by contract” into a concrete C and C++ programming feature. The approach taken is to provide some software specification elements to be permanently embedded within the implementation of a software module.

This paper describes the services provided by CrFormal.h, a header file compatible with both C and C++. The source listings that accompany this article provide a full implementation of formal assertions described below.

 


Existing tools

The ANSI C assert() macro

ANSI C provides standard support for assertions [K&R 88]. It provides the header file assert.h which defines a macro that masquerades as a function of the type:

   void assert(int expression)

 

Here is an example of its use:

#include <assert.h>

int strlen(char *p)

{

   int val;

   assert(p != NULL);

   /* not responsible for result it p == NULL */
   /* .... set val to length of string */

   return val;

}

 

The macro has two definitions. By default, the macro is replaced by code to this effect:

if(!(p != NULL)) {

   fprintf(stderr,
      "File %s; Line %d ## Assertion failed: p != NULL\n",
      __FILE__, __LINE__)

   abort();

}

 

Failure of the assertion produces output like this on a console-type application:

   File filename; Line nnn ## Assertion failed: p != NULL

If, at compile time, a definition of NDEBUG is detected, the macro and the accompanying expression produce no code whatsoever.

      This standard macro is helpful, but it is not without its problems:

It is hardwired to produce console-style output. We need assertions that produce output for applications and device drivers as well.

If the assertion is violated, the program immediately terminates. Although the assertion message shows us the location of the violated assertion in the source code, that isn’t necessarily where the bug is. In the above example, the client is responsible for providing valid arguments.

Plain vanilla assertions lack a formality of application, and therefore tend be be used informally. This results in both under-application where they are badly needed, and frequent misuse[1] where they are inappropriate.

 

What are assertions?

Assertions are fundamentally debugging statements. They express assumptions made by the program. When these assumptions do not hold true, the program breaks.

      Assertions are necessary, since every function cannot take responsibility to produce reasonable results under all conditions[2]. This is particularly true of low-level functions. A higher-level function typically has the obligation to provide reasonable arguments to each function it calls.

      Assertions can be switched on and off with compile-time definitions. This is typical of other debugging statements. They are also cleaner, since they use far fewer ugly  #ifdef’s than other techniques.

 

Formal Assertions

Formal assertions spell out and enforce the contract between a module and its clients. Conceptually, formal assertions are part of the specification of a module, no different from stating which data types are to be passed as parameters or returned by a function[3].

·        Just as the compiler enforces that certain types must be passed as parameters and returned at compile time;

·        And (in C++), the linker effectively enforces the conformance of library functions to their stated interfaces (type-safe linkage);

·        Formal assertions point out violations of the programming contract at run-time.

 

When the module is compiled with formal assertions enabled, the module halts the program with a meaningful message if either the module or one of its clients fails to fulfill its obligations.

 

Require and Ensure

Preconditions and postconditions are grouped together into clauses. Were it possible for us to graft language-level support for formal assertions onto C or C++, the require clauses would appear as part of the function prototype, and might look something like this:

   require {

      expr;

      expr;

   }

 

Unfortunately, we cannot do this as gracefully as we would like without language-level support. Therefore, we created preprocessor macros require and ensure to mimic the Eiffel keywords of the same name as closely as possible, and to provide headings for our clauses  where they are most visible, —at the top of the function definition

 

Here is strlen() presented again with formal assertions:

 

#include "CrFormal.h"

int strlen(char *p)

{

   int val;

   require {   // client must satisfy these conditions

      precondition(p != NULL);    for

   } endreq;

   ensure {    // we guarantee the following

      postcondition(val >= 0);  

   } endens;

   // .... calculate string size and assign it to val here

   ensReturn(int, val);
}

 

In addition to macros require and ensure, we have added something to terminate each clause: endreq and endens. The endreq macro always expands to nothing, it is provided only for symmetry with the endens macro, which works in tandem with ensReturn, to guarantee that postconditions are checked just prior to returning a value, regardless of where the ensure clause is positioned.

 

Class Invariants

Suppose that we have a dynamic array class of the following description:

 

#include "CrFormal.h"

class Array {

public:

   Array()             {iCount = 0; iData = NULL;}

   ~Array();

   int Count()           { return iCount; }

   int Add(void *p);              // add item, return index

   void Remove(int index);    // remove item at index

   void *Nth(int index);        // get pointer to nth item

protected:

   void **iData;

   int iCount;

   errcode Illegal() const;      // checks invariants

};

 

All methods of our class might want to verify the integrity of the instance before performing any work in debugging mode. This verification should be done both before and after the invocation of every method. How is this accomplished? Define a class invariant!

 

The class invariant describes the properties of an object that remain constant[4], we want to be able to specify them all in one place, and check them automatically on entry and exit to every method. For this purpose we specify the “Illegal” method.

 

 

Array::~Array()

{ 

   require {} endreq;

   if(iData)   DisposHandle(iData);

}

 

int Array::Add(void *p)

{

   Handle h;

   int index = -1;

   require {} endreq;

   ensure {} endens;

 

   if(iCount == 0 ) {

      h = NewHandle(sizeof(void *));

      if(h == NULL) goto end;      // can’t do it

      iData = (void **)h;

   } else {

      SetHandleSize((Handle)iData, iCount * sizeof(void *));

      if(MemError()) goto end;  // can’t get more memory

   }

 

   index = iCount++;  

   (*iData)[index] = p;

 

   end:

   ensReturn(int, index);
} // end Add()

 

errcode Array::Illegal() const

{

   errcode err = NO_ERR;

 

   if(!validNonNullPointer(this))         // is this a pointer

      err = kArrayErr_BadInstancePointer;

   else if(!validPointer(iData))            // iData null or a good ptr

      err = kArrayErr_BadDataPointer;

   else if(iCount < 0)                   // no negative counts allowed

      err = kArrayError_BadCount;

   else if((iData == NULL) != (iCount == 0)) // are ptr, count in sync?

      err kArrayError_InconsistentVariables;

   return err;

}

 

The class invariant above states explicitly that iCount should never be less than zero, and that iData should only be NULL when iCount is zero, and at other times it should appear to be a valid pointer (since it points to the beginning of an allocation, it should follow platform address rules)

 

Limitations

Some features are best provided at the language level. They are far more gracefully incorporated that way. We must do the best that we can, however, where necessary language features are missing.

      CrFormal.h uses the require and ensure clauses as opportunities to do class invariant checking. Where these clauses are missing, invariants also cannot be checked.

      The ensReturn(type, expr) and ensReturnVoid macros can only be used once in any function. Neither can be used in absence of an ensure clause. This does not pose a problem for those whose convention it is to provide only a single exit point for any function. It is generally not a good practice to rely on programming conventions when adding language-like features. At least in this case, a compilation error occurs if a construct is missing or employed too many times. There is no protection from raw return statements in addition to a ensReturn statement, however.

      Two versions of the ensReturn (type, expr) macro are included in CrFormal.h. One uses a temporary variable of the specified type to prevent the expression from being evaluated twice. This definition makes the world safe for side effects, but it is not perfect. I use the second definition, and avoid side effects in ensReturn statements.

 

Conclusion

“Programming by contract” is an critical concept for building robust software. However, as a concept, or as a documentation technique alone, it will have no impact on software development.

      An unenforceable contract has little value. Formal assertions simultaneously express and enforce the terms of the module-client contract. The formal assertions techniques specified here have a user definable hook to allow them to operate in either a traditional shell tool or under a full-blown application running under a GUI.

 

Definition of terms

Each of the terms below refers to other terms also defined here in italics.

 

Assertion

An assertion is a statement that evaluates a boolean condition that in one state allows execution to proceed unhindered, and in another produces an exception. Assertion checking is can be compile-switched in for debugging purposes, but generates no code in a shippable product.

      Ansi C provides its own standard assertion macro, assert(expr). This article describes formal assertions that operate in the context of an event driven application with a GUI.

 

Formal assertion

A formal assertion is an assertion that has a prescribed formal relationship to the semantics of a function. Preconditions and postconditions are examples of formal assertions. Put in more contract-like terms, they represent disclaimers and guarantees, respectively.

 

Invariant

An invariant is an assertion that describes properties of a data structure that never change. When enabled, class invariants are checked upon entering and exiting each method. If an invariant is violated upon entering a method, then the integrity of that instance was violated prior to calling the method. If the invariant is violated upon exiting a method, then the method did not fulfill its obligation to preserve the integrity of the object instance.

 

Postcondition

A postcondition is a formal assertion  that expresses a responsibility of the function to fulfill. A postcondition violation indicates that either the function does not work correctly, or that the assertion is incorrectly specified. Postconditions assume that all preconditions have been satisfied.

 

Precondition

A precondition is a formal assertion that expresses a responsibility of the client that must be fulfilled prior to calling the function for the function to operate correctly.

 

Acknowledgements

The concepts and the model for their usage were borrowed from the Eiffel programming language [Meyer 88]. This work was done in support of developing reliable core architecture for multimedia systems for MacroMind, Incorporated, San Francisco, California.

 

References

[Meyer 88] Bertrand Meyer,  “Systematic Approaches to Software Construction,” Object Oriented Software Construction,   Prentice Hall, 1988, pp. 111-164.

[K&R 88] Brian W. Kernighan, Dennis M. Ritchie, The C Programming Language, (second edition), Prentice Hall,   1988, pp. 253-254.

[Wing 90] Jeanette M. Wing, “Using Larch to Specify Avalon/C++ Objects,” IEEE Trans. Software Eng.,  Special Issue on Formal Methods in Software Engineering, vol. 16, no. 9,  September 1990, pp. 1076-1088.

 



[1] Where are they inappropriate? We have seen a lot of code which substitutes assertions for valid failure conditions, (i.e. checking if a memory allocation succeeded). Assertions should specify the logic of the program. In otherwords failure of an assertion is a bug in the program. Lack of memory is not a bug in the program, it is a real world possibility that must be accounted for in release versions of any program.

[2] See [MEYER 88] for a discussion of why defensive programming is bad. Redundant code is written because of unclear specifications, which act to obscure more serious bugs. We want a program with a bug to fail definitively at the earliest possible moment (in order of preference compile time, link time, load time, run-time). Defensive code mysteriously partially compensates for and obscures bugs under many conditions, making serious bugs more costly to find, by deferring their detection, and removing the symptoms from the source of the bug.

[3] In practice, formal assertions are part of the implementation.

[4]  If the object has multiple valid states, then it checks to guarantee that it conforms to one of these valid states. If there are object methods that are only to be used when the object is in a particular state, that should be expressed in a precondition. If the purpose of a method is to put the object into a different state, conformity to that particular state is a postcondition of the method.