Another example program.A generalization of Karr’s algorithm in another direction is the useof polyhedra instead of affine spaces for approximately representing sets of program states; the classic reference is Cousot’s andHalbwachs’ paper [6]. Polyhedra allow to determinebesides affineequalities also affine inequalities like 3x1 5x2 7x3 . Since thelattice of polyhedra has infinite height, widenings must be used toensure termination of the analysis (see [2] for a recent discussion)—making it unsuitable for precise analyses. Sets of affine inequalities, however, allow to relate the values of variables before andafter a procedure call (a relational analysis in the terminology ofCousot)—thus naturally allowing for an interprocedural generalization. A relational analysis, however, that uses affine spaces orpolyhedra for approximating the relational semantics of proceduresis not precise enough to detect all valid affine relations in a program with procedures.

For a simple example see Figure 2. Thetrue relational semantics of procedure P is described by the formulax x0 x 2 x0 2 where x0 represents the initial and x the final value of the variable. The best approximation of this relationby an affine space or polyhedron is described by the formula true.It is obvious that this approximation of P’s semantics is too weakto detect that the affine relation x 2 is valid at program point 2 inprocedure Main.The paper is organized as follows. In Section 2, we formally introduce the programs to be analyzed together with their semantics.In Section 3, we introduce affine relations, their weakest preconditions along a program run and explain our algorithm for this specialcase.

In Section 4, we generalize our approach to deal with arbitrarypolynomial relations of bounded degree. In Section 5, we extendour approach to procedures with local variables and in Section 6 weshow how to take into account affine preconditions completely.2 Affine ProgramsWe model programs by systems of non-deterministic flow graphsthatcan recursively call each other as in Figure 1.

Let X x1 xk be the set of (global) variables the program operates on.We use x to denote the column vector1 of variables x x1 xk t .We assume that the variables take values in a fixed field . In practice, is the field of rational numbers. Then a state assigning values to the variables is conveniently modeled by a k-dimensional(column) vector x x1 xk t k ; xi is the value assigned tovariable xi . Note that we distinguish variables and their values byusing a different font.

For a state x, a variable xi and a value c ,we write x xi c for the state x1 xi 1 c xi 1 xk t .We assume that the basic statements in the program are either affine1 The superscript “t” denotes the transpose operation which mirrors a matrix at the main diagonal and changes a row vector into acolumn vector (and vice versa).assignments of the form x j : t0 ∑ki 1 ti xi (with ti for i 0 k and x j X) or non-deterministic assignments of the formx j : ? (with x j X). Assignments x j : x j have no effect ontothe program state. They are also called statements and omittedin pictures. Non-deterministic assignments x j : ? represent a safeabstraction of statements in a source program our analysis cannothandle precisely, for example of assignments x j : t with non-affineexpressions t or of read statements x j .

Let  be the set ofbasic statements.A program comprises a finite set !"$#% of procedure names that contains a distinguished procedure Main. Execution starts with a callto Main. Each procedure name p !"$#% is associated with a control flow graph G p Np E p A p e p r p that consists of:&a set Np of program points;&a set of edges E p ' Np ( Np ;a mapping A p : E p  *)+!"$#% that annotates each edgewith a basic statement of the form described above or a procedure call;&&&a special entry (or start) point e p Np ; anda special return point r p Np .We assume that the program points of different procedures are disjoint: Np , Nq 0/ for p - q. This can always be enforced by renaming program points.We write N for .

p /1012 354 Np , E for . p /6012 34 E p , and A for. p /1012 354 A p . We agree that 789  is the sete : A e ;5<of base edges and =8?> > p e : A e A@ p is the set of edges thatcall procedure p.The core part of our algorithm can be understood as a precise abstract interpretation of a constraint system characterizing the program executions that reach program points. We represent programexecutions or runs by sequences of affine assignments. Formally, arun r is a finite sequencer @ s1 ; ; smof assignments si of the form x j : t where x j X and t @t0 ∑ki 1 ti xi for some t0 tk . We write BAC?D for the setof runs. The set of runs reaching program point u N can be characterized as the least solution of a system of subset constraints onrun sets (see, e.g., [19] for a similar approach for explicitly parallel programs).

We start by defining the program executions of baseedges e in isolation. If e is annotated by an affine assignment,i.e.,A e E@ x j : t, it gives rise to a single execution: S e x j : t .The effect of base edges e annotated by a non-deterministic assignment x j : ? is captured by all runs that assign some value from to x j :S ex j : c : c Thus, we capture the effect of non-deterministic assignments bycollecting all constant assignments.

Next, we characterize samelevel runs. Same-level runs of procedures capture complete runs ofprocedures in isolation. As auxiliary sets we consider same-levelruns of program nodes, i.e., those runs that reach a program point uin a procedure p from a call to p on same-level, i.e., after all procedures called by p have terminated. The same-level runs of procedures and program nodes are the smallest solution of the constraintsystem S: S1S qS eq S vS v S2 S3 S4S rq εS u ; S e if e S u ; S p if e u v 7 9 u v 8= ?> > pwhere “ε” denotes the empty run, and the operator “;” denotes concatenation of run sets.

By S1 , the set of same-level runs of a procedure q comprises all same-level runs reaching the return point ofq. By S2 , the set of same-level runs of the entry point of a procedure contains the empty run. By S3 and S4 , a same-level runfor a program point v is obtained by considering an ingoing edgee u v . In both cases, we concatenate a same-level run reachingu with a run corresponding to the edge. If e is a base edge, we concatenate with an edge from S e . If e is a call to a procedure p, wetake a same-level run of p.Next, we characterize the runs that reach program points.

They arethe smallest solution of the constraint system R: R1 R2R Main R pR u R3hyper-plane in the k-dimensional vector space k . Such a relationcan be represented as a polynomial of degree at most 1 (namely, theleft-hand side) or, equivalently, as a column vector a a0 ak t .In particular, the set of all affine relations forms an -vector spacewhich is isomorphic to k 1 .

The vector y k satisfies the affinerelation a iff a0 a y 0 where a a1 ak t and “ ” denotesscalar product. We write y : a to denote this fact. Geometrically,this means that the point y is an element of the hyper-plane described by a.εR uif u =8?> > pR p ; S u if u N pThe affine relation a is valid after a single run r iff r x : a forall x k , i.e., iff a0 a6 r x 0 for all x k ; x representsthe unknown initial state.

Thus, a0 a r x 0 is the weakestprecondition for validity of the affine relation a after run r. Wehave Choice of Ar and br a0 a A r x b r t xx x j t x where t x is the value of term t in state x. This definition is inductively extended to runs: ε , where is the identical mappingand ra a r .The state transformation of an affine assignment x j : t0 ∑ki 1 ti xiis an affine transformation. Hence, it can be written in the form x j : t x Ax b with a matrix A k k and a (column) vectorb k .

More specifically, A and b have the form indicated below:AI j 10t1 tk0Ik jb0t00(1)Here, Ii is the unit matrix with i rows and columns and 0 denoteszero matrices and vectors of appropriate dimension. In b, t0 appearsas j-th component.As a composition of affine transformations, the state transformer ofa run is an affine transformation as well.

For any run r, let Ar k kand br k be such that r x Ar x br .a0 a br a Ar x 0At x y from linear algebra Law x Ay iff x j :0 Linearity, rearrangementSo far, we have furnished procedural flow graphs with a symbolicoperational semantics only by describing the sets of sequencesof assignments possibly reaching program points.

Each of theseruns gives rise to a transformation of the underlying program statex k . Every assignment statement xi : t induces a state transformation x j : t : k k given bya0 a 5 r x 0iffiffBy R1 , the procedure Main is reachable by the empty path.

By R2 , every procedure p is reachable by a path reaching a call ofp. By R3 , we obtain a run reaching a program point u in someprocedure p, by composing a run reaching p with a same-level runreaching u.a0 a br Atr a x 0From this characterization we see that the weakest precondition isagain an affine relation. Even better: The mapping that assignsto each affine relation its weakest precondition before run r is thelinear map described by the following k 1 ( k 1 matrix Wr :Wr1btr0Atr(2)In particular, we have proved that for every x r x :aiffx : Wr ak:(3)Thus, the matrix Wr provides us with a finite description of theweakest precondition transformer for affine relations of a singleprogram execution r.Note that the only affine relation which is true for all program statesis the relation 0 0 0 t .

Thus, the affine relation a is valid afterrun r iff Wr a 0, because the initial state is arbitrary. Accordingly,the affine relation a is valid at a program point u, iff it is valid afterall runs r R u . Summarizing, we have:L EMMA 1. The affine relation a u iff Wr a 0 for all r R u .k 1is valid at program pointThus, the set W Wr : r R u gives us a handle to solve thevalidity problem for affine relations. The problem is that we donot know how to represent W in a finitary way—let alone how tocompute it. In this place, we recall from linear algebra that the setof k 1 A( k 1 matrices again forms an -vector space.

