10 1 1 68

Graph
Nils Klarlund {klarlund,mis}@daimi. Aarhus Ny University, Munkegade, Department
Types
I. Schwartzbacht aau .dk of Computer ~rhus, Denmark Science,
& Michael
DK-8000
Abstract Recursive data structures are abstractions of simple
0 the invariant generate
can be exploited values. originate from
to automatically compar-
code for such tasks as copying,
records and pointers. They impose a shape invamant, which is verijied at compile-time and exploited to automatically ing, However, a major generate code for building, values wiihout use. which allow comlists or threaded We deto speccomputing a secondcopying, comparand traversing loss of ejiciency.
ing, and traversing Recursive tional data types
the seventies typed
[7] [10],
and have become languages
ubiquitous such as ML
in modern
funcimper-
such values are always tree shaped, which is obstacle to practical
[8] and MIRANDA in PASCAL-like
but they may also be employed
We propose a notion of graph types, mon shapes, such as doubly-linked trees, to be expressed concisely jine regular languages addresses tree. An ify relative spanning such of routing
and efficiently. expressions
at ive languages. Their benefits are substantial, but they also impose limitations; in particular, the values of recursive data types will always be tree shaped. In this paper we present a natural generalization, graph types, which allows a large variety of graph shaped trees. values, including root-linked (doubly-chained) cyclic lists, leaf-to-
of extra pointers
in a canonical
eficient
addresses
is developed.
algorithm for We employ
trees, leaf-linked
trees, and threaded
order monadic logic to decide well- formedness of graph type specifications. This logic can also be used for automated reasoning about pointer structures.
The key idea is to allow only graphs with a backbone, which is a canonical spanning tree. All extra edges must depend functionally on this backbone. The extra edges are specified by a language of regular routing expressions, which give relative addresses within the backbone. We show that construction of such graph valuesalong happen cidable tomatic monadic derivation with all relevant time. manipulationscan We introduce a deallows auThere efficiently in linear
Introduction
Recursive data types are abstractions of structures bllilt from simple records and pointers. The values of a recursive tures that advantage
q
data
type form a common
a set of pointer shape invariant.
strucThe
all obey
logic of graph types, which of some constant of doubly-linked
of this approach of the invariant and
is twofold: can be statically verified
time operations lists. graph-shaped exact descripand it does so close to exist-
such as concatenation to the correct-
validity
at compile-time, ness of programs;
which contributes
*The author is supported by a fellowship from the Danish Resemch Counsil. t The author is partially supported by the Danish Research Council, DART Project (5.21.08.03).
Permission granted direct title that to copy that without fee all or part are not made copyright and appear, of this material is for provided commercial of the copying specific ACM the copies or distributed notice notice
have been other attempts to describe values. Our proposal, however, allows tions of a more general class of types, using an intuitive notation that is very ing concepts in programming languages. This summary is kept in an informal, and algorithms
explanatory are included
style. Formal definitions in the appendix. 2 Data Types
advantage,
the ACM
and the is given a fee
publication To copy
and its date otherwise, USA
is by permission permission. 0-89791
of the Association or to republish,
for Computing requires
Machinery. and/or @ 1993 ACM-20th
PoPL-1/93-S.C.,
For this presentation, a (recursive) data type D is a special kind of tree grammar. The non-terminals are called types. There is a distinguished main type, which
-561 -5/93
/0001
/0196
. ..$1 .50
196
in examples is always the one mentioned ers are merely auxiliary. A production z4v(al: Tl,..., an:Tn)
first;
the oth-
would tween of type
always
denote
the corresponding The comparison expression v,
value;
in an be-
imperative
language
there is the usual distinction
1- and r~values. L, then
x = y is alis(x,v) yields
ways defined true exactly
for two values of type the boolean when x is of variant
L. If x is a value In an impera-
of D, where T and the T~s are types, declares a variant v of type T containing say that the production data jields named al, ..., declares a type-variant an; we (T: v).
tive language, the value assignment x := y is present, possibly accompanied by the swap x :=: y which exchanges two subtrees without copying. Values of data types are traversed by recursive functions or procedures. Thus, explicit pointers are never used. in this approach.
For each type, the possible variants must be mutually distinct; thus (T: v) uniquely determines the production, Moreover, for each type-variant, the data fields must be mutually distinct.
There is no intrinsic The values of a data type are essentially the derivation trees of the underlying context-free grammar, starting with the main type. They are implemented as pointer trees, but the programmer ulate these pointers. will never directly maniptree Shortcomings Each node of such a pointer
loss of efficiency
Constants can be built, copied, compared, and traversed in optimal linear time, and addresses are accessed in constant time. Thus, if one really wants tree shaped values, then only advantages of Data Types are to be seen.
is an instance of a variant of a type. A formal detlnition of the values of a data type is given in section Al of the appendix. following integer data lists L + nonempty(head: Int, tail: L) As a simple type, which example, consider the specifies a type of simple
The main draw-back of data types is the limited shapes of values that they allow. For the above simple lists, values always look as follows pictured as a ground symbol) (an empty record is
-+ emptyo We can think specified w of the type Int as being a data type However, it is a common optimization to want an extra pointer to gain constant time access to the last element of the list. Thus, following shape the values should instead have the
Int+Oollo1201..
We allow implicit variants as a form of syntactic sugar. If the sets of data fields are distinct for all variants, then the explicit of the variant field names. variants are not needed; we may think a concatenation write L) of the names ss being Thus, L + 4() Programming with Data Types
we may instead (head: Int, tail:
These are not trees and, hence, cannot data types. Until now, there this problem. The only possibility to the often perilous use of explicit 3 Graph Types the notion simple
be specified
by to
has been no solution
has been to revert pointers.
When a data type has been specified, it gives rise to a number of operations in the programming language. First of all, there is a language for denoting constant values. For the above lists, one may write down L(head: 11, tail: (head: 12, tail: (head: 13, tail: ())))
We introduce conceptually
,0 of graph types, which of data types. retaining
form
a al-
extension There
They
low graph shaped values while and ease of use. solution:

q
the efficiency to our
are two key insights
for the list of type L with elements 11, 12, and 13. If x is a variable containing a value of type L, then x.tail.tail.head specifies the address of a subtree, in this case of type Int. In a functional language this
while being graphs, which is a canonical
the values all have a backbone, spanning tree; and
197
the
remaining
edges are all functionally
deter-
mined
by this backbone. we give
each case. In pictures of values, we use the convention that pointers from data fields are solid, whereas those from routing pointer fields with are dashed. no origin. looks like The root of the unby a a pointer derlying solid spanning tree, or backbone, is indicated
Many, but not all, sets of graphs fit this mold; examples of both kinds.
The list with
A graph type extends a data type by having routing fields aa well as data jields. Productions now look like T~v(. ..ai
:~... flj:!lj[l ?)...)
to the last element
H a L A + A typical value is
(first: (head: ()
L, last: Int, tail:
L[Jfirst L)
Jtail
$ t])
Here ai is a normal data field but aj is a routing field. It is distinguished by having an associated routing expression R. A graph type has an underlying which is obtained by removing the routing backbones ues of this relative data type. Routing data type, fields. The the valdescribe
of the graph type values are simply expressions the backbone.
first
addresses within
The complete
graph type value is obtained by using the routing expressions to evaluate the destinations of the routing fields. Routing expressions are regular expressions over a language of directives, which describe navigation within a backbone. Directives include move up to the parent (from a specific child) (t or T a) , move down to a specific child (J a), and verify a property of the current node, ( A), this defines where properties is a leaf include this is the root expression preto a ($), and this is (a specific variant (T or (T: v)). indicated if its regular A graph A routing
i. . . . . . . . . . . .
H-------------f
Jfirst Jtail $ T for the last
The routing
expression
field contains the following directives: move down along the first pointer (Jfirst); follow the tail pointers until a leaf is reached (Jtail $); then back up once (t). This is the destination of the last pointer. A cyclic list looks like
of) a specific type ing routing field
C -+ (next: -+ (next:
C!) C[~* A])
the destination
by the correspondcontains leading
language
cisely one sequence of successful directives node in the tree. routing expression always defines a unique gives formal
type is well-formed
if every of
A typical
value is
destination. definitions
Section A2 of the appendix these concepts. To make a convincing we need to demonstrate . many useful families
case for this new mechanism, the following of structures facts: can be easily next ~ I next
specified; . values can be manipulated at run-time similarly to values of data types, and without loss of efficiency;
q
M
The routing expressions contain the following A doubly-linked simple cyclic directives: move up to the root. list looks like D ~ ~ (next: (next: D, prev: D[t +A Jnext$]) D[t A], prev: D[t +A]) A typical value is
and specifications can
well-formedness of graph type be decided at compile-time. Examples
We now show that many common pointer structures have simple specifications as graph types. The examples are all well-formed, which can be easily seen in
198
r ----1 jr II II prev I Ii i
L -;I L I
lnext
IM!3
next
-..p::v1 -..
$-R
prev ----next 1 I , prev I next left complicated operator here; they use the expresA
!?
left
Directives
are more
nondeterministic
union
on regular
L ------
sions (+) to express context-dependent choices. For example, consider the (prev field of the first variant. According root, follow to the routing next expression T + A $next $ of A binary tree with this field, we must either move up, or, if we are at the to the leaf. to the root
-----next
I . . . . . J
red or black leaves, in which
those
of the same color are joined
in a cyclic list, looks like
pointers
A binary tree in which all leaves are linked looks like R ~(left, -(root: A typical value is right: R[f R)
A])
K +(left, right: K) ared(next: K[BLACK* ~black(next: where RED abbreviates breviates STEP (K: black). ing a typical looks like T +(left, right: T, post: K[RED*
RED]) BLACK]) and BLACK abfrom showtree in post-order a binary
STEP (K:red) Finally, cyclically
We shall abstain
value of this type.
in which all nodes are threaded
TIPOST])
! ,
a(post: \ A I I 1 I I i ---1 1 I
TIPOST])
r -----1 1 I I I root ~ I I ~ left

I I L
---/ left
where POST abbreviates A typical value is
~right+tleft
\rightJleft*$+
A Jleft.
I J
-6
J a(left, +(next:
~ b
root
-J
r -----I I I
------
. --
----
A binary tree in which cyclic list looks like
all the leaves are joined
in a post
left --
1 : 1
1 1 I t
right: J) J[sTEP$]) Tright* (tleft lright+ A) Jleft*.
left
where STEP abbreviates A typical value is
I 1
199
At a first ing, but familiar. of routing
glance such specifications at least to the authors The use of abbreviations, legibility expressions. Complicated they
may seem dauntquickly became reuse strucsuch as STEP and and promote pointer
exactly
the same as for the underlying fields are just for constants data type. ignored.
two data values;
the routing The syntax underlying
are also the same as for the fields
POST above, may improve
The values of the routing
tures may give rise to complicated graph type specifications. However, it is fair to say that the complexity of the graph type specification inherent pictorial complexity, description would. correlates well with this a verbal or in the same way that
are then computed automatically. The example values of the previous section are specified as constants as follows: H(first: C(next: D(next: R(left: J(left: T(left: Note that identical, (head: 11, tail: (head: 12, tail: (head: 13, tail: ())))) (next: (next: (left: (left: (left: (next: (next: (), right: (), right: (), right: ()))) ()))) ()), right: ()), right: ()), right: ()) ()) ()) are
Not all families by graph istic, things tree. types.
of graph shaped values can by specified First of all, they must tree. This be determinof such be precludes cannot all edges must be functions
in the sense that as a pointer But
some underlying
spanning
from the root to some node in the situations
even all deterministic
specified. Consider a generalized tableau structure on a grid, in which there must be an edge from a point to the one immediately below, if they are both present.
the expressions
for the C- and D-values
as are those for the R-, J-, and T-values.
Copying (sub) values happens in two steps. First, the underlying spanning tree is copied; second, the values of the routing fields must be reevaluated. Consider for example is copied, root of that the leaf-to-root-linked tree. then several spanning Consider tree. now point If a subtree to the new then the leaves must
If a data field in a graph value is assigned,
routing fields in the both the surrounding tree and the new graft may have to change. for example changed the red-black leaf-linked then from red to black, it must
trees. If a leaf is be removed
from one cyclic list and inserted in another. A simple way of handling this is to reevaluate all routing fields, but that A graph type cannot represent such graphs, since the variant at a given node is dependent on whether there is a downward pointing edge. Thus the variant is dependent on the rest of the graphsomething we cannot specify 5 in a context-free grammar. Routing fields can be read just like data fields; also point to subtrees of the canonical spanning It is, of course, routing field. In summary, not possible to assign directly they tree. to a is undesirable since the surrounding small. A similar fields of subtrees. tree may problem that are be large and the graft maybe exists for the swapping an algorithm for detecting required to be updated.
We must develop
the routing
Programming
So far, we have seen that many families of pointer structures can be captured ss the values of graph types. We must also demonstrate used for programming in a manner data types. An obvious is that the problem recursive with having graph shaped values may be problematic; traversal that they can be similar to that for
many
of the required
algorithms
are in-
herited from the underlying data structure. However, we must be able to evaluate all routing fields in only combined linear time, and for assignment we need to detect those routing fields that must be updated. Evaluating Backbones Routing can clearly Fields be constructed in linear time.
how can we avoid cycles? However, for graph types we have the canonical spanning tree of the underlying data value. Thus, many of the simple techniques can be inherited in a straightforward manner. For example, the algorithm for comparing two graph values is
Given a backbone, fields in combined
it is possible linear time.
to evaluate
all routing
200
First, each routing expression in the graph type is translated into an equivalent nondeterministic automaton. This translation is linear. Next, a table is constructed that for each node a and
such a node can be followed backwardstowards possible origins, routing fields whose routes go through the node-and forwardstowards a possible destination. Above, this involves finding four destinations and four origins. For example, when considering a, we obtain two origins, the next fields of a) and a{, and their corresponding destinations. We shall shortly sible. Note, see how further that optimizations are po5 types the
for each automaton state q of each automaton A contains a pointer. Intuitively, if this pointer is not nil, it indicates rectives a node /3 reachable a such that by a sequence w of diw, automaton described in from upon reading
A may end up in a final state at node ~. This table is calculated in linear time by an algorithm the appendix. When the table has been constructed, the destination of a routing field at CY given as the pointer found in is an entry (cr, go) of the table, of the automaton Detecting Sometimes date routing representing where go is an initial the routing state expression.
however,
for some graph
number of paths This happens for type R described an existing tree. the techniques the algorithm Monadic
to follow may be proportional to n. example for the root linked trees of earlier when a new root is added to In this case there is no gain in using in this section all routing compared fields. to
described for updating
Required
Updates to up-
Logic
and
Well-Formedness logic on graph types is a propand
when a change occurs, it is sufficient fields for only a small part
The monadic logical erties about
second-order that graph types
of the value.
formalism
allows several important to be expressed. Our logic permits types, addresses, we define the logic formally
For example, this happens when swapping subtrees of values of type J, the type of leaf-linked binary trees. Consider the situation after the subtrees rooted at addresses a and /3 have been swapped:
In section quantifiand sets
A4 of the appendix, show that cation
it is decidable.
over values of graph
of addresses. In this logic we can formulate questions such as What is the type-variant of a node a in a value z? question or Is there a walk in a value z from node a to a routing a graph Thus about type expression R? The can of whether is well-formed to node ~ according also be expressed Similarly,
in the logic as it is shown in section this question is decidable. such as types, comparing values,
A4 of the appendix. questions
Val G1 G Val G2, where

1 I 1 I
GI and G2 are graph
are decidable. Although much can be expressed in the monadic second-order logic on graph types, there are simple operations that cannot. For example, one cannot represent the result subtree (although be expressible). Access Optimization of updating only routing four fields fields in leaf-linked needed to be upthe destiof replacing certain a subtree with another may properties of the result
L -----------------
------------------
---.
.-.
..-a
next
Here only the next need to be updated. doubly-linkedby
pointers at a, a, If we assume that a field prev:
@, and P J is made J[t + A]-
adding
it would often be less costly to locate the four nodes {a, a, @, ~} after the change and reevaluate their next fields than evaluating all routing expressions in the backbone from scratch. In fact, with this approach we can guarantee that the time to locate fields in need of updating is proportional to the total length of the paths that lead to these fields, in this csse of the paths from a to a, from a to a, from ~ to /?, and from /3 to p . To generate these paths, we consider each node incident on a backbone edge that changes (above, it would be a, ~, and their parents). Each automaton state at
In the example trees, dated.
we saw that
It is not hard to see that
calculating
nation of each such routing field is not necessary. For example, the new value of the next field at a is the old value of the the four routing next field at ~. Thus, when fields have been located, the updates
can take place in constant time by properly permuting the values of known next pointers. Such use of the values of routing fields is called access optimization.
201
The formal reasoning behind access optimization can be formulated in monadic logic. For example the question graph yes Is the value of the next field at cd in the new field at /3 and the answer the same aa the value of the next can be expressed, can be computed.
may only point to nodes labeled syntactically with a marker. Since the number of markers is finite, this language precludes the modeling of e.g. doubly-linked trees. of the main lists or leaf-linked The ADDS trees, but allows root-linked in [6] allows
in the old graph?
notation
the description through The
In general, a strategy for access optimization is to compare values contained in nodes already located to the destination of paths that arise in the detection of required updates. This involves trying out different combinations of paths that are followed explicitly and testing mulate number whether other needed problem destinations Thus for finding or origins one can forthe least in order to can be found in constant that time.
abstract properties of pointer concepts of dimensions and motivation is to make through (non-invasive)
structures directions.
static analysis more program annotations.
feasible With
the ADDS notation one cannot of values, and manipulations pointer operations. for evaluating
specify the exact shape still rely on explicit
The techniques ilar to algorithms
routing
fields
are simgram-
a minimization of paths
for reevaluating
attributed
need to be followed and this problem
carry out an update, For doubly-linked lows the automatic time code for having grammer 6
is decidable. al-
mars [9], but to our knowledge the algorithms for updating a tree of a grammar whose attributes are nodes in the tree has not been described Acknowledgments Thanks to the comments. References anonymous referees for their helpful before.
lists of type generation concatenating to specify Work
D, such reasoning of optimal, listswithout
constantthe pro-
any pointer
operations.
Related
Decidability of Iogics of graphs have been studied tensively; see [4] for references to the classical sults trees eral that graphs. the monadic and The similar second for order logic is decidable extensions graph to more rewriting
exregen-
[1] H. Ait-Kaci In Proc.
and R. Nasr. Symp.
Logic
and inheritance, of Program1986.
19th ACM
on Print.
on finite grammars for-
ming Languagesj [2] H. Kit-Kaci and
pages 219228, R. Nasr. with
hyperedge-replacement context-free
of [4] and
programming Journal Journal
language
Login: A logic built-in inheritance. 3:185-215, 1986.
malisms describe much larger our graph types. An important property expressed graphs is decidable
classes of graphs than result of [4] is that any logic on gram-
of Logic Programming, version of [1].
in second-order monadic on hyperedge-replacement
and A. Podelski. Towards a meaning [3] H. Ait-Kaci of life. In Jan Maluszyfiski and Martin Wirsing, editors, posium Proceedings of the %-d International Language (Passau, LNCS Symon Programming ImplementaGermany), 528, Au-
mars. We could have used this result to derive our decidability result; but the translation into context-free graph grammars graph appears Although to be more mathematically tend complex than to untypes in our approach. context-free derstand; interesting,
tion and Logic Programming pages 255274. gust 1991. [4] B. Courcelle. The monadic
Springer-Verlag,
grammars
to be hard
this is likely
the reason why, to our knowlsecond-order logic of Ingraphs I. Recognizable sets of finite graphs: formation and computation, 85:12-75, 1990.
edge, they have not been used for describing programming languages.
Closer in spirit to our approach are the feature grammars and algebras; see [5] for references. These formalisms sponding are built to our on the view record fields) that features (correare partial functions
[5] J. Dorre and W.C Rounds. On subsumption and semiunification in feature algebras. In Proc. IEEE Symp. on Logics in Computer Science, pages 300310, 1990.
that identify attributes. Not being based on tree structures, features allow the description of self-referential data structures. As opposed to our approach, the values designated are not guided by any expressions.
[6] L. Hendren, J. Hummel, and A. Nicolau. Abstractions for recursive pointer data structures: Improving the analysis and transformation perative programs. In Proc. SIGPLAN92 of imCon-
The programming languages in [1, 2] and [3] use similar ideas and permits circular data structures. A restriction of this work is that such circular references
ference on Programming Language Design and Implementation, pages 249-260. ACM, 1992.
202
hter[7] C.A.R, Hoare. Recursive data structures. national Journal of Computer and Information Sciences, 4:2:105-132, [8] Robin Milner, Mads 1975. Tofte, and Robert ML. MIT Harper.
A2:
Graph
Types
and
Routing
Expressions
While F~ still denotea all fields, we use F: to denote the data fields, and F; to denote the routing jields. We use the notation ~ (T: v) a to denote the routing the routfield a in Press, 1990. for attribute ing expression associated variant v of type T. with
The Definition [9] T. Reps.
of Standard
Incremental
evaluation
grammars with unrestricted movement between tree modifications. Acts Infornaatica, 25, 1986. [10] D.A. Turner. Miranda: A non-strict functional
The graph type has an underlying data type Data Q which is obtained by removing all the routing fields. The Data routing expressions below. must all be defined on G, as described
language with polymorphic types. In Proc. Conference on Functional Programming Languages and Computer Architecture, pages 116. SpringerVerlag (LNCS 201), 1985.
Given a data type D, define the alphabet A that consists of directives (letters) A; $; T; Ta and la, where aEFv; T and (T:v), where TETD and VEVVT. Given x E Val D we define the step relation -= on
dom z x A x dom z by the following Appendix: Formal Definitions
transitions:
This appendix contains the formal definitions of the concepts introduced. They may be used to elucidate and substantiate mary. Al: Data Types the contents of the preceding sum-
6:=6
.3
Cr-=cl
if a is a leaf in z t CY a ffoa if z(a) ~ if x(a) = (T : v) for some v = (T : v)
a,a+=
cr. a+= cr~= @S=a ~ (~)=
T4
Associated with a data type D we have some notation. The main type is denoted Main D. By TV we denote the set of types. variant above, By TV(T: Tv(T : v)ai v)a we denote = Ti. By VD the type of we denote When the data field a in variant v of type T, i.e., for the type-
Q ~=
~, we say that
~ is reached from a ~=
CYby
the set of all variants in D; by VDT we denote the set of variants of type T. By Fv we denote the set of all data fields in D; by FD (T: v) we denote the set of data fields of type T and variant variant declaration above, FO(T of F;. z : F% 4 An address a is an element v, i.e., for the type: v) = {al, , . . . an}.
directive defined,
d. Note that
@ such that
~ is uniquely
if it exists, by the values of a and d. . dn is a word a c domx to ~ G dom

cxo,
Aroutep=dl.. in x from unique ~i- 1 ~=
over x
=
A.
fl,
A walk p is the such that
along
sequence, if it exists,
. . . an
~i for all i, 1 < i < n. The walk is denoted
The values of D is the set Val D of functions Tv x VD such that dom x is finite c(e) = (Main and prefix closed;
Q z=
p. R on D is a regular expression regular expressions using operand * (iteration). L(R). is a x, a destination defined by R is denoted
A routing expression over A. We construct ators + (union), The regular language
. (concatenation), tYE dom
P: v), for some v; and z, if x(a) and = (T: v) then
Given x, R and an origin

q
for all a c dom v E VPT
-era
Edomx ~ aCFD(T:v) A Tv(T: v) a = T where z(cra) = (T: v) for some v. the addresses in dom % serve as pointer
@c dorn x such that a ~= p for some route p E L(R). The set of all destinations is denoted Dest =(R, a). If this set is a singleton we say that R at a in z has the unique destination property. Intuitively, the routing expressions specify where the
pointers in he routing fields should lead to. A graph type is only well-formed when all such expressions always have the unique lead to subtrees destination property types. and always of the specified
Intuitively, values.
203
The values of a well-formed Val G of finite construct graphs. in the underlying a graph data type.
graph type G form the set Given x E Val Data are labeled G we Note that and that
ii. insert each entry Step 4.(b)
(~, q) in Q (a, q) is considered involves only at most once
There is a graph for every value x, the set of by field the node a and its
whose nodes are dom
addresses in x. The edges, which
names, come in two flavors: data edges and routing edges. The data edges provide the canonical spanning tree-the backbone---and {a ~ The routing {a4pl a c F~z(a), Rgz(a)a = R, Dest (both .(R, cr) = {/3} } are defined as
immediate neighborsthus a number of nodes that depends on the grammar only. We conclude that the algorithm Ofz. With the well-formedness criterion it is not hard to see that the destination of a routing field at a is the node ~ if and only if there exists an initial the corresponding A4: The Monadic monadic M2L automaton Logic second-order GT, logic of graph certain types, desuch that state q of q) = ~. Tbl(a, runs in linear time as a function of the size
aa I aa G domz}. as
edges are defined
In this graph, addresses in F; fields) are defined. A3: Evaluating Routing
data and routing
noted
is used to express
proper-
Fields mentioned representing For an in
ties of graph types. We first introduce a simpler logic, monadic second-order logic of data types, denoted M2LDT. Fix a data type D. We define the M2LDT on D as follows. There are two kinds of second-order variables, value variables and address set variables. D. A value variable M x denotes a value of D. An a set of addresses with of U, fl and 0 address set variable Such variables denotes
Here we give the details Section of nondeterministic all routing
of the algorithm automata
5. We are given a backbone finite-state expressions in the graph
z and a collection grammar.
automaton A with transition . . . dn 6 A*, we write w=do there exists go, .d.., qn+l ql . ..-3
qn+~.
relation 4A and a word q _%A q to denote that

go = q,
can be combined
such that
qn+l
= i,
to form address set expressions. The set of addresses of z is denoted dom z, which is also a set expression. A first-order variable a, also called an address vamable, denotes an address of D. That a is an address in M is expressed as the formula a E M. A value variable z of type D is introduced by an existential quantification 3DX or a universal quantification lfvx. Variables that denote addresses The formulas quantification, the following or sets of addresses (3) or universal A (and), .,. are intro(V) quanby duced by usual existential tification.
and go ~
Our goal is to build a table Tbl such that for each node a in z and for each automaton A and each state q of A, the value of Tbl(cr, q) is a node ~, if it exists, such that then Tbl(a, for some w G A*, q) = nil. below employs a queue Q to calculate a %= ~ and q ~.4 qF, where qF is a final state of A; if no such node exists
of the logic are obtained basic formulas:

ac =
The algorithm Tbl:
combining tion) with
V (or), = (nega?.. -
1. Tbl(cr, q) := nil, for all nodes a in z and all automata states q 2. make Q empty 3. for all (a, q), where q is a final (a) (b) 4. while Tb/(cx, q) := et insert (a, q) in Q state:
is A(cJ)
isc$(a) isz (T: v)(a) is=T(a) is=walk(a, a=p El = E2 t~ ~ 82 faE& ~ = @.a ~, R)
z(a) z(a) z(a)
is a leaf variant = (T : v) = (T : v) for some v : cz ~= ~
~p c L(R)
Q is non-empty: (cr, q) from Q
(a) delete an element (b)
a=/?ka
aGFDz(@
and a = ~.a The
for all (~, q) such that for some d, q 5A i. Tbl(fl,
Tb/(~, q) = nil and a:
where & and the &is are address set expressions. formulas have the obvious is true iff the type-variant
q and ~ ~= q)
q) := Tb/(~,
meanings, e.g. is=(T: v)(~) at address a in z is (T: v).
204
The formulaa = /3; a is true if a = ~.a field of the type variant at a in z.
and a is a
the last state need not be final) starting at cr. This collection of subsets can be coded using IARI set variables. We must then write a M2LkSFT formula expressing that all states in a subset have a predecessor under the transition We must also write the collection condition; relation (unless and in the subset at a). This alone down a condition technically, we of subsets is minimal in order to ensure initial states at cr. c1 are omitted. for some directive
Expressing The following
well-formedness formula in M2LDT expresses that a
the state is initial is not sufficient. that with ensures that
graph type G is well-formed:
respect to the previous
are calculating that all states The details where D = Data ~ is the underlying data type; 3 ! is an abbreviation for there exists a unique; and AND is an abbreviation by expanding expressing the conjunction indices. obtained Logic of graph over the corresponding
a least fixed-point are reachable from
of this translation
types types,
Decidability Theorem
of M2LDT 1 M2LDT is decidable.
The monadic second-order logic of graph M2LGT, has the same syntax as M2LDT. Theorem 2 M2LGT is decidable. into M2LkSFT only /3 ; a. If a c F~z(@, is=walk(~, We omit the details.
Proof M2LDT is decidable by an easy reduction to M2LkSFT, the monadic second-order logic of k successors on finite trees. The latter logic hss set variables, such as X, denoting subsets of {1, . . . . k}* and jirst-order {1,... , k}*. .kforeachj variables, E{l,..., such as a, denoting elements of In addition there is a successor function k} and connective and quan-
Proof The translation for the formula a = the translation where R = R~z(~)a.
differs then a, R), c1
must expresses that
tifiers as above. We will indicate how formulas of M2LDT involving a data type D can be translated into 1,..., mula IX; M2LkSFT. fields k. ,..., We let k be [Fm[, in D, and we rename into xd the number field names of as forX~T, dom z; different
An x in D introduced is translated xp : g A ~, where
by a quantified 3Xd, X;,..., expresses
3ZJX : t
x;,... X;= expresses the type at position a by the bit pattern (ac XJ,..., acX~T) (here nl = loglT~l); at position x;, . . . X~w expresses the variant the bit pattern (a c X;,..., a c X~ ) (here log lV~ l); ~ is the translation expressing that z is a value of D according a by n =
of ~; and g is a formula to the con-
ditions on derivation trees given in Section 1. Address set variables are just translated into set variables and address variables into first-order variables. Most of the basic formulas are now easy to express. For example, a = ~ i a is translated into a = f?.a A a E x; this formula is equivalent a = ~. a A a c F~z(~) ,8, R) since z c Val V. The basic formula iszwalk(a,
is more difficult, Here we encode the working of AR, the automaton equivalent to R, on z by a formula that guesses the subsets of states at each a that are accessible from a partial ron (which is like a run except that
205

10 1 1 68

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

10 1 1 68

Caricato da

Copyright:

Formati disponibili

Graph

Nils Klarlund {klarlund,mis}@daimi. Aarhus Ny University, Munkegade, Department

Abstract Recursive data structures are abstractions of simple

0 the invariant generate

can be exploited values. originate from

code for such tasks as copying,

ing, and traversing Recursive tional data types

the seventies typed

and have become languages

such values are always tree shaped, which is obstacle to practical

[8] and MIRANDA in PASCAL-like

but they may also be employed

and efficiently. expressions

algorithm for We employ

trees, and threaded

type form a common

a set of pointer shape invariant.

logic of graph types, which of some constant of doubly-linked

of this approach of the invariant and

is twofold: can be statically verified

time operations lists. graph-shaped exact descripand it does so close to exist-

such as concatenation to the correct-

at compile-time, ness of programs;

explanatory are included

style. Formal definitions in the appendix. 2 Data Types

and the is given a fee

and its date otherwise, USA

is by permission permission. 0-89791

of the Association or to republish,

for Computing requires

Machinery. and/or @ 1993 ACM-20th

would tween of type

the corresponding The comparison expression v,

there is the usual distinction

1- and r~values. L, then

ways defined true exactly

for two values of type the boolean when x is of variant

we may instead (head: Int, tail:

has been no solution

has been to revert pointers.

,0 of graph types, which of data types. retaining

low graph shaped values while and ease of use. solution:

the efficiency to our

are two key insights

while being graphs, which is a canonical

the values all have a backbone, spanning tree; and

edges are all functionally

by this backbone. we give

The list with

to the last element

L, last: Int, tail:

of the graph type values are simply expressions the backbone.

of) a specific type ing routing field

C!) C[~* A])

by the correspondcontains leading

and specifications can

well-formedness of graph type be decided at compile-time. Examples

red or black leaves, in which

of the same color are joined

in a cyclic list, looks like

RED]) BLACK]) and BLACK abfrom showtree in post-order a binary

STEP (K:red) Finally, cyclically