Sei sulla pagina 1di 10

Graph

Nils Klarlund {klarlund,mis}@daimi. Aarhus Ny University, Munkegade, Department

Types
I. Schwartzbacht aau .dk of Computer ~rhus, Denmark Science,

& Michael

DK-8000

Abstract Recursive data structures are abstractions of simple

0 the invariant generate

can be exploited values. originate from

to automatically compar-

code for such tasks as copying,

records and pointers. They impose a shape invamant, which is verijied at compile-time and exploited to automatically ing, However, a major generate code for building, values wiihout use. which allow comlists or threaded We deto speccomputing a secondcopying, comparand traversing loss of ejiciency.

ing, and traversing Recursive tional data types

the seventies typed

[7] [10],

and have become languages

ubiquitous such as ML

in modern

funcimper-

such values are always tree shaped, which is obstacle to practical

[8] and MIRANDA in PASCAL-like

but they may also be employed

We propose a notion of graph types, mon shapes, such as doubly-linked trees, to be expressed concisely jine regular languages addresses tree. An ify relative spanning such of routing

and efficiently. expressions

at ive languages. Their benefits are substantial, but they also impose limitations; in particular, the values of recursive data types will always be tree shaped. In this paper we present a natural generalization, graph types, which allows a large variety of graph shaped trees. values, including root-linked (doubly-chained) cyclic lists, leaf-to-

of extra pointers

in a canonical

eficient

addresses

is developed.

algorithm for We employ

trees, leaf-linked

trees, and threaded

order monadic logic to decide well- formedness of graph type specifications. This logic can also be used for automated reasoning about pointer structures.

The key idea is to allow only graphs with a backbone, which is a canonical spanning tree. All extra edges must depend functionally on this backbone. The extra edges are specified by a language of regular routing expressions, which give relative addresses within the backbone. We show that construction of such graph valuesalong happen cidable tomatic monadic derivation with all relevant time. manipulationscan We introduce a deallows auThere efficiently in linear

Introduction

Recursive data types are abstractions of structures bllilt from simple records and pointers. The values of a recursive tures that advantage
q

data

type form a common

a set of pointer shape invariant.

strucThe

all obey

logic of graph types, which of some constant of doubly-linked

of this approach of the invariant and

is twofold: can be statically verified

time operations lists. graph-shaped exact descripand it does so close to exist-

such as concatenation to the correct-

validity

at compile-time, ness of programs;

which contributes

*The author is supported by a fellowship from the Danish Resemch Counsil. t The author is partially supported by the Danish Research Council, DART Project (5.21.08.03).
Permission granted direct title that to copy that without fee all or part are not made copyright and appear, of this material is for provided commercial of the copying specific ACM the copies or distributed notice notice

have been other attempts to describe values. Our proposal, however, allows tions of a more general class of types, using an intuitive notation that is very ing concepts in programming languages. This summary is kept in an informal, and algorithms

explanatory are included

style. Formal definitions in the appendix. 2 Data Types

advantage,

the ACM

and the is given a fee

publication To copy

and its date otherwise, USA

is by permission permission. 0-89791

of the Association or to republish,

for Computing requires

Machinery. and/or @ 1993 ACM-20th

PoPL-1/93-S.C.,

For this presentation, a (recursive) data type D is a special kind of tree grammar. The non-terminals are called types. There is a distinguished main type, which

-561 -5/93

/0001

/0196

. ..$1 .50

196

in examples is always the one mentioned ers are merely auxiliary. A production z4v(al: Tl,..., an:Tn)

first;

the oth-

would tween of type

always

denote

the corresponding The comparison expression v,

value;

in an be-

imperative

language

there is the usual distinction

1- and r~values. L, then

x = y is alis(x,v) yields

ways defined true exactly

for two values of type the boolean when x is of variant

L. If x is a value In an impera-

of D, where T and the T~s are types, declares a variant v of type T containing say that the production data jields named al, ..., declares a type-variant an; we (T: v).

tive language, the value assignment x := y is present, possibly accompanied by the swap x :=: y which exchanges two subtrees without copying. Values of data types are traversed by recursive functions or procedures. Thus, explicit pointers are never used. in this approach.

For each type, the possible variants must be mutually distinct; thus (T: v) uniquely determines the production, Moreover, for each type-variant, the data fields must be mutually distinct.

There is no intrinsic The values of a data type are essentially the derivation trees of the underlying context-free grammar, starting with the main type. They are implemented as pointer trees, but the programmer ulate these pointers. will never directly maniptree Shortcomings Each node of such a pointer

loss of efficiency

Constants can be built, copied, compared, and traversed in optimal linear time, and addresses are accessed in constant time. Thus, if one really wants tree shaped values, then only advantages of Data Types are to be seen.

is an instance of a variant of a type. A formal detlnition of the values of a data type is given in section Al of the appendix. following integer data lists L + nonempty(head: Int, tail: L) As a simple type, which example, consider the specifies a type of simple

The main draw-back of data types is the limited shapes of values that they allow. For the above simple lists, values always look as follows pictured as a ground symbol) (an empty record is

-+ emptyo We can think specified w of the type Int as being a data type However, it is a common optimization to want an extra pointer to gain constant time access to the last element of the list. Thus, following shape the values should instead have the

Int+Oollo1201..

We allow implicit variants as a form of syntactic sugar. If the sets of data fields are distinct for all variants, then the explicit of the variant field names. variants are not needed; we may think a concatenation write L) of the names ss being Thus, L + 4() Programming with Data Types

we may instead (head: Int, tail:

These are not trees and, hence, cannot data types. Until now, there this problem. The only possibility to the often perilous use of explicit 3 Graph Types the notion simple

be specified

by to

has been no solution

has been to revert pointers.

When a data type has been specified, it gives rise to a number of operations in the programming language. First of all, there is a language for denoting constant values. For the above lists, one may write down L(head: 11, tail: (head: 12, tail: (head: 13, tail: ())))

We introduce conceptually

,0 of graph types, which of data types. retaining

form

a al-

extension There

They

low graph shaped values while and ease of use. solution:


q

the efficiency to our

are two key insights

for the list of type L with elements 11, 12, and 13. If x is a variable containing a value of type L, then x.tail.tail.head specifies the address of a subtree, in this case of type Int. In a functional language this

while being graphs, which is a canonical

the values all have a backbone, spanning tree; and

197

the

remaining

edges are all functionally

deter-

mined

by this backbone. we give

each case. In pictures of values, we use the convention that pointers from data fields are solid, whereas those from routing pointer fields with are dashed. no origin. looks like The root of the unby a a pointer derlying solid spanning tree, or backbone, is indicated

Many, but not all, sets of graphs fit this mold; examples of both kinds.

The list with

A graph type extends a data type by having routing fields aa well as data jields. Productions now look like T~v(. ..ai
:~... flj:!lj[l ?)...)

to the last element

H a L A + A typical value is

(first: (head: ()

L, last: Int, tail:

L[Jfirst L)

Jtail

$ t])

Here ai is a normal data field but aj is a routing field. It is distinguished by having an associated routing expression R. A graph type has an underlying which is obtained by removing the routing backbones ues of this relative data type. Routing data type, fields. The the valdescribe

of the graph type values are simply expressions the backbone.

first

addresses within

The complete

graph type value is obtained by using the routing expressions to evaluate the destinations of the routing fields. Routing expressions are regular expressions over a language of directives, which describe navigation within a backbone. Directives include move up to the parent (from a specific child) (t or T a) , move down to a specific child (J a), and verify a property of the current node, ( A), this defines where properties is a leaf include this is the root expression preto a ($), and this is (a specific variant (T or (T: v)). indicated if its regular A graph A routing

i. . . . . . . . . . . .

H-------------f
Jfirst Jtail $ T for the last

The routing

expression

field contains the following directives: move down along the first pointer (Jfirst); follow the tail pointers until a leaf is reached (Jtail $); then back up once (t). This is the destination of the last pointer. A cyclic list looks like

of) a specific type ing routing field

C -+ (next: -+ (next:

C!) C[~* A])

the destination

by the correspondcontains leading

language

cisely one sequence of successful directives node in the tree. routing expression always defines a unique gives formal

type is well-formed

if every of

A typical

value is

destination. definitions

Section A2 of the appendix these concepts. To make a convincing we need to demonstrate . many useful families

case for this new mechanism, the following of structures facts: can be easily next ~ I next

specified; . values can be manipulated at run-time similarly to values of data types, and without loss of efficiency;
q

M
The routing expressions contain the following A doubly-linked simple cyclic directives: move up to the root. list looks like D ~ ~ (next: (next: D, prev: D[t +A Jnext$]) D[t A], prev: D[t +A]) A typical value is

and specifications can

well-formedness of graph type be decided at compile-time. Examples

We now show that many common pointer structures have simple specifications as graph types. The examples are all well-formed, which can be easily seen in

198

r ----1 jr II II prev I Ii i
L -;I L I

lnext

IM!3
next
-..p::v1 -..

$-R
prev ----next 1 I , prev I next left complicated operator here; they use the expresA

!?

left

Directives

are more

nondeterministic

union

on regular

L ------

sions (+) to express context-dependent choices. For example, consider the (prev field of the first variant. According root, follow to the routing next expression T + A $next $ of A binary tree with this field, we must either move up, or, if we are at the to the leaf. to the root

-----next

I . . . . . J

red or black leaves, in which

those

of the same color are joined

in a cyclic list, looks like

pointers

A binary tree in which all leaves are linked looks like R ~(left, -(root: A typical value is right: R[f R)
A])

K +(left, right: K) ared(next: K[BLACK* ~black(next: where RED abbreviates breviates STEP (K: black). ing a typical looks like T +(left, right: T, post: K[RED*

RED]) BLACK]) and BLACK abfrom showtree in post-order a binary

STEP (K:red) Finally, cyclically

We shall abstain

value of this type.

in which all nodes are threaded

TIPOST])

! ,

a(post: \ A I I 1 I I i ---1 1 I

TIPOST])

r -----1 1 I I I root ~ I I ~ left


I I L

---/ left

where POST abbreviates A typical value is

~right+tleft

\rightJleft*$+

A Jleft.

I J

-6
J a(left, +(next:

~ b
root
-J

r -----I I I

------

. --

----

A binary tree in which cyclic list looks like

all the leaves are joined

in a post

left --

1 : 1
1 1 I t

right: J) J[sTEP$]) Tright* (tleft lright+ A) Jleft*.

left

where STEP abbreviates A typical value is

I 1

199

At a first ing, but familiar. of routing

glance such specifications at least to the authors The use of abbreviations, legibility expressions. Complicated they

may seem dauntquickly became reuse strucsuch as STEP and and promote pointer

exactly

the same as for the underlying fields are just for constants data type. ignored.

two data values;

the routing The syntax underlying

are also the same as for the fields

POST above, may improve

The values of the routing

tures may give rise to complicated graph type specifications. However, it is fair to say that the complexity of the graph type specification inherent pictorial complexity, description would. correlates well with this a verbal or in the same way that

are then computed automatically. The example values of the previous section are specified as constants as follows: H(first: C(next: D(next: R(left: J(left: T(left: Note that identical, (head: 11, tail: (head: 12, tail: (head: 13, tail: ())))) (next: (next: (left: (left: (left: (next: (next: (), right: (), right: (), right: ()))) ()))) ()), right: ()), right: ()), right: ()) ()) ()) are

Not all families by graph istic, things tree. types.

of graph shaped values can by specified First of all, they must tree. This be determinof such be precludes cannot all edges must be functions

in the sense that as a pointer But

some underlying

spanning

from the root to some node in the situations

even all deterministic

specified. Consider a generalized tableau structure on a grid, in which there must be an edge from a point to the one immediately below, if they are both present.

the expressions

for the C- and D-values

as are those for the R-, J-, and T-values.

Copying (sub) values happens in two steps. First, the underlying spanning tree is copied; second, the values of the routing fields must be reevaluated. Consider for example is copied, root of that the leaf-to-root-linked tree. then several spanning Consider tree. now point If a subtree to the new then the leaves must

If a data field in a graph value is assigned,

routing fields in the both the surrounding tree and the new graft may have to change. for example changed the red-black leaf-linked then from red to black, it must

trees. If a leaf is be removed

from one cyclic list and inserted in another. A simple way of handling this is to reevaluate all routing fields, but that A graph type cannot represent such graphs, since the variant at a given node is dependent on whether there is a downward pointing edge. Thus the variant is dependent on the rest of the graphsomething we cannot specify 5 in a context-free grammar. Routing fields can be read just like data fields; also point to subtrees of the canonical spanning It is, of course, routing field. In summary, not possible to assign directly they tree. to a is undesirable since the surrounding small. A similar fields of subtrees. tree may problem that are be large and the graft maybe exists for the swapping an algorithm for detecting required to be updated.

We must develop

the routing

Programming

So far, we have seen that many families of pointer structures can be captured ss the values of graph types. We must also demonstrate used for programming in a manner data types. An obvious is that the problem recursive with having graph shaped values may be problematic; traversal that they can be similar to that for

many

of the required

algorithms

are in-

herited from the underlying data structure. However, we must be able to evaluate all routing fields in only combined linear time, and for assignment we need to detect those routing fields that must be updated. Evaluating Backbones Routing can clearly Fields be constructed in linear time.

how can we avoid cycles? However, for graph types we have the canonical spanning tree of the underlying data value. Thus, many of the simple techniques can be inherited in a straightforward manner. For example, the algorithm for comparing two graph values is

Given a backbone, fields in combined

it is possible linear time.

to evaluate

all routing

200

First, each routing expression in the graph type is translated into an equivalent nondeterministic automaton. This translation is linear. Next, a table is constructed that for each node a and

such a node can be followed backwardstowards possible origins, routing fields whose routes go through the node-and forwardstowards a possible destination. Above, this involves finding four destinations and four origins. For example, when considering a, we obtain two origins, the next fields of a) and a{, and their corresponding destinations. We shall shortly sible. Note, see how further that optimizations are po5 types the

for each automaton state q of each automaton A contains a pointer. Intuitively, if this pointer is not nil, it indicates rectives a node /3 reachable a such that by a sequence w of diw, automaton described in from upon reading

A may end up in a final state at node ~. This table is calculated in linear time by an algorithm the appendix. When the table has been constructed, the destination of a routing field at CY given as the pointer found in is an entry (cr, go) of the table, of the automaton Detecting Sometimes date routing representing where go is an initial the routing state expression.

however,

for some graph

number of paths This happens for type R described an existing tree. the techniques the algorithm Monadic

to follow may be proportional to n. example for the root linked trees of earlier when a new root is added to In this case there is no gain in using in this section all routing compared fields. to

described for updating

Required

Updates to up-

Logic

and

Well-Formedness logic on graph types is a propand

when a change occurs, it is sufficient fields for only a small part

The monadic logical erties about

second-order that graph types

of the value.

formalism

allows several important to be expressed. Our logic permits types, addresses, we define the logic formally

For example, this happens when swapping subtrees of values of type J, the type of leaf-linked binary trees. Consider the situation after the subtrees rooted at addresses a and /3 have been swapped:

In section quantifiand sets

A4 of the appendix, show that cation

it is decidable.

over values of graph

of addresses. In this logic we can formulate questions such as What is the type-variant of a node a in a value z? question or Is there a walk in a value z from node a to a routing a graph Thus about type expression R? The can of whether is well-formed to node ~ according also be expressed Similarly,

in the logic as it is shown in section this question is decidable. such as types, comparing values,

A4 of the appendix. questions

Val G1 G Val G2, where


1 I 1 I

GI and G2 are graph

are decidable. Although much can be expressed in the monadic second-order logic on graph types, there are simple operations that cannot. For example, one cannot represent the result subtree (although be expressible). Access Optimization of updating only routing four fields fields in leaf-linked needed to be upthe destiof replacing certain a subtree with another may properties of the result

L -----------------

------------------

---.

.-.

..-a

next

Here only the next need to be updated. doubly-linkedby

pointers at a, a, If we assume that a field prev:

@, and P J is made J[t + A]-

adding

it would often be less costly to locate the four nodes {a, a, @, ~} after the change and reevaluate their next fields than evaluating all routing expressions in the backbone from scratch. In fact, with this approach we can guarantee that the time to locate fields in need of updating is proportional to the total length of the paths that lead to these fields, in this csse of the paths from a to a, from a to a, from ~ to /?, and from /3 to p . To generate these paths, we consider each node incident on a backbone edge that changes (above, it would be a, ~, and their parents). Each automaton state at

In the example trees, dated.

we saw that

It is not hard to see that

calculating

nation of each such routing field is not necessary. For example, the new value of the next field at a is the old value of the the four routing next field at ~. Thus, when fields have been located, the updates

can take place in constant time by properly permuting the values of known next pointers. Such use of the values of routing fields is called access optimization.

201

The formal reasoning behind access optimization can be formulated in monadic logic. For example the question graph yes Is the value of the next field at cd in the new field at /3 and the answer the same aa the value of the next can be expressed, can be computed.

may only point to nodes labeled syntactically with a marker. Since the number of markers is finite, this language precludes the modeling of e.g. doubly-linked trees. of the main lists or leaf-linked The ADDS trees, but allows root-linked in [6] allows

in the old graph?

notation

the description through The

In general, a strategy for access optimization is to compare values contained in nodes already located to the destination of paths that arise in the detection of required updates. This involves trying out different combinations of paths that are followed explicitly and testing mulate number whether other needed problem destinations Thus for finding or origins one can forthe least in order to can be found in constant that time.

abstract properties of pointer concepts of dimensions and motivation is to make through (non-invasive)

structures directions.

static analysis more program annotations.

feasible With

the ADDS notation one cannot of values, and manipulations pointer operations. for evaluating

specify the exact shape still rely on explicit

The techniques ilar to algorithms

routing

fields

are simgram-

a minimization of paths

for reevaluating

attributed

need to be followed and this problem

carry out an update, For doubly-linked lows the automatic time code for having grammer 6

is decidable. al-

mars [9], but to our knowledge the algorithms for updating a tree of a grammar whose attributes are nodes in the tree has not been described Acknowledgments Thanks to the comments. References anonymous referees for their helpful before.

lists of type generation concatenating to specify Work

D, such reasoning of optimal, listswithout

constantthe pro-

any pointer

operations.

Related

Decidability of Iogics of graphs have been studied tensively; see [4] for references to the classical sults trees eral that graphs. the monadic and The similar second for order logic is decidable extensions graph to more rewriting

exregen-

[1] H. Ait-Kaci In Proc.

and R. Nasr. Symp.

Logic

and inheritance, of Program1986.

19th ACM

on Print.

on finite grammars for-

ming Languagesj [2] H. Kit-Kaci and

pages 219228, R. Nasr. with

hyperedge-replacement context-free

of [4] and

programming Journal Journal

language

Login: A logic built-in inheritance. 3:185-215, 1986.

malisms describe much larger our graph types. An important property expressed graphs is decidable

classes of graphs than result of [4] is that any logic on gram-

of Logic Programming, version of [1].

in second-order monadic on hyperedge-replacement

and A. Podelski. Towards a meaning [3] H. Ait-Kaci of life. In Jan Maluszyfiski and Martin Wirsing, editors, posium Proceedings of the %-d International Language (Passau, LNCS Symon Programming ImplementaGermany), 528, Au-

mars. We could have used this result to derive our decidability result; but the translation into context-free graph grammars graph appears Although to be more mathematically tend complex than to untypes in our approach. context-free derstand; interesting,

tion and Logic Programming pages 255274. gust 1991. [4] B. Courcelle. The monadic

Springer-Verlag,

grammars

to be hard

this is likely

the reason why, to our knowlsecond-order logic of Ingraphs I. Recognizable sets of finite graphs: formation and computation, 85:12-75, 1990.

edge, they have not been used for describing programming languages.

Closer in spirit to our approach are the feature grammars and algebras; see [5] for references. These formalisms sponding are built to our on the view record fields) that features (correare partial functions

[5] J. Dorre and W.C Rounds. On subsumption and semiunification in feature algebras. In Proc. IEEE Symp. on Logics in Computer Science, pages 300310, 1990.

that identify attributes. Not being based on tree structures, features allow the description of self-referential data structures. As opposed to our approach, the values designated are not guided by any expressions.

[6] L. Hendren, J. Hummel, and A. Nicolau. Abstractions for recursive pointer data structures: Improving the analysis and transformation perative programs. In Proc. SIGPLAN92 of imCon-

The programming languages in [1, 2] and [3] use similar ideas and permits circular data structures. A restriction of this work is that such circular references

ference on Programming Language Design and Implementation, pages 249-260. ACM, 1992.

202

hter[7] C.A.R, Hoare. Recursive data structures. national Journal of Computer and Information Sciences, 4:2:105-132, [8] Robin Milner, Mads 1975. Tofte, and Robert ML. MIT Harper.

A2:

Graph

Types

and

Routing

Expressions

While F~ still denotea all fields, we use F: to denote the data fields, and F; to denote the routing jields. We use the notation ~ (T: v) a to denote the routing the routfield a in Press, 1990. for attribute ing expression associated variant v of type T. with

The Definition [9] T. Reps.

of Standard

Incremental

evaluation

grammars with unrestricted movement between tree modifications. Acts Infornaatica, 25, 1986. [10] D.A. Turner. Miranda: A non-strict functional

The graph type has an underlying data type Data Q which is obtained by removing all the routing fields. The Data routing expressions below. must all be defined on G, as described

language with polymorphic types. In Proc. Conference on Functional Programming Languages and Computer Architecture, pages 116. SpringerVerlag (LNCS 201), 1985.

Given a data type D, define the alphabet A that consists of directives (letters) A; $; T; Ta and la, where aEFv; T and (T:v), where TETD and VEVVT. Given x E Val D we define the step relation -= on

dom z x A x dom z by the following Appendix: Formal Definitions

transitions:

This appendix contains the formal definitions of the concepts introduced. They may be used to elucidate and substantiate mary. Al: Data Types the contents of the preceding sum-

6:=6
.3
Cr-=cl

if a is a leaf in z t CY a ffoa if z(a) ~ if x(a) = (T : v) for some v = (T : v)

a,a+=

cr. a+= cr~= @S=a ~ (~)=

T4

Associated with a data type D we have some notation. The main type is denoted Main D. By TV we denote the set of types. variant above, By TV(T: Tv(T : v)ai v)a we denote = Ti. By VD the type of we denote When the data field a in variant v of type T, i.e., for the type-

Q ~=

~, we say that

~ is reached from a ~=

CYby

the set of all variants in D; by VDT we denote the set of variants of type T. By Fv we denote the set of all data fields in D; by FD (T: v) we denote the set of data fields of type T and variant variant declaration above, FO(T of F;. z : F% 4 An address a is an element v, i.e., for the type: v) = {al, , . . . an}.

directive defined,

d. Note that

@ such that

~ is uniquely

if it exists, by the values of a and d. . dn is a word a c domx to ~ G dom


cxo,

Aroutep=dl.. in x from unique ~i- 1 ~=

over x
=

A.
fl,

A walk p is the such that

along

sequence, if it exists,

. . . an

~i for all i, 1 < i < n. The walk is denoted

The values of D is the set Val D of functions Tv x VD such that dom x is finite c(e) = (Main and prefix closed;

Q z=

p. R on D is a regular expression regular expressions using operand * (iteration). L(R). is a x, a destination defined by R is denoted

A routing expression over A. We construct ators + (union), The regular language

. (concatenation), tYE dom

P: v), for some v; and z, if x(a) and = (T: v) then

Given x, R and an origin


q

for all a c dom v E VPT

-era

Edomx ~ aCFD(T:v) A Tv(T: v) a = T where z(cra) = (T: v) for some v. the addresses in dom % serve as pointer

@c dorn x such that a ~= p for some route p E L(R). The set of all destinations is denoted Dest =(R, a). If this set is a singleton we say that R at a in z has the unique destination property. Intuitively, the routing expressions specify where the

pointers in he routing fields should lead to. A graph type is only well-formed when all such expressions always have the unique lead to subtrees destination property types. and always of the specified

Intuitively, values.

203

The values of a well-formed Val G of finite construct graphs. in the underlying a graph data type.

graph type G form the set Given x E Val Data are labeled G we Note that and that

ii. insert each entry Step 4.(b)

(~, q) in Q (a, q) is considered involves only at most once

There is a graph for every value x, the set of by field the node a and its

whose nodes are dom

addresses in x. The edges, which

names, come in two flavors: data edges and routing edges. The data edges provide the canonical spanning tree-the backbone---and {a ~ The routing {a4pl a c F~z(a), Rgz(a)a = R, Dest (both .(R, cr) = {/3} } are defined as

immediate neighborsthus a number of nodes that depends on the grammar only. We conclude that the algorithm Ofz. With the well-formedness criterion it is not hard to see that the destination of a routing field at a is the node ~ if and only if there exists an initial the corresponding A4: The Monadic monadic M2L automaton Logic second-order GT, logic of graph certain types, desuch that state q of q) = ~. Tbl(a, runs in linear time as a function of the size

aa I aa G domz}. as

edges are defined

In this graph, addresses in F; fields) are defined. A3: Evaluating Routing

data and routing

noted

is used to express

proper-

Fields mentioned representing For an in

ties of graph types. We first introduce a simpler logic, monadic second-order logic of data types, denoted M2LDT. Fix a data type D. We define the M2LDT on D as follows. There are two kinds of second-order variables, value variables and address set variables. D. A value variable M x denotes a value of D. An a set of addresses with of U, fl and 0 address set variable Such variables denotes

Here we give the details Section of nondeterministic all routing

of the algorithm automata

5. We are given a backbone finite-state expressions in the graph

z and a collection grammar.

automaton A with transition . . . dn 6 A*, we write w=do there exists go, .d.., qn+l ql . ..-3
qn+~.

relation 4A and a word q _%A q to denote that


go = q,

can be combined

such that

qn+l

= i,

to form address set expressions. The set of addresses of z is denoted dom z, which is also a set expression. A first-order variable a, also called an address vamable, denotes an address of D. That a is an address in M is expressed as the formula a E M. A value variable z of type D is introduced by an existential quantification 3DX or a universal quantification lfvx. Variables that denote addresses The formulas quantification, the following or sets of addresses (3) or universal A (and), .,. are intro(V) quanby duced by usual existential tification.

and go ~

Our goal is to build a table Tbl such that for each node a in z and for each automaton A and each state q of A, the value of Tbl(cr, q) is a node ~, if it exists, such that then Tbl(a, for some w G A*, q) = nil. below employs a queue Q to calculate a %= ~ and q ~.4 qF, where qF is a final state of A; if no such node exists

of the logic are obtained basic formulas:


ac =

The algorithm Tbl:

combining tion) with

V (or), = (nega?.. -

1. Tbl(cr, q) := nil, for all nodes a in z and all automata states q 2. make Q empty 3. for all (a, q), where q is a final (a) (b) 4. while Tb/(cx, q) := et insert (a, q) in Q state:

is A(cJ)
isc$(a) isz (T: v)(a) is=T(a) is=walk(a, a=p El = E2 t~ ~ 82 faE& ~ = @.a ~, R)

z(a) z(a) z(a)

is a leaf variant = (T : v) = (T : v) for some v : cz ~= ~

~p c L(R)

Q is non-empty: (cr, q) from Q

(a) delete an element (b)

a=/?ka

aGFDz(@

and a = ~.a The

for all (~, q) such that for some d, q 5A i. Tbl(fl,

Tb/(~, q) = nil and a:

where & and the &is are address set expressions. formulas have the obvious is true iff the type-variant

q and ~ ~= q)

q) := Tb/(~,

meanings, e.g. is=(T: v)(~) at address a in z is (T: v).

204

The formulaa = /3; a is true if a = ~.a field of the type variant at a in z.

and a is a

the last state need not be final) starting at cr. This collection of subsets can be coded using IARI set variables. We must then write a M2LkSFT formula expressing that all states in a subset have a predecessor under the transition We must also write the collection condition; relation (unless and in the subset at a). This alone down a condition technically, we of subsets is minimal in order to ensure initial states at cr. c1 are omitted. for some directive

Expressing The following

well-formedness formula in M2LDT expresses that a

the state is initial is not sufficient. that with ensures that

graph type G is well-formed:

respect to the previous

are calculating that all states The details where D = Data ~ is the underlying data type; 3 ! is an abbreviation for there exists a unique; and AND is an abbreviation by expanding expressing the conjunction indices. obtained Logic of graph over the corresponding

a least fixed-point are reachable from

of this translation

types types,

Decidability Theorem

of M2LDT 1 M2LDT is decidable.

The monadic second-order logic of graph M2LGT, has the same syntax as M2LDT. Theorem 2 M2LGT is decidable. into M2LkSFT only /3 ; a. If a c F~z(@, is=walk(~, We omit the details.

Proof M2LDT is decidable by an easy reduction to M2LkSFT, the monadic second-order logic of k successors on finite trees. The latter logic hss set variables, such as X, denoting subsets of {1, . . . . k}* and jirst-order {1,... , k}*. .kforeachj variables, E{l,..., such as a, denoting elements of In addition there is a successor function k} and connective and quan-

Proof The translation for the formula a = the translation where R = R~z(~)a.

differs then a, R), c1

must expresses that

tifiers as above. We will indicate how formulas of M2LDT involving a data type D can be translated into 1,..., mula IX; M2LkSFT. fields k. ,..., We let k be [Fm[, in D, and we rename into xd the number field names of as forX~T, dom z; different

An x in D introduced is translated xp : g A ~, where

by a quantified 3Xd, X;,..., expresses

3ZJX : t

x;,... X;= expresses the type at position a by the bit pattern (ac XJ,..., acX~T) (here nl = loglT~l); at position x;, . . . X~w expresses the variant the bit pattern (a c X;,..., a c X~ ) (here log lV~ l); ~ is the translation expressing that z is a value of D according a by n =

of ~; and g is a formula to the con-

ditions on derivation trees given in Section 1. Address set variables are just translated into set variables and address variables into first-order variables. Most of the basic formulas are now easy to express. For example, a = ~ i a is translated into a = f?.a A a E x; this formula is equivalent a = ~. a A a c F~z(~) ,8, R) since z c Val V. The basic formula iszwalk(a,

is more difficult, Here we encode the working of AR, the automaton equivalent to R, on z by a formula that guesses the subsets of states at each a that are accessible from a partial ron (which is like a run except that

205

Potrebbero piacerti anche