Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract Certain of these produced items are ones the project ul-
timately ships as the `product' of the eort. It is essen-
This paper studies an abstract model of dependencies be- tial therefore that the implications of any changes in the
tween software conguration items based on a theory of source items be properly re
ected in the items directly
concurrent computation over a class of Petri nets. The or indirectly produced from them. This can become an
primary goal is to illustrate the descriptive power of the overwhelming task if the project environment does not
model and lay theoretical groundwork for using it to de- provide automated support for it. A recognition of the
sign software conguration maintenance tools or model ubiquity of this problem, and the insight that a tool could
software congurations. As a start in this direction, the address it in a wide range of cases, led Stuart Feldman
paper analyzes and addresses certain limitations in make to develop the Unix tool, make [4]. In this limited appli-
description les using a form of abstract interpretation. cation domain, the special-purpose make description les
were easier to write and maintain than general purpose
programs, so the tool quickly gained widespread use. An
1 Introduction example of a make description le for a small congura-
tion of C-programming les appears in Figure 1. The
A variety of formalisms have been created to aid phases
of the software engineering life cycle. For instance, log- table : a.out indata
ical languages such as Z can be used to describe func- a.out
tional specications, while structures like
ow charts and
Petri nets are useful in detailed design. Automated for- a.out : main.o datanal.o lo.o /usr/cg208/lib/gen.a
mal verication has been shown feasible in certain cases cc main.o datanal.o lo.o /usr/cg208/lib/gen.a
2
(main.c) [cc -c] (main.o)
3
(a) (b) (c) (d) (e) (f)
Denition: (Markings.) Let N = (B; E; S; T ) be a pro- from that used for Petri nets. This dierence is owing to
duction net. A subset M B [ E is condition-closed the particular application for which p-nets are intended,
if, for every event e 2 E , e \ M =6 ; implies e M . which uses markings to model system build states. Con-
A marking M for N is a subset of B [ E that is both sider the net in Figure 2 for example: note the way u is
left-closed (with respect to vN ) and condition-closed. shared by e and f and the way v is shared by g and h.
As a build computes a result, the pre-condition remains
An event is viewed as having one of three states relative intact through the remainder of the build; sharing in the
to a marking. sense of these examples does not introduce con
ict in the
Denition: (Event States.) Let M be a marking on a computation. So, the usual Petri net semantics of plac-
p-net N = (B; E; S; T ) and let e be an event. We say that ing markings on conditions and having them consumed
e is enabled by M and write M=e if e M and e 62 M . by events to which they are pre-conditions is not a conve-
We say that e is initiated in M and write e=M if e 2 M nient way of thinking of system state for builds. Instead,
and e \ M = ;. We say that e is terminated in M and one wants to view a build as having achieved consistency
write e & M if e M . between sources and targets in a subset of the p-net that
is left-closed relative to vB .
Note that e is terminated in M if, and only if, M \ e 6= ;. The pair of operational rules in Figure 6 provide a basic
The intuition is that M=e is a state in which the pre- representation of the observable events of the concurrent
conditions of e are satised, but where the event e has computation. An event e engenders two forms of observ-
not yet begun. The state e=M is one in which e has
begun, but has not yet nished. The state e & M is
one in which e has nished and now its post-conditions M=e
Initiation
are within M . A pictorial representation of the dierent M ! M [ feg
e^
states appears in Figure 5. Under suitable assumptions,
the movement from one state to another with respect to Termination e=M
e retains the property of being a marking. M ! M [ e
e
4
M M M
e e e
the form of either the rst or last of these possibilities, Denition: (DP-Nets.) A dated production net (dp-net)
in which no event begins before all the events that began is a 5-tuple N = (B; E; S; T; ) where (B; E; S; T ) is a p-
before it have terminated. net and is a function from B into ! called the date.
Let us write M ! M 0 for the transitive, re
exive Given a dp-net N = (B; E; S; T; ), it is convenient to
closure of the labeled relation. That is, M ! M 0 pro- dene a pair of functions that provide greatest and least
vided M is M 0 or there is a marking M 00 and a label l dates on pre- and post-conditions of events. The pre-
such that M ! M 00 ! l
M 0. The number of steps date function : E ! ! is dened so that (e) is the
0
in the relation M ! M is the minimum number of greatest value in the set f(b) j b 2 eg. The post-date
markings M0 ; : : :; Mn such that M0 = M and Mn = M 0 function : E ! ! is dened so that (e) is the least
and there are labels l1 ; : : :; ln 1 such that the relations value in the set f (b) j b 2 eg. With this, the concept of
M0 l! 1
l!
1
Mn all hold. Although we will quickly an `up-to-date' target is given as follows:
replace the simple semantics provided by the rules in Fig- Denition: (Up-to-Date Markings.) A marking on
ure 6 by something more realistic (and interesting), it is the dp-net N is a marking on the underlying p-net
worth noting brie
y the following basic property: (B; E; S; T ). Such a marking M is said to be up-to-date
Proposition 2 If M is a marking of a net N and if every event e whose post-conditions are contained in M
M ! M 0, then M 0 is also a marking. satises (e) < (e).
The proposition is proved by an induction on the number Given a partial function f : X ! Y and a subset U of
of steps in the relation !. Each case follows immedi- the domain of f , we will denote by f j U the restriction
ately from Lemma 1. of f to U . It usually does not matter whether f j U is
A build over a p-net N = (B; E; S; T ) may now be to be viewed as a partial function with U as its domain
viewed as follows. First, a selection X B of targets is or whether the domain of f j U is X and f is undened
made. Then an initial marking M is chosen to consist outside of U , but for the purposes of this paper it is taken
of the minimal elements in # X . From this initial mark- to be the latter. It will often be useful for us to consider
ing, the events in # X are executed to produce a marking the restriction of f to the complement of the set U . This
M 0 which contains X . The events may evaluate as con- will be denoted f U .
-
currently as their dependencies allow until all events in Figure 7 displays a collection of operational rules for
# X are terminated in M 0 . However, this is potentially a a parallel version of make. When an event e is enabled,
very inecient way to ensure that targets properly re
ect it does not need to be run if it is up-to-date. This is
changes. For example, it may be that no condition among expressed by the process omission rule, which provides for
the minimal elements of # X has been modied since the a `"-event': a "-event is to be viewed as a no-op in which
last time the targets in X were built, so no computation an event is incorporated into the marking representing the
is required: the targets remain acceptable. The program state of the build without being executed. That this can
make optimizes by examining the dates assigned to les
be done is based on an assumption that is essential to the
by the operating system. Let us attempt to formalize this correctness of make-controlled builds: if a target exists,
idea. its pre-requisites exist, and the target is up-to-date with
The dates associated with les by the operating sys- respect to its targets, then the target was created from
tem can be viewed as a function on conditions. An aug- the pre-requisites, and it does not need to be rebuilt from
mented form of production net includes the needed addi- its pre-requisites if its pre-requisites are themselves based
tional structure. on up-to-date productions. We will return to the issue of
build invariants and correctness later.
5
foo.sig.sml foo.sml bar.sig.sml bar.sml
M=e (e) (e)
Initiation
M; !
e^
M [ feg;
e=M (e) < (e) Figure 9: Hash Keys to Avoid Cascading Recompilation
- e = - e
Termination
M; e!
M [ e ;
(e) < (e) grams, SML programs generally have deeply nested de-
M=e pendencies, often dozens of les long. In the gure, the
Omission
M; ! M [ feg [ e ;
"
SML signature FOO in the le foo.sig.sml, together with
some basic environment, is compiled into a target envi-
ronment. The implementing structure Foo of this signa-
Figure 7: Parallel make ture is then compiled and incorporated into this environ-
ment. After this, a signature BAR, which is stored in a le
bar.sig.sml and uses names from FOO and Foo, is com-
There are several ways in which the make optimiza- piled and incorporated. Finally, its implementing struc-
tion based on dates provides less than one would want ture Bar, which is in a le bar.sml and uses names from
in certain cases. Over the years, make extensions have FOO, Foo, and BAR, is compiled and incorporated. Now,
attempted to address some of these problems, others are if even a comment is changed in foo.sig.sml, then all
addressed only in IDE's. Let us consider three of these, of the steps used to generate this nal target will need
each of which is based on common experience with system to be rerun if one uses only the standard make date op-
maintenance. The rst concerns parsing tools, the second timization. When dealing with such deeply nested de-
concerns SML programming language compilations, and pendencies, it becomes worthwhile to retain information
the third concerns C header les. sucient to recognize when it is probable that the cas-
Figure 8 illustrates a production net which arises in cading sequence of recompilations can be cut o. In the
Compilation Manager (CM) IDE of Matthias Blume [1],
which is part of SML/NJ system, targets of compilations
(lex.c)
[cc] (lex.o)
(y.tab.h) are assigned `ngerprints'. A ngerprint is a bit string
(foo.y) [yacc] [cc] (y.tab.o)
computed from a le in such a way that les with the
(y.tab.c)
same ngerprint are very unlikely to be dierent.2 This
Figure 8: Identity of Old and New Inputs provides a pragmatic aid when development is under way;
even the low probability of error due to the imperfection
of the ngerprint assignment can be eliminated by delet-
instances where one is using the parser-generator yacc. ing the target les to induce complete regeneration before
The yacc tool takes an input le in a special format and nal testing. Time saved avoiding recompilations quickly
produces from a C source le, y.tab.c and a C header repays the overhead of calculating the ngerprints in typ-
le y.tab.h. The header le describes the information ical SML programming projects.
about keywords declared in foo.y, which is all of the A somewhat more subtle issue of dependence is illus-
information generated from this le that is needed to cre- trated in Figure 10. Header les allow separate com-
ate the lexer, lex.o. The input le foo.y also describes
a possibly intricate collection of actions used to deter- (my.c) [cc -c] (my.o)
keyword declarations there. However, if only the actions (your.c) [cc -c] (your.o)
ming language, appears Figure 9. Unlike typical C pro- Rabin's CRC polynomials; see [2] for an exposition.
6
le Ci can be compiled with a suitable subset of the (that is, the interpretation of z ) should be the result of
header les. In particular, no other C les are required. using x to process the data in y. This view allows one to
This has the advantage that one may compile Ci even if understand more precisely the role that the labels were
the C implementations of header les needed for the com- meant to play in something like Figure 3: they suggest
pilation are not available, perhaps because they are un- the intended model of the underlying p-net. Similarly,
der development. In Figure 10 compilation of my.c and the concept of a dated p-net is one in which the intended
your.c can be done using foo.h. If foo.c is modied model is expected to associate dates with conditions. One
then the les my.o and your.o do not become out-of- can get along with leaving the model as an informal con-
date as a result, although the linking of the three object cept up to a certain point, but a more precise interpre-
les does become out-of-date. This provides a valuable tation requires more structure. In particular, the goal of
form of support for separate compilation, allowing signif- this section is to explore `abstract interpretations' of de-
icant independence between programmers as well as an pendencies between software conguration items. As in
opportunity for the use of parallelism in builds. Suppose, other cases, such as the well-known application of strict-
however, that the programmer in charge of your.c asks ness analysis [3], an abstract interpretation is based on
that an additional function f be included in foo.c and its the use of a `non-standard' model which, to be useful, is
proto-type placed in foo.h. This results in a change in simpler in certain regards than the `standard' model but
foo.h, causing the compilation of my.c to become out-of- retains key relationships to the standard model. Regard-
date. However, the program my.c may not make any use ing the example above, the value of x may be a C le, but
of f or depend on it in any way. An intuitive and inexpen- its abstract interpretation may be its modication date.
sive approach to recognizing this state of aairs automat- This is the abstract interpretation exploited by make.
ically was introduced by Tichy [14] under the sobriquet To provide the key denitions, some mathematical ma-
`smart' recompilation. Tichy's benchmarks suggest that chinery is reqired. An indexed family of sets is an index-
the analysis is well worth the time spent on it, as one ing collection I together with a function associating with
might intuitively predict. His implementation was actu- each element i 2 I a set Si . Such an indexed family will
ally for Pascal with modules rather than C programs, so be written S = (Si j i 2 I ) and we say that S is `indexed
the example of Figure 10 should be taken with an appro- over I '. A section s = (si j i 2 I ) of such an indexed
priate grain of salt, but the basic idea is fairly language family of sets S is a function associating with each i 2 I
independent. The idea has inspired several subsequent an element si 2 Si . The product (Si j i 2 I ) is the
studies, including one on `smarter ' recompilation [12] for set of all sections of S . A partial section of an indexed
C programs and `smartest ' recompilation for SML [13]. family of sets (Si j i 2 I ) is a section s = (si j i 2 I 0 ) of
(Si j i 2 I 0 ) where I 0 I . In this case I 0 is called the do-
main of existence of s and we say that si exists if i 2 I 0 .
4 P-Net Models and Abstractions The partial product ~ (Si j i 2 I ) is the set of all partial
To really represent the kinds of issues described in the sections of S . We will generally be concerned with partial
previous section, it is essential to provide a model for sections. In examples, it will be convenient to write down
a production net that describes the kinds of entities in- some partial sections: if i; j; k are elements of I and a; b; c
volved in a build over the net and the relations they are lie in Si ; Sj ; Sk respectively, then s = (i; j; k 7! a; b; c) is
expected to satisfy. By way of illustration, consider the the partial section with fi; j; kg as its domain of existence
following very basic p-net: and with si ; sj ; sk respectively equal to a; b; c.
A model of a p-net is a family of sets indexed over
x a subset of the conditions of the net and an family of
relations indexed over a subset of the events of the net.
Denition: (Models.) A model A = (B 0 ; E0; V; R) of a
e z
y p-net N = (B; E; S; T ) is
This net has many models. For example, it might be a family of sets V = (Vx j x 2 B 0 ) indexed by condi-
the case that x is interpreted as a C source le, y as a tions B 0 B , and
header le, and z as an object le. The event e is inter- a family of relations R = (Re j e 2 E 0 ) indexed by
preted as the relation between the input les and output events E 0 E
les which holds when the interpretation of the output
could be the outcome of a correct C-compilation of the such that, for each e 2 E 0 , we have e B 0 and
interpretations of its inputs. In another model, x is in- Re ~ (Vx j x 2 e) ~ (Vx j x 2 e )
terpreted as an object le, y as a le containing data,
and z as another data le. The desired relation is that z A is a total model of N if B = B 0 and E = E 0 .
7
When dealing with multiple models, components can be in Figure 2, A; s j= e is shorthand for something like
distinguished by superscripts: A = (B A ; E A; V A ; RA). A; s j= Re(x; u; y; z ). Note, however, that the state s can
To clarify ideas, let us consider an example of a model. be partial on the variables, and the variables have dif-
Consider the production net in Figure 11, which corre- ferent types (sorts); the relation may still hold even if s
is undened on one of the variables x; u; y; z . Unlike rst
order models, the relations in the p-net model do not pro-
z
vide any order for the variables in the way they appear
e y in a relation expression like Re (x; u; y; z ).
An important property of the relation A; s j= e is that
x
it is dependent only on the values on e . The proof
u f of the following can be obtained by simply unrolling the
denition:
v
Lemma 3 Let A be a p-net model N and let e be an
event in N . For any pair of states s; s0 , if A; s0 j= e and
Figure 11: P-net for Example of a Model (s j e ) = (s0 j e ), then A; s j= e.
Our theory of system builds will involve two kinds of
sponds to the one in Figure 8. The `standard' model for mathematical entities. First, there are the relations or
the conditions u; v; x; y; z associates sets as follows: Vu is invariants which embody the correctness property of the
the set of yacc input les, Vx is the set of C header les, build and its optimizations. Second, there are the servers
Vv ; Vz are both the set of C les, and Vy is the set of ob- which drive the computation. We begin with the de-
ject les. The events e; f are interpreted as follows: Re is nition of the kinds of invariants required. The goal is to
the relation between yacc input les and output les, Rf describe a relation between a pair of models A and B for a
is the relation between C input les and the object les given net N , wherein B can be viewed as an abstraction of
producted by compiling them.3 A. The motivating example is the make date abstraction:
Given a model, the association of conditions and events the model A is the `standard' model in which conditions
to elements of the model provides the concept of system are interpreted as things like C source les and object
state: les, while the model B instead interprets these condi-
tions as dates. The relations on A are things like `input
Denition: (States and Consistent Events.) Suppose source le x compiles to output object le y', while the
A = (B 0 ; E 0; V; R) relations on B are things like `the date of x is earlier than
the date of y'. More precisely, what we need is a relation
is a model of the p-net N = (B; E; S; T ). A state of A is between states s of A and states t of B such that the rela-
a partial section of (Vx j x 2 B 0 ). Given a state s of A tion holds when t is to be viewed as a correct abstraction
we write A; s j= e just in case of s. Here is the precise formulation:
(si j i 2 e) Re (si j i 2 e ): Denition: Suppose A and B are models of the p-net
N = (B; E; S; T ). An abstraction : A ! B is a relation
An event e is said to be consistent in state s if A; s j= e. A between A-states and B-states that satises the following
left-closed subset L of B [ E is consistent with respect to rules for each A-state s, B-state t, and subset U B :
s if, for each non-minimal condition x in L, the event that
produced x is consistent. A marking on N is said to be [Production] (s; tA) ; s j=B;et j= e
consistent if its left-closure relative to vN is consistent.
To give some help on the notation here, we can think of [Deletion] (s; t)
A as a being a multi-sorted model of rst-order logic that (s j U; t j U )
interprets relation symbols Re corresponding to events The rst of these two rules is soundness with respect to
e 2 E 0. Variables corresponding to conditions are in-
terpreted in A by the state s. For example, for the net production and the second is soundness with respect to
deletion.
3 A technical quibble here is that each of the sets V in this ex-
To understand the names and origins of the two rules,
i
ample is really the set of les. In particular, any sort of le could
be input to yacc by a programmer having a bad day, so the view we must appreciate the invariants expected of a system
that there is a special set of yacc inputs is slightly misleading. The for which an abstraction will be used. First of all, the
case is less misleading for produced les, which must be in the spe-
cic class of les that could be produced by programs like the C aim of an abstraction is to signal when a production does
compiler. not need to be performed. To be sound, it must be the
8
case that if B says that a production step is not needed, We may now dene the key server concepts.
then the corresponding A values have the desired rela-
tionship. Hence soundness `with respect to production' Denition: Suppose A and B are models of the p-net
is the basic correctness criterion. To understand the sec- N = (B; E; S; T ) and : A ! B is an abstraction.
ond rule, consider the invariants that make is expected A build server for A is a function which takes as
to satisfy. Source les may be modied (or possibly even its arguments an event e and A-state s and returns as
deleted) while produced les may be deleted but not mod- its value an A-state s0 such that s e = s0 e and
- -
# U is consistent in s0
build would result. On the other hand, deleting the ob- (s; t)
ject le would not cause a problem, because the deleted [Abstraction]
le would be properly rebuilt from the up-to-date source (s0 ; (U; s0; t)) :
le. It is common to delete produced les, for example,
to save space. Source les may also be deleted, since it
will sometimes be the case that an event does not require With these denitions it is possible to describe the rules
one or more of its inputs to be dened: perhaps it is rea- for computation in Figure 12. These rules can be viewed
sonable just to think of deletion as an extreme form of
modication! Thus soundness `with respect to deletion'
is a natural requirement to impose on abstractions. Initiation M=e B; t 6j= e
To carry out a system build it is essential to have a col- ^
e
M; s; t ! M [ feg; s; t
lection of servers that can produce the desired outputs
from the available inputs. For instance, the description e=M s0 = (e; s)
Termination
le in Figure 1 requires a C compiler to process C source M; s; t e! M [ e ; s0; (e ; s0 ; t)
les and an assembler to process assembly code. When
one is dealing with abstractions, another server is needed
Omission M=e B; t j= e
to calculate the desired abstractions. In the case of the "
M; s; t ! M [ feg [ e ; s; t
make date optimization, this task consists only of noting
the date. However, some of the other examples discussed
in the previous section require a more sophisticated collec- Figure 12: Computation Relative to Servers and
tion of operations. For example, smart recompilation [14]
requires a `history' attribute which is used to cache infor- as the generalization of the `date optimization' rules in
mation required for assessing the eect of a change. In Figure 7 to a class of similar optimizations determined by
each of the ways a value in a state may change, a server the choice of the abstraction : A ! B. The three rules
is required to recalculate an abstraction that takes the play basically the same role as the ones in Figure 7. Initi-
change into account. Changes come in three forms: a ation occurs when a build must be carried out because the
source item is modied by the environment, an item is abstraction B does not indicate it is unnecessary. When
produced in the course of a build, or an item is deleted. termination occurs, the build result and the abstraction
In the last case the corresponding abstraction values are are updated by and respectively in the new state.
deleted (made undened) and correctness is ensured by Omission occurs when the abstraction B indicates that a
the rule for soundness with respect to deletion, so no spe- rebuild is unnecessary.
cial server is required. The former pair of cases apply to The principle soundness result for the rules in Figure 12
a set U of source items, or to a set e of produced items. is the following:
We need a name for this pair of cases:
Theorem 4 Let : A ! B be an abstraction between
Denition: Let N = (B; E; S; T ) be a p-net. A genera- models of a p-net N = (B; E; S; T ). Suppose s is a state
tion of conditions is a subset U B such that of A and t is a state of B such that (s; t). Let M be a
1. each element of U is a source, or consistent marking of A; s. If M; s; t ! M 0 ; s0; t0 with
respect to a build server for A and an abstraction server
2. there is an event e such that U = e . for , then
9
1. M 0 is a consistent marking of A; s0 and Relation The abstraction relation is dened by stipulat-
ing that (s; t) holds i, for every event e such that t
2. (s0 ; t0). is dened on e and Inequality 1 holds, the A-state
The theorem is proved by inducting on the length of the s is also dened on e and A; s j= e.
evaluation from M; s; t to M 0; s0 ; t0. Server Suppose U is a generation of conditions and s0
is an A-state and t is a B-state. The state t0 =
5 Applications of Abstractions (U; s0; t) has the same values as t outside of U . For
each x 2 U , if s0x is dened, then
An application of the concept of an abstraction requires t0x = 1 + maxftx j x 2 #fxg [ "fxgg: (2)
the the demonstration of a model, an abstraction relation,
and an abstraction server. Suppose we are given a model That is, the date on x is `later' than anything related
A for a p-net N = (B; E; S; T ). To dene an abstraction to it by vB . If s0x is undened, then t0x is also taken
for A we need: to be undened.
Abstraction Model We must dene a model B for N Our proof burdens are to show that is an abstrac-
which is to serve as the space of abstractions. This tion relation and that is an abstraction server for .
entails selecting the conditions B B that are to be In eect, this means proving that the rules [Production],
abstracted, and the events E B for which the abstrac- [Deletion], and [Abstraction] are satised by B, , and
tions are to be tested. For each element x 2 B B , a . Let us do this fully for this example. We start with
space VxB of abstract values is required, and for each soundness with respect to production:
e 2 E B , a relation RBe between B-states of the pre- (s; t) B; t j= e :
and post-conditions of e is required. It is fair game
to use VxA to dene VxB , although it will be unusual A; s j= e
to use RAe to dene RBe . The relationship between This rule has essentially been dened to hold for this ex-
RA B
e and Re is most likely to be expressed in the ab- ample. If B; t j= e then t is dened on e and Equation 1
straction relation. holds. By the denition of (s; t) these conditions imply
Abstraction Relation We must dene the relation A; s j= e. To see that it is sound with respect to deletion,
between A-states and B-states. This relation needs suppose U B ; we must show that the following rule is
to satisfy the two rules for abstraction relations, but satised:
(s; t)
it may also involve other properties that are to be (s j U; t j U )
assumed as invariants preserved by the abstraction
server. Suppose the hypothesis of the rule holds, let s0 = s j U
and t0 = t j U , and suppose e 2 E . If t0 is dened on
Abstraction Server It is necessary to dene a server e then it must be the case that e U so t is also
function for the abstraction . If the abstraction dened on e , where it has the same values as t0 . If s0 is
is chosen unwisely, it may be dicult or impossible undened on any of the elements of e , then so is s, hence
to nd a feasibly computable server for it. also t (because (s; t)), and consequently t0 too, contrary
to assumption. Moreover, the values of s0 must be the
Having selected these three things, it still remains a same as those of s on e . If maxft0x j x 2 eg < minft0y j
question of the application itself whether the abstraction y 2 e g then Inequality 1 holds too, so A; s0 j= e follows
will be useful. The axioms for abstraction relations and from (s; t) and the fact that s0 has the same values as s
servers ensure only that the abstraction optimization is on e .
sound. To prove that is an abstraction server for , we must
show that the following rule is satised:
Date Abstraction (s; t) s U = s0 U
- - # U is consistent in s0 :
Let A be a total model for a p-net N = (B; E; S; T ). (s0 ; t0)
Model The model B takes B B = B and E B = E . For where t0 = (U; s0 ; t) and U is a generation. Let e be an
each x 2 B , we dene VxB = !. For each event e 2 E , event and suppose that t0 is dened on e and
dene (t j e) ReB (t j e ) to hold i t is dened on
e and maxft0x j x 2 eg < minft0y j y 2 e g: (3)
maxftx j x 2 eg < minfty j y 2 e g (1) We must show that s0 is dened on e and A; s0 j= e. If
e \ U = ;, then the values in question are the same as
10
those for s; t, so the desired conclusion follows from the e and ty is a pair (u; n) where u = tx and n > tz . This
fact that (s; t) holds. Suppose therefore that e \U 6= ;. means that t satises the hypotheses of the second con-
Because U is a generation it cannot contain elements of dition for the dierence abstraction, so s is dened on
both e and e . Suppose rst that U \ e 6= ;. Then there e and (x; z 7! u; sz ) RA
e (y 7! sy ). But t also satises
is a contradiction with Inequality 3 because the values the rst condition for the abstraction , so sx = tx = u,
of t0 to the right of x must be the same as those of t which means A; s j= e as desired. Soundness with respect
but the denition of says that t0x is larger than any of to deletion is straight-forward. The proof that is an
these. Suppose second that U \ e 6= ;. Then there is no abstraction server is omitted; it is a hybrid of the proof
problem because # U was assumed to be consistent in s0 above for the date abstraction and the argument below
and e 2 # U . for dierence abstractions.
Equation 2 is, of course, dierent from the dates that
would be assigned to modied les by consulting the sys-
tem clock, so it is not exactly the same as the make ab-
Dierence Abstraction
straction server. This underscores the fact that a given The dierence abstraction caches the sources that were
abstraction may have many servers that could implement used to build a target. This information can be used to
it. It is important for any choice of server to prove that it avoid subsequent rebuilds when sources have not changed.
does indeed satisfy the expected invariant. For example, To describe the abstraction precisely, we need some more
the server above doesn't leave one to wonder about the mathematical notation. Given a product X Y , let fst :
correctness of builds made after a reboot has caused or X Y ! X be projection onto the rst coordinate, and
corrected an error in the time on the system clock. snd : X Y ! Y be projection onto the second. When
working with expressions that may not exist, like si where
A Customized Abstraction s is a partial section, it is useful to write equations using
Kleene equality : given expressions P and Q, we write
Let us consider the optimization proposed for the p-net P ' Q to mean that (1) P exists if, and only if, Q does
in Figure 8. In this example, it seems potentially worth- and (2) if P; Q exist then P = Q.
while to retain the header y.tab.h for comparison to
subsequent versions with later dates to avoid recompil- Model Dene B B = B and E B = E. Dene
ing the lexer if no changes have occurred. To describe VA
the abstraction, we refer to the names for p-net elements Vx = VxA ~ (V A j y 2 e ) xx aproduced
B source
by e.
appearing in Figure 11. Let A be the standard model
x y
11
To show that is sound with respect to production, between correctness and almost-correctness is to focus the
suppose (s; t) and B; t j= e for some s; t; e. We must uncertainty about correctness in the relation between the
show that A; s j= e also holds. The fact that B; t j= e actual model and an approximate model. Let A be the
means that there is a post-condition y 2 e such that intended model for a production net N = (B; E; S; T )
snd(ty ) is the partial section u = (c(t; x) j x 2 e ). and supose is a build server for A. For each x 2 B , let
Now, the assumption that (s; t) holds tells us two things. us assume we are given a space Fx of `ngerprints' and a
First, c(t; x) ' sx for each x; this means that u is (s j e ). ngerprinting function fx : VxA ! Fx . We dene a new
Second, since ty is dened, it has the form (sy ; (s0 j e )) model A as follows. The events and conditions of A are
where A; s0 j= e. But, by Lemma 3, these facts imply that the same as those of A, that is, B = B A and E A = E A.
A
A; s j= e, as desired. That is also sound with respect For each x 2 B , we dene VxA = VxA Fx , and for any
to deletion is straight-forward, noting that the domains event e and A-state s, we dene (s j e) RAe (s j e ) if,
of existence of s; t are the same if (s; t) holds, so the and only if, there is an A-state s0 such that
domains of existence of (s j U ) and (t j U ) will also be (s0 j e) RAe (s0 j e )
the same for any set of conditions U .
To see that is a server for , let U be a generation and, for each x 2 e , fx (s0x ) ' snd(sx ). That is, a
of conditions, let s be a A-state, and let t be a B-state. relation Re holds in A i the values of the pre- and
Suppose that (s; t) and s0 is a A-state such that s U = - post-conditions have ngerprints that could have been ob-
s0 U and # U is consistent in s0 . We must prove that
- tained from a related set of values in A. In particular, if
t0 = (U; s0; t) satises (s0 ; t0 ). First, if x is not in U , fx is an injection, then fx (s0x ) ' snd(sx ) ' fx (fst(sx )) so
then s0x ; t0x are the same as sx ; tx; the desired properties s0x = fst(sx ). If fx is an injection for each x 2 B , then A
hold because (s; t) does. Suppose x 2 U . If x is a and A are isomorphic. Thus the delity of A to A is mea-
source, then t0x ' s0x by denition and the condition on sured by how closely the ngerprinting function approxi-
for sources is therefore satised. If, on the other hand, x mates being an injection. A server for A is also needed.
is produced by e, then t0x ' (s0x ; (s0 j e )). But e 2 # U For any e; s; x, dene (e; s)x = ((e; s)x ; fx ((e; s)x )).
and # U is consistent in s0 , thus, in particular, A; s0 j= e. We are now prepared to describe ngerprinting as an
abstraction of the approximate model A.
Fingerprinting Abstraction Model Dene B B = B and E B = E. Dene
The dierence abstraction is inecient in some ways: the
VxB = F x if x is a source
abstraction keeps the entirity of the old values used to Fx ~ (Fy j y 2 e ) if x is produced.
produce the new ones, and the abstraction condition must
check whether this value is equal to new values, possibly c(t; x) ' fst( tx if x is a source
many times. To save space and time, it might be worth- tx ) if x is produced.
while to save a compressed version of the old value and And, for each B-state t and e 2 E , dene
compare this to compressed versions of the new values.
We could choose to do the compression in such a way (t j e) RBe (t j e )
that the compressions of two values are the same if, and
only if, the values themselves are the same. That is, we if, and only if, there is some y 2 e such that
could choose an injective compression map. However, we (c(t; x) j x 2 e ) = snd(ty ):
are not generally interested in uncompressing the values
in this case, only in keeping enough of a record of the Abstraction For any A-state s and B state t, the rela-
values that an equality test can be carried out eciently. tion (s; t) holds i
This leads us naturally to the idea that if the `compres-
sion' is almost injective, then this will be good enough, for every source x, tx ' fx (sx ), and
because the probability of the `compressions' of two dif- for every produced x, tx is dened i sx is
ferent values being the same is acceptably low. This is dened, and, if they are dened, then tx =
the idea behind ngerprinting, as discussed earlier in the (snd(sx ); (fy (s0y ) j y 2 e )) for some A-state
context of the SML/NJ Compilation Manager and ap- s0 such that A; s0 j= e.
plied in numerous other contexts. To t ngerprinting Server For any generation U B and A-state s0 and
into the theoretical framework of this paper demands that B-state t dene (U; s0; t)x ' snd(s0x ) if x is a source
we reconcile the ngerprinting concept of being correct al- in U . If x is a produced condition in U dene
most always with the correctness criteria for abstractions,
which stipulates correctness in all cases. (U; s0 ; t)x ' (snd(s0x ); (snd(s0y ) j y 2 e )):
Perhaps the simplest way to achieve this reconciliation
If x is not in U , then (U; s0; t)x ' tx .
12
The proofs that is an abstraction and is a server for
it are very similar to the ones given for the dierence ab-
straction above (modulo the tedium of some projections
and ngerprintings).
6 Conclusions
The accomplishments of this paper are the introduction
of production nets and their models, the formulation of
abstractions and their associated correctness conditions,
and the application of these concepts in a collection of
noteworthy cases. Questions still remain about the inte-
gration of models, the way in which models should be de-
scribed and implemented, and whether a sucient range
of problems that arise in real system congurations can
be treated reasonably using the abstract framework de-
scribed here. A more limited objective is applying the
theory to more of the cases within its current realm; chal- First IDE General Operation Second IDE
lenges include a rigorous treatment of abstractions like
Tichy's smart recompilation and the treatment of sys- Figure 13: Integrating Models
tems with multi-pass builds (or apparent cycles) such as
the type-setting program LaTeX. Let me close by com-
menting very brie
y on the three larger questions. of some of the build computations using the language
From a mathematical perspective, the full version of Pict [10, 9, 11], which is based on Milner's -calculus and
this paper provides a satisfactory account of how vari- provides useful high-level constructs for describing the
ous models and abstractions can be combined. To give concurrent computations over p-nets. Comparison with
a hint about the systems perspective of this mathemat- other conguration management languages and systems
ics, suppose we are given a pair of IDE's and a project would be helpful. In any case, a suitable data structure
that needs to manipulate items produced by them. Each or language for p-net models and abstractions requires
of the IDE's controls the dependencies, abstractions, and further exploration.
build operations for a portion of the collection of items As for whether a sucient range of problems that arise
included in the project. Figure 13 illustrates the general in real system congurations can be treated reasonably
idea. The IDE's supply the servers that are used in the using p-net abstractions and models, there are several
overall build, which is controlled by computation over the issues that must be treated seriously before any attempt
underlying p-net according to the rules in Figure 12. Ab- at validation seems worthwhile. Key issues include the
stractions may be supplied by the IDE's (based on special treatment of versions/variants and the incorporation of
`knowledge' they have about the semantics of the items in changes in dependencies (that is, where a change in a
their domains) or by other means (as, for instance, make source item results in a change in the underlying p-net of
supplies the date abstraction or CM supplies the nger- dependencies). Dynamic determination of dependencies
printing abstraction). may also be worthy of consideration. Another interesting
P-nets are, of course, a mathematical abstraction; a issue is the possibility of restricting one's view of a p-net
system that uses them must represent them in a data of items by moving a baseline to hide or expose items to
structure or allow the programmer (or a system) to de- change control.
scribe them in a language. A tool like makedepend
performs this latter function for C-programming items
and make. Also nmake [5] provides explicit support for
dependency-reporting. A more sophisticated and ab-
Acknowledgements
stract language than that of make description les is pro- I would like to express my appreciation to the following
vided by the Vesta programming language [7, 6] which people who in
uenced this work: Benli Pierce, Sandip
is the description language for the Vesta conguration Biswas, Luca Cardelli, Tony Hoare, Micheal Jackson,
management system [8]. Vesta makes some decisions dif- Trevor Jim, Cli Jones, Dave MacQueen, V. Mahesh,
ferently from the way p-nets are applied in this paper, Andy Pitts, John Reppy, Glynn Winskel. The following
but the approaches may be complementary in some ways. agencies and institute provided partial support for this
The full version of this paper describes implementations project: ARO (USA), EPSRC (UK), NIMS (UK), NSF
13
(USA), ONR (USA).
References
[1] Matthias Blume. CM: A Compilation Manager for
SML/NJ. User Manual.
[2] Andrei Broder. Some applications of Rabin's ngerprint-
ing method. In R. M. Capocelli et. al., editor, Sequences
II: Methods in Communication, Security, and Computer
Science. Sprinter-Verlag, 1991.
[3] G. L. Burn, C. Hankin, and S. Abramsky. Strictness
analysis for higher-order functions. Science of Computer
Programming, 7:249{278, 1986.
[4] Stuart I. Feldman. Make|a program for maintaining
computer programs. Software|Practice and Experience,
9:255{265, 1979.
[5] Glenn Fowler. A case for make. Software|Practice and
Experience, 20(S1):S1/35{S1/46, 1990.
[6] Christine B. Hanna and Roy Levin. The Vesta language
for conguration management. Technical Report 107,
Digital Systems Research Center, 1993.
[7] Butler W. Lampson and Eric E. Schmidt. Practical use
of a polymorphic applicative language. In Proceedings
of the Tenth Annual ACM Symposium on Principles of
Programming Languages, 1983.
[8] Roy Levin and Paul R. McJones. The Vesta approach to
precise conguration of large software systems. Technical
Report 105, Digital Systems Research Center, 1993.
[9] Benjamin C. Pierce. Programming in the pi-calculus: An
experiment in programming language design. Tutorial
notes on the Pict language. Available electronically, 1995.
[10] Benjamin C. Pierce and David N. Turner. Concurrent
objects in a process calculus. In Takayasu Ito and Aki-
nori Yonezawa, editors, Theory and Practice of Parallel
Programming, number 907 in Lecture Notes in Computer
Science, pages 187{215. Springer-Verlag, 1995.
[11] Benjamin C. Pierce and David N. Turner. Pict: A pro-
gramming language based on the pi-calculus. To appear,
1995.
[12] Robert W. Schwanke and Gail E. Kaiser. Smarter recom-
pilation. ACM Transactions on Programming Languages
and Systems, 10(4):627{632, 1988.
[13] Zhong Shao and Andrew W. Appel. Smartest recompi-
lation. In Susan L. Graham, editor, Conference Record
of the Twentieth Annual ACM SIGPLAN-SIGACT Sym-
posium on Principles of Programming Languages, pages
439{450. ACM, 1993.
[14] Walter F. Tichy. Smart recompilation. ACM Transac-
tions on Programming Languages and Systems, 8(3):273{
291, 1986.
14