Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Synthetic biology aims at producing novel biological systems to carry out some desired and
well-dened functions. An ultimate dream is to design these systems at a high level of
abstraction using engineering-based tools and programming languages, press a button, and
have the design translated to DNA sequences that can be synthesized and put to work in
living cells. We introduce such a programming language, which allows logical interactions
between potentially undetermined proteins and genes to be expressed in a modular manner.
Programs can be translated by a compiler into sequences of standard biological parts, a
process that relies on logic programming and prototype databases that contain known
biological parts and protein interactions. Programs can also be translated to reactions,
allowing simulations to be carried out. While current limitations on available data prevent
full use of the language in practical applications, the language can be used to develop formal
models of synthetic systems, which are otherwise often presented by informal notations. The
language can also serve as a concrete proposal on which future language designs can be
discussed, and can help to guide the emerging standard of biological parts which so far has
focused on biological, rather than logical, properties of parts.
type id properties
resolve constraints
pcr c0051 codes(clR, 0.001)
reject device pcr c0040 codes(tetR, 0.001)
no
pcr c0080 codes(araC, 0.001)
no pcr c0012 codes(lacI, 0.001)
set of yes results
devices ok? pcr c0061 codes(luxI, 0.001)
pcr c0062 codes(luxR, 0.001)
pcr c0079 codes(lasR, 0.001)
pcr c0078 codes(lasI, 0.001)
choose device keep device simulate pcr cunknown3 codes(ccdB, 0.001)
pcr cunknown4 codes(ccdA, 0.1)
pcr cunknown5 codes(ccdA2, 10.0)
single prom r0051 neg(clR, 1.0, 0.5, 0.00005)
reactions con(0.12)
device compile
prom r0040 neg(tetR, 1.0, 0.5, 0.00005)
con(0.09)
prom i0500 neg(araC, 1.0, 0.000001, 0.0001)
implement
con(0.1)
prom r0011 neg(lacI, 1.0, 0.5, 0.00005)
con(0.1)
device in prom runknown2 pos(lasR-m3OC12HSL, 1.0, 0.8, 0.1)
living cell pos(luxR-m3OC6HSL, 1.0, 0.8, 0.1)
con(0.000001)
Figure 1. Flow chart for the translation of GEC programs to rbs b0034 rate(0.1)
genetic devices, where each device consists of a set of part ter b0015
sequences. The GEC code describes the desired properties of
the parts that are needed to construct the device. The
constraints are resolved automatically to produce a set of
devices that exhibit these properties. One of the devices can Section 2 describes our assumptions about the
then be implemented inside a living cell. Before a device is databases, and gives an informal overview of GEC
implemented, it can also be compiled to a set of chemical through examples and two case studies drawn from the
reactions and simulated, in order to check whether it exhibits existing literature. Section 3 gives a formal presen-
the desired behaviour. An alternative device can be chosen, tation of GEC through an abstract syntax and a
or the constraints of the original GEC model can be modied denotational semantics, thus providing the necessary
based on the insights gained from the simulation. details to reproduce the compiler demonstrated by the
examples in the paper. Finally, 4 points to the main
to be extended if necessary and simulated. Stochastic contributions of the paper and potential use of GEC on
and deterministic simulation can be carried out directly a small scale, and outlines the limitations of GEC to be
in the tool (relying on the Systems Biology Workbench addressed in future work before the language can be
(Bergmann & Sauro 2006)) or through third-party tools adopted for large-scale applications.
after a translation to Systems Biology Markup
Language (Hucka et al. 2003; Bornstein et al. 2008).
To our knowledge, only one other formally dened 2. RESULTS
programming language for genetic engineering of cells is
currently available, namely the GenoCAD language 2.1. The databases
(Cai et al. 2007), which directly comprises a set of The parts and reaction databases presented in this
biologically meaningful sequences of parts. Another paper have been chosen to contain the minimal
language is currently under development at the Sauro structure and information necessary to convey the
Lab, University of Washington. While GenoCAD main ideas behind GEC. Technically, they are
allows a valid sequence of parts to be assembled implemented in Prolog, but their content is
directly, the Sauro Lab language allows this to be informally shown in tables 1 and 2. The reaction
done in a modular manner and additionally provides a table represents reactions in the general form
rich set of features for describing general biological enzymes w reactants/{r} products, optionally with
systems. GEC also employs a notion of modules, but some parts omitted as in the reaction luxI w /{1.0}
allows programs to be composed at higher levels of m3OC6HSL, in which m3OC6HSL is synthesized by
abstraction by describing the desired logical properties luxI. We stress that the mass action rate constants
of parts; the translation deduces the low-level {r} are hypothetical, and that specic rates have been
sequences of relevant biological parts that satisfy chosen for the sake of clarity. The last four lines of
these properties, and the sequences can then be checked the reaction database represent transport reactions,
for biological validity according to the rules of the where m3OC6HSL/[m3OC6HSL] is the transport of
GenoCAD language. m3OC6HSL into some compartment.
Table 2. A minimal reaction database consisting of basic Table 3. Parts in GEC with their corresponding graphical
reactions, enzymatic reactions with enzymes preceding the w representation.
symbol and transport reactions with compartments rep-
resented by square brackets. (Mass action rate constants are
enclosed in braces.)
The module keyword is followed by the name of the Table 5. Reactions in the GEC language with their
module, a list of formal parameters and the body of corresponding graphical representation.
the module enclosed in brackets {}. For the constitutive
expression module, we arbitrarily x a promoter. Modules
can be invoked simply by naming them together with a
list of actual parameters, as in the case of the tl module
above. These gate modules allow subsequent examples to
abstract away from the level of individual parts. The
remaining examples in this paper will also use dual-
output counterparts of some of these modules, for
producing two proteins rather than one:
(a) 16
A B C 14
populations (103)
12
10
8
6
4
2
Figure 2. The expanded repressilator program and its 0
corresponding graphical representation. (b) 11
10
9
populations (103)
module gateNeg(i, o) { 8
7
new RB. new RUB. new RTB. 6
new RT. new R. new RD. 5
prom!con(RT), neg(i, RB, RUB, RTB)O; 4
3
rbs!rate(R)O; 2
pcr!codes(o,RD)O; ter j 1
0.9 ! RB j RB ! 1.1 j 0 10 20 30 40 50 60 70 80 90 100
0.4 ! RUB j RUB ! 0.6 j time (103)
0.05 ! RT j RT ! 0.15 j
Figure 3. Stochastic simulation plots of two repressilator
RTB ! 0.01 j
devices. (a) The device using clR (red solid lines), tetR (blue
0.05 ! R j R ! 0.15 dashed lines) and araC (green dotted lines) is defective due to
}; the low rate of unbinding between araC and its promoter, while
(b) the device using clR (red solid lines), tetR (blue dashed
The rst two lines use the new operator to ensure that lines) and lacI (green dotted lines) behaves as expected.
variables are globally fresh. This means that variables of
the same name, but under different scopes of the new
operator, are considered semantically distinct. This is
important in the repressilator example because the RbA RbB RbC
gateNeg module is instantiated three times, and we do RubA RdA RubB RdB RubC RdC
not necessarily require that the binding rate RB is the RtbA A RtbB B RtbC C
RtA RtB RtC
same for all three instances. A sequence of part types RA RB RC
with properties then follows, this time with rates given
by variables, as shown in gure 4. Finally, constraints on
these rate variables are composed using the constraint
composition operator j. With this module replacing the
non-quantitative gate module dened previously, com-
pilation of the repressilator now results in the six devices
without the promoter araC, rather than the 24 from
before. One of these is the repressilator device contained
in the MIT parts database under the identier I5610: Figure 4. The expanded quantitative repressilator program
(with quantitative constraints omitted) and the correspond-
[r0040, b0034, c0051, b0015, r0051, b0034, ing graphical representation. The quantitative constraints
c0012, b0015, r0011, b0034, c0040, b0015] given in the main text restrict rates to a given range, placing
further constraints on the parts that can be chosen from
the database.
Simulation of the associated reactions now yields the
expected oscillations, as shown in gure 3b. Observe
also how modularity allows localized parts of a program
to be rened without rewriting the entire program. expressed Q1b. This dimer in turn induces expression of
the death protein ccdB. Symmetrically, the prey
constitutively expresses Q2a for synthesizing H2,
2.4. Case study: the predatorprey system which diffuses to the predator where it dimerizes with
Our second case study, an Escherichia coli predator the constitutively expressed Q2b. Instead of inducing
prey system (Balagadde et al. 2008) shown in gure 5, cell death, this dimer induces expression of an antidote
represents one of the largest synthetic systems A, which interferes with the constitutively expressed
implemented to date. It is based on two quorum-sensing death protein.
systems, one enabling predator cells to induce Note how we have left the details of the quorum-
expression of a death gene in the prey cells, and the sensing system and antidote unspecied by using
other enabling prey cells to inhibit expression of a death variables (upper case names) for the species involved.
gene in the predator. In the predator, Q1a is constitu- Only the death protein is specied explicitly (using a
tively expressed and synthesizes H1, which diffuses to lower case name). A GEC program implementing the
the prey where it dimerizes with the constitutively logic of gure 5 can be written as follows:
predator prey
r0051
ccdB
Q1a
H1 Q1b
H1 H1 H1 Q1b
Q2b H2 H2 H2
Q2b H2
Q2a
r0051
A
ccdB
r0051
Figure 5. The expanded predatorprey program and the corresponding graphical representation. The simulation-only reactions
described in the main text are not represented.
populations (102)
choices of proteins and hence of parts. Reversible 25
reactions are an abbreviation for the composition
20
of two reactions. For example, Q2b C H2 4 Q2bKH2
is an abbreviation for Q2b C H2 / Q2bKH2 j Q2b K 15
H2 / Q2bCH2. The last two lines of the predator and 10
prey modules specify reactions that are used for 5
simulation only and do not impose any constraints, 0
indicated by the star preceding the reaction arrows. The (b) 35
second to last line is a simple approach to modelling cell 30
populations (102)
death, and we return to this when discussing
25
simulations shortly. The last line consists of degradation
20
reactions for H1 and H2; since these are not the result of
gene expression, the associated degradation reactions 15
are not deduced automatically by the compiler. 10
The transport module denes transport reactions in 5
and out of two compartments, c1 and c2, representing,
respectively, the predator and prey cell boundaries. The 10 20 30 40 50 60 70 80 90 100
0
choice of compartment names is not important for time (103)
compilation to parts, but it is important in simulations Figure 6. Stochastic simulation plots of two predatorprey
where a distinction must be made between the popu- devices. (a) The device using ccdA2 is defective due to the low
lations of the same species in different compartments. rate of ccdA2-catalysed ccdB degradation, while (b) the device
using ccdA behaves as expected (red solid lines, c1[ccdB]; blue
The main body of the program invokes the three
dashed lines, c2[ccdB]).
modules, while putting the predator and prey inside
their respective compartments using the compartment another important assumption about the semantics of
operator. The modules are composed using the parallel the language: a part may be used only if its implicit
composition operator. In contrast to the sequential properties do not contain species which are present in the
composition operator which intuitively concatenates same compartment as the part. By implicit properties,
the part sequences of its operands, the parallel we mean the properties of a part that are not explicitly
composition intuitively results in the union of the part specied in the program. In our example, the part
sequences of its operands. This is useful when devices runknown2 in the predator has the implicit property
are implemented on different plasmids, or even in that it is positively regulated by lasR-m3OC12HSL.
different cells as in this example. The expanded Hence the part may not be used in a compartment
predatorprey program is shown in gure 5. in which lasR-m3OC12HSL is present. The use of
Compiling the program results in four devices, each compartments in our example ensures that this condition
consisting of two lists of parts that implement of crosstalk avoidance is met.
the predator and prey, respectively. One device is The reactions associated with the above device are
shown below. shown in the electronic supplementary material.
Simulation results, in which the populations of the killer
[r0051, b0034, c0062, b0034, c0078, b0015, protein ccdB in the predator and prey are plotted, are
runknown2, b0034, cunknown5, b0015, shown in gure 6a. We observe that the killer protein
r0051, b0034, cunknown3, b0015] in the predator remains expressed, hence blocking the
[runknown2, b0034, cunknown3, b0015, synthesis of H1 through the simulation-only reaction,
r0051, b0034, c0061, b0034, c0079, b0015] and preventing expression of killer protein in the prey.
This constant level of killer protein in the predator is
By inspection of the database, we establish that explained by the low rate at which the antidote protein
the compiler has found luxR/lasI/m3OC12HSL and ccdA2 used in this particular device degrades ccdB,
lasR/luxI/m3OC6HSL for implementing the quorum- and by the high degradation rate of ccdA2. The second
sensing components in the respective cells and ccdA2 for of the four devices is identical to the device above,
the antidote, and that it has found a unique promoter except that the more effective antidote ccdA is
runknown2, which is used both for regulating expressed using cunknown4 instead of cunknown5:
expression of ccdA2 in the predator and for regulat-
ing expression of ccdB in the prey. The fact that the [r0051, b0034, c0062, b0034, c0078,
two instances of this one promoter are located in different b0015, runknown2, b0034, cunknown4,
compartments now plays a crucial role: without the b0015, r0051, b0034, cunknown3, b0015]
compartment boundaries, undesired crosstalk would [runknown2, b0034, cunknown3, b0015,
arise between lasR-m3OC12HSL and the promoter r0051, b0034, c0061, b0034, c0079, b0015]
runknown2 in the predator, and between luxR-
m3OC6HSL and the promoter runknown2 in the prey. The simulation results of the reactions associated
Indeed, if we remove the compartments from the with this device are shown in gure 6b. The two
program, the compiler will detect this crosstalk and remaining devices are symmetric to the ones shown
report that no devices could be found. This illustrates above in the sense that the same two quorum-sensing
systems are used, but they are swapped around such Table 6. The abstract syntax of GEC, in terms of programs P,
that the predator produces m3OC6HSL rather than constraints C and reactions R, where the symbol is used to
m3OC12HSL, and vice versa for the prey. denote alternatives. (Programs consist of part sequences,
We stress that the simulation results obtained for the modules, compartments and constraints, together with
predatorprey system do not reproduce the published localized variables. The denition is recursive, in that a
results. One reason is that we are plotting the levels of large program can be made up of smaller programs.
Constraints consist of reactions and parameter ranges.)
killer proteins in a single predator and a single prey cell
rather than cell populations, in order to simplify the P ::Z u:t (Q t ) part u of type t with properties Qt
simulations. Another reason is that the published 0 empty program
results are based on a reduced ordinary differential u fP1 g; P2
p~ denition of module p with
equation model, and the parameters for the full model formals u~
are not readily available. Our simplied model is pA~ invocation of module p with
nevertheless sufcient to illustrate the approach, and actuals A~
can be further rened to include additional details of the P jC constraint C associated to
experimental set-up. We note a number of additional program P
simplifying omissions in our model: expression of the P1sP2 parallel composition of P1 and P2
quorum-sensing proteins in the prey, and of the killer P1 ; P2 sequential composition of
protein in the predator, are IPTG induced in the P1 and P2
original model; activated luxR (i.e. in complex with c[P ] compartment c containing
m3OC6HSL) dimerizes before acting as a transcription program P
new x .P local variable x inside program P
factor; and the antidote mechanism is more compli-
C ::Z R reaction
cated than mere degradation (Af et al. 2001). T transport reaction
K numerical constraint
C1jC2 conjunction of C1
3. METHODS and C 2
P
In this section, we give a formal denition of GEC in R :: Z S w mi $Sj reactants Si , products Sj
P
terms of its syntax and semantics. The denition is /r mi $Sj
largely independent of the concrete choice of part types T ::Z S /r c[S ] transport of S into compartment c
c[S ] /r S transport of S out of
and properties, except for the translation to reactions,
compartment c
which relies on the part types and properties described K ::Z E1 O E2 expression E1 greater than E2
in the previous section. Here, we focus on the E ::Z r real number or variable
translation to parts, which we consider to be the main E15E2 arithmetic operation 5 on E1
novelty, and the translation to reactions is dened and E2
formally in the electronic supplementary material. A ::Z r real number or variable
S species
3.1. The syntax of GEC
3.1.1. The abstract syntax of GEC. We assume xed identiers, m ranges over the natural numbers N, r
sets of primitive species names Ns, part identiers Np ranges over Rg X and 5 ranges over an appropriate
and variables X. We assume that x2X represents set of arithmetic operators. The abstract syntax of GEC
variables, n2Np represents part identiers and u2U is then given by the grammar in table 6. The tilde
represents species, parts and variables, where symbol $~P in the grammar is used to denote lists, and
U bNs g Np g X. A type system would be needed to the sum ( ) formally represents multisets.
ensure the proper, context-dependent use of u, but this
is a standard technical issue that we omit here. The set 3.1.2. Derived forms. Some language constructs are not
of complex species is given by the set S of multisets over represented explicitly in the grammar of table 6, but can
NsgX and is ranged over by S; multisets (i.e. sets that easily be dened in terms of the basic abstract syntax.
may contain multiple copies of each element) are used Reversible reactions are dened as the parallel compo-
in order to allow for homomultimers. sition of two reactions, each representing one of the
A xed set T of part types t is also assumed together directions. We use the underscore (_) as a wild card
with a set Qt of possible part properties for each type to mean that any name can be used in its place.
t2T. We dene Q bgt2T Qt and let Q t =Qt range This wild card can be encoded using variables and
over nite sets of properties of type t. In the case the new operator; for example, we dene _ : tQ t b
studies, properties are terms over Sg R where R is the new x:x : tQ t . In the specic case of basic part
set of real numbers, but the specic structure of programs, we will often omit the wild card, i.e.
properties is not important from a language perspec- tQ t b_ : tQ t . We also allow constraints to be
tive; all we require is that functions FV : Q1 X and composed to the left of programs and dene C jP bPjC .
FS : Q1 S are given for obtaining the variables and
species of properties, respectively, and we assume
these functions to be extended to other program 3.1.3. The concrete syntax of GEC. The examples of
constructs in the standard way. GEC programs given in this paper are written in a
Finally, c ranges over a given set of compartments, concrete syntax understood by the implemented parser.
p ranges over a given set of program (module) The main difference compared to the abstract syntax is
that variables are represented by upper case identiers Note that t1Z{m3OC6HSLKluxR}, since
and names are represented by lower case identiers. the complex m3OC6HSLKluxR is in the proper-
Complex species are represented by lists separated by ties of runknown2, but has not been mentioned
the (K) symbol in the concrete syntax, and the fact explicitly in the program under the correspond-
that complex species are multisets in the abstract ing substitution q1. Therefore, this complex
syntax reects that this operator is commutative (i.e. should not be used anywhere in the same
ordering is not signicant). Similar considerations compartment as runknown2, in order to pre-
apply to the sum operator in reactions. We also assume vent unwanted interference between parts.
some standard precedence rules, in particular that (;) Similar ideas apply to t10 .
binds tighter than (k), and allow the use of parentheses (ii) The second sequential composition produces
to override these standard rules if necessary. equivalent results, namely the device template
{[Y1,Y2]} and two context-sensitive sub-
3.2. The semantics of GEC stitutions, one for each possible choice of
transcription factors in the database for
We rst illustrate the semantics of GEC informally the promoter part. These are (q2,r2,s2,t2) and
through a small example, which exhibits characteristics q20 ; r20 ; s20 ; t20 where
from the predatorprey case study, and then turn to the q2Z{( Y11runknown2),( Y21b0034),
formal presentation. (H21m3OC12HSL),(Q2b1lasR)}.
r2Z{H2,Q2b}.
3.2.1. A small example. The translation uses the notion s2Z{m3OC12HSLKlasR}.
of context-sensitive substitutions (q,r,s,t) to represent t2Z{m3OC6HSLKluxR}.
solutions to the constraints of a program, where q is a q20 Z{( Y11runknown2),( Y21b0034),
mapping from variable names to primitive species (H21m3OC6HSL),(Q2b1luxR)}.
names, part names or numbers; r is a set of variables r20 Z{H2,Q2b}.
that must remain distinct after the mapping is applied; s20 Z{m3OC6HSLKluxR}.
s and t are, respectively, the species names that have t20 Z{m3OC12HSLKlasR}.
been used in the current context and the species names (iii) The parallel composition can now be evaluated by
that are excluded for use. This information is necessary taking the union of device templates from
in order to ensure piecewise injectivity over compart- the two components, resulting in {[X1,X2],
ment boundaries, together with crosstalk avoidance, as [Y1,Y2]}. However, none of the context-
mentioned in the case studies. The translation also uses sensitive substitutions are compatible. We can
the notion of device templates. These are sets containing combine neither q1 and q2, nor q10 and q20 , because
lists over part names and variables, and a context- the union of these is not injective on the
sensitive substitution can be applied to a device corresponding domains determined by r1gr2
template in order to obtain a nal concrete device. and r10 g r20 , respectively. And we can combine
Consider the following example: neither q1 and q20 , nor q10 and q2, because the
corresponding used species and exclusive species
(X1:prom!pos(H1-Q1b)O; X2:rbs) k overlap. This can be overcome by placing each
( Y2:prom!pos(Q2b-H2)O; Y2:rbs) parallel component inside a compartment as in the
predatorprey case study. The compartment
The translation rst processes the two sequential operator simply disregards the injective domain
compositions in isolation, and then the parallel variables, the used names and the exclusive
composition. names of its operands, after which all the four
combinations of substitutions mentioned above
(i) Observe that the database only lists a single
would be valid. In this example, all four
promoter part that is positively regulated by a
resulting substitutions map the part variables
dimer, namely runknown2. The rst sequential
to the same part names, and hence only
composition gives rise to the device template
the single device {[runknown2,b0034],
{[X1,X2]} and two context-sensitive sub-
[runknown2,b0034]} results.
stitutions, one for each possible choice of
transcription factors listed in the database for
this promoter part. These are (q1,r1,s1,t1) and
q10 ; r10 ; s10 ; t10 where 3.2.2. Formal denition. Transport reactions contain
q1Z{(X11runknown2),(X21b0034), explicit compartment names that are important
(H11m3OC12HSL),(Q1b1lasR)}. for simulation, but only the logical property that
r1Z{H1,Q1b}. transport is possible is captured in the parts database.
s1Z{m3OC12HSLKlasR}. We therefore dene the operator ($)Y on transport
t1Z{m3OC6HSLKluxR}. reactions to ignore compartment names, where
q10 Z{(X11runknown2),(X21b0034), S / cS 0 Yb S / S 0 and cS/ S 0 Y bS/ S 0 .
(H11m3OC6HSL),(Q1b1luxR)}. The meaning of a program is then given relative to
r10 Z{H1,Q1b}. global databases Kb and Kr of parts and reactions,
s10 Z{m3OC6HSLKluxR}. respectively. For the formal treatment, we assume
t10 Z{m3OC12HSLKlasR}. these to be given by two nite sets of ground terms:
modication sites that is common when modelling could be modelled as a nite species, and protein
signalling pathways. This in turn could tie in with the degradation could be modelled as an interaction with
use of protein domain part types, which are already this species, rather than as a simple delay. As the
present to some extent in the Registry of Standard number of proteins increases, the degradation
Biological Parts. An important and non-trivial future machinery would then become overloaded and the
challenge is therefore to address this question of effects would be observed in the simulations. More
representation. rened models of host physiology could also be included
For any given representation, however, logic program- by rening the corresponding part properties
ming can be used to design further levels of deductive accordingly.
reasoning. A simple example would be the notion of On a practical note, much can be done to improve the
indirect regulation, such as a promoter that is indirectly prototype implementation of the compiler, including for
positively regulated by s, which would have the property example better support for error reporting. Indeed, it is
ipos(s), if either (i) it has the property pos(s) or (ii) it possible to write syntactically well-formed programs that
has the property neg(s 0 ) and there is a reaction sC are not semantically meaningful, e.g. by using a complex
s 0 / sKs 0 . A deductive approach to the analysis of species where a part identier is expected. Such errors
genes, implemented using theorem proving in the tool should be handled through a type system, but this is a
BioDeductiva, is discussed in Shrager et al. (2007). This standard idea which has therefore not been the focus of
approach is likely to also work with biochemical the present work. Neither have the efforts of the
reactions, in which case linking the compiler to this tool GenoCAD tool been duplicated, so the compiler will not
may be of interest. However, the challenge of formalizing refuse programs that do not adhere to the structure set
the data representation and deductive layers in this forth by the GenoCAD language. We do, however, show
framework remains. in the electronic supplementary material how the
With additional deductive power also comes GenoCAD language can be integrated into our frame-
increased computational complexity. Our current work in a compositional manner. Finally, practical
Prolog-based approach is unlikely to scale with applications will require extensions of the current
increasing numbers of variables in programs, partly minimal databases, which may lead to additional part
because the search for solutions is exhaustive and types or properties.
unguided, and also because the order in which
constraints are tested is arbitrarily chosen as the The authors would like to thank Gordon Plotkin for useful
discussions.
order in which they occur in the GEC programs. Future
work should seek to ameliorate these problems through
the use of dedicated constraint logic-programming
techniques, which we anticipate can be added as REFERENCES
extensions to our current approach by using the
Af, H., Allali, N., Couturier, M. & Van Melderen, L. 2001
facilities of the ECLiPSe environment.
The ratio between ccda and ccdb modulates the transcrip-
More work is also needed to increase the likelihood tional repression of the ccd poison-antidote system. Mol.
that devices generated by the compiler do in fact work Microbiol. 41, 7382. (doi:10.1046/j.1365-2958.2001.
in practice, for example by taking into account 02492.x)
compatibility between parts and compatibility with Andrianantoandro, E., Basu, S., Karig, D. & Weiss, R. 2006
target cells. Quantitative constraints are another Synthetic biology: new engineering rules for an emerging
important means to this end, and we have indicated discipline. Mol. Syst. Biol. 2, 2006.0028. (doi:10.1038/
how mass action rate constants can be added to parts. msb4100073)
While such a representation is simple, it does not Apt, K. R. & Wallace, M. G. 2007 Constraint logic
account for the notion of uxes, measured for example programming using ECLiPSe. Cambridge, UK: Cambridge
in polymerases per second or ribosomes per second and University Press.
Balagadde, F., Song, H., Ozaki, J., Collins, C., Barnet, M.,
used for instance in Marchisio & Stelling (2008). But
Arnold, F., Quake, S. & You, L. 2008 A synthetic
imposing meaningful constraints on uxes, which are Escherichia coli predatorprey ecosystem. Mol. Syst.
expressed as rst-order derivatives of real functions, is Biol. 4, 187. (doi:10.1038/msb.2008.24)
not an easy task and remains an important future Bergmann, F. T. & Sauro, H. M. 2006 SBWa modular
challenge. The related issue of accommodating coop- framework for systems biology. In WSC 06: Proc. 38th
eratively regulated promoters, such as those described Conf. on Winter Simulation, pp. 16371645.
in Buchler et al. (2003), Hermsen et al. (2006) and Cox Blossey, R., Cardelli, L. & Phillips, A. 2008 Compositionality,
et al. (2007), also remains to be addressed. stochasticity and cooperativity in dynamic models of gene
Another issue that needs to be considered is the regulation. HFSP J. 2, 1728. (doi:10.2976/1.2804749)
potential impact of circuits on the host physiology, such Bornstein, B. J., Keating, S. M., Jouraku, A. & Hucka, M.
as the metabolic burden that can be caused by circuit 2008 LibSBML: an API library for SBML. Bioinformatics
24, 880881. (doi:10.1093/bioinformatics/btn051)
activation. This is an important issue in the design of
Buchler, N. E., Gerland, U. & Hwa, T. 2003 On schemes of
synthetic gene circuits, as discussed for example in combinatorial transcription logic. Proc. Natl Acad. Sci.
Andrianantoandro et al. (2006) and Marguet et al. USA 100, 51365141. (doi:10.1073/pnas.0930314100)
(2007). One way to account for this impact is to include Cai, Y., Hartnett, B., Gustafsson, C. & Peccoud, J. 2007
an explicit model of host physiology when generating A syntactic model to design and verify synthetic genetic
the reactions of a given GEC program for simulation. constructs derived from standard biological parts. Bioinfor-
For instance, the degradation machinery of a host cell matics 23, 27602767. (doi:10.1093/bioinformatics/btm446)
Chabrier-Rivier, N., Fages, F. & Soliman, S. 2004 The biochemical network models. Bioinformatics 19, 524531.
biochemical abstract machine BIOCHAM. In Proc. (doi:10.1093/bioinformatics/btg015)
CMSB, vol. 3082 (eds V. Danos & V. Scha chter). Lecture Mallavarapu, A., Thomson, M., Ullian, B. & Gunawardena,
Notes in Computer Science, pp. 172191. Berlin, Germany: J. 2009 Programming with models: modularity and
Springer. abstraction provide powerful capabilities for systems
Ciocchetta, F. & Hillston, J. 2008 Bio-pepa: an extension of biology. J. R. Soc. Interface 6, 257270. (doi:10.1098/
the process algebra pepa for biochemical networks. rsif.2008.0205)
Electron. Notes Theor. Comput. Sci. 194, 103117. Marchisio, M. A. & Stelling, J. 2008 Computational
(doi:10.1016/j.entcs.2007.12.008) design of synthetic gene circuits with composable parts.
Cox III, R. S., Surette, M. G. & Elowitz, M. B. 2007 Bioinformatics 24, 19031910. (doi:10.1093/bioinfor-
Programming gene expression with combinatorial matics/btn330)
promoters. Mol. Syst. Biol. 3, 145. (doi:10.1038/msb41 Marguet, P., Balagadde, F., Tan, C. & You, L. 2007 Biology
00187) by design: reduction and synthesis of cellular components
Danos, V., Feret, J., Fontana, W., Harmer, R. & Krivine, J. and behaviour. J. R. Soc. Interface 4, 607623. (doi:10.
2007 Rule-based modelling of cellular signalling. In 1098/rsif.2006.0206)
CONCUR, vol. 4703 (eds L. Caires & V. T. Vasconcelos).
Pedersen, M. & Phillips, A. 2009 GEC tool. See http://
Lecture Notes in Computer Science, pp. 1741. Berlin,
research.microsoft.com/gec.
Germany: Springer. Tutorial paper.
Pedersen, M. & Plotkin, G. 2008 A language for biochemical
Elowitz, M. B. & Leibler, S. 2000 A synthetic oscillatory
systems. In Proc. CMSB (eds M. Heiner & A. M.
network of transcriptional regulators. Nature 403,
Uhrmacher). Lecture Notes in Computer Science. Berlin,
335338. (doi:10.1038/35002125)
Endy, D. 2005 Foundations for engineering biology. Nature Germany: Springer.
438, 449453. (doi:10.1038/nature04342) Regev, A., Silverman, W. & Shapiro, E. 2001 Representation
Fisher, J. & Henzinger, T. 2007 Executable cell biology. Nat. and simulation of biochemical processes using the pi-calculus
Biotechnol. 25, 12391249. (doi:10.1038/nbt1356) process algebra. In Pacic Symp. on Biocomputing,
Garnkel, D. 1968 A machine-independent language for the pp. 459470.
simulation of complex chemical and biochemical systems. Sauro, H. 2000 Jarnac: a system for interactive metabolic
Comput. Biomed. Res. 2, 3144. (doi:10.1016/0010-4809 analysis. In Animating the cellular map: Proc. 9th Int.
(68)90006-2) Meeting on BioThermoKinetics (eds J.-H. Hofmeyr,
Hermsen, R., Tans, S. & Wolde, P. R. 2006 Transcriptional J. Rohwer & J. Snoep), pp. 221228. Stellenbosch,
regulation by competing transcription factor modules. Republic of South Africa: Stellenbosch University
PLoS Comput. Biol. 2, 15521560. (doi:10.1371/journal. Press.
pcbi.0020164) Shrager, J., Waldinger, R., Stickel, M. & Massar, J. 2007
Hucka, M. et al. 2003 The systems biology markup language Deductive biocomputing. PLoS ONE 2, e339. (doi:10.
(SBML): a medium for representation and exchange of 1371/journal.pone.0000339)