Sei sulla pagina 1di 10

Review articles

A biological cosmos of parallel


universes: does protein structural
plasticity facilitate evolution?
Sebastian Meier1* and Suat Özbek2

Summary in which new structural frameworks evolve via simple muta-


While Darwin pictured organismal evolution as ‘‘descent tions has remained much more elusive. The retrospective
with modification’’ more than 150 years ago, a detailed character of evolutionary studies inevitably requires specula-
reconstruction of the basic evolutionary transitions at the tion, especially as historic transition forms at the molecular
molecular level is only emerging now. In particular, the
evolution of today’s protein structures and their concur- level are absent due to the absence of molecular fossils. In fact,
rent functions has remained largely mysterious, as the the disclosure of missing links is among the central achieve-
destruction of these structures by mutation seems far ments of the Darwinian theory of evolution but reconstructing
easier than their construction. While the accumulation of them at the organismal or molecular levels remains a major
genomic and structural data has indicated that proteins challenge.(6)
are related via common ancestors, naturally occurring
protein structures are often considered to be evolutiona- The vast number of possible sequences—a 100 amino acid
rily robust, thus leaving open the question of how protein protein can occur in 20100 sequences, each of which has
structures can be remodelled while selective pressure 100 ! 19 neighbours differing by one amino acid in ‘‘sequence
forces them to function. New information on the proteome, space’’—makes a comprehensive mapping of genotype–
however, increasingly explains the nature of local and phenotype relationships at the protein level impossible and
global conformational diversity in protein evolution, which
allows the acquisition of novel functions via molecular suggests that lessons can best be learned from evolutionary
transition forms containing ancestral and novel structures sequence design as performed by nature. Until recently,
in dynamic equilibrium. Such structural plasticity may protein sequence–structure mapping could be considered
permit the evolution of new protein folds and help account ‘‘largely Terra incognita’’,(7) in contrast to insightful RNA-
for both the origins of new biological functions and the folding models.(8) This latter work has supported the classic
nature of molecular defects. BioEssays 29:1095–1104,
2007. ! 2007 Wiley Periodicals, Inc. notion that often unrelated sequences fold to the same
structure and are connected in ‘‘a continuous network which
Introduction can be traversed by single mutational steps without passing
Proteins have an outstanding functional capacity due to their through non-functional intermediates’’.(9) Models of structural
ability to fold into distinct structures.(1) The precise sequence innovation based upon mutational walks on such phenotypi-
features that result in functional protein structures are there- cally neutral networks have been proposed. These models,
fore of seminal interest for the development of a protein folding however, crave experimental validation, especially in proteins.
theory. The earliest structural studies on proteins had In principle, structural plasticity could allow smooth evolu-
previously indicated that protein structures are far more tionary transformations of structures and concomitant func-
conserved than their sequences.(2) In fact, similar structures tional innovation via intermediate states containing a structural
can be assumed by proteins without significant sequence ensemble.(10,11) Here, we review an increasing body of largely
homology.(3) While the evolution of novel functions in a incidental observations from a variety of biological disciplines,
conserved structural scaffold is well established,(4,5) the way which strongly indicate that a wealth of protein folds is
transmutable via plastic intermediates.
1
Institute of Molecular Biology and Physiology, August Krogh
Building, University of Copenhagen, Universitetsparken 13, DK-2100
Relationship between natural proteins
Copenhagen, Denmark. Structures evolve slowly compared to the underlying se-
2
Institute of Zoology, Department of Molecular Evolution and Geno- quences and only a few thousand kinds of protein folds
mics, Im Neuenheimer Feld 230, 69120 Heidelberg, Germany. account for today’s domain diversity. Clearly, the earliest
*Correspondence to: Sebastian Meier, Sebastian Meier, Carlsberg
protein structures must have evolved de novo;(12) this process
Laboratory, Gamle Carlsberg Vej 10, DK-2500 Valby, Denmark.
E-mail: smeier@crc.dk
has been retraced experimentally by in vitro evolution of a
DOI 10.1002/bies.20661 known natural fold.(13) The total set of evolutionary relation-
Published online in Wiley InterScience (www.interscience.wiley.com). ships between today’s protein domains by divergence from a
few such primordial protein structures however, has remained

BioEssays 29:1095–1104, ! 2007 Wiley Periodicals, Inc. BioEssays 29.11 1095


Review articles

much more controversial. Sequence similarity essentially Conformational diversity in the immune system—a ‘‘micro-
proves evolutionary relationships, whereas neither functional cosm of protein evolution’’(27)—shows how unlimited func-
nor structural similarity alone is taken as a valid indication of tional diversity can be provided by a limited number of
evolutionary relationships:(14,15) the limited number of protein sequences.(27) Enzymes presumably evolve novel functions
folds makes fold convergence possible while functional by mutational ‘tinkering’ with a specialized protein sequence,
similarity can be the result of comparable selection pressures thus frequently enabling a specialized enzyme to adopt a novel
or the limited chemical repertoire needed by cells and provided function via a ‘generalist’ intermediate sequence conducting
by protein functional groups.(14) As a result, major innovations diverse functions at surprisingly high efficiency.(24,28)
of novel functional scaffolds by sequence divergence are Plasticity is the phenotypic diversity that generates various
particularly hard to detect. The accumulation of structural data selectable ground states in one genotype and thus may
over recent years suggests that the variety of today’s protein facilitate major evolutionary changes. It has been implicated in
domain folds results from evolutionary dynamics rather than the course of evolutionary alterations in developmental
from de novo evolution and thus properties of fold space alone. processes, in particular.(29,30) Proteins may also show plastic
The combination of structural identity and functional similarity behavior by sampling new functions, while retaining their
indicate evolutionary relationships even when there is low original functions with nearly uncompromised efficiency.(10,24)
sequence similarity(16,17) and allow the construction of putative A fairly large sampling of conformational space by plastic
evolutionary links. Hard evidence for the evolutionary relation- biomolecular structures would facilitate the molecular evolu-
ships of folds has come from the network topology depicted in tion of novelty and the attainment of global fitness maxima
graphs of protein structural similarity.(18 –21) Networks of (Fig. 1). Such plasticity will increase phenotypic variation and
structural similarity between protein domains show that the thus contribute to the capacity of biomolecules to undergo
number of connections across the network follows a power change in form and function. In this way an evolving organism
(Zipf) law, indicative of a sequential evolutionary growth by the could be ‘‘extensively remodeled while it is running’’.(29)
addition of domains.(20,22) Phenomenological models of Divergent specialization of plastic biomolecular structures
duplication and divergence reproduce the structural relation- could occur by minor genetic changes, for example by single
ships of proteins in these networks by the divergent expansion mutations following gene duplication (Fig. 1).(31)
of the protein universe in what Shakhnovich has termed the In these scenarios, plasticity has a buffering effect that
‘‘biological Big Bang’’.(20) Notably Shakhnovich’s model allows the biomolecular structure to develop novel specialized
produces a relatively large number of structural innovations phenotypes without abandoning the original tasks. On the
as compared to independent de novo evolution of folds.(20) The other hand, structural diversity can be costly, as each
detailed molecular mechanisms of how evolutionary transi- conformer occupies only a fraction of the total population. If
tions in protein structural space are achieved, however, the fitness function is a composite over diverse conformations,
remains unclear, as the destruction of protein folds and fractions of beneficial structures will increase by selection.
functions is substantially easier than their construction.(23) Thus, selection would act to reduce plasticity after an
innovation, potentially explaining the rarity of natural transition
Functional and conformational plasticity forms.(32) The existence of plastic transition regions between
of proteins biomolecular structures has been demonstrated for RNA(8) but
Proteins can evolve new functions, for instance antibiotic has remained controversial for protein sequence space. There
resistance, in a matter of months by a small number of is no doubt, however, that relatively simple changes in the
mutations.(24) As a result, closely related proteins can have genotype suffice to evolve new protein topologies. For
different functions, while distantly related proteins can perform example, recent in vitro evolution by the Tawfik group has
the same task. Such biological functionality has been linked to demonstrated the gradual transformation between circular
local structural fluctuations and chemical adaptations in a permutations of DNA methyltransferase via gene rearrange-
stable structural background,(25) while global structural ments.(33) Most importantly, in this instance, evolution pro-
changes in proteins are rarely observed. The gradual evolution ceeds to naturally occurring topologies via functional
of new functions has been associated with gradual changes of intermediates.
populations in diverse protein ensembles.(26) Multispecificity
both in enzymes and antibodies,(27) for example, is permitted Divergence of structures
by a pre-equilibrium between multiple conformers of nearly The mapping of sequences to structures is highly redundant
identical energy. The conformational selection from a hetero- both for RNA and proteins in the sense that many sequences
geneous ensemble has recently been demonstrated in anti- can fold to the same structure. Models of sequence–structure
body maturation by a combined spectroscopic/molecular topology have been especially well explored for RNA with
dynamics approach, thus showing how cooperatively acting powerful secondary structure prediction algorithms.(8,34,35)
mutations rigidify the variable regions in an antibody.(26) Random mutational walks by phenotypically neutral point

1096 BioEssays 29.11


Review articles

Figure 1. Protein structure plasticity increases phenotypic variation and allows the exploration of novel functions, which can be selected
from to become the novel wild type. The different colours symbolize different functional phenotypes. Efficient exploration of phenotypic
space via functional intermediates (top right) aids the discovery of global adaptive maxima. Redundancy by gene duplication adds further
robustness to the system by freeing restraints on one of the duplicates. Darwinian evolution suggests that novel, better solutions are only
found, when intermediates are not too disadvantageous. A static picture of proteins would suggest a vast prevalence of loss of function upon
non-neutral mutations in an evolved sequence. Note that the dimensionality of adaptive landscapes is much larger than three, as genotypes
and phenotypes differ in a wealth of factors.

mutations span large regions of sequence space without phenotypically neutral mutations along the chain. The ability of
altering the RNA secondary structure, thus leading to many different sequences to fold into the same structure on a
extensive neutral networks in sequence space. RNA studies neutral network implies mutational robustness, but also leads
show that finding particular novel structures by mutation to evolvability: the large covering of sequence space by folds
and selection is straightforward due to the proximity in implies small sequence distances relative to the number of
sequence space of neutral networks belonging to different amino acids both between folds and between a random amino
RNA secondary structures. In vitro selection experiments acid sequence and a particular target fold. As a result,
principally have validated the notion of molecular transitions by evolution of the earliest structures from unfolded sequences
evolutionary pathways for RNA.(7) Accordingly, two different and subsequent divergence by fold change is possible by a
naturally occurring ribozymes have been linked by a series of limited number of mutations, where the sequence–structure
point mutations, where each of the intermediate sequences map extends across sequence space.
retains activity. An evolutionary bridge sequence between both As we still lack an analytical folding theory for the prediction
ribozymes exhibits both folds and functionalities in a single of sequence foldability, the analysis of evolutionary sequence
RNA sequence.(36) The degeneracy of structures in such selection to derive folded proteins from random ensembles is
bridge states offers attractive features to allow the continuous particularly important. The structural starting ensemble of a
evolution of novel folds and functions via plastic intermediates, random sequence is very sensitive to minor external perturba-
which assume globally different structures in a single tions and sequence changes.(37) Evolution acts to select a
sequence. native structure conferring the highest fitness, and mutational
If this notion holds for real proteins, extensive walks by changes to the sequence optimize consistent folding to that
single mutational steps on neutral networks can explore structure.(38,39) In this way, thermodynamically and kinetically
sequence space by amino acid exchanges that leave the foldable sequences evolve, where native-like contacts are on
foldability intact (Fig. 2). While the over-all fraction of neutral or average more stable than non-native interactions.(40) The
non-neutral mutations in proteins remains a matter of debate, entropic loss upon folding is thus driven by specific enthalpic
potential sequence changes in proteins draw from nineteen contacts and the folding reaction is sculpted by evolution to
alternatives rather than three in RNA, implying a wealth of become cooperatively stabilized. Both thermodynamically and

BioEssays 29.11 1097


Review articles

mutational robustness of functional folds over evolutionary


time(42) indicate that protein structural overlap in sequence
space is rare. The very existence of such overlap regions with
tertiary structural plasticity in single sequences is at odds with
the common picture of proteins, but incidental discoveries
have accumulated that demonstrate such plasticity rather than
unique ground state structures (Table 1). Global structural
changes in proteins have been primarily linked to protein
misfolding diseases. In contrast, a functional interconversion
of naturally occurring cysteine-rich tertiary structures was
achieved via a bridge state sequence folding into both
domain structures by reconstructing evolutionary pathways
from sequence data.(45) Cysteine-rich domains (CRDs) of
Hydra nematocyst wall proteins have a conserved motif
CXXXCXXXCXXXCXXXCC and fold to two different struc-
tures with three different disulfide bridges and a globally
different topology. Single mutations to a natural CRD
sequence suffice to induce a bridge state with some 20% of
the original fold and a major fraction converted to the other
naturally occurring structure (Fig. 2). An additional mutation
can virtually complete the conversion between the two folds.
The different structures occur in the N- and C-terminal
domains of minicollagens as cross-linking elements of the
ultrastable nematocyst wall. The mutations introduced in vitro
reflect a natural design principle to introduce structural polarity
into minicollagens.(45) Naturally occurring protein folds thus
can overlap in sequence space and continuous mutational
Figure 2. Top: particular protein folds can be formed by paths both for RNAs and for proteins can allow adaptive walks
largely different sequences, which are not clustered in and structural innovation via functional intermediates due to
sequence space. Sequences forming one fold are organized the non-unique ground states in evolutionary bridges (Fig. 3,
in a net-like arrangement in so-called ‘neutral networks’, where
Table 1). Saddle regions between unique folds are formed by
single neutral mutations connect the sequences forming the
same fold. Top: Rare evolutionary bridges between different sequences that cannot saturate interactions in either of the two
folds allow continuous evolutionary transitions. Bottom: In folds. The crossing of such bridge states by walks in the
such bridges, the notion that protein sequences fold to unique hypothetical free energy landscapes depends on the function-
states is not valid, as shown for cysteine-rich domains from ality of these sequences, that is their foldability relative to
Hydra(45). Mutated residues are shown in magenta.
physiological requirements.(46) Furthermore, plasticity in
these saddle regions reduces the effective concentration of
each conformation and is thus costly. As a result, fitness
kinetically populated states can evolve in this way into unique effects and the rarity of transition sequences lead to a
ground state structures.(41,42) For a detailed recent review on disparity between time scales for sequence and structure
folding theory and evolutionary sequence selection, see evolution.(43)
Shakhnovich.(43)
Ascertaining the closest approach in sequence for different Experimental limitations
folds is central to the question whether mutational pathways The exploration of shortest mutational paths for structural
can principally prompt divergence between biomolecular innovation in sequence space has largely remained sporadic.
structures ‘‘without passing through non-functional intermedi- While chance observations of protein tertiary structure
ates’’. Various simulations on protein lattice models raised the plasticity in transition sequences between domain structures
possibility that neutral networks of different protein folds accumulate, it is still unclear whether energetically nearly
approach to distances of a few amino acid exchanges or degenerate bridge states between folds are a common feature
overlap in sequence space.(7,44) Relative to RNAs the overlap of protein structures and how many natural sequences are
seems largely reduced due to the larger chemical diversity of within one or a few point mutations of defined non-native
the 20 amino acids as compared to the four RNA nucleo- structures. To estimate whether a limited number of amino acid
tides.(36) The paucity of molecular intermediate forms and the changes will suffice to change a protein structure, the number

1098 BioEssays 29.11


Review articles

Table 1. Examples of protein structure plasticity


Protein Degenerate forms and structural changes Trigger

Arc repressor(11,78) helix-sheet switch in a bridge state degenerate structures in parallel folding
reactions
Rop(51) repacking in different helical bundle dimers due to topological frustration degenerate structures in parallel folding
reactions
IGF-1(53) degenerate structures with different disulfide patterns degenerate structures in parallel folding
reactions
huPrP(64,65) equilibrium between a/b structure in the terminal domain of the soluble form degenerate structures in parallel folding
reactions
cysteine-rich domains(45) bridge state between different natural folds upon single mutation degenerate structures in parallel folding
reactions
Mad2(52) two distinct structures N1-Mad2 and N2-Mad2 with a potential role degenerate structures in parallel folding
in signalling reactions
DNA-methyltransferase(33) new protein topologies (circular permutation) through directed evolution multistep gene rearrangements
designed switches(56,72,73,86) e.g. a $ b rich(87) pH, ion binding, redox state, temperature
parallel coiled coil $ antiparallel coiled coil(88)
coiled coild $ amyloid(89)
homeodomain $ zinc finger(86)
coiled coild $ zinc finger(72,90)
toxins(57,58) refolding between prepore- and pore-conformations pH
hemagglutinin(59) secondary and tertiary structure refolding pH
protein GB1(91) monomer $ intertwined tetramer 5 conservative mutations
Janus(50) protein GB1 fold $ Rop fold 20% sequence change
(92,93)
Amyloidogenic proteins secondary and tertiary structure switch, manifested by aggregation spontaneous, crowding, impaired
chaperoning, mutation, seeding
KH domains(17) different topologies terminal extensions
chameleon sequences(94,95) short sequences with different conformations in different tertiary backgrounds,tertiary background
includes helix-sheet transitions
Lysozyme(96,97) large fluctuations with less native-like contacts upon destruction of long-rangeW62G
interactions
CD2(98) monomer/dimer, strand exchange kinetic trapping
Functional prions(60,66,67,68,69) soluble/fibril: self-propagating secondary and tertiary structure switch protein-conformational chain reaction
implicated in memory formation, structural functions and evolvability.
>30% of the proteome(99) intrinsically unfolded or molten globule proteins fold upon binding: switch free enthalpy of binding
unfolded $ folded

of amino acids determining a fold would need to be known. The


relative importance of each residue in the folding of a protein
sequence has however not been established yet. Some
examples like Arc(11,78) and CRDs(45) show nonetheless that
even fairly localized changes can suffice to change the native
state. The effect of exchanging one or two amino acids will be
even more pronounced for residues involved in cooperative
interactions, as in the formation of a hydrophobic core.(37)
Computational and kinetic analysis has indicated that very few
residues determine native protein topology in the transition
state ensemble.(47,48) The identification of these residues by
sequence comparison is, however, complicated by the fact that
Figure 3. Folding free energy in dependence of protein
sequence selection will not only preserve the folding core but
sequence. The relative stability of protein structure varies include additional factors like functional selection.(43) On the
massively with sequence. Functional sequences have to fold to other hand, it has long been considered a challenge to design
reasonably stable structures in biologically relevant time. sequences with more than 50% identity folding into different
Prototypical sequences of highest stability are no prerequisite structures (the so-called Paracelsus challenge).(49) Such
for protein function and less stable protein sequences occur
more frequently. The marginal stability of native protein folds
conformational transitions between unrelated protein folds
ensures that novel configurations are evolvable upon few depending upon only a few mutations had been predicted in
mutations via intermediate bridge sequences. simulations based on potentials derived from structural
databases.(44) The structures of the mainly b sheet protein

BioEssays 29.11 1099


Review articles

GB1 and the four a-helix bundle protein Rop (repressor of changes associated with refolding and fibril formation in
primer) were used to solve the challenge experimentally. In prions.
fact, the structures can be interconverted between their fully The detection of misfolding in amyloid diseases is
folded, stable states by changing less than 20% of the alleviated by the self-perpetuation and trapping through
sequence.(50) protein-conformational chain reactions(60) in the formation of
Only degenerate structures with substantial populations of morphologically detectable plaques. Single mutations(61,62)
the different states and sufficient kinetic stability will allow a suffice to induce pathogenesis in humans at a post-reproduc-
characterization or even isolation of the different structures. In tive age, where selection acts less strongly.(60,63) Structural
other words, experimental description will be greatly aided, if a plasticity between structure rich in a-helix or b-sheet in the N-
limited number of energetically nearly degenerate structures terminal domain (residues 127–164) of the cellular prion
exists (Fig. 4) with sufficient free energy barriers between the protein indicates an underlying role of structural plasticity in the
folds. Such a situation is known for Rop itself on the quaternary refolding to the pathogenic scrapie form.(64,65) The detailed
structure level, as Rop assumes two different dimers of four- structural equilibria have remained controversial as mentioned
helix bundle in a double-funnelled folding landscape similar to above. While the occurrence of prions in mammals is linked to
that of the CRD bridge state shown in Fig. 4.(51) Likewise, the disease and death, examples for the biological functionality of
spindle checkpoint protein Mad2 adopts two distinct folds at evolutionarily conserved prions have accumulated in lower
equilibrium, which may be critical for its in vivo functionality in organisms. The seeding behaviour in prions notably allows a
signalling.(52) Disulfide-linked proteins abound in the examples hereditable protein-based information flow.(60) Among others,
of nearly degenerate tertiary structures (Table 1). Alternative the switching to fibrils has been implicated in memory
structures of similar energy have thus been described formation,(66) structural functions and providing surfaces(67,68)
for Hydra cysteine-rich domains,(45) the insulin-like growth and in an increased phenotypic variation by reducing
factor-1,(53) and superoxide dismutase.(54) Conformational transcriptional fidelity in adverse environments.(69) Thus,
dynamics between different disulfide linkages depend on the proteins seem to employ global structural changes both in
presence of redox catalyst, if disulfide bond reshuffling occurs soluble and in aggregating forms to vastly increase their
via reduced intermediate states. As a result, different shapes functional repertoire. The view that proteins are a direct and
can be trapped by the absence of redox catalyst or by low simple incarnation of the information encoded in the DNA in
pH during purification. While X-ray crystallography inherently ‘one gene–one protein–one function’ relationships can there-
is a purification procedure, thus precluding the detection of fore be regarded as outdated. Rather, the variation of protein
structural diversity, biomolecular NMR spectroscopy requires structures produces novel—and often beneficial—pheno-
a significant population of the diverse states due to the types, thus placing proteins at centre stage in responsive
relatively low sensitivity of the technique, in order to achieve an developmental processes.(70)
atomic scale description of structures. An ensemble of struc- The number of designed tertiary structure switches has
tures in dynamic equilibrium may well give NMR spectra grown vastly over the last few years due to the relevance of
reminiscent of unfolded proteins(23) and the number of confor- pathological misfolding and due to improved design methods
mers in equilibrium may be smaller than anticipated.(55) Most (for recent reviews see Ambroggio et al.(56) and Wright
importantly, the detailed refolding equilibrium in soluble prion et al.(71)). In cases, the generation of hybrids of the target fold
protein forms has eluded atomic characterization to date. The sequences suffices to create switches. A particularly promis-
observation of large-scale refolding in silico is still impeded by ing design approach relies on an algorithm that optimizes the
the restricted time accessible to all atom molecular dynamics amino acid sequence by using Monte Carlo search in
simulations of proteins. In consequence, exploring the true sequence space.(72) Protein switches between tertiary struc-
diversity of protein conformations on all structural levels will tures are sequences in bridge regions of the sequence–
require further methodological development. structure map, which offer the advantage that their refolding is
triggered by an environmental stimulus. As a result, these
Protein conformational switching triggered structural switches are easier to characterize than
The structural variability of proteins is applied as a molecular mutational switches in evolutionary bridge states. Conforma-
switch in different cellular contexts. Minor changes in the tional switches have been designed between unfolded and
solution state, like changing redox conditions, pH or ion folded states, between different secondary structures and
concentrations suffice to introduce structural changes in between different tertiary structures. The switching of folds
natural protein conformational switches.(56) Such natural can be triggered by various changes to solution states and
switching mostly occurs in the form of rearrangements within results in refolding, e.g. between antiparallel and parallel coiled
a stable fold. Prominent exceptions are, however, the refolding coils, coiled coils and zinc fingers as well as zinc fingers and
of various membrane-punctuating toxins(57,58) and of hemag- homeodomain folds (Table 1).(73) This demonstrates that a
glutinin upon lowering of the pH,(59) as well as the structural variety of natural protein folds overlap in sequence space,

1100 BioEssays 29.11


Review articles

Figure 4. Parallel reactions in the folding of bridge state proteins and of their point mutants. Even the modulation of single local
interactions suffices to evolve predominant populations. A detailed structural characterization of bridge state structures benefits from
sufficiently high populations (DDG < KBT) and sufficiently high energetic barriers between competing folds, thus explaining the dearth of
evolutionary transition forms in protein structures.

notably including upstream transcription factor domains. bridge state populations in cysteine-rich domains and Arc
Hence both mutation and environmental stimuli can induce repressor, where the competing folding reactions to novel
phenotypic variation on the protein structure level. Refinement structures predominate after the abolishing of a hydrogen
of existing computational methods for switch design could bond donor or the shifts of methyl groups. Removing a
extend sequence–structure mapping and identify transition hydrogen bond in CRD by a mutation K21P reduces the
regions between folds.(72) population of fold I from >95% to 20%.(45) In the case of
secondary structure switching by the Arc repressor, an N11I
Protein evolution and the buffering of variation mutation results in >95% of the molecules adopting the native
The theory of protein folding has fostered a statistical view of antiparallel b-sheet form, whereas changing only the position
protein conformational transitions in the folding reaction. of the methyl group in a further mutation I11L results in a
Competing structures are not uncommon in the folding majority of the molecules in artificial helical form.(78) In addition
pathways of natural proteins. Folding intermediates formed to mutational studies, de novo designed proteins offer an
early in the protein-folding pathway can have well-character- opportunity to distinguish inherent from evolved protein-like
ized non-native structures.(74–76) Protein sequences often properties in polypeptides. A recent study by the Baker lab
populate non-native structures or kinetically trapped misfolded shows that the designed folded polypeptide Top7 is not
states, which can evolve to form new ground state structures protein-like in lacking a cooperative folding transition to a
upon few mutations.(37,41,77) Proteins are evolved to saturate unique native structure. Rather, a multiphasic folding kinetics is
favourable interactions while reducing the prevalence of observed in this case and a non-native folded structure
conflicting interactions. This concept has been termed ‘‘the coexists in thermodynamic equilibrium with the designed
consistency principle’’ or ‘‘the principle of minimal frustration’’. target structure.(79) Thus evolutionary sequence selection
Conflicting interactions are not eliminated in this way, but are most likely accounts for the cooperative foldability of natural
normally weak enough to allow escape and folding to the native proteins to unique structures, as previously proposed by a
state. These predictions are validated in the quantification of physically realistic model.(80)

BioEssays 29.11 1101


Review articles

Robustness of native protein folds is conferred not only by harbours a wealth of shapes and functions, which provides
sequence selection but is aided by chaperones and by the the basis for a mutation and selection process with a
threshold stability of protein folds, which initially allow the surprisingly integrative dynamics. Thus, the most-basic
accumulation of detrimental mutations. The stabilization of genetic events can provide distinctive selective advantages
proteins against misfolding and unfolding introduces capaci- in the Darwinian evolution of complex features with surpris-
tance with respect to phenotypic variation by buffering the ingly high frequency.
emergence of non-native structures.(81,82) This effectively
extends the neutral networks of protein folds and further Acknowledgments
decreases the distance between different protein shapes in We wish to thank A. Wilkins, P.R. Jensen and nameless
sequence space. While all genes and gene networks that reviewers for critical reading of the manuscript and useful
preserve the status quo can act in this way,(83) chaperones are comments.
especially attractive candidates for evolutionary capacitance
due to their sensitivity to environmental stress, indicative of References
changing fitness landscapes. The high activity of chaperones 1. Deeds EJ, Shakhnovich EI. 2007. A structure-centric view of protein
in cancer cells likewise seems to allow a large genetic variation evolution, design, and adaptation. Adv Enzymol Relat Areas Mol Biol
75:133–191 xi-xii.
in a stressful cellular environment, thus driving oncogen- 2. Chothia C, Lesk AM. 1986. The relation between the divergence of
esis.(84,85) sequence and structure in proteins. Embo J 5:823–826.
3. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, et al. 1997.
CATH–a hierarchic classification of protein domain structures. Structure
Conclusion 5:1093–1108.
Natural entities inevitably raise questions about the range of 4. Anantharaman V, Aravind L, Koonin EV. 2003. Emergence of diverse
their possible states and their origins, transmutations and biochemical activities in evolutionarily conserved structural scaffolds of
proteins. Curr Opin Chem Biol 7:12–20.
potentials for future development. During evolution, change 5. Zuckerkandl E, Pauling L. 1965. Molecules as documents of evolutionary
towards fitter states requires that the intermediate forms are history. J Theor Biol 8:357–366.
not substantially less functional. The folding of proteins to 6. Vollmer G. 1995. Philosophy of Biology (Biophilosophie). Stuttgart:
Reclam.
unique structures and their structural stability over evolution- 7. Babajide A, Farber R, Hofacker IL, Inman J, Lapedes AS, et al. 2001.
ary time has been a long standing paradigm attributed to the Exploring protein sequence space using knowledge-based potentials. J
chemical diversity of amino acids, intrinsic biophysical proper- Theoret Biol 212:35–46.
8. Fontana W, Schuster P. 1998. Continuity in evolution: on the nature of
ties of protein structures and the sequence-fold map: chains transitions. Science 280:1451–1455.
encoding different structures allegedly have to differ by a large 9. Smith JM. 1970. Natural selection and the concept of a protein space.
fraction of their sequence. This picture has been changing Nature 225:563–564.
10. James LC, Tawfik DS. 2003. Conformational diversity and protein
despite experimental shortcomings in the characterization of evolution–a 60-year-old hypothesis revisited. Trends Biochem Sci 28:
diverse conformational ensembles and proteins appear 361–368.
increasingly as both robust and evolvable molecules. Plasticity 11. Cordes MH, Burton RE, Walsh NP, McKnight CJ, Sauer RT. 2000. An
evolutionary bridge to a new protein fold. Nat Struct Biol 7:1129–1132.
at the secondary, tertiary and quaternary structure level 12. Shakhnovich EI, Gutin AM. 1993. Engineering of stable and fast-folding
illustrates an unanticipated adaptability of proteins (Table 1). sequences of model proteins. Proc Natl Acad Sci USA 90:7195–7199.
Currently, it remains a matter of personal judgement whether 13. Keefe AD, Szostak JW. 2001. Functional proteins from a random-
sequence library. Nature 410:715–718.
the majority of natural protein folds are considered isolated 14. Murzin AG. 1998. How far divergent evolution goes in proteins. Curr Opin
islands in sequence space or nets in close sequence proximity. Struct Biol 8:380–387.
Due to the marginal stability of protein folds additional 15. Theobald DL, Wuttke DS. 2005. Divergent evolution within protein
superfolds inferred from profile-based phylogenetics. J Mol Biol 354:
structures in the conformational ensemble can appear as a 722–737.
consequence of only few mutations. Where mapped in close 16. Grishin NV. 2001. Fold change in evolution of protein structures. Journal
enough detail, sequence space thus appears to allow of Structural Biology 134:167–185.
17. Grishin NV. 2001. KH domain: one motif, two folds. Nucleic Acids Res
structural innovation of domain folds by a fairly limited number 29:638–643.
of amino acid exchanges along pathways, where the novel 18. Aravind L, Mazumder R, Vasudevan S, Koonin EV. 2002. Trends in
structure is present from the beginning of the transition. This protein evolution inferred from sequence and structure analysis. Curr
Opin Struct Biol 12:392–399.
allows transformations of protein structures at low fitness 19. Koonin EV, Wolf YI, Karev GP. 2002. The structure of the protein universe
costs resulting from the reduced time of the protein spent in its and genome evolution. Nature 420:218–223.
most active conformation.(32) Accordingly, the earliest folds 20. Dokholyan NV, Shakhnovich B, Shakhnovich EI. 2002. Expanding protein
universe and its origin from the biological Big Bang. Proc Natl Acad Sci
may have evolved gradually from functional but intrinsically USA 99:14132–14136.
unfolded proteins.(27) Further instances of fold overlap in 21. Tiana G, Shakhnovich BE, Dokholyan NV, Shakhnovich EI. 2004. Imprint
sequence space, however, hold great promise for additional of evolution on protein structures. Proc Natl Acad Sci USA 101:2846–
2851.
surprises at the crossroads of the biological and biophysical 22. Wolf YI, Karev G, Koonin EV. 2002. Scale-free networks in biology:
disciplines. The emerging picture is that the proteome new insights into the fundamentals of evolution? Bioessays 24:105–109.

1102 BioEssays 29.11


Review articles

23. Blanco FJ, Angrand I, Serrano L. 1999. Exploring the conformational resolves the Rop dimer-folding mystery. Proc Natl Acad Sci USA 102:
properties of the sequence space between two proteins with different 2373–2378.
folds: an experimental study. J Mol Biol 285:741–753. 52. Luo XL, Tang ZY, Xia GH, Wassmann K, Matsumoto T, et al. 2004. The
24. Aharoni A, Gaidukov L, Khersonsky O, Mc QGS, Roodveldt C, et al. Mad2 spindle checkpoint protein has two distinct natively folded states.
2005. The ‘evolvability’ of promiscuous protein functions. Nat Genet 37: Nature Struct Molec Biol 11:338–345.
73–76. 53. Huang QL, Zhao J, Tang YH, Shao SQ, Xu GJ, et al. 2007. The sequence
25. Rasmussen BF, Stock AM, Ringe D, Petsko GA. 1992. Crystalline determinant causing different folding behaviors of insulin and insulin-like
ribonuclease A loses function below the dynamical transition at 220 K. growth factor-1. Biochemistry 46:218–224.
Nature 357:423–424. 54. Petersen SV, Oury TD, Valnickova Z, Thogersen IB, Hojrup P, et al. 2003.
26. Zimmermann J, Oakman EL, Thorpe IF, Shi X, Abbyad P, et al. 2006. The dual nature of human extracellular superoxide dismutase: one
Antibody evolution constrains conformational heterogeneity by tailoring sequence and two structures. Proc Natl Acad Sci USA 100:13875–
protein dynamics. Proc Natl Acad Sci USA 103:13722–13727. 13880.
27. James LC, Roversi P, Tawfik DS. 2003. Antibody multispecificity 55. Choy WY, Forman-Kay JD. 2001. Calculation of ensembles of structures
mediated by conformational diversity. Science 299:1362–1367. representing the unfolded state of an SH3 domain. J Mol Biol 308:1011–
28. Khersonsky O, Roodveldt C, Tawfik DS. 2006. Enzyme promiscuity: 1032.
evolutionary and mechanistic aspects. Curr Opin Chem Biol 10:498– 56. Ambroggio XI, Kuhlman B. 2006. Design of protein conformational
508. switches. Curr Opin Struct Biol 16:525–530.
29. West-Eberhard M. 1989. Phenotypic plasticity and the origins of 57. Tilley SJ, Orlova EV, Gilbert RJ, Andrew PW, Saibil HR. 2005. Structural
diversity. Annu Rev Ecol Syst 20:249–278. basis of pore formation by the bacterial toxin pneumolysin. Cell 121:247–
30. Nijhout HF. 1990. Metaphors and the role of genes in development. 256.
Bioessays 12:441–446. 58. Petosa C, Collier RJ, Klimpel KR, Leppla SH, Liddington RC. 1997.
31. Joyce GF. 1997. Evolutionary chemistry: getting there from here. Science Crystal structure of the anthrax toxin protective antigen. Nature 385:833–
276:1658–1659. 838.
32. Ancel LW, Fontana W. 2000. Plasticity, evolvability, and modularity in 59. Bullough PA, Hughson FM, Skehel JJ, Wiley DC. 1994. Structure of
RNA. J Exp Zool 288:242–283. influenza haemagglutinin at the pH of membrane fusion. Nature 371:37–43.
33. Peisajovich SG, Rockah L, Tawfik DS. 2006. Evolution of new protein 60. Shorter J, Lindquist S. 2005. Prions as adaptive conduits of memory and
topologies through multistep gene rearrangements. Nat Genet 38:168– inheritance. Nat Rev Genet 6:435–450.
174. 61. Prusiner SB. 1993. Genetic and infectious prion diseases. Arch Neurol
34. Huynen MA. 1996. Exploring phenotype space through neutral evolution. 50:1129–1153.
J Mol Evol 43:165–169. 62. Murrell J, Farlow M, Ghetti B, Benson MD. 1991. A mutation in the
35. Fontana W. 2002. Modelling ‘evo-devo’ with RNA. Bioessays 24:1164– amyloid precursor protein associated with hereditary Alzheimer’s
1177. disease. Science 254:97–99.
36. Schultes EA, Bartel DP. 2000. One sequence, two ribozymes: Implica- 63. Lansbury PT Jr. 1999. Evolution of amyloid: what normal protein folding
tions for the emergence of new ribozyme folds. Science 289:448–452. may tell us about fibrillogenesis and disease. Proc Natl Acad Sci USA
37. Plotkin SS, Onuchic JN. 2002. Understanding protein folding with 96:3342–3344.
energy landscape theory. Part I: Basic concepts. Q Rev Biophys 64. Derreumaux P. 2001. Evidence that the 127-164 region of prion proteins
35:111–167. has two equi-energetic conformations with beta or alpha features.
38. Shakhnovich EI, Gutin AM. 1990. Implications of thermodynamics of Biophys J 81:1657–1665.
protein folding for evolution of primary sequences. Nature 346:773–775. 65. Torrent J, Alvarez-Martinez MT, Liautard JP, Balny C, Lange R. 2005. The
39. Wolynes PG. 2005. Recent successes of the energy landscape theory of role of the 132-160region in prion protein conformational transitions.
protein folding and function. Q Rev Biophys 38:405–410. Protein Sci 14:956–967.
40. Dinner AR, Sali A, Smith LJ, Dobson CM, Karplus M. 2000. Under- 66. Si K, Lindquist S, Kandel ER. 2003. A neuronal isoform of the aplysia
standing protein folding via free-energy surfaces from theory and CPEB has prion-like properties. Cell 115:879–891.
experiment. Trends Biochem Sci 25:331–339. 67. Chapman MR, Robinson LS, Pinkner JS, Roth R, Heuser J, et al. 2002.
41. Govindarajan S, Goldstein RA. 1998. On the thermodynamic hypothesis Role of Escherichia coli curli operons in directing amyloid fiber formation.
of protein folding. Proc Natl Acad Sci USA 95:5545–5549. Science 295:851–855.
42. Xia Y, Levitt M. 2004. Simulating protein evolution in sequence and 68. Claessen D, Rink R, de Jong W, Siebring J, de Vreugd P, et al. 2003. A
structure space. Curr Opin Struct Biol 14:202–207. novel class of secreted hydrophobic proteins is involved in aerial hyphae
43. Shakhnovich E. 2006. Protein folding thermodynamics and dynamics: formation in Streptomyces coelicolor by forming amyloid-like fibrils.
where physics, chemistry, and biology meet. Chem Rev 106:1559–1588. Genes Dev 17:1714–1726.
44. Babajide A, Hofacker IL, Sippl MJ, Stadler PF. 1997. Neutral networks in 69. Glover JR, Kowal AS, Schirmer EC, Patino MM, Liu JJ, et al. 1997. Self-
protein space: a computational study based on knowledge-based seeded fibers formed by Sup35, the protein determinant of [PSIþ], a
potentials of mean force. Folding and Design 2:261–269. heritable prion-like factor of S. cerevisiae. Cell 89:811–819.
45. Meier S, Jensen PR, David CN, Chapman J, Holstein TW, et al. 2007. 70. Dover G. 2000. Results may not fit well with current theories. Nature 408:17.
Continuous molecular evolution of protein-domain structures by single 71. Wright CM, Heins RA, Ostermeier M. 2007. As easy as flipping a switch?
amino acid changes. Curr Biol 17:173–178. Curr Opin Chem Biol 11:342–346.
46. Nelson ED, Onuchic JN. 1998. Proposed mechanism for stability of proteins 72. Ambroggio XI, Kuhlman B. 2006. Computational design of a single amino
to evolutionary mutations. Proc Natl Acad Sci USA 95:10682–10686. acid sequence that can switch between two distinct protein folds. J Am
47. Abkevich VI, Gutin AM, Shakhnovich EI. 1994. Specific nucleus as the Chem Soc 128:1154–1161.
transition state for protein folding: evidence from the lattice model. 73. Hori Y, Sugiura Y. 2002. Conversion of antennapedia homeodomain to
Biochemistry 33:10026–10036. zinc finger-like domain: Zn(II)-induced change in protein conformation
48. Shakhnovich E, Abkevich V, Ptitsyn O. 1996. Conserved residues and and DNA binding. J Am Chem Soc 124:9362–9363.
the mechanism of protein folding. Nature 379:96–98. 74. Gruebele M. 2002. An intermediate seeks instant gratification. Nat Struct
49. Scott KA, Daggett V. 2007. Folding mechanisms of proteins with high Biol 9:154–155.
sequence identity but different folds. Biochemistry 46:1545–1556. 75. Capaldi AP, Kleanthous C, Radford SE. 2002. Im7 folding mechanism:
50. Dalal S, Regan L. 2000. Understanding the sequence determinants misfolding on a path to the native state. Nat Struct Biol 9:209–216.
of conformational switching using protein design. Protein Sci 9:1651– 76. Troullier A, Reinstadler D, Dupont Y, Naumann D, Forge V. 2000.
1659. Transient non-native secondary structures during the refolding of
51. Levy Y, Cho SS, Shen T, Onuchic JN, Wolynes PG. 2005. Symmetry alpha-lactalbumin detected by infrared spectroscopy. Nat Struct Biol
and frustration in protein energy landscapes: a near degeneracy 7:78–86.

BioEssays 29.11 1103


Review articles

77. Hwang W, Zhang S, Kamm RD, Karplus M. 2004. Kinetic control of dimer 89. Kammerer RA, Kostrewa D, Zurdo J, Detken A, Garcia-Echeverria C,
structure formation in amyloid fibrillogenesis. Proc Natl Acad Sci USA et al. 2004. Exploring amyloid formation by a de novo design. Proc Natl
101:12916–12921. Acad Sci USA 101:4435–4440.
78. Anderson TA, Cordes MH, Sauer RT. 2005. Sequence determinants of a 90. Cerasoli E, Sharpe BK, Woolfson DN. 2005. ZiCo: a peptide designed to
conformational switch in a protein structure. Proc Natl Acad Sci USA switch folded state upon binding zinc. J Am Chem Soc 127:15008–
102:18344–18349. 15009.
79. Watters AL, Deka P, Corrent C, Callender D, Varani G, et al. 2007. The 91. Kirsten Frank M, Dyda F, Dobrodumov A, Gronenborn AM. 2002. Core
highly cooperative folding of small naturally occurring proteins is likely mutations switch monomeric protein GB1 into an intertwined tetramer.
the result of natural selection. Cell 128:613–624. Nat Struct Biol 9:877–885.
80. Shakhnovich EI. 1994. Proteins with selected sequences fold into unique 92. Booth DR, Sunde M, Bellotti V, Robinson CV, Hutchinson WL, et al. 1997.
native conformation. Phys Rev Lett 72:3907–3910. Instability, unfolding and aggregation of human lysozyme variants
81. Rutherford SL, Lindquist S. 1998. Hsp90 as a capacitor for morpholo- underlying amyloid fibrillogenesis. Nature 385:787–793.
gical evolution. Nature 396:336–342. 93. Fandrich M, Fletcher MA, Dobson CM. 2001. Amyloid fibrils from muscle
82. Queitsch C, Sangster TA, Lindquist S. 2002. Hsp90 as a capacitor of myoglobin. Nature 410:165–166.
phenotypic variation. Nature 417:618–624. 94. Tan S, Richmond TJ. 1998. Crystal structure of the yeast MAT alpha 2/
83. Bergman A, Siegal ML. 2003. Evolutionary capacitance as a general MCM1/DNA ternary complex. Nature 391:660–666.
feature of complex gene networks. Nature 424:549–552. 95. de Chiara C, Menon RP, Adinolfi S, de Boer J, Ktistaki E, et al. 2005. The
84. Whitesell L, Lindquist SL. 2005. HSP90 and the chaperoning of cancer. AXH domain adopts alternative folds: The solution structure of HBP1
Nat Rev Cancer 5:761–772. AXH. Structure 13:743–753.
85. Takayama S, Reed JC, Homma S. 2003. Heat-shock proteins as 96. Klein-Seetharaman J, Oikawa M, Grimshaw SB, Wirmer J, Duchardt E,
regulators of apoptosis. Oncogene 22:9041–9047. et al. 2002. Long-range interactions within a nonnative protein. Science
86. Hori Y, Sugiura Y. 2004. Effects of Zn(II) binding and apoprotein 295:1719–1722.
structural stability on the conformation change of designed antenna- 97. Zhou R, Eleftheriou M, Royyuru AK, Berne BJ. 2007. Destruction of long-
finger proteins. Biochemistry 43:3068–3074. range interactions by a single mutation in lysozyme. Proc Natl Acad Sci
87. Mihara H, Takahashi Y. 1997. Engineering peptides and proteins USA 104:5824–5829.
that undergo alpha-to-beta transitions. Curr Opin Struct Biol 7:501– 98. Murray AJ, Lewis SJ, Barclay AN, Brady RL. 1995. One sequence, two
508. folds: a metastable structure of CD2. Proc Natl Acad Sci USA 92:7337–
88. Pandya MJ, Cerasoli E, Joseph A, Stoneman RG, Waite E, et al. 2004. 7341.
Sequence and structural duality: designing peptides to adopt two stable 99. Dyson HJ, Wright PE. 2005. Intrinsically unstructured proteins and their
conformations. J Am Chem Soc 126:17016–17024. functions. Nat Rev Mol Cell Biol 6:197–208.

1104 BioEssays 29.11

Potrebbero piacerti anche