Sei sulla pagina 1di 16

Protein Refolding Versus Aggregation: Computer

Simulations on an Intermediate-resolution
Protein Model
Anne Voegler Smith and Carol K. Hall*
Department of Chemical
Engineering, North Carolina
State University, Raleigh
NC 27695-7905, USA
Computer simulations are performed on a system of eight model peptide
chains to study how the competition between protein refolding and
aggregation affects the optimal conditions for refolding of four-helix bun-
dles. The discontinuous molecular dynamics algorithm is utilized along
with an intermediate-resolution protein model that we developed for this
work. Physically, the model is much more detailed than any model used
to date for simulations of protein aggregation. Each model residue con-
sists of a detailed, three-bead backbone and a simplied, single-bead
side-chain. Excluded volume, hydrogen bond, and hydrophobic inter-
actions are modeled with discontinuous (i.e. hard-sphere and square-
well) potentials. Simulations efciently sample conformational space, and
complete folding trajectories from random initial congurations to two
four-helix bundles are possible within two days on a single processor
workstation. Folding of the bundles follows two main pathways, one
through a trimeric intermediate and the other through an intermediate
with two dimers. The proportion of trajectories that follow each route is
signicantly different for the eight-peptide system in this work than in a
previously studied four-peptide system, which yields one four-helix bun-
dle, suggesting, as our previous simulations have, that protein folding
properties are strongly inuenced by the presence of other proteins. Fold-
ing of the bundles is optimal within a xed temperature range, with the
high-temperature boundary a function of the complexity of the protein
(or oligomer) to be folded and the low-temperature boundary a function
of the complexity of the protein's environment. Above the optimal tem-
perature range for folding, the model chains tend to unfold; below the
optimal range, the model chains tend to aggregate. As has been seen pre-
viously, aggregates have substantial levels of native secondary structure,
suggesting that aggregates are composed largely of partially folded inter-
mediates, not denatured chains.
# 2001 Academic Press
Keywords: discontinuous molecular dynamics; protein folding; protein
misfolding; aggregation; four-helix bundle *Corresponding author
Introduction
The protein aggregation problem is just as com-
plicated and interesting as the protein folding pro-
blem. In multi-protein systems, competition exists
between the formation of correct intra-protein
interactions during folding and incorrect inter-pro-
tein interactions during aggregation.
1
The outcome
of this competition is dependent on properties of
both the protein itself and the protein's environ-
ment. Proteins that aggregate in vivo can have pro-
found pathological implications, as in the aberrant
aggregation of b-amyloid proteins to form plaques
in Alzheimer's disease
2
and the aggregation of the
PrP
Sc
variant of the prion protein in various neuro-
degenerative diseases.
3
Protein aggregation is also
a serious obstacle to protein-based drug pro-
duction. In host cell systems genetically engineered
to overproduce heterologous proteins of pharma-
ceutical importance, the desired proteins often
aggregate into inclusion bodies which must be dis-
E-mail address of the corresponding author:
hall@turbo.che.ncsu.edu
Abbreviations used: DMD, discontinuous molecular
dynamics.
doi:10.1006/jmbi.2001.4845 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 312, 187202
0022-2836/01/01018716 $35.00/0 # 2001 Academic Press
solved and then treated with specic renaturation
conditions to recover active protein.
46
Unfortu-
nately, the fundamental mechanisms underlying
aggregation are not well understood; and optimal
refolding conditions must be individually deter-
mined for each protein of interest.
The goal of our work is to use computer simu-
lations to study how the competition between pro-
tein folding and aggregation affects the optimal
refolding conditions for proteins. We investigate
refolding as a function of temperature in an eight-
peptide system designed to form two tetrameric a-
helical bundles based on DeGrado and co-workers'
de novo designed a family of proteins.
711
We com-
pare the results for this eight-peptide system with
our previous study on a four-chain system
designed to form one tetrameric a-helical bundle
and a one-chain system designed to form an
a-helix.
12
With these simulations, we are able to
probe the balance between protein folding and
aggregation in multi-protein systems and to offer a
physically based explanation for the optimum in
refolding yield as a function of environmental con-
ditions that is observed in this work as well as in
previous simulations
12,13
and in refolding experi-
ments.
1
We also study how the conformational
properties of individual peptides differ in this
eight-peptide system in comparison to the one-
and four-peptide systems studied previously, and
we characterize the overall conformational proper-
ties of multi-peptide aggregates.
Computer simulations of isolated proteins with a
variety of protein models have provided a wealth
of information on protein stability and on the kin-
etics and thermodynamics of protein folding. Com-
plex, high-resolution models, such as all-atom
models which typically incorporate every atom of
the protein (with the exception of some hydrogen
atoms), are common
1416
and have allowed valu-
able insights into a variety of interesting phenom-
ena including the process of protein unfolding,
1719
the conformational properties of the denatured
state ensemble,
20,21
and the nature of highly specic
protein-protein interactions.
22
Idealized, low-resol-
ution models, such as on- and off-lattice homo- and
heteropolymer chain models in which each amino
acid residue is represented by a single sphere with
identical (homo) or varied (hetero) interaction para-
meters,
2325
have been used to study conformation-
al transitions during folding,
2628
the structures of
molten globule intermediates,
29,30
and the confor-
mational variability in ensembles of low energy,
native-like structures.
3137
Intermediate-resolution protein models represent
a powerful compromise between the extremely
simplied (low-resolution) homo- and heteropoly-
mer chain models and the complex all-atom (high-
resolution) models currently favored by the protein
folding community. The two-bead model,
3842
with
one backbone bead and one side-chain bead per
residue, and the three-bead model,
43
with one
backbone bead and one or two side-chain beads
per residue, allow independent backbone and side-
chain interactions in the system. Four bead mod-
els,
44,45
with three backbone beads and one side-
chain bead, have been developed to more accu-
rately represent protein backbone structure. The
success with these models suggests that intermedi-
ate-resolution models, which inherently access
longer times than their high-resolution counter-
parts, may offer reasonable estimates of the folding
process and of three-dimensional folded structures.
Low- and intermediate-resolution protein
models of the a family of proteins have been used
previously for computer simulation studies of pro-
tein folding.
41,42,46
Guo and Thirumalai studied the
thermodynamics and kinetics of the a
4
protein
using Langevin dynamics simulations on a hetero-
polymer chain model and found that four-helix
bundle folding occurs via a variety of pathways,
some of which are complicated and involve long-
lived intermediates. They suggest that a regular
pattern of hydrophobic and hydrophilic residues is
crucial for four-helix bundle folding and that all de
novo protein design efforts should carefully arrange
the hydrophobic and hydrophilic residues so as to
destabilize non-native folds.
46
Using a simplied
protein representation with one backbone bead
and one side-chain bead per residue, Sikorski et al.
performed lattice Monte Carlo simulations on a
1B
,
a
2
, and a
4
peptides.
41
They too found that the
sequence designed by DeGrado and co-workers
provided enough information to successfully yield
the native structure. Multiple folding pathways
were observed with equilibrium intermediate
structures possessing substantial native character.
While computer simulations are widely used to
study the dynamics of isolated proteins during the
folding process, simulations of protein aggregation
are rare and, to our knowledge, have been per-
formed exclusively on low-resolution protein
models. Patro and Przybycien presented some of
the earliest aggregation simulations in which a
hexagon with a mix of hydrophobic and polar
sides was used as the model protein and protein
aggregation was studied by monitoring the associ-
ation of the hexagons with two-dimensional Monte
Carlo simulations.
47,48
We recently studied the
competition between protein refolding and aggre-
gation using a 40-chain heteropolymer system and
two-dimensional lattice Monte Carlo simulations
and found that refolding yield is optimal at inter-
mediate values of denaturant concentration and
that aggregation arises from the association of par-
tially folded intermediates, not from completely
denatured chains.
13
These simulations were per-
formed with a ``two-letter'' heteropolymer model
termed an HP model because each bead on the
chain is one of two possible types: hydrophobic
(H) or polar (P). Istrail et al. performed simulations
on a pair of HP chains using two-dimensional lat-
tice simulations to study the innate ability of a pair
of proteins to aggregate.
49
Broglia et al. performed
simulations on a pair of 20-letter model protein
chains using a three-dimensional Monte Carlo
method and found that aggregates consist of par-
188 Protein Refolding Versus Aggregation
tially folded intermediates, not denatured chains.
50
Recently, exact enumeration studies of HP chains
have been reported by Giugliarelli et al.
51
and by
Harrison et al.
52
Giugliarelli et al. reported a study
of two-dimensional HP chains that probes the
inuence of the inter-residue interaction strength
on the formation of either soluble, non-interacting
native structures, aggregates composed of chains
with native structures, or aggregates composed of
chains with non-native structures. Harrison et al.
studied pairs of two- and three-dimensional HP
chains to examine the thermodynamics of the con-
formational change associated with aggregation of
model prion proteins. Given the current power of
computers and supercomputers, simulations of
protein aggregation require simplied protein
models. However, our efforts are aimed at simulat-
ing aggregation with a model that possesses sig-
nicantly more physical detail than the protein
models used previously to study aggregation.
Here, we present our results from computer
simulations on refolding and aggregation of two
tetrameric a-helical bundle proteins using an inter-
mediate-resolution protein model. The protein
model used here offers a signicant improvement
in detail over previous low-resolution protein
models used in simulations of aggregating sys-
tems. In previous papers, we described the devel-
opment of an off-lattice, intermediate-resolution
protein model that incorporates substantial physi-
cal protein detail yet is simple enough to permit
simulations of multiple proteins over long time-
s.
53,54
In our most recent work, we presented
results for simulations on the folding of the iso-
lated 16-residue model peptide that serves as the
building block for the tetrameric bundle and for
simulations on the assembly of four of these 16-
residue model peptides into the native tetrameric
bundle.
12
Here, we simulate eight-chain systems to
study tetrameric a-helical bundle formation in the
presence of competing aggregation events. We
study folding to the native state as a function of
temperature, and we characterize the predominant
conformations of model peptides that are involved
in aggregated structures. Despite the simplicity of
our model, we obtain a-helical bundle structures
with realistic physical properties in relatively short
(on the order of hours) simulations on single-
processor workstations.
Highlights of our simulation results are the fol-
lowing. The simplicity of the protein model devel-
oped for this work, along with the power of the
discontinuous molecular dynamics algorithm,
enables observation of complete folding trajectories
to the native state of two tetrameric a-helical bun-
dles via simulations as short as two days on a 500
MHz single-processor workstation. There is an
optimal temperature range for bundle assembly in
our simulations, and the range is dened by the
tendency of the peptides to aggregate at low tem-
peratures and to unfold at high temperatures. We
observe the same two main pathways of bundle
assembly as seen previously in the four-peptide
system (monomer-to-dimer-to-trimer-to-tetramer,
and monomer-to-dimer-to-tetramer); however, the
proportion of folding trajectories that follow each
route is different for the four- and eight-peptide
systems, suggesting that the number of chains
in the system has a signicant impact on likely
folding trajectories. The boundaries marking the
optimal temperature range for folding in the eight-
peptide system are different from those in the one-
and four-peptide systems studied previously. The
high-temperature boundary of the optimal tem-
perature range for folding appears to be a function
of the complexity of the protein being simulated;
and the low-temperature boundary appears to be a
function of the complexity of the protein's environ-
ment, with more complex environments (such as
larger number of chains) contributing to aggrega-
tion. Aggregates tend to consist of chains with sub-
stantial native secondary structure and are held
together by a signicant number of non-native
hydrophobic contacts, suggesting that partially
folded chains, not denatured chains, are the main
component of amorphous aggregates.
The next section describes the model developed
for this work, including the physical representation
and the potential energy function, and the DMD
simulation technique. A further section presents
the results and discussion for the simulations of
the folding and aggregation of a two tetrameric
a-helical bundles. The nal section provides a brief
conclusion.
Models and Methods
The physical protein representation and model
details are described in detail elsewhere.
54
We pro-
vide a brief description here.
Physical chain representation
The protein model has a fairly realistic backbone
structure and a very simplied side-chain struc-
ture. Each amino acid residue is modeled with four
beads as depicted by the four broken circles in
Figure 1. An N united atom represents the amino
acid's amide nitrogen and hydrogen, a C
a
united
atom represents the alpha-carbon and its hydro-
gen, and a C united atom represents the carbonyl
carbon and oxygen. The fourth bead in the model,
R, represents the side-chain group. This physical
structure, a three-bead backbone and one-bead
side-chain, has been used successfully to study the
folding of isolated proteins elsewhere
44,45
in concert
with different search algorithms and potential
energy functions. Model glycine residues do not
have R beads, and the model cannot currently be
used for proline residues because of proline's unu-
sual structure. In our model, the inter-residue bond
is assumed to be in the trans conguration, all
backbone bond lengths and bond angles are xed
at their ideal values, and the distance between con-
secutive C
a
united atoms is xed in accordance
with empirical observations. The side-chains in the
Protein Refolding Versus Aggregation 189
model may vary in size and distance from C
a
,
depending on the particular amino acid residues
being modeled, and are held in positions relative
to the backbone so that all residues are L-isomers.
The values of the bond lengths and angles and the
method used to maintain these values and chirality
are given below. Solvent molecules are not expli-
citly included in the model. The effect of solvent is
factored into the energy function as a potential of
mean force.
Forces and interactions
Beads in the protein model are subject to four
different types of forces: repulsion due to excluded
volume effects, attraction between bonded beads
and pseudobonded beads (as will be dened
below), attraction between pairs of backbone beads
during hydrogen bond formation, and attraction
between pairs of side-chain beads during hydro-
phobic interactions. Each of these forces is rep-
resented by a discontinuous potential force, either
a hard-sphere potential:
u
ij
(r) =
&
oY r4s
0Y r b s
(1)
where r is the distance between beads i and j and
s is the bead diameter, or a square-well potential:
u
ij
(r) =
@
oY r4s
eY s ` r4ls
0Y r b ls
(2)
where ls is the well diameter and e is the well
depth. Conceptually, a hard sphere refers to an
impenetrable, solid sphere, and a square well refers
to an attractive region of thickness l that envelops
that sphere. Deeper wells correspond to stronger
attractive interactions between square-well beads,
and shallower wells correspond to weaker attrac-
tive interactions. The well depth parameter is
coupled to the temperature so that a single par-
ameter, reduced well depth (e*) which is equal to
e/k
B
T, characterizes the protein environment. High
values of e*, for example, can be considered to
characterize a low temperature or poor solvent
environment. Reduced temperature, T*, is the
inverse of reduced well depth. In this work, T* is
dened by the strength of the hydrogen bonding
potential and is therefore equal to k
B
T/e
HB
, where
e
HB
is the depth of the square well on an N or a C
bead. The strength of the hydrophobic potential
(the depth of the square well on hydrophobic side-
chains, e
HP
) may be varied independently of the
strength of the hydrogen bonding potential (e
HB
).
The reduced time in our simulations is dened to
be t
+
t(e
HB
ams
2
)
1
2
, where t is the time and m is
the average molecular mass of a bead in our
model.
Excluded volume
Pairs of beads collide and repel when the dis-
tance between them becomes so small that their
surfaces touch (when r
ij
= s). Diameters for each of
the three types of backbone beads are chosen to be
reasonable estimates for the sizes of the atoms they
represent as described previously
54
and are shown
in Table 1. For interactions between pairs of neigh-
bor beads (three or fewer bonds apart along the
chain), we allow the beads to overlap by up to
25 %. The amount of overlap is chosen to dictate
the range of motion around N-C
a
and C
a
-C bonds,
and our previous work demonstrates that the
model exhibits realistic - conformational free-
dom for both non-glycine and glycine residues.
54
Bonds
Covalent bonds are maintained between neigh-
boring beads along the chain backbone and
between the C
a
and R united atoms. Bonded beads
move freely between separation distances of
(1 d)l and (1 d)l, where d is the bond tolerance
and l is the ideal bond length between the bonded
beads. The choice of d denes the acceptable range
of uctuation in the bond length. Here, d is chosen
to be 0.02. In effect, bonds in the simulation uctu-
ate within 2 % of their assigned lengths by experi-
encing a hard-sphere repulsion at (1 d)l and an
innitely strong square-well attraction at (1 d)l.
Figure 1. An amino acid residue. The side-chain
group is denoted R and represents one of 20 different
chemical groups. Broken circles depict atom groups,
each of which is represented by a sphere in the model.
is rotation around the bond between nitrogen and
a-carbon atoms. is rotation around the bond between
a-carbon and carbonyl carbon atoms.
190 Protein Refolding Versus Aggregation
All covalent bond lengths in the model are
assigned ideal values
55
and are given in Table 1.
Ideal backbone bond angles, C
a
-C
a
distances,
and residue L-isomerization are achieved by using
a set of pseudobonds as described previously.
54
For example, a pseudobond is placed between N
and C of each residue to force each NC
a
C angle to
be near its ideal value. Like the covalent bonds
described above, the pseudobonds uctuate within
2 % of their assigned lengths. Pseudobonds are also
included between neighboring pairs of C
a
beads to
maintain their distances near the experimentally
determined constant value. The concerted action of
pseudobonds and covalent bonds restricts the
interpeptide group (C
a
i
-C
i
-N
i 1
-C
a
i 1
) to the trans
conguration. Finally, pseudobonds are placed
between the side-chain and backbone N and C uni-
ted atoms of each residue to hold the side-chain
beads xed relative to the backbone so that all
model residues are L-isomers. The assigned bond
angles and corresponding pseudobond lengths are
shown in Table 1.
Hydrogen bonds
Backbone hydrogen bonds form in real proteins
between amide hydrogen atoms and carbonyl oxy-
gen atoms. Although hydrogen and oxygen atoms
are not explicitly represented in our model, we
maintain enough realistic backbone character that
the locations of the hydrogen and oxygen atoms
can be calculated at any given time. Therefore,
hydrogen bonds occur as an attraction between
virtual hydrogen and oxygen atoms, atoms that
are not explicitly included in the model but instead
lie within the N and C united atom spheres. Four
criteria must be met to enable hydrogen bond for-
mation: N and C are separated by 4.2 A

(the N and
C well width), the nitrogen-hydrogen-oxygen and
hydrogen-oxygen-carbon angles are between 120

and 180

(so that the nitrogen-hydrogen and car-


bon-oxygen vectors point toward each other),
neither N nor C is already involved in a hydrogen
bond with a different partner, and N and C are
separated by at least three intervening residues
along the protein chain. In our simulations, the cri-
teria above lead to hydrogen bonds with HO dis-
tances of 1.8-3.1 A

and NHO angles of 115-180

,
reasonable estimates when compared with the HO
distances and NHO angles observed in an analysis
of the structures in a protein database.
56
The details
of the hydrogen bond interaction can be found in
our previous work.
54
Hydrogen bonding between virtual hydrogen
and oxygen atoms in computer simulations of iso-
lated model proteins has been implemented pre-
viously by Klimov et al.
57
Their model differs from
the model used here in that they represent each
residue by only one sphere, utilize continuous
potentials and Brownian dynamics, and permit
hydrogen bonds to form only between predeter-
mined native hydrogen bond partners.
Hydrophobic interactions
In our simulations, hydrophobic side-chains (H)
are modeled as square-well beads and polar side-
chains (P) are modeled as hard-sphere beads.
Therefore, hydrophobic interactions occur as a
square-well attraction between pairs of hydro-
phobic side-chains. Two criteria must be met to
enable a hydrophobic interaction: the hydrophobic
side-chains must be separated by an appropriate
distance, and they must be separated by at least
three intervening residues along the protein chain.
The rst requirement, that the side-chains are sep-
arated by an appropriate distance, is met by
assigning suitable square-well diameters for H
side-chains. The well diameter for H is chosen to
be 1.5s
H
, where s
H
is the diameter of a side-chain
bead. The second requirement, that the side-chain
beads are separated by at least three intervening
residues, is implemented to avoid counting a
hydrophobic interaction that is merely the result of
proximity along the chain.
Discontinuous molecular dynamics
Simulations are performed using the discontinu-
ous molecular dynamics (DMD) simulation
algorithm,
5860
a fast alternative to continuous-
potential molecular dynamics techniques. DMD
simulations are conducted as follows. Each bead of
the model protein chain is assigned an initial pos-
ition and an initial velocity. The initial positions
Table 1. Simulation parameters: bead diameters, well
diameters, bond lengths, pseudobond lengths, and cor-
responding bond angles
Bead diameters, s (A

)
N 3.300
C
a
3.700
C 4.000
R 4.408
Well diameters, ls (A

)
N 4.200
C 4.200
H 6.612
Bond lengths, l (A

)
N
i
-C
a,i
1.460
C
a
i
-C
i
1.510
C
i
-N
i 1
1.330
C
a
i
-R
i
1.531
Pseudobond lengths, l (A

)
N
i
-C
i
2.45
C
a
i
-N
i 1
2.41
C
i
-C
a
i 1
2.45
C
a
i
-C
a
i 1
3.80
N
i
-R
i
2.44
C
i
-R
i
2.49
Bond angles (degrees)
N
i
-C
a
i
-C
i
111.0
C
a
i
-C
i
-N
i 1
116.0
C
i
-N
i 1
-C
a
i 1
122.0
R
i
-C
a
i
-N
i
109.6
R
i
-C
a
i
-C
i
110.1
Protein Refolding Versus Aggregation 191
are random, although they may not violate any of
the assigned bond lengths and angles listed in
Table 1. The initial velocities are chosen at random
from a Maxwell-Boltzmann distribution at a very
high temperature (T* = 0.5). At the start of each
simulation, the system is annealed from T* = 0.5 to
the desired run temperature to minimize kinetic
trapping in local free energy minima. The anneal-
ing is complete within approximately ve reduced
time units, a small fraction of overall folding time
which is typically 1200 or more reduced time units.
When a DMD simulation begins, each bead moves
with its individual velocity. The simulation pro-
ceeds according to the following schedule: identify
the rst event, move forward in time until that
event occurs, calculate new velocities for the pair
of beads involved in the event and calculate any
changes in system energy resulting from hydrogen
bond events or hydrophobic interactions, nd the
second event, and so on.
Types of events include excluded volume events,
bond events, and square-well hydrogen bond and
hydrophobic interaction events. An excluded
volume event occurs when the surfaces of two
hard-sphere beads collide and repel each other.
Bond (or pseudobond) events include a hard-
sphere repulsion event which occurs when the
bond length is (1 d)l and an innite square-well
attraction event which occurs with the bond length
is (1 d)l. Square-well events include capture,
bounce, and dissociation events which occur when
the square wells of N and C or the square wells of
two H beads touch. Capture events occur when an
attraction is felt between two beads, such as the
attraction between an N and a C during the for-
mation of a hydrogen bond. In the simulation, the
attraction results in an increase in kinetic energy
(beads N and C move faster toward each other)
and a decrease in potential energy (in accordance
with the depth of the N and C square wells). In
essence, the capture event causes the beads to
become partners. Dissociation events dissolve part-
nerships and are the opposite of capture events;
the beads move away from each other and lose
velocity (lowering the kinetic energy of the system)
while the system gains potential energy. Bounce
events occur between partnered beads that lack
enough kinetic energy to dissociate. Both energy
and momentum are conserved during all types of
events. The event-to-event nature of DMD offers
signicant computational advantages over stan-
dard, continuous-potential molecular dynamics
techniques which must proceed through time by
taking very small steps.
61
For details on DMD
simulations with square-well potentials, see papers
by Alder & Wainwright
58
and Smith et al.
61
Simulations are performed in the canonical
ensemble, which means that the number of par-
ticles, volume, and temperature are held constant.
Constant number of particles and volume are
achieved by creating a virtual three-dimensional
box for the simulation and allowing the model pro-
tein chains to move within that box. Periodic
boundary conditions are used to eliminate artifacts
due to simulation box wall effects. With this meth-
od, the primary simulation box is replicated in-
nitely in all dimensions, and chains are allowed to
move freely between the boxes. Since each box is
an exact replica of all others, when a chain appears
to be leaving the primary box, its image simul-
taneously enters the primary box from the opposite
face. The dimensions of the box are chosen to
ensure that a chain cannot interact with more than
one image of any other chain. For this study, we
use a cubic box with sides 100 A

in length. Con-
stant temperature is achieved by implementing the
Andersen thermostat method
62
as was used pre-
viously.
28,54
With this procedure, all beads in the
simulation are subject to random, infrequent col-
lisions with ghost particles. The post-event velocity
of a bead colliding with a ghost particle is chosen
randomly from a Maxwell-Boltzmann distribution
at the simulation temperature. We have
implemented several optimization techniques in
this work, including neighbor lists and false posi-
tioning, which have been described elsewhere.
61
Simulations are performed on alpha worksta-
tions and range in length from two to four billion
events, the length chosen in each case based on the
progress of the system. We ran all simulations
until the structures of the chains and system prop-
erties such as internal energy were constant for at
least the nal one billion events of the run. Aver-
age system properties for each run were calculated
based on the system properties in the nal 650
million events of each run. We also present raw
data for system properties versus time over the
course of individual simulations.
Model peptides
We perform DMD simulations on eight 16-resi-
due model peptide chains, each designed to form
an amphipathic a-helix. The peptides have the fol-
lowing sequence of hydrophobic (H) and polar (P)
residues: PPHPPHHPPHPPHHPP. This sequence is
derived from a sequence designed by Ho &
DeGrado,
7
GELEELLKKLKELLKG, where polar
residues glycine (G) and glutamic acid (E) and
hydrophobic residues leucine (L) and lysine (K) are
arranged to generate the simplest helical subunit of
a four-helix bundle protein. For purposes of com-
puter simulations, Guo & Thirumalai
46
reduced the
Ho & DeGrado sequence to the sequence of Hs
and Ps shown above.
Results and Discussion
In this section, we present results from 56 inde-
pendent simulations on systems of eight 16-residue
chains. The lowest energy (native) state for this
system is two tetrameric a-helical bundles. When
native, each tetrameric a-helical bundle has 48
a-helical hydrogen bonds, 12 intra-chain hydro-
phobic interactions, and 40 inter-chain hydro-
phobic interactions.
192 Protein Refolding Versus Aggregation
Figure 2 shows three pictures of the native state
for the model 16-residue chain that serves as the
helical subunit for the tetrameric a-helical bundle.
Figure 3 shows the native four-helix bundle struc-
ture with each chain shown in a different color.
The structure has the ``twist'' characteristic of tetra-
meric a-helical bundle proteins,
7,6365
with helices
lying at angles offset approximately 20

from the
bundle axis. Bundles with all possible combi-
nations of parallel and antiparallel helices are
observed during our simulations and are isoener-
getic in our model. The native state in our eight-
peptide system is two a-helical bundles, each of
which is like the one shown in Figure 3.
We describe progression of the system to the
native state via a single order parameter, Q, called
the nativeness parameter. Q is dened as the sum
of Q
HB
, which is a function of the number of native
a-helical hydrogen bonds that form, and Q
HP
,
which is a function of the number of chains that
align due to hydrophobic contacts. Q
HB
and Q
HP
are calculated as follows:
Q
HB
=
1
4
noX of a-helical hydrogen bonds formed
48

(3)
and
Q
HP
=
1
4
noX of aligned pairs of chains
6

(4)
where a pair of chains are ``aligned'' if they are in
one of the following two arrangements. In the rst
arrangement, two chains are aligned if they lie in
an anti-parallel direction such that they have at
least one inter-chain hydrophobic contact and the
distances between the N and C-terminal hydro-
phobic side-chain on opposite chains are both less
than 7 A

. Alternatively, two chains are aligned if


they lie in a parallel direction such that they have
at least one inter-chain hydrophobic contact and
the distances between the two N-terminal hydro-
phobic side-chains and between the two C-terminal
hydrophobic side-chains are both less than 7 A

.
With these denitions, the native structure has six
aligned pairs of chains, since chains directly next
to each other in the native bundle are separated by
approximately 4.7 A

and chains diagonally across


from each other in the native bundle are separated
by approximately 6.6 A

. Dening alignments in
bundle proteins in this way has been done else-
where.
66
The normalization constants of 1/4 in
each equation were chosen so that a Q value of
1/2 corresponds to the native state for one tetra-
mer, and equal weight is given for correct a-helical
hydrogen bonds formed and correct hydrophobic
arrangements. If both tetramers fold to the native
state, Q = 1. To determine Q for the simulation, we
consider, from the eight chains in the system, all
possible combinations of two sets of four chains.
We calculate Q for each set of four chains in a
given combination using equations (3) and (4) and
sum the Q for the two sets of four chains. The
maximum average Q obtained from all possible
combinations of two sets of four chains is used as
the value of Q for that simulation.
Folding of two tetrameric a-helical bundles (as
measured by Q) is a strong function of temperature
and is very similar to the functional relationship
seen previously for the folding of one tetrameric a-
helical bundle in a four-chain system. Variation in
Q as a function of reduced temperature is shown
in Figure 4 for both the eight-chain system (lled
circles) and the four-chain system studied pre-
viously
12
(open squares). Each system displays a
maximum in Q over a range of temperatures
Figure 2. The 16-residue peptide in an a-helix. Hydro-
phobic residues are dark gray, polar residues are light
gray, and the N-terminal residue is black. The structure
on the left is a RasMol
77
cartoon rendering of an ideal a-
helix (f = 70, c = 40) with the HP sequence used in
this work. The middle and right structures are drawn
with the AVS software package (Advanced Visual Sys-
tems, Inc.) and are the native conformation of the model
chain simulated here shown with full- and half-size
beads, respectively.
Figure 3. Native structure of the tetrameric a-helical
bundle shown with half-size beads (left) and full-size
beads (right). Chain backbones are pastel colors, hydro-
phobic side-chains are bright colors, polar side-chains
are light gray, and N-terminal beads are black.
Protein Refolding Versus Aggregation 193
which corresponds to an optimal temperature
range for folding. In 22 of the 29 eight-chain simu-
lations performed within the reduced temperature
range where Q is a maximum (0.088-0.105), the
eight-chain system successfully assembles into two
tetrameric a-helical bundles and remains within
small uctuations of the native state. At high
reduced temperatures (above 0.105) where both
hydrogen bonds and hydrophobicity are weak,
none of the 14 simulations performed yield stable
native bundles. At low reduced temperatures
(below 0.088) where both hydrogen bonds and
hydrophobicity are very strong, all 20 of the simu-
lations performed result in aggregated structures.
One of the simplications in our model is that
the strengths of interactions, e
HB
and e
HP
, are inde-
pendent of temperature, which is not the case for
real proteins. Using low-resolution protein models
with temperature-dependent hydrophobic inter-
actions, Chan and Dill nd that an optimum in
folding rate as a function of temperature is primar-
ily due to the temperature dependence of the
hydrophobic term which allows heat denaturation
at high temperatures and cold denaturation at low
temperatures.
67
In our results shown in Figure 4,
the optimum in folding yield cannot be ascribed
strictly to heat and cold denaturation. The high
temperature behavior is similar to heat denatura-
tion in that the yield of native bundles is low
because the effective interactions, e
HB
** = 1/T* and
e
HP
** = 1/(6T*), are too weak to persist long
enough for complete folding of helices and assem-
bly of bundles. The low temperature behavior is
due to aggregated structures being stabilized by
long-lived non-native interactions, not to cold
denaturation. In fact, in low-temperature simu-
lations starting from the native state, the native
state persists rather than degrading as would be
expected during cold denaturation (data not
shown). It has been shown previously that low-res-
olution model proteins tend to become kinetically
trapped more often than real proteins,
68
and a
similar phenomenon may exist for intermediate-
resolution models such as the one used here. Given
that our model includes temperature-independent
forces and may be more susceptible to kinetic traps
than real proteins, our conclusion about the exist-
ence of an optimal temperature range for folding
of real proteins is somewhat speculative. For that
reason, we focus on qualitative differences between
the one-, four-, and eight-chain systems rather than
on the details of a single system.
Although the folding in one-, four-, and eight-
chain systems show similar trends with reduced
temperature, the optimal temperature range for
folding shrinks as the number of chains in the sys-
tem increases. The high-temperature boundary on
the optimal temperature folding range is 0.115 for
the one-chain system and 0.105 for both the four-
and eight-chain systems. The low temperature
boundary on the optimal temperature folding
range is 0.065, 0.080, and 0.088 for the one-, four-,
and eight-chain systems, respectively. The position
of the upper boundary on the optimal temperature
range for folding appears to be dictated by the
complexity of the protein being folded. The native
structure of model proteins in both the four- and
the eight-chain systems is a tetrameric a-helical
bundle, and the upper edge of the optimal tem-
perature range for folding is constant at T* = 0.105.
In contrast, the upper edge of the optimal tempera-
ture range for folding of isolated 16-residue pep-
tides that serve as the building blocks for the
tetrameric bundle was shown previously to be con-
siderably higher (T* = 0.115, data not shown).
12
The position of the lower boundary on the optimal
temperature range for folding appears to be a func-
tion of the complexity of the protein environment.
At low temperatures, misfolding and aggregation
out-compete folding; and aberrant misfolding and
aggregation is a more signicant problem as the
number of chains in the system increases. In the
eight-chain system, signicantly more non-native
contacts and unproductive aggregation interactions
are possible than in the four- or one-chain systems
because of the increased number of chains; this
added complexity shifts the lower boundary to a
higher temperature.
Our hypothesis that the optimal temperature
range for protein folding is bounded on the high
side by the complexity of the protein and the low
side by the complexity of the solution offers an
interesting perspective on the experiments that are
performed to refold proteins in vitro. Our simu-
lation results suggest that, for a given protein, only
the low temperature boundary on the optimal tem-
perature range for folding can be manipulated
experimentally because the high temperature
boundary is xed by the protein being studied.
Reduced temperature in our simulations is a par-
Figure 4. Nativeness parameter, Q, versus reduced
temperature for the four- (open squares) and eight-chain
(lled circles) simulations.
194 Protein Refolding Versus Aggregation
ameter that can equivalently be considered to be
other environmental properties, such as solvent
quality or denaturant concentration. High reduced
temperature corresponds to good solvent or high
denaturant concentration; low reduced tempera-
ture corresponds to poor solvent or low denaturant
concentration. Experimental protein refolding is
known to be optimal under particular conditions
of temperature, solvent, and denaturant con-
centration,
1
and we suggest that the boundaries on
the optimal ranges observed experimentally are
dened at one extreme by the protein being stu-
died and at the other extreme by the chosen pro-
tein environment.
Folding to tetrameric a-helical bundles follows
many different trajectories in the eight-chain sys-
tem. However, two main pathways can be dened
that summarize the possible folding pathways for
a given set of four chains to a tetrameric bundle:
four monomers =dimer two monomers
=trimer monomer =tetramer
four monomers =two dimers =tetramer
In our previous simulations on four-chain systems,
the vast majority (18/22) were shown to fold via
the pathway with a trimeric intermediate. In con-
trast, in the eight-chain simulations presented here,
slightly fewer than half of the folding trajectories
(20/44) follow this path. The rest of the folding tra-
jectories in the eight-chain simulations (24/44)
occur without a trimeric intermediate and fold via
the second path shown above. The proportion of
folding trajectories that follow each route is signi-
cantly different for the four- and eight-peptide sys-
tems. This result suggests that protein folding
properties, such as dominant folding trajectories,
are strongly inuenced by the presence of other
proteins. We obtained a similar result in our pre-
vious multiprotein simulations on low-resolution
lattice models.
13
These observations suggest that
simulations of isolated proteins, as are standard in
the computational protein folding eld, may not
accurately reect dominant protein folding trajec-
tories in vivo or in concentrated solutions.
Interprotein interactions have been shown exper-
imentally to affect the kinetics of protein folding.
Using theoretical and experimental approaches,
Oliveberg demonstrated that true two-state kinetics
for monomeric proteins may be masked by transi-
ent aggregation at high (-5 mM) protein concen-
trations and appear as multistate kinetics.
69
The
effective peptide concentrations in our systems are
comparatively high (-50 mM or 60 mg/l in the
four-peptide system and -100 mM or 120 mg/l in
the four-peptide system), well into the concen-
tration regime where transient aggregates are sus-
pected to exist experimentally. However, our
model native structure is oligomeric and therefore
aggregation itself is a required part of the folding
and assembly process. Further simulation studies
at a range of effective peptide concentrations and,
preferably, on a monomeric model peptide will be
required to determine to what extent the kinetic
intermediates observed during our simulations
stem from concentration differences and whether
these intermediates can be compared to those pre-
dicted by theoretical and experimental studies.
Snapshots of an eight-chain simulation at
T* = 0.10 (within the optimal temperature range
for folding) in which one tetramer folds via a path-
way with a trimeric intermediate and the other
folds via a pathway without a trimeric intermediate
are shown in Figure 5. The rst snapshot (t* = 0)
shows the random initial conguration of the sys-
tem. The individual chains are shown with differ-
ent colors (red, green, blue, yellow, orange, purple,
magenta, and turquoise). At t* = 114.5, the chains
have paired into four separate dimers via hydro-
phobic interactions (blue-yellow, red-green, purple-
turquoise, and magenta-orange). (In this and in all
subsequent snapshots, the monomeric chains or
multi-chain complexes are shown side-by-side,
rather than in their true relative positions in space.)
The chains vary in the number of a-helical hydro-
gen bonds formed from 0 to 12. By t* = 142.6,
three of the four dimers have formed an aggregate,
while the fourth dimer (magenta-orange) remains
free. The six-chain aggregate quickly separates into
a trimer (green-turquoise-purple), a dimer (blue-
yellow), and a monomer (red), as shown in the
snapshots at t* = 194.5. As the simulation pro-
ceeds, the green-turquoise-purple trimer remains
isolated and the chains in this trimer slowly man-
euver into a native-like alignment that offers maxi-
mal hydrophobic contacts. In the meantime, the
other ve chains undergo signicant changes. The
blue-yellow dimer breaks apart, and then the red
and yellow chains form a dimer. Later, the red-yel-
low dimer and magenta-orange dimers come
together to form a tetramer. By t* = 308.2, the blue
chain, in a b-sheet structure, has joined the tetra-
mer but has few connections with it and breaks
away quickly. By t* = 822.5, the alignments within
the red-orange-yellow-magenta tetramer have
begun to resemble the native state, and the blue
chain has lost its b-structure and begins to establish
a-helical hydrogen bonds. At t* = 1210, the fully
helical blue chain associates with the green-tur-
quoise-purple trimer. Over the next 35 time units,
the blue chain aligns with the other chains in its
tetramer, and the native structure is observed in
both tetramers at t* = 1260.
Figures 6 and 7 offer a more detailed description
of the folding trajectory shown in Figure 5 in
which the red-orange-yellow-magenta set of chains
and the green-blue-purple-turquoise set of chains
assemble into independent tetrameric a-helical
bundles. The top four panels in Figures 6 (red-
orange-yellow-magenta bundle formation) and 7
(green-blue-purple-turquoise bundle formation)
show the number of a-helical hydrogen bonds
formed versus reduced time for each of the four
chains in the set and the middle six panels show
Protein Refolding Versus Aggregation 195
the number of inter-chain hydrophobic contacts
formed versus reduced time for each pair of chains
in the set. Figure 6 shows that the red-orange-yel-
low-magenta tetramer forms via association of two
dimeric intermediates. Early in the simulation
(t* = 0-250), the red-orange-yellow-magenta set of
chains (Figure 6) form two dimers, as can be seen
by the large number of hydrophobic contacts
during this time in the second (red and yellow)
and fth (orange and magenta) hydrophobic con-
tact panels. Despite these inter-chain hydrophobic
interactions, each chain successfully adopts a long,
a-helical structure during the time period from
t* = 0 to t* = 250. At approximately t* = 580, the
red-yellow and orange-magenta dimers associate
as can be seen by the prevalence of hydrophobic
contacts between all pairs in Figure 6. Over
approximately the next 680 time units, the red,
yellow, orange, and magenta chains reorient to
adopt the native a-helical bundle structure as
shown in Figure 5 at t* = 1260. Figure 7 shows that
the green-blue-purple-turquoise tetramer forms via
a trimeric intermediate. The purple and turquoise
chains are the rst to form long-lasting hydro-
phobic contacts (at approximately t* = 75). By
t* = 150, the green chain has joined the purple-tur-
quoise dimer and each of the three chains in the
resulting trimer has substantial a-helical character.
The blue chain, however, makes very few hydro-
phobic contacts with the green, purple, and tur-
quoise chains for the rst 1200 time units of the
simulation. The blue chain also spends the rst 850
time units of the simulation in non-helical struc-
tures, as can be seen in Figure 5 at reduced times
through 822.5. At t* = 1210, the blue chain nally
associates with the green-purple-turquoise trimer;
and at t* = 1260, the native tetrameric a-helical
bundle is achieved.
Non-native hydrogen bonds are common in our
simulations, and multiple non-native hydrogen
bonds can stabilize b-structures, such as the b-
sheet exhibited by the blue chain at t* = 308.2 in
Figure 5. Structures with non-native hydrogen
bonds are observed in all of our eight-chain simu-
lations, and b-structures (either b-turn, b-hairpin,
or b-sheets) are observed in 73.0 % of the simu-
lations. Table 2 shows the number of simulations
in which non-native and b hydrogen bonds (the
hydrogen bonds responsible for a b-structure) are
observed for simulations that result in either fold-
Figure 5. Snapshots of the conformations of each chain or complex of chains in an eight-chain simulation that
results in formation of two tetrameric a-helical bundles.
196 Protein Refolding Versus Aggregation
ing to the native state or trapping in a misfolded
or aggregated structure for the one- and four-chain
systems studied previously and for the eight-chain
system studied in this work. (Simulations resulting
in misfolded/aggregated structures include all
runs within the optimal temperature range for
folding that do not yield the native state and all
runs performed below the optimal temperature
range for folding.) Non-native hydrogen bonds
form in nearly all simulations regardless of the
number of chains in the system and of the simu-
lation outcome. Of the 54 single-chain simulations,
only four follow trajectories that do not experience
non-native hydrogen bonds. In all 37 four-chain
simulations and in all 49 eight-chain simulations,
non-native hydrogen bonds are observed. In all
three systems, b-structure is more common in
Figure 6. Number of a-helical hydrogen bonds formed
and number of inter-chain hydrophobic contacts formed
for the red-orange-yellow-magenta tetramer during the
folding trajectory of the two tetrameric a-helical bundles
depicted in Figure 5.
Figure 7. Number of a-helical hydrogen bonds formed
and number of inter-chain hydrophobic contacts formed
for the green-blue-purple-turquoise tetramer during the
folding trajectory of the two tetrameric a-helical bundles
depicted in Figure 5.
Protein Refolding Versus Aggregation 197
simulations that result in misfolds or aggregates
than in simulations that result in folding to the
native state. In the one-chain system, all trajectories
ending in misfolded structures involved b-hydro-
gen bonds, compared with only 43.5 % of trajec-
tories that ended in the native state. In the multi-
chain systems, slightly more misfolding and aggre-
gation trajectories than folding trajectories (80.0 %
compared to 72.7 % for the four-chain system and
92.6 % compared to 77.3 % for the eight-chain sys-
tem) involved b hydrogen bonds. This trend
suggests that the presence of b-structures hinders
folding, as is expected for our particular system in
which a-helical structures are required for folding.
Extended, b-strand and b-sheet structures have
been shown to be prevalent in the ordered, brillar
aggregates common to amyloid diseases.
2,70,71
Though we observe b-structures in our
simulations, the amorphous aggregates that form
contain little or no b-sheet content. In our system,
b-structures represent relatively deep energetic
traps that retard progression toward native,
a-helical structures.
The structures of the individual chains involved
in aggregates can offer clues as to the points along
the folding trajectories that are most susceptible to
detrimental aggregation events. In our simulations,
we observe that aggregated structures, such as the
one shown in Figure 8 for a simulation at
T* = 0.076, often possess substantial native charac-
ter. Table 3 describes the amount of native charac-
ter found in misfolded and aggregated structures
for the four-chain system studied previously and
the eight-chain system studied in this work. In
both the four- and eight-chain systems, the mis-
folds and aggregates have a large number of
a-helical (native) hydrogen bonds, representing
more than 70 % of the a-helical hydrogen bonds
that would be present in the native state. The ratio
of non-native to native hydrogen bonds is very
low in both systems, which indicates that aggre-
gated structures are not solely the product of
excessive non-native hydrogen bonds. In fact, the
ratio of non-native to native hydrogen bonds is
nearly identical in the four- and eight-chain sys-
tems (0.18 compared with 0.20), suggesting that
hydrogen bonding is unaffected by the number of
Table 2. Occurrence of non-native and b
a
hydrogen bonds during one-, four-, and eight-chain simulations within and
below the optimal temperature range for folding
No. chains Simulation outcome No. simulations
No. simulations in which
non-native hydrogen
bonds were observed
No. simulations in which
b hydrogen bonds were
observed
1 Native 46 42 (91.3 %
b
) 20 (43.5 %)
Misfolded 8 8 (100 %) 8 (100 %)
4 Native 22 22 (100 %) 16 (72.7 %)
Misfolded 15 15 (100 %) 12 (80.0 %)
8 Native 22 22 (100 %) 17 (77.3 %)
Misfold/aggreg 27 27 (100 %) 25 (92.6 %)
a
We dene b hydrogen bonds as those bonds in stretches of three or more consecutive hydrogen bonds that contribute to a
b-turn, b-hairpin, or b-sheet structure.
b
Percentages in parentheses are relative to the total number of simulations in the category.
Figure 8. An eight-chain aggregate in a simulation at
T* = 0.076.
Table 3. Native and non-native characteristics of aggre-
gated structures
Four-chain
simulations
Eight-chain
simulations
Average number of
native hydrogen
bonds
34.6 (72.1 %
a
) 67.5 (70.3 %)
Average ratio of
non-native to native
hydrogen bonds
0.18 0.20
Average number of
aligned pairs of
chains
2.76 (46.0 %) 4.18 (34.8 %)
Average ratio of
inter-chain non-
native to native
hydrophobic
interactions
2.39 4.12
a
Percentages in parentheses are relative to the number in the
native structure.
198 Protein Refolding Versus Aggregation
chains in the system. On average, 2.76 pairs of
chains are aligned in the aggregates in the four-
chain system (nearly one-half of the native six
alignments), which is a considerably higher frac-
tion than the 4.18 pairs of chains that are aligned
in the aggregates in the eight-chain system (one-
third of the native 12 alignments). Unlike hydrogen
bonding, alignment of chains appears to be
affected by the system size. The most striking
difference between the four- and eight-chain sys-
tems is that, while aggregate structures in the four-
chain systems have only 2.39 times as many non-
native hydrophobic interactions as native hydro-
phobic interactions, the aggregate structures in the
eight-chain systems have 4.12 times as many non-
native hydrophobic interactions as native hydro-
phobic interactions. Hydrophobic interactions con-
tribute to aggregate stability in both the four- and
eight-chain systems. The larger the system, the lar-
ger the role non-native hydrophobic interactions
play in stabilizing aggregated structures.
As with the one- and four-chain simulations per-
formed previously, eight-chain simulations that
result in correct assembly to the native state in this
work are very efcient. Folding transitions require
as few as two days on a single-processor 500 MHz
workstation.
Conclusions
In eight-chain simulations, where each chain is
designed to form an identical amphipathic a-helix,
the model peptides successfully assemble into two
tetrameric a-helical bundles when simulations are
performed at intermediate values of reduced tem-
perature. Despite the simplications in our model,
which include neglecting details of side-chain
structure and implementing only steric, hydrogen
bonding, and hydrophobic forces, the structural
characteristics of the resulting bundle are consist-
ent with experimental characterization of Ho &
DeGrado's original de novo designed amphipathic
a-helix sequence
711
and with previous simulation
results for this system.
12,41,46
This agreement is
encouraging and suggests that these and further
simulations with this model may provide reason-
able estimates of real peptide behavior. However,
it is possible that the simplications in the model
affect the simulation results. For example, we
monitor hydrophobic interactions between side-
chains, but other side-chain interactions, such as
hydrogen bonding and salt links, are likely to
impact intermediate structures and aggregate stab-
ility. It is also possible that incorporating tempera-
ture-dependent hydrophobic interactions and
hydrogen bonds will affect the system behavior.
Further simulation studies are necessary to fully
assess the robustness of the results.
Folding of the a-helical bundle follows many
different trajectories. However, two main path-
ways can be dened, one through a trimeric inter-
mediate and the other involving the association of
two dimers. Interestingly, the proportion of folding
pathways that follow each route is signicantly
different for the eight-peptide system than in the
previously studied four-peptide system. While the
eight-chain simulations folded equally via the two
pathways, the four-chain simulations heavily
favored the pathway with a trimeric intermediate.
The different folding tendencies of the two systems
suggests that protein folding properties, such as
dominant pathways, are strongly inuenced by the
presence of other proteins; and simulations of iso-
lated proteins, as is standard practice in the com-
putational folding eld, should be analyzed with
this caveat in mind.
The optimal temperature range for folding is
different for each of the systems we have studied.
From comparisons between one-, four-, and eight-
peptide systems, it appears that the high-tempera-
ture boundary of the optimal temperature range is
a function of the complexity of the protein (or oli-
gomer) to be folded, while the low-temperature
boundary is a function of the complexity of the
protein's environment and the competition
between protein folding and aggregation. There-
fore, when experimental refolding of a particular
protein is difcult, efforts to expand the optimal
temperature range should focus on pushing the
low-temperature boundary lower since the high-
temperature boundary may be xed.
In simulations on eight-peptide systems below
their optimal temperature ranges for folding,
aggregation out-competes folding, as we saw pre-
viously in simulations on four-peptide systems. In
general, aggregates in both the eight- and four-
peptide systems have substantial levels of native
secondary structure and appear to be stabilized by
a signicant number of non-native hydrophobic
contacts. This observation is in agreement with
previous experimental
7276
and simulation
13,50
stu-
dies that suggest aggregates are composed largely
of partially folded intermediates, as opposed to
completely denatured chains. All aggregates
observed in our simulations are amorphous,
analogous to experimentally observed inclusion
body aggregates, with each peptide chain in each
amorphous aggregate adopting a unique partially
folded or random coil conguration. We do not
observe brillar aggregates with long-range order
like those formed by b-amyloid proteins in
Alzheimer's disease
2
and by prion proteins in
Creutzfeld-Jakob disease.
52
In the eight-peptide system, a wide array of
structures are actively sampled, including non-
native compact structures and b-sheet confor-
mations. However, the power of the DMD simu-
lation algorithm, along with the simplicity of our
intermediate-resolution protein model, enables
observation of complete folding trajectories to two
tetrameric a-helical bundles within two days on a
500 MHz single-processor workstation.
Protein Refolding Versus Aggregation 199
Acknowledgments
This work was supported by the GAANN Biotechnol-
ogy Fellowship program of the U.S. Department of Edu-
cation. Funding was also provided by the National
Institutes of Health under grant number GM-56766 and
the National Science Foundation under grant number
CTS-9704044.
References
1. Jaenicke, R. & Seckler, R. (1997). Protein mis-
assembly in vitro. Advan. Protein Chem. 50, 1-59.
2. Selkoe, D. J. (1999). Translating cell biology into
therapeutic advances in Alzheimer's disease. Nature,
399, A23-A31.
3. Cohen, F. E. & Prusiner, S. B. (1998). Pathologic con-
formations of prion proteins. Annu. Rev. Biochem. 67,
793-819.
4. Manning, M., Patel, K. & Borchardt, R. (1989).
Stability of protein pharmaceuticals. Pharm. Res. 6,
903-918.
5. Costantino, H. R., Langer, R. & Klibanov, A. M.
(1995). Aggregation of lyophilized pharmaceutical
protein, recombinant human albumin: effect of
moisture and stabilization by excipients. Biotechnol.
13, 493-496.
6. King, J. & Betts, S. (1999). A green light for protein
folding. Nature Biotech. 17, 637-638.
7. Ho, S. P. & DeGrado, W. F. (1987). Design of a
4-helix bundle protein: synthesis of peptides which
self-associate into a helical protein. J. Am. Chem. Soc.
109, 6751-6758.
8. Regan, L. & DeGrado, W. F. (1988). Characterization
of a helical protein designed from rst principles.
Science, 241, 976-978.
9. Hill, C. P., Anderson, D. H., Wesson, L., DeGrado,
W. F. & Eisenberg, D. (1990). Crystal structure of a
1
:
implications for protein design. Science, 249, 543-546.
10. Betz, S. F., Bryson, J. W. & DeGrado, W. F. (1995).
Native-like and structurally characterized designed
a-helical bundles. Curr. Opin. Struc. Biol. 5, 457-463.
11. Raleigh, D. P., Betz, S. F. & DeGrado, W. F. (1995).
A de novo designed protein mimics the native state
of natural proteins. J. Am. Chem. Soc. 117, 7558-7559.
12. Smith, A. V. & Hall, C. K. (2001). Assembly of a tet-
rameric a-helical bundle: computer simulations on
an intermediate-resolution protein model. Proteins:
Struct. Funct. Genet. 44, 376-391.
13. Gupta, P., Hall, C. K. & Voegler, A. C. (1998). Effect
of denaturant and protein concentrations upon pro-
tein refolding and aggregation: a simple lattice
model. Protein Sci. 7, 2642-2652.
14. Weiner, P. K. & Kollman, P. A. (1981). AMBER -
assisted model-building with energy renement - a
general program for modeling molecules and their
interactions. J. Comput. Chem. 2, 287-303.
15. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D.,
States, D. J., Swaminathan, S. & Karplus, M. (1983).
CHARMM - a program for macromolecular energy,
minimization, and dynamics calculations. J. Comput.
Chem. 4, 187-217.
16. Levitt, M., Hirschberg, M., Sharon, R. & Daggett, V.
(1995). Potential energy function and parameters for
simulations of the molecular dynamics of proteins
and nucleic acids in solution. Comput. Phys. Com-
mun. 91, 215-231.
17. Caisch, A. & Karplus, M. (1995). Acid and thermal-
denaturation of barnase investigated by molecular-
dynamics simulations. J. Mol. Biol. 252, 672-231.
18. Lazaridis, T. & Karplus, M. (1998). ``New view'' of
protein folding reconciled with the old through mul-
tiple unfolding simulations. Science, 278, 1928-1931.
19. Kazmirski, S. L. & Daggett, V. (1999). Analysis
methods for comparison of multiple molecular
dynamics trajectories: applications to protein unfold-
ing pathways and denatured ensembles. J. Mol. Biol.
290, 283-304.
20. Kazmirski, S. L. & Daggett, V. (1998). Simulations of
the structural and dynamical properties of
denatured proteins: the ``molten coil'' state of bovine
pancreatic trypsin inhibitor. J. Mol. Biol. 277, 487-
506.
21. Wong, K. B., Clarke, J., Bond, C. J., Neira, J. L.,
Freund, S. M. V., Fersht, A. R. & Daggett, V. (2000).
Towards a complete description of the structural
and dynamic properties of the denatured state of
barnase and the role of residual structure in folding.
J. Mol. Biol. 296, 1257-1282.
22. Elcock, A. H., Gabdoulline, R. R., Wade, R. C. &
McCammon, J. A. (1999). Computer simulation of
protein-protein association kinetics: acetylcholin-
esterase-fasciculin. J. Mol. Biol. 291, 149-162.
23. Lau, K. & Dill, K. (1989). A lattice statistical mech-
anics model of the conformational and sequence
spaces of proteins. Macromolecules, 22, 3986-3997.
24. Shakhnovich, E. & Gutin, A. (1993). Engineering of
stable and fast-folding sequences of model proteins.
Proc. Natl Acad. Sci. USA, 90, 7195-7199.
25. Socci, N. & Onuchic, J. (1994). Folding kinetics of
protein-like heteropolymers. J. Chem. Phys. 101,
1519-1528.
26. Doniach, S., Garel, T. & Orland, H. (1996). Phase
diagram of a semiexible polymer chain in a y
solvent: application to protein folding. J. Chem. Phys.
105, 1601-1607.
27. Zhou, Y., Hall, C. K. & Karplus, M. (1996). First-
order disorder-to-order transition in an isolated
homopolymer model. Phys. Rev. Letters, 77, 2822-
2825.
28. Zhou, Y., Karplus, M., Wichert, J. M. & Hall, C. K.
(1997). Equilibrium thermodynamics of homopoly-
mers and clusters: molecular dynamics and Monte
Carlo simulations of systems with square-well inter-
actions. J. Chem. Phys. 107, 10691-10708.
29. Hu, W. (1998). Structural transformation in the col-
lapse transition of the single exible homopolymer
model. J. Chem. Phys. 109, 3686-3690.
30. Wu, C. & Wang, X. (1998). Globule-to-coil transition
of a single homopolymer chain in solution. Phys.
Rev. Letters, 80, 4092-4094.
31. Iori, G., Marinari, E., Parisi, G. & Struglia, M. V.
(1992). Statistical mechanics of heteropolymer fold-
ing. Physica A, 185, 98-103.
32. Camacho, C. J. & Thirumalai, D. (1993). Kinetics and
thermodynamics of folding in model proteins. Proc.
Natl Acad. Sci. USA, 90, 6369-6372.
33. Bratko, D., Charkraborty, A. K. & Shakhnovich, E. I.
(1997). The structure of a random heteropolymer in
a disordered medium: ensemble growth simulation.
J. Chem. Phys. 106, 1264-1278.
34. Irba ck, A., Peterson, C., Potthast, F. & Sommelius,
O. (1997). Local interactions and protein folding: a
200 Protein Refolding Versus Aggregation
three-dimensional off-lattice approach. J. Chem. Phys.
107, 273-282.
35. Zhdanov, V. P. & Kasemo, B. (1997). Monte Carlo
simulation of protein folding with orientation-
dependent monomer-monomer interactions. Proteins:
Struct. Funct. Genet. 29, 508-516.
36. Nymeyer, H., Garcia, A. E. & Onuchic, J. N. (1998).
Folding funnels and frustraction in off-lattice minim-
alist protein landscapes. Proc. Natl Acad. Sci. USA,
95, 5921-5928.
37. Dinner, A. R. & Karplus, M. J. (1999). The thermo-
dynamics and kinetics of protein folding: a lattice
model analysis of multiple pathways with inter-
mediates. Phys. Chem. B, 103, 7976-7994.
38. Kolinski, A. & Skolnick, J. (1992). Discretized model
of proteins. I. Monte Carlo study of cooperativity in
homopolypeptides. J. Chem. Phys. 97, 9412-9426.
39. Kolinski, A. & Skolnick, J. (1994). Monte Carlo
simulations of protein folding. I. lattice model and
interaction scheme. Proteins: Struct. Funct. Genet. 18,
338-352.
40. Kolinski, A. & Skolnick, J. (1994). Monte Carlo simu-
lations of protein folding. I. Application to Protein
A, ROP, and crambin. Proteins: Struct. Funct. Genet.
18, 353-366.
41. Sikorski, A., Kolinski, A. & Skolnick, J. (1998).
Computer simulations of de novo designed helical
proteins. Biophys. J. 75, 92-105.
42. Sikorski, A., Kolinski, A. & Skolnick, J. (2000). Com-
puter simulations of the properties of the a
2
, a
2
C,
and a
2
D de novo designed helical proteins. Proteins:
Struct. Funct. Genet. 38, 17-28.
43. Wallqvist, A. & Ullner, M. (1994). A simplied
amino acid potential for use in structure predictions
of proteins. Proteins: Struct. Funct. Genet. 18, 267-280.
44. Sun, S. (1993). Reduced representation model of pro-
tein structure prediction: statistical potential and
genetic algorithms. Protein Sci. 2, 762-785.
45. Takada, S., Luthey-Schulten, Z. & Wolynes, P. G.
(1999). Folding dynamics with nonadditive forces: a
simulation study of a designed helical protein and a
random heteropolymer. J. Chem. Phys. 110, 11616-
11629.
46. Guo, Z. & Thirumalai, D. (1996). Kinetics and
thermodynamics of folding of a de novo designed
four-helix bundle protein. J. Mol. Biol. 263, 323-343.
47. Patro, S. Y. & Przybycien, T. M. (1994). Simulations
of kinetically irreversible protein aggregate struc-
ture. Biophys. J. 66, 1274-1289.
48. Patro, S. Y. & Przybycien, T. M. (1996). Simulations
of reversible protein aggregate and crystal structure.
Biophys. J. 70, 2888-2902.
49. Istrail, S., Schwartz, R. & King, J. (1999). Lattice
simulations of aggregation funnels for protein fold-
ing. J. Comput. Biol. 6, 143-162.
50. Broglia, R. A., Tiana, G., Pasquali, S., Roman, H. E.
& Vigezzi, E. (1998). Folding and aggregation of
designed proteins. Proc. Natl Acad. Sci. USA, 95,
12930-12933.
51. Giugliarelli, G., Micheletti, C., Banavar, J. R. &
Maritan, A. (0000). Compactness, aggregation, and
prionlike behavior of protein: a lattice model study.
J. Chem. Phys. 113, 5072-5077.
52. Harrison, P. M., Chan, H. S., Prusiner, S. B. &
Cohen, F. E. (1999). Thermodynamics of model
prions and its implications for the problem of prion
protein folding. J. Mol. Biol. 286, 593-606.
53. Smith, A. V. & Hall, C. K. (2000). Bridging the gap
between homopolymer and protein models: a dis-
continuous molecular dynamics study. J. Chem.
Phys. 113, 9331-9342.
54. Smith, A. V. & Hall, C. K. (2001). a-Helix formation:
discontinuous molecular dynamics on an intermedi-
ate-resolution protein model. Proteins: Struct. Funct.
Genet. 44, 344-360.
55. Voet, D. & Voet, J. G. (1990), Biochemistry, John
Wiley & Sons, New York, NY.
56. Baker, E. N. & Hubbard, R. E. (1984). Hydrogen
bonding in globular proteins. Prog. Biophys. Mol.
Biol. 44, 97-179.
57. Klimov, D. K., Betancourt, M. R. & Thirumalai, D.
(1998). Virtual atom representation of hydrogen
bonds in minimal off-lattice models of a helices:
effect on stability, cooperativity and kinetics. Folding
Des. 3, 481-496.
58. Alder, B. J. & Wainwright, T. E. (1959). Studies in
molecular dynamics I. General method. J. Chem.
Phys. 31, 459-466.
59. Rapaport, D. C. (1978). Molecular dynamics simu-
lation of polymer chains with excluded volume.
J. Phys. A: Math. Gen. 11, L213-L217.
60. Bellemans, A., Orban, J. & Belle, D. V. (1980).
Molecular dynamics of rigid and non-rigid necklaces
of hard discs. Mol. Phys. 39, 781-782.
61. Smith, S. W., Hall, C. K. & Freeman, B. D. (1997).
Molecular dynamics for polymeric uids using dis-
continuous potentials. J. Comp. Phys. 134, 16-30.
62. Andersen, H. C. (1980). Molecular dynamics simu-
lations at constant temperature and/or pressure.
J. Chem. Phys. 72, 2384-2393.
63. Crick, F. H. C. (1953). The packing of a-helices:
simple coiled-coils. Acta. Crystallog. 6, 689-697.
64. Kamtekar, S. & Hecht, M. H. (1995). The four-helix
bundle: what determines a fold? FASEB J. 9, 1013-
1022.
65. Zhong, Q., Jiang, Q., Moore, P. B., Newns, D. M. &
Klein, M. L. (1998). Molecular dynamics simulation
of a synthetic ion channel. Biophys. J. 74, 3-10.
66. Rojnuckarin, A., Kim, S. & Subramaniam, S. (1998).
Brownian dynamics simulations of protein folding:
access to milliseconds time scale and beyond. Proc.
Natl Acad. Sci. USA, 95, 4288-4292.
67. Chan, H. S. & Dill, K. A. (1998). Protein folding in
the landscape prespective: chevron plots and non-
Arrhenius kinetics. Proteins: Struct. Funct. Genet. 30,
2-33.
68. Chan, H. S. (1998). Matching speed and locality.
Nature, 392, 761-763.
69. Oliveberg, M. (1998). Alternative explanations for
multistate kinetics in protein folding: transient
aggregation and changing transition-state ensembles.
Acc. Chem. Res. 31, 765-772.
70. Kirschner, D. A., Abraham, C. & Selkoe, D. J. (1986).
X-ray diffraction from intraneuronal paired helical
laments and extraneuronal amyloid bers in
Alzheimer disease indicates cross-b conformation.
Proc. Natl Acad. Sci. USA, 83, 503-507.
71. Kelly, J. W. (1998). The alternative conformations of
amyloidogenic proteins and their multi-step assem-
bly pathways. Curr. Opin. Struct. Biol. 8, 101-106.
72. Oberg, K., Chrunyk, B. A., Wetzel, R. & Fink, A. L.
(1994). Nativelike secondary structure in interleukin-
1-beta inclusion bodies by attenuated total relectance
FT-IR. Biochemistry, 33, 2628-2634.
73. Speed, M. A., Wang, D. I. & King, J. (1995). Multi-
meric intermediates in the pathway to the aggre-
gated inclusion body state for P22 tailspike
polypeptide chains. Protein Sci. 4, 900-908.
Protein Refolding Versus Aggregation 201
74. Wetzel, R. (1996). For protein misassembly, it's the
``I'' decade. Cell, 86, 699-702.
75. King, J., Haase-Pettingell, C., Robinson, A. S., Speed,
M. & Mitraki, A. (1996). Thermolabile folding inter-
mediates: inclusion body precursors and chaperonin
substrates. FASEB J. 10, 57-66.
76. Fink, A. L. (1998). Protein aggregation: folding
aggregates, inclusion bodies and amyloid. Folding
Des. 3, R9-R23.
77. Sayle, R. & Milner-White, E. J. (1995). RasMol:
biomolecular graphics for all. Trends Biochem. Sci. 20,
333-379.
Edited by F. Cohen
(Received 9 November 2001; received in revised form 14 May 2001; accepted 14 May 2001)
202 Protein Refolding Versus Aggregation

Potrebbero piacerti anche