Protein Folding Problem
Protein Folding Problem
s
t
r
u
c
t
u
r
e
(
[
]
2
2
2
)
15
a b c
0 5 10
0
15
5
10
(radius)
3
Native fold
Partially folded
structures
Figure 1
(a) Binary code. Experiments show that a primarily binary hydrophobic-polar code is sufcient to fold
helix-bundle proteins (112). Reprinted from Reference 112 with permission from AAAS.
(b) Compactness stabilizes secondary structure, in proteins, from lattice models. (c) Experiments
supporting panel b, showing that compactness correlates with secondary structure content in nonnative
states of many different proteins (218). Reprinted from Reference 218 with permission.
Three-helix bundl e
[Alcohol]
F
R
E
T
e
f
f
i
c
i
e
n
c
y
0.0
1.0
20% 0%
Q
FP
H
O R
N
NH
2
n
H
R
O
N
NH
2
n
a
b
Peptide
Peptoid
c
Methanol
Propanol
Ethanol
Designed molecule
Experimentally determined
structure
Figure 2
(a) A novel protein fold, called Top7, designed by Kuhlman et al. (129).
Designed molecule (blue) and the experimental structure determined
subsequently (red ). From Reference 129; reprinted with permission from
AAAS. (b) Three-helix bundle foldamers have been made using
nonbiological backbones (peptoids, i.e., N-substituted glycines).
(c) Their denaturation by alcohols indicates they have hydrophobic cores
characteristic of a folded molecule (134).
Studies of lattice models (25, 29, 51) and
tube models (11, 12, 159) have shown that
secondary structures in proteins are substan-
tially stabilized by the chain compactness,
an indirect consequence of the hydrophobic
force to collapse (Figure 1). Like airport se-
curity lines, helical and sheet congurations
are the only regular ways to pack a linear chain
(of people or monomers) into a tight space.
Designing New Proteins
and Nonbiological Foldamers
Although our knowledge of the forces of fold-
ing remains incomplete, it has not hindered
the emergence of successful practical pro-
tein design. Novel proteins are now designed
as variants of existing proteins (43, 94, 99,
145, 173, 243), or from broadened alphabets
of nonnatural amino acids (226), or de novo
(129) (Figure 2). Moreover, folding codes are
used to design new polymeric materials called
foldamers (76, 86, 120). Folded helix bundles
have now been designed using nonbiological
backbones (134). Foldamers are nding appli-
cations in biomedicine as antimicrobials (179,
185), lung surfactant replacements (235),
292 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
cytomegalovirus inhibitors (62), and siRNA
delivery agents (217). Hence, questions of
deep principle are no longer bottlenecks to
designing foldable polymers for practical ap-
plications and new materials.
COMPUTATIONAL PROTEIN
STRUCTURE PREDICTION IS
INCREASINGLY SUCCESSFUL
A major goal of computational biology has
been to predict a proteins three-dimensional
native structure fromits amino acid sequence.
This could help to (a) accelerate drug dis-
covery by replacing slow, expensive struc-
tural biology experiments with faster, cheaper
computer simulations, and (b) annotate pro-
tein function from genome sequences (9).
With the rapid growth of experimentally de-
termined structures available in the Protein
Databank (PDB), protein structure predic-
tion has become as much a problem of infer-
ence and machine learning as it is of protein
physics.
Among the earliest uses of protein
databases to infer protein structures were
secondary structure prediction algorithms
(33, 34, 190). In the mid-1980s, several groups
began using the methods of computational
physicsatomic force elds plus Monte Carlo
samplingto compute the structures of the
Met-enkephalin, a ve-residue peptide (95,
141). The early 1990s saw signicant strides
in using databases and homology detection
algorithms to assemble structures from ho-
mologous sequences (192) and to recognize
folds by threading unknown sequences onto
three-dimensional structures from a database
(111). A key advance was the exploitation of
evolutionary relationships among sequences
through the development of robust sequence
alignment methods (32, 64, 224).
CASP: A Community-Wide
Experiment
In 1994, John Moult invented CASP (Critical
Assessment of Techniques for Protein Struc-
ture Prediction) (165), a biennial, community-
wide blind test to predict the unknown struc-
tures of proteins. Organizers identify pro-
teins likely to be solved or whose structures
have not yet been released, and predictors
have roughly 35 weeks to predict their native
structures. CASP has grown from 35 predic-
tor groups and 24 target sequences in CASP1
in 1994 to over 200 groups and 100+ targets
in CASP7 in 2006.
Over the seven CASPs, two trends are
clear (164, 219). First, although much re-
mains to be done, there has been substantial
improvement in protein structure prediction.
Web servers and software packages often
predict the native structure of small, single-
domain proteins to within about 26
A of
their experimental structures (8, 17, 242). In
addition, fast-homology methods are com-
puting approximate folds for whole genomes
(182, 214). Figure 3 shows a quantitative
assessment of performance at the rst ve
CASP meetings. The most signicant gains
have occurred in the alignments of targets
to homologs, the detection of evolutionarily
distant homologs, and the generation of
reasonable models for difcult targets that do
CASP5
Target difficulty
Easy Difficult
P
r
e
d
i
c
t
i
o
n
s
u
c
c
e
s
s
(
%
)
100
0
CASP4
CASP3
CASP2
CASP1
Figure 3
Progress in protein structure prediction in CASP15 (219). The y-axis
contains the GDT TS score, the percentage of model residues that can be
superimposed on the true native structure, averaged over four resolutions
from 1 to 8
A (100% is perfect). The x-axis is the ranked target difculty,
measured by sequence and structural similarities to proteins in the PDB at
the time of the respective CASP. This shows that protein structure
prediction on easy targets is quite good and is improving for targets of
intermediate difculty. Reprinted from Reference 219 with permission.
www.annualreviews.org The Protein Folding Problem 293
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
not have templates (new folds). Since CASP5,
predictions have also beneted from the use
of metaservers, which solicit and establish
consensus among predictions from multiple
algorithms. Second, while most methods
rely on both physics and bioinformatics,
the most successful methods currently draw
heavily from knowledge contained in native
structural databases. Bioinformatics methods
have beneted from the growth in size of the
PDB (9, 219).
The following challenges remain (8, 164,
219): (a) to rene homology models beyond
those of the best template structures; (b) to
reduce errors to routinely better than 3
A,
particularly for proteins that are large, have
signicant content, are new folds, or have
low homology; (c) to handle large multido-
main or domain-swapped proteins; (d ) to ad-
dress membrane proteins; and (e) to predict
protein-protein interactions. Structural ge-
nomics is likely to help here (87, 222). In
any case, the current successes in computer-
based predictions of native protein struc-
tures are far beyond what was expected thirty
years ago, when structure prediction seemed
impossible.
ARE THERE MECHANISMS
OF PROTEIN FOLDING?
In 1968, Cyrus Levinthal rst noted the puz-
zle that even though they have vast confor-
mational spaces, proteins can search and con-
verge quickly to native states, sometimes in
microseconds. How do proteins nd their na-
tive states so quickly? It was postulated that
if we understood the physical mechanism of
protein folding, it could lead to fast computer
algorithms to predict native structures from
their amino acid sequences. In its description
of the 125 most important unsolved problems
in science, Science magazine framed the prob-
lem this way: Can we predict how proteins
will fold? Out of a near innitude of possible
ways to fold, a protein picks one in just tens of
microseconds. The same task takes 30 years
of computer time (1).
The following questions of principle have
driven the eld: How can all the denatured
molecules in a beaker nd the same na-
tive structure, starting from different con-
formations? What conformations are not
searched? Is folding hierarchical (10, 119)?
Which comes rst: secondary or tertiary
structure (80, 239)? Does the protein col-
lapse to compact structures before struc-
ture formation, or before the rate-limiting
step (RLS), or are they concurrent (7, 89,
101, 195, 205, 213)? Are there folding nuclei
(58, 152)?
Several models have emerged. In the
diffusion-collision model, microdomain
structures form rst and then diffuse and
collide to form larger structures (115, 116).
The nucleation-condensation mechanism
(70) proposes that a diffuse transition state en-
semble (TSE) with some secondary structure
nucleates tertiary contacts. Some proteins,
such as helical bundles, appear to follow a
hierarchical diffusion-collision model (155,
169) in which secondary structure forms and
assembles in a hierarchical order. In hierar-
chic condensation (139), the chain searches
for compact, contiguous structured units,
which are then assembled into the folded
state. Or, proteins may fold via the stepwise
assembly of structural subunits called foldons
(22, 126), or they may search for topomers,
which are largely unfolded states that have
native-like topologies (45, 150). These
models are not mutually exclusive.
There Have Been Big
Advances in Experimental
and Theoretical Methods
The search for folding mechanisms has driven
major advances in experimental protein sci-
ence. These include fast laser temperature-
jump methods (22); mutational methods
that give quantities called -values (71, 84,
106) [now also used for ion-channel kinet-
ics and other rate processes (42)] or -
values (204), which can identify those residues
most important for folding speed; hydrogen
294 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
exchange methods that give monomer-level
information about folding events (125, 149);
andthe extensive explorationof proteinmodel
systems, including cytochrome c, CI2, bar-
nase, apomyoglobin, the src, -spectrin, and
fyn SH3 domains, proteins Land G, WWdo-
mains, trpzip, and trp cage (154). In addition,
peptide model experimental test systems pro-
vide insights into the fast early-folding events
(14, 109, 124). Furthermore, single-molecule
methods are beginning to explore the con-
formational heterogeneity of folding (23, 133,
166, 194).
There have been corresponding advances
in theory and computation. Computer-based
molecular minimization methods were rst
applied to protein structures in the 1960s
(18, 79, 171), followed by molecular dynam-
ics (140, 155), improved force elds (40) (re-
viewed in Reference 114), weighted sampling
and multi-temperature methods (130, 210),
highly parallelized codes for supercomputers
(2, 57), and distributed grid computing meth-
ods such as Folding@home (198, 241). Mod-
els having less atomic detail also emerged to
address questions more global and less de-
tailed in nature about protein conformational
spaces: (a) The Go model (82), which was in-
tended to see if a computer could nd the
native structure if native guiding constraints
were imposed, is now widely used to study
folding kinetics of proteins having known na-
tive structures (37, 38, 113, 197, 225). (b) More
physical models, typically based on polymer-
like lattices, are used to study the static and
dynamic properties of conformational spaces
(19, 50). (c) Master-equation approaches can
explore dynamics in heterogeneous systems
(26, 36, 77, 138, 161, 176, 231, 232). Below,
we describe some of what has been learned
from these studies.
The PSB Plot: Folding Speed
Correlates with the Localness of
Contacts in the Native Structure
One of the few universal features of protein
folding kinetics was rst observed by Plaxco,
0
14
8 22
R = 0.75
RCO (%)
l
n
k
f
Figure 4
Folding rate versus relative contact order (a
measure of localness of contacts in the native
structure) for the 48 two-state proteins given in
Reference 91, showing that proteins with the most
local contacts fold faster than proteins with more
nonlocal contacts.
Simons, and Baker (PSB), namely, that the
folding speed of a protein is correlated with
a topological property of its native struc-
ture (88, 184). As shown in Figure 4, Plaxco
et al. found that the folding rates of two-state
proteinsnowknown to vary more than 8 or-
ders of magnitudecorrelate with the aver-
age degree to which native contacts are local
within the chain sequence: Fast-folders usu-
ally have mostly local structure, such as he-
lices andtight turns. Slow-folders usually have
more nonlocal structure, such as -sheets
(184), although there are exceptions (237).
Folding rates have been subsequently found
to correlate well with other native topologi-
cal parameters such as the proteins effective
chain length (chain length minus the num-
ber of amino acids in helices) (107), secondary
structure length (104), sequence-distant con-
tacts per residue (90), the fraction of contacts
that are sequence distant (163), the total con-
tact distance (245), and intrinsic propensities,
for example, of -helices (131). And, there
are now also methods that predict the folding
rate from the sequence (91, 186). It follows
that a protein typically forms smaller loops
and turns faster than it forms larger loops
and turns, consistent with the so-called zip-
pingandassembly (ZA) mechanism, described
below, which postulates that search speed is
www.annualreviews.org The Protein Folding Problem 295
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
governed by the effective loop sizes [the ef-
fective contact order (ECO) (53, 73)] that the
chain must search at any step.
Proteins Fold on Funnel-Shaped
Energy Landscapes
Why has folding been regarded as so chal-
lenging? The issue is the astronomical num-
ber of conformations a protein must search
to nd its native state. Models arose in the
1980s to study the nature of the conforma-
tional space (19, 47), i.e., the shape of the
energy landscape, which is the mathemati-
cal function F(, , X) that describes the
intramolecular-plus-solvation free energy of
a given protein as a function of the micro-
scopic degrees of freedom. A central goal has
beentoquantify the statistical mechanical par-
tition function, a key component of which is
the density of states (DOS), i.e., the number of
conformations at each energy level. In simple
cases, the logarithm of the DOS is the con-
formational entropy. Such entropies have not
been determinable through all-atom model-
ing, because that would require astronomi-
cal amounts of computational sampling (al-
though replica-exchange methods can nowdo
this for very small peptides). Hence under-
standing a proteins DOS and its entropies
has required simplied models, such as mean-
eldpolymer andlattice treatments (51), spin-
glass theories (19, 47), or exact enumerations
in minimalist models (132).
A key conclusion is that proteins have
funnel-shaped energy landscapes, i.e., many
high-energy states and few low-energy states
(19, 49, 50, 52, 138). What do we learn from
the funnel idea? Funnels have both qualita-
tive and quantitative uses. First, cartooniza-
tions of funnels (Figure 5) provide a use-
ful shorthand language for communicating
the statistical mechanical properties and fold-
ing kinetics of proteins. Figure 5 illustrates
fast folding (simple funnel), kinetic trapping
(moats or wells), and slow random searching
(golf course). These pictures show a key dis-
tinction between protein folding and simple
classical chemical reactions. A simple chemi-
cal reaction proceeds from its reactant A to its
product B, through a pathway, i.e., a sequence
of individual structures. A protein cannot do
this because its reactant, the denatured state,
is not a single microscopic structure. Fold-
ing is a transition from disorder to order, not
from one structure to another. Simple one-
dimensional reaction path diagrams do not
capture this tremendous reduction in confor-
mational degeneracy.
A funnel describes a proteins conforma-
tional heterogeneity. Conformational het-
erogeneity has been found in the few experi-
ments that have been designed to look for it
a b c d
N N N N
Figure 5
One type of energy landscape cartoon: the free energy F(, , ) of the bond degrees of freedom. These
pictures give a sort of simplied schematic diagram, useful for illustrating a proteins partition function
and density of states. (a) A smooth energy landscape for a fast folder, (b) a rugged energy landscape with
kinetic traps, (c) a golf course energy landscape in which folding is dominated by diffusional
conformational search, and (d ) a moat landscape, where folding must pass through an obligatory
intermediate.
296 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
(16, 143, 157, 206, 209, 244). For example, us-
ing time-resolved FRET with four different
intramolecular distances, Sridevi et al. (206)
found inBarstar that (a) that the chainentropy
increases as structures become less stable,
(b) that there are multiple folding routes, and
(c) that different routes dominate under differ-
ent folding conditions. Moreover, changing
the denaturant canchange the dominant path-
way, implying heterogeneous kinetics (143).
Figure 6 shows the funnel landscape that
has been determined by extensive mutational
analysis of the seven ankyrin sequence repeats
of the Notch ankyrin repeat domain (16, 157,
209).
A funnel describes a proteins chain
entropy. The funnel idea rst arose to ex-
plain denaturation, the balance between the
chain entropy and the forces of folding (48).
Proteins denature at high temperatures be-
cause there are many states of high energy
and fewer states of low energy, that is, the
landscape is funneled. For cold unfolding, the
shape of the funnel changes with temperature
because of free-energy changes of the aqueous
solvent. When you can accurately compute a
-6
0
6
G
k
c
a
l
/
m
o
l
Figure 6
The experimentally determined energy landscape of the seven ankyrin
repeats of the Notch receptor (16, 157, 209). The energy landscape is
constructed by measuring the stabilities of folded fragments for a series of
overlapping modular repeats. Each horizontal tier presents the partially
folded fragments with the same number of repeats. Reprinted from
Reference 157 with permission.
proteins DOS, you can predict the proteins
free energy of folding and its denaturationand
cooperativity properties. Figure 7 shows an
example in which the DOS (set onto its side
to illustrate the funnel) was found by exten-
sive lattice enumeration for F13W
, a three-
helix bundle, with predictions compared to
experiments (146).
Native
E
n
e
r
g
y
(
k
c
a
l
/
m
o
l
)
ln (conformation count)
50
0 80
2 0
0
1
0.5
50 0 100
0
1
0.5
4 6
Transition
states
GuHCl (M)
Temperature (C)
F
r
a
c
t
i
o
n
n
a
t
i
v
e
F
r
a
c
t
i
o
n
n
a
t
i
v
e
Figure 7
(Left) The density of states (DOS) cartoonized as an energy landscape for the three-helix bundle protein
F13W
: DOS (x-axis) versus the energy (y-axis). (Right) Denaturation predictions versus experiments
(146). The peak free energy (here, where the DOS is minimum), typically taken to be the transition state,
is energetically very close to native.
www.annualreviews.org The Protein Folding Problem 297
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
Funnels provide a microscopic framework
for folding kinetics. Folding kinetics is tra-
ditionally described by simple mass action
models, such as D I N (on-path inter-
mediate I between the denatured state D and
native state N) or X D N (off-path in-
termediate X), where the symbol I or X rep-
resents macrostates that are invoked for the
purpose of curve-tting experimental kinet-
ics data. In contrast, funnel models or master-
equation models aim to explain the kinetics
in terms of underlying physical forces. They
aim to predict the microstate composition of
those macrostates, for example. The states in
master-equation models differ from those in
mass-action models insofar as the former are
more numerous, more microscopic, are de-
ned by structural or energetic criteria, and
are arranged kinetically to reect the underly-
ing funnel-like organization of the dynamical
ows.
For example, Figure 8b shows a master-
equation model for the folding of SH3, illus-
trating the apparently paradoxical result that
folding can be serial and parallel at the same
time. The protein has multiple routes avail-
able. However, one of the dominant series
pathways is UB BDBDEBCDE
N. BD is the TSE because it is the dynam-
ically least populated state. B precedes BD
diagrammatically inseries inthis pathway. Yet,
the probability bucket labeled B does not rst
ll up and then empty into BD; rather the lev-
els in both buckets, B and BD, rise and fall to-
gether and hence are dynamically in parallel.
Such series and parallel steps are also seen in
computer simulations of CI2 (189), for exam-
ple. Sometimes a chain contact Aforms before
b
ABCE
1.9
ACDE
BDE
0.2
ABDE
1.3
B
2.0
ABD
2.9
ABCD
2.4
BD
2.4
A
2.5
ACE
2.9 1.3
AC
4.0
0.0
C
1.5
BCD
1.9
ABCDE
2.3
BCDE
0.3
BC
3.5
0
a
R TS P
R
Time
10
-4
10
-2
1
0.1 10 1000
0
B
BD
BDE
BCDE
ABCDE
P
i
(
t
)
10
-6
10
-4
10
-2
1
0.1 10 1000
Time
P
i
(
t
)
TS
P
A
D
E
B
C
10 20 30 40 50
10
20
30
40
50
strand
Diverging turn DT
3-10 helix
Loop
Figure 8
(a) A simple single-pathway system. R, reactant; TS, transition state; P, product. (b) Pathway diagram of
SH3 folding from a master-equation model. The native protein has ve contact clusters: A = RT loop;
B =
2
3
; C =
3
4
; D = RT loop-
4
; and E =
1
5
. Combined letters, such as BD, mean that
multiple contact clusters have formed. Funneling occurs toward the right, because the symbols on the left
indicate large ensembles, whereas the symbols on the right are smaller ensembles. The numbers indicate
free energies relative to the denatured state. The arrows between the states are colored to indicate
transition times between states. The slowest steps are in red; the fastest steps in green. BD is the
transition state ensemble because it is the highest free energy along the dominant route. While B and BD
would seem to be obligatorily in series, the time evolutions of these states show that they actually rise and
fall in parallel (232).
298 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
another contact B in nearly all the simulation
trajectories (series-like). But another contact
C may form before B in some trajectories and
after B in others (parallel-like). Some folding
is sequential, as inFynSH3 (123), cytochrome
(66), T4 lysozyme (24), and Im7 (74), and
some folding is parallel, as in cytochrome C
(83) and HEW lyzosyme (151).
Or, consider the traditional idea that a re-
actions RLS coincides with the point of the
highest free energy, G
is a thermodynamic
quantitythe maximumfree energy alongthe
fastest route, which usually does correspond
to some specic ensemble of structures. Be-
low, we describe why these matters of princi-
ple are important.
How do we convert folding experiments
into insights about molecular behavior?
To interpret data, we must use models. -
value experiments aim to identify RLSs in
folding. But how we understand the molec-
ular events causing a given -value depends
on whether we interpret it by funnels or path-
ways. A -value measures how a folding rate
changes when a protein is mutated (42, 67,
71, 105, 144, 153, 154, 176, 178, 193) (see
Reference 231andreferences 124 therein).
equals the change in the logarithmof the fold-
ing rate causedby the mutation, dividedby the
change in the logarithm of the folding equi-
librium constant. If we then seek a structural
interpretation of , we need a model. Us-
ing the Bronsted-Hammond pathway model
of chemical reactions, is often assumed to
indicate the position of the TSE along the
folding reaction coordinate: = 0 means
the mutation site is denatured in the TSE;
= 1 means the mutation site is native in
the TSE. In this pathway view, can never
lie outside the range from 0 to 1; in the fun-
nel view, is not physically restricted to this
range. For example, < 0 or > 1 has been
predicted for mutations that stabilize a helix
but that destabilize the bundles tertiary struc-
ture (231). Unfortunately, experiments are not
yet denitive. While some -values are in-
deed observed to be negative or greater than
1 (44, 85, 176, 193), those values might be ex-
perimental artifacts (193). Other challenges in
interpreting have also been noted (65, 188).
To resolve the ambiguities in interpreting ,
we need to deepen our understanding beyond
the single-reaction-coordinate idea.
How do we convert computer simulations
into insights about molecular behavior?
Similarly, insights about folding events are
often sought from computer modeling. It is
much easier to calculate structural or ener-
getic quantities than kinetic quantities. For
example, some modeling efforts compute -
values by assuming some particular struc-
ture for the TSE (78, 233) or some partic-
ular reaction coordinate, such as the RMSD
to native structure, radius of gyration, num-
ber of hydrogen bonds, or number of na-
tive contacts (196). Alternatively, a quantity
called pfold (56), which denes a separatrix
(a sort of continental divide between folded
and unfolded states), is sometimes computed.
Although pfold predicts well the RLSs for
simple landscapes (147), it can give less in-
sight into protein landscapes having multi-
ple barriers or other complexity (30). To go
beyond classical assumptions, there has been
an extensive and growing effort to use master-
equationapproaches (13, 26, 31, 36, 60, 61, 77,
110, 161, 172, 176, 201, 202, 208, 211, 212,
231, 232) to explore underlying assumptions
about reaction coordinates, pathways, transi-
tion states, and RLSs.
www.annualreviews.org The Protein Folding Problem 299
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
Funnel models can explain some non-
canonical behaviors in ultrafast folding.
More than a dozen proteins fold in microsec-
onds (128). Some foldinhundreds of nanosec-
onds (127, 237). Is there a state of protein
folding that is so fast that there is no free-
energy barrier at all (156)? This has some-
times been called downhill folding (93, 122,
128, 187). There is currently an intensive
search for downhill foldersand much con-
troversy about whether or not such folding
has yet been observed, mainly in BBL, a
40-residue helical protein. That controversy
hinges on questions of experimental analysis
(68, 69, 75, 103, 168, 170, 191): establishing
proper baselines and ionization states to nd
denaturation temperatures and to determine
whether the equilibrium is two-state, for ex-
ample, which would imply a barrier between
D and N.
Remarkably, all known ultrafast-folders
have anti-Arrhenius thermal kinetics. That
is, heating those proteins at high tempera-
tures slows down folding, the opposite of what
is expected from traditional activation barri-
ers. Here too, any molecular interpretation
requires a model, and the common expecta-
tion is based on the classical Arrhenius/Eyring
pathway model. Is the Arrhenius model suf-
cient for funnels involving many fast pro-
cesses? Ultrafast folding kinetics has recently
been explored in various models (59, 77, 122,
148, 167). One funnel model (77) explains
that the reason why increasing the tempera-
ture leads to slower folding is because of ther-
mal unfolding of the denatured chain, leading
to a larger conformational space that must be
searched for the chain to nd route to native
downhill. It predicts that the ultimate speed
limit to protein folding, at temperatures that
will disappear all other barriers, is the confor-
mational search through the denatured basin.
Near the speed limit of protein folding, the
heterogeneity and searching that are intrinsic
to funnels can be an important component of
the folding physics. That model also explains
that helical proteins fold faster than -sheets,
on average, because helices have more paral-
lel microscopic folding routes (because a he-
lix can nucleate at many different points along
the chain).
The Zipping and Assembly
Hypothesis for the Folding Routes
Protein folding is a stochastic process: One
protein molecule in a beaker follows a
different microscopic trajectory than another
molecule because of thermal uctuations.
Hence, protein folding is often studied us-
ing Monte Carlo or molecular dynamics sam-
pling. However, computations seeking the
native state using purely physical models
are prohibitively expensive, because this is a
challenging needle-in-a-haystack global opti-
mization problem (96, 132, 216). Since the
beginnings of experimental folding kinetics,
there has been the view that the Levinthal
paradoxof how a protein searches its con-
formational space so quicklymight be ex-
plained by a folding mechanism, i.e., by some
higher-level description (beyond the state-
ment that it is stochastic) that claries how
the protein decides which structures to form
and avoids searching vast stretches of the con-
formational space in the rst place.
Zipping and assembly (ZA) is a hypothesis
for a general folding mechanism. On fast
timescales, small fragments of the chain can
search their conformations more completely
than larger fragments can (53, 73). There are
certain problems of global optimization
including the ZA mechanism of protein
folding and the Cocke-Kasami-Younger
method for parsing sentences (54, 102)in
which the globally optimal solution (native
structure, in this case) can almost always
be found (although not guaranteed) by a
divide-and-conquer strategy, a fast process
of cobbling together smaller locally optimal
decisions. Accordingly, in the earliest time
steps after folding is initiated (picoseconds to
nanoseconds), each of the different peptide
fragments of the chain searches for small
300 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
local metastable structures, such as helical
turns, -turns, or small loops. Each peptide
segment searches its own conformations, at
the same time that other segments are search-
ing. Not stable on their own, a few of those
local structures are sufciently metastable
to survive to the next longer timescale,
where they grow (or zip) into increasingly
larger and more stable structures. On still
longer timescales, pairs or groups of these
substructures can assemble into structures
that are still larger and more native-like, and
metastability gives way to stability (97, 102,
103, 199, 215, 223, 228230, 232).
The ZA mechanism shares much in
common with other mechanisms, such as
diffusion-collision, hierarchical, and foldon
models. The last two mechanisms, however,
are descriptors of experiments. They do not
prescribe how to compute a proteins fold-
ing route from its amino acid sequence. In
contrast, the ZA mechanism is such a micro-
scopic recipe, starting from the amino acid
sequence and specifying a time series of en-
sembles of conformations the chain searches
at each stage of folding. ZAis a funnel process:
There are many parallel microscopic routes at
the beginning, and fewer and more sequential
routes at the end. The ZA mechanism pro-
vides a plausible answer to Levinthals para-
dox of what vast stretches of conformational
space the protein never bothers to search.
For any compact native polymer structure,
there are always routes to the native state that
take only small-conformational-entropy-loss
steps. ZAotherwise explores very little of con-
formational space. These few routes consti-
tute the dominant folding processes in the
ZA mechanism. One test of this mechanismis
the prediction of the change of folding routes
(229), measured by the change in -value dis-
tributions (143), upon circular permutation
of the chain. Proteins can be circularly per-
muted if the chain termini are adjacent to each
other inthe wild-type native structure. Insuch
cases, the ends are covalently linked and the
chain is broken elsewhere. This alters the na-
tive topology (contact map) dramatically, and
sometimes the folding routes, but does not ap-
pear to substantially change the native struc-
ture (20, 142, 143, 160, 221).
PHYSICS-BASED MODELING
OF FOLDING AND
STRUCTURE PREDICTION
Computer simulations of purely physics-
based models are becoming useful for struc-
ture prediction and for studying folding
routes. Here the metric of success is not purely
performance in native structure prediction; it
is to gain a deeper understanding of the forces
and dynamics that govern protein properties.
When purely physical methods are success-
ful, it will allow us to go beyond bioinfor-
matics to (a) predict conformational changes,
such as induced t, important for computa-
tional drug discovery; (b) understand protein
mechanisms of action, motions, folding pro-
cesses, enzymatic catalysis, and other situa-
tions that require more than just the static
native structure; (c) understand how proteins
respond to solvents, pH, salts, denaturants,
and other factors; and (d ) design synthetic
proteins having noncanonical amino acids
or foldameric polymers with nonbiological
backbones.
A key issue has been whether semiem-
pirical atomic physical force elds are good
enough to fold up a protein in a computer.
Physics-based methods are currently limited
by large computational requirements ow-
ing to the formidable conformational search
problemand, to a lesser extent, by weaknesses
in force elds. Nevertheless, there have been
notable successes in the past decade enabled
by the development of large supercomputer
resources and distributed computing systems.
The rst milestone was a supercomputer sim-
ulation by Duan and Kollman in 1998 of
the folding of the 36-residue villin headpiece
in explicit solvent, for nearly a microsec-
ond of computed time, reaching a collapsed
state 4.5
A from the NMR structure (57).
www.annualreviews.org The Protein Folding Problem 301
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
Another milestone was the development by
Pande and colleagues of Folding@home, a
distributed grid computing system running
on the screensavers of volunteer comput-
ers worldwide. Pande and colleagues (241)
have studied the folding kinetics of villin.
High-resolution structures of villin have re-
cently been reached by Pande and colleagues
(110) and Duan and colleagues (136, 137). In
addition, three groups have folded the 20-
residue Trp-cage peptide to 1
A: Simmer-
ling et al. (200), the IBM Blue Gene group
of Pitera and Swope (183), and Duan and
colleagues (35). Recently, Lei & Duan (135)
folded the albumin-binding domain, a 47-
residue, three-helix bundle, to 2.0
A. Physics-
based approaches are also folding small he-
lices and -hairpin peptides of up to 20
residues that have stable secondary structures
(63, 81, 108, 240, 246; M.S. Shell, R. Ritterson
MTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE
815
617
2839
4354
4156
2856
156
120
3037 4552
a
b
Figure 9
The folding routes found in the ZAM conformational search process for
protein G, from the work described in Reference 177. The chain is rst
parsed into many short, overlapping fragments. After sampling by replica
exchange molecular dynamics, stable hydrophobic contacts are identied
and restrained. Fragments are then either (a) grown or zipped though
iterations of adding new residues, sampling, and contact detection, or
(b) assembled together pairwise using rigid body alignment followed by
further sampling until a completed structure is reached.
& K.A. Dill, unpublished data). Physical po-
tential models have also been sampled using
non-Boltzmann stochastic and deterministic
optimization strategies (121, 174, 207, 220).
Here are some of the key conclusions.
First, a powerful way to sample conforma-
tions and obtain proper Boltzmann aver-
ages is replica exchange molecular dynamics
(REMD) (210). Second, although force elds
are good, they need improvements in back-
bone torsional energies to address the bal-
ance between helical and extended conforma-
tions (81, 108, 240), and in implicit solvation,
which dramatically reduces the expense rel-
ative to explicit water simulations but which
frequently overstabilizes ion-pairing interac-
tions, in turn destabilizing native structures
(63, 246).
Can modern force elds with Boltzmann
sampling predict larger native structures? Re-
cent work indicates that, when combined with
a conformational search technique based on
the ZA folding mechanism, purely physics-
based methods can arrive at structures close
to the native state for chains up to 100
monomers (177; M.S. Shell, S.B. Ozkan, V.A.
Voelz, G.H.A. Wu & K.A. Dill, unpublished
data). The approach, called ZAM (zipping
and assembly method), uses replica exchange
and the AMBER96 force eld and works by
(a) breaking the full protein chain into small
fragments (initially 8-mers), which are sim-
ulated separately using REMD; (b) then
growing or zipping the fragments having
metastable structures by adding a few new
residues or assembling two such fragments
together, with further REMD and itera-
tions; and (c) locking in place any stable
residue-residue contacts with a harmonic
spring, enforcing emerging putative physi-
cal folding routes, without the need to sam-
ple huge numbers of degrees of freedom at a
time.
ZAM was tested through the folding
of eight of nine small proteins from the
PDB to within 2.5
A, using a 70-processor
cluster over 6 months (177), giving good
302 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
agreement with the -values known for
four of them. Figure 9 shows the ZAM
folding process for one of these proteins, and
Figure 10 shows the predicted versus experi-
mental structures for all nine. In a more strin-
gent test, ZAM was applied in CASP7 to the
folding of six small proteins from 76 to 112
residues (M.S. Shell, S.B. Ozkan, V.A. Voelz,
G.H.A. Wu & K.A. Dill, unpublished data).
Of the four proteins attempted in CASP7 that
were not domain-swapped, ZAM predicted
roughly correct tertiary structures, segments
of more than 40 residues with an average
RMSDof 5.9 angstroms, andsecondary struc-
tures with 73% accuracy. From these stud-
ies it has been concluded that ZA routes can
identify limited-sampling routes to the native
state from unfolded states, directed by all-
atomforce elds, and that the AMBER96 plus
a generalized Born implicit solvent model is a
reasonable scoring function. Fragments that
adopt incorrect secondary structures early in
the simulations are frequently corrected in
later-stage folding because the emerging ter-
tiary structure of the protein often will not
tolerate them.
SUMMARY
The protein folding problem has seen enor-
mous advances over the last fty years.
Newexperimental techniques have arisen, in-
cluding hydrogen exchange, -value meth-
ods that probe mutational effects on fold-
ing rates, single-molecule methods that can
explore heterogeneity of folding and en-
ergy landscapes, and fast temperature-jump
methods. New theoretical and computa-
tional approaches have emerged, includ-
ing methods of bioinformatics, multiple-
sequence alignments, structure-prediction
Web servers, physics-based force elds of
good accuracy, physical models of energy
landscapes, fast methods of conformational
sampling and searching, master-equation
methods to explore the physical mecha-
nisms of folding, parallel and distributed
grid-based computing, and the CASP
community-wide event for protein structure
prediction.
Protein folding no longer appears to be
the insurmountable grand challenge that it
once appeared to be. Current knowledge of
folding codes is sufcient to guide the suc-
cessful designs of newproteins and foldameric
materials. For the once seemingly intractable
Levinthal puzzle, there is now a viable hy-
pothesis: A protein can fold quickly and
solve its big global optimization puzzle
by piecewise solutions of smaller compo-
nent puzzles. Other matters of principle
are now yielding to theory and physics-
based modeling. And current computer algo-
rithms are now predicting native structures of
small proteins remarkably accurately, promis-
ing growing value in drug discovery and
proteomics.
Protein A Albumin-binding domain protein 3 D
Protein G -spectrin SH3 src-SH3
YJQ8 WW domain (res 731) FBP28 WW domain (res 631) 35-mer unit of ubiquitin
Figure 10
Ribbondiagrams of the predictedproteinstructures using the ZAMsearchal-
gorithm( purple) versus experimental PDBstructures (orange). The backbone
C-RMSDs with respect to PDB structures are protein A (1.9
A), albumin
domain binding protein (2.4
A), alpha-3D [2.85
A (excluding loop residues)
or 4.6
A], FBP26 WW domain (2.2
A), YJQ8 WW domain (2.0
A), 135
residue fragment of Ubiquitin (2.0
A), protein G (1.6
A), and -spectrin SH3
(2.2
A). ZAM fails to nd the src-SH3 structure: Shown is a conformation
that is 6
A from the experimental structure. The problem in this case appears
to be overstabilization of nonnative ion pairs in the GB/SA implicit solvation
model.
www.annualreviews.org The Protein Folding Problem 303
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
SUMMARY POINTS
1. The protein folding code is mainly embodied in side chain solvation interactions.
Novel protein folds and nonbiological foldamers are now being successfully designed
and are moving toward practical applications.
2. Thanks to CASP, the growing PDB, and fast-homology and sequence alignment
methods, computer methods now can often predict correct native structures of small
proteins.
3. The protein folding problem has both drivenand beneted frombig advances in
experimental and theoretical/computational methods.
4. Proteins fold on funnel-shaped energy landscapes, which describe the conformational
heterogeneity among the nonnative states. This heterogeneity is key to the entropy
that opposes folding and thus to folding equilibria. This heterogeneity is also impor-
tant for understanding folding kinetics at the level of the individual chain processes.
5. A protein can fold quickly to its native structure by ZA, making independent local
decisions rst and then combining those substructures. In this way, a protein can avoid
searching most of its conformational space. ZA appears to be a useful search method
for computational modeling.
DISCLOSURE STATEMENT
The authors are not aware of any biases that might be perceived as affecting the objectivity of
this review.
ACKNOWLEDGMENTS
For very helpful comments and insights, both on this review and through ongoing discussions
over the years, we are deeply grateful to D. Wayne Bolen, Hue Sun Chan, John Chodera, Yong
Duan, Walter Englander, Frank Noe, Jos e Onuchic, Vijay Pande, Jed Pitera, Kevin Plaxco,
Adrian Roitberg, George Rose, Tobin Sosnick, Bill Swope, Dave Thirumalai, Vince Voelz,
Peter G Wolynes, and Huan-Xiang Zhou. We owe particular thanks and appreciation to Buzz
Baldwin, to whom this volume of the Annual Review of Biophysics is dedicated, not only for his
interest and engagement with us on matters of protein folding over the many years, but also
for his pioneering and founding leadership of the whole eld. We appreciate the support from
NIH grant GM 34993, the Air Force, and the Sandler Foundation.
LITERATURE CITED
1. 2005. So much more to know. . . . Science 309:78102
2. Allen F, Coteus P, Crumley P, Curioni A, Denneau M, et al. 2001. Blue gene: a vision for
protein science using a petaop supercomputer. IBM Syst. J. 40:31027
3. Annsen CB. 1973. Principles that govern the folding of protein chains. Science 181:223
30
4. Annsen CB, Scheraga HA. 1975. Experimental and theoretical aspects of protein folding.
Adv. Protein Chem. 29:205300
304 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
5. Auton M, Holthauzen LM, Bolen DW. 2007. Anatomy of energetic changes accompany-
ing urea-induced protein denaturation. Proc. Natl. Acad. Sci. USA 104:1531722
6. Avbelj F, Baldwin RL. 2002. Role of backbone solvation in determining thermodynamic
propensities of the amino acids. Proc. Natl. Acad. Sci. USA 99:130913
7. Bachmann A, Kiefhaber T. 2001. Apparent two-state tendamistat folding is a sequential
process along a dened route. J. Mol. Biol. 306:37586
8. Baker D. 2006. Prediction and design of macromolecular structures and interactions.
Philos. Trans. R. Soc. B Biol. Sci. 361:45963
9. Baker D, Sali A. 2001. Protein structure prediction and structural genomics. Science
294:9396
10. Baldwin RL, Rose GD. 1999. Is protein folding hierarchic? I. Local structure and peptide
folding. Trends Biochem. Sci. 24:2633
11. Banavar JR, Maritan A. 2007. Physics of proteins. Annu. Rev. Biophys. Biomol. Struct.
36:26180
12. Banavar JR, Maritan A, Micheletti C, Trovato A. 2002. Geometry and physics of proteins.
Proteins 47:31522
13. Best RB, Hummer G. 2005. Reaction coordinates and rates from transition paths. Proc.
Natl. Acad. Sci. USA 102:673237
14. Bieri O, Wirz J, Hellrung B, Schutkowski M, Drewello M, Kiefhaber T. 1999. The speed
limit for protein folding measured by triplet-triplet energy transfer. Proc. Natl. Acad. Sci.
USA 96:9597601
15. Bolhuis PG. 2005. Kinetic pathways of -hairpin (un)folding in explicit solvent. Biophys.
J. 88:5061
16. Bradley CM, Barrick D. 2006. The Notch ankyrin domain folds via a discrete, centralized
pathway. Structure 14:130312
17. Bradley P, Misura KMS, Baker D. 2005. Toward high-resolution de novo structure pre-
diction for small proteins. Science 309:186871
18. Brant DA, Flory PJ. 1965. The role of dipole interactions in determining polypeptide
congurations. J. Am. Chem. Soc. 87:66364
19. Bryngelson JD, Wolynes PG. 1987. Spin glasses and the statistical mechanics of protein
folding. Proc. Natl. Acad. Sci. USA 84:752428
20. Bulaj G, Koehn RE, Goldenberg DP. 2004. Alteration of the disulde-coupled folding
pathway of BPTI by circular permutation. Protein Sci. 13:118296
21. Byrne MP, Manuel RL, Lowe LG, Stites WE. 1995. Energetic contribution of side
chain hydrogen bonding to the stability of staphylococcal nuclease. Biochemistry 34:13949
60
22. Callender RH, Dyer RB, Gilmanshin R, Woodruff WH. 1998. Fast events in pro-
tein folding: the time evolution of primary processes. Annu. Rev. Phys. Chem. 49:173
202
23. Cecconi C, Shank EA, Bustamante C, Marqusee S. 2005. Direct observation of the three-
state folding of a single protein molecule. Science 309:205760
24. Cellitti J, Bernstein R, Marqusee S. 2007. Exploring subdomain cooperativity in T4
lysozyme. II. Uncovering the C-terminal subdomain as a hidden intermediate in the
kinetic folding pathway. Protein Sci. 16:85262
25. Chan HS, Dill KA. 1990. Origins of structure in globular proteins. Proc. Natl. Acad. Sci.
USA 87:638892
26. Chan HS, Dill KA. 1994. Transition states and folding dynamics of proteins and het-
eropolymers. J. Chem. Phys. 100:923857
www.annualreviews.org The Protein Folding Problem 305
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
27. Chekmarev SF, Krivov SV, Karplus M. 2005. Folding time distributions as an approach
to protein folding kinetics. J. Phys. Chem. B 109:531230
28. Chen J, Stites WE. 2001. Packing is a key selection factor in the evolution of protein
hydrophobic cores. Biochemistry 40:1528089
29. Chikenji G, Fujitsuka Y, Takada S. 2006. Shaping up the protein folding funnel by local
interaction: lesson from a structure prediction study. Proc. Natl. Acad. Sci. USA 103:3141
46
30. Cho SS, Levy Y, Wolynes PG. 2006. P versus Q: structural reaction coordinates capture
protein folding on smooth landscapes. Proc. Natl. Acad. Sci. USA 103:58691
31. Chodera JD, Singhal N, Pande VS, Dill KA, Swope WC. 2007. Automatic discovery of
metastable states for the construction of Markov models of macromolecular conforma-
tional dynamics. J. Chem. Phys. 126:155101
32. Chothia C, Lesk AM. 1986. The relationbetweenthe divergence of sequence andstructure
in proteins. EMBO J. 5:82326
33. Chou PY, Fasman GD. 1974. Prediction of protein conformation. Biochemistry 13:222
45
34. Chou PY, Fasman GD. 1978. Empirical predictions of protein conformation. Annu. Rev.
Biochem. 47:25176
35. Chowdhury S, Lee MC, Xiong G, Duan Y. 2003. Ab initio folding simulation of the
Trp-cage mini-protein approaches NMR resolution. J. Mol. Biol. 327:71117
36. Cieplak M, Henkel M, Karbowski J, Banavar JR. 1998. Master equation approach to
protein folding and kinetic traps. Phys. Rev. Lett. 80:365457
37. Cieplak M, Xuan Hoang T. 2000. Scaling of folding properties in Go models of proteins.
J. Biol. Phys. 26:27394
38. Clementi C, Nymeyer H, Onuchic JN. 2000. Topological and energetic factors: What
determines the structural details of the transition state ensemble and en-route inter-
mediates for protein folding? An investigation for small globular proteins. J. Mol. Biol.
298:93753
39. Cordes MHJ, Davidsont AR, Sauer RT. 1996. Sequence space, folding and protein design.
Curr. Opin. Struct. Biol. 6:310
40. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, et al. 1995. A second generation
force eld for the simulation of proteins, nucleic acids, and organic molecules. J. Am.
Chem. Soc. 117:517997
41. Creamer TP, Rose GD. 1994. A-helix-forming propensities in peptides and proteins.
Proteins 19:8597
42. Cymes GD, Grosman C, Auerbach A. 2002. Structure of the transition state of gat-
ing in the acetylcholine receptor channel pore: a -value analysis. Biochemistry 41:5548
55
43. Dahiyat BI, Mayo SL. 1997. De novo protein design: fully automated sequence selection.
Science 278:8287
44. de Los Rios MA, Daneshi M, Plaxco KW. 2005. Experimental investigation of the
frequency and substitution dependence of negative -values in two-state proteins.
Biochemistry 44:1216067
45. Debe DA, Carlson MJ, Goddard WA. 1999. The topomer-sampling model of protein
folding. Proc. Natl. Acad. Sci. USA 96:2596601
46. Deechongkit S, Dawson PE, Kelly JW. 2004. Toward assessing the position-dependent
contributions of backbone hydrogen bonding to -sheet folding thermodynamics em-
ploying amide-to-ester perturbations. J. Am. Chem. Soc. 126:1676271
306 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
47. Dill KA. 1985. Theory for the folding and stability of globular proteins. Biochemistry
24:15019
48. Dill KA. 1990. Dominant forces in protein folding. Biochemistry 29:713355
49. Dill KA. 1999. Polymer principles and protein folding. Protein Sci. 8:116680
50. Dill KA, Alonso DOV, Hutchinson K. 1989. Thermal stabilities of globular proteins.
Biochemistry 28:543949
51. Dill KA, Bromberg S, Yue KZ, Fiebig KM, Yee DP, et al. 1995. Principles of protein
folding: a perspective from simple exact models. Protein Sci. 4:561602
52. Dill KA, Chan HS. 1997. From Levinthal to pathways to funnels. Nat. Struct. Biol. 4:10
19
53. Dill KA, Fiebig KM, Chan HS. 1993. Cooperativity in protein-folding kinetics. Proc. Natl.
Acad. Sci. USA 90:194246
54. Dill KA, Lucas A, Hockenmaier J, Huang L, Chiang D, Joshi AK. 2007. Computational
linguistics: a new tool for exploring biopolymer structures and statistical mechanics. Poly-
mer 48:4289300
55. Drozdov AN, Grosseld A, Pappu RV. 2004. Role of solvent in determining conforma-
tional preferences of alanine dipeptide in water. J. Am. Chem. Soc. 126:257481
56. Du R, Pande VS, Grosberg AY, Tanaka T, Shakhnovich ES. 1998. On the transition
coordinate for protein folding. J. Chem. Phys. 108:33450
57. Duan Y, Kollman PA. 1998. Pathways to a protein folding intermediate observed in a
1-microsecond simulation in aqueous solution. Science 282:74044
58. Dyson HJ, Wright PE, Scheraga HA. 2006. The role of hydrophobic interactions in
initiation and propagation of protein folding. Proc. Natl. Acad. Sci. USA 103:13057
61
59. Ellison PA, Cavagnero S. 2006. Role of unfolded state heterogeneity and en-route rugged-
ness in protein folding kinetics. Protein Sci. 15:56482
60. Elmer SP, Park S, Pande VS. 2005. Foldamer dynamics expressed via Markov state
models. I. Explicit solvent molecular-dynamics simulations in acetonitrile, chloroform,
methanol, and water. J. Chem. Phys. 123:114902
61. Elmer SP, Park S, Pande VS. 2005. Foldamer dynamics expressed via Markov state models.
II. State space decomposition. J. Chem. Phys. 123:114903
62. English EP, Chumanov RS, Gellman SH, Compton T. 2006. Rational development of
-peptide inhibitors of human cytomegalovirus entry. J. Biol. Chem. 281:266167
63. Felts AK, Harano Y, Gallicchio E, Levy RM. 2004. Free energy surfaces of -hairpin and
-helical peptides generated by replica exchange molecular dynamics with the AGBNP
implicit solvent model. Proteins 56:31021
64. Feng DF, Doolittle RF. 1987. Progressive sequence alignment as a prerequisite to correct
phylogenetic trees. J. Mol. Evol. 25:35160
65. Feng H, Vu ND, Zhou Z, Bai Y. 2004. Structural examination of -value analysis in
protein folding. Biochemistry 43:1432531
66. Feng H, Zhou Z, Bai Y. 2005. A protein folding pathway with multiple folding interme-
diates at atomic resolution. Proc. Natl. Acad. Sci. USA 102:502631
67. Ferguson N, Sharpe TD, Johnson CM, Fersht AR. 2006. The transition state for folding
of a peripheral subunit-binding domain contains robust and ionic-strength dependent
characteristics. J. Mol. Biol. 356:123747
68. Ferguson N, Sharpe TD, Johnson CM, Schartau PJ, Fersht AR. 2007. Structural biology:
analysis of downhill protein folding. Nature 445:1415
69. Ferguson N, Sharpe TD, Schartau PJ, Sato S, Allen MD, et al. 2005. Ultra-fast barrier-
limited folding in the peripheral subunit-binding domain family. J. Mol. Biol. 353:42746
www.annualreviews.org The Protein Folding Problem 307
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
70. Fersht AR. 1997. Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7:39
71. Fersht AR, Sato S. 2004. -value analysis and the nature of protein-folding transition
states. Proc. Natl. Acad. Sci. USA 101:797681
72. Fersht AR, Shi JP, Knill-Jones J, Lowe DM, Wilkinson AJ, et al. 1985. Hydrogen bonding
and biological specicity analysed by protein engineering. Nature 314:23538
73. Fiebig KM, Dill KA. 1993. Protein core assembly processes. J. Chem. Phys. 98:347587
74. Friel CT, Beddard GS, Radford SE. 2004. Switching two-state to three-state kinetics
in the helical protein Im9 via the optimisation of stabilising non-native interactions by
design. J. Mol. Biol. 342:26173
75. Garcia-Mira MM, Sadqi M, Fischer N, Sanchez-Ruiz JM, Munoz V. 2002. Experimental
identication of downhill protein folding. Science 298:219195
76. Gellman SH. 1998. Foldamers: a manifesto. Acc. Chem. Res. 31:17380
77. Ghosh K, Ozkan SB, Dill K. 2007. The ultimate speed limit to protein folding is confor-
mational searching. J. Am. Chem. Soc. 129:1192027
78. Gianni S, Guydosh NR, Khan F, Caldas TD, Mayor U, et al. 2003. Unifying features in
protein-folding mechanisms. Proc. Natl. Acad. Sci. USA 100:1328691
79. Gibson KD, Scheraga HA. 1967. Minimization of polypeptide energy I. Preliminary
structures of bovine pancreatic ribonuclease s-peptide. Proc. Natl. Acad. Sci. USA 58:420
27
80. Gilmanshin R, Williams S, Callender RH, Woodruff WH, Dyer RB. 1997. Fast events in
protein folding: relaxation dynamics of secondary and tertiary structure in native apomyo-
globin. Proc. Natl. Acad. Sci. USA 94:370913
81. Gnanakaran S, Garcia AE. 2003. Validation of an all-atomprotein force eld: fromdipep-
tides to larger peptides. J. Phys. Chem. B 107:1255557
82. Go N, Taketomi H. 1978. Respective roles of short- and long-range interactions in protein
folding. Proc. Natl. Acad. Sci. USA 75:55963
83. GoldbeckRA, Thomas YG, ChenE, Esquerra RM, Kliger DS. 1999. Multiple pathways on
a protein-folding energy landscape: kinetic evidence. Proc. Natl. Acad. Sci. USA96:278287
84. Goldenberg DP. 1988. Genetic studies of protein stability and mechanisms of folding.
Annu. Rev. Biophys. Biomol. Struct. 17:481507
85. Goldenberg DP. 1999. Finding the right fold. Nat. Struct. Biol. 6:98790
86. Goodman CM, Choi S, Shandler S, DeGrado WF. 2007. Foldamers as versatile frame-
works for the design and evolution of function. Nat. Chem. Biol. 3:25262
87. Grabowski M, Joachimiak A, Otwinowski Z, Wladek M. 2007. Structural genomics: keep-
ing up with expanding knowledge of the protein universe. Nucleic Acids Seq. Topol. 17:347
53
88. Grantcharova V, Alm EJ, Baker D, Horwich AL. 2001. Mechanisms of protein folding.
Curr. Opin. Struct. Biol. 11:7082
89. Grater F, Grubmuller H. 2007. Fluctuations of primary ubiquitin folding intermediates
in a force clamp. J. Struct. Biol. 157:55769
90. Gromiha MM, Selvaraj S. 2001. Comparison between long-range interactions and contact
order indetermining the folding rate of two-state proteins: applicationof long-range order
to folding rate prediction. J. Mol. Biol. 310:2732
91. Gromiha MM, Thangakani AM, Selvaraj S. 2006. FOLD-RATE: prediction of protein
folding rates from amino acid sequence. Nucleic Acids Res. 34:W7074
92. Haber E, Annsen CB. 1962. Side-chain interactions governing the pairing of half-cystine
residues in ribonuclease. J. Biol. Chem. 237:183944
93. Hagen SJ. 2007. Probe-dependent and nonexponential relaxation kinetics: unreliable sig-
natures of downhill protein folding. Proteins 68:20517
308 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
94. Handel T, Degrado WF. 1990. Denovo design of a Zn
2+
-binding protein. J. Am. Chem.
Soc. 112:671011
95. Hansmann UHE, Okamoto Y. 1993. Prediction of peptide conformation by multicanoni-
cal algorithm: new approach to the multiple-minima problem. J. Comput. Chem. 14:1333
38
96. Hart WE, Istrail S. 1997. Robust proofs of NP-hardness for protein folding: general
lattices and energy potentials. J. Comput. Biol. 4:122
97. Haspel N, Tsai CJ, WolfsonH, Nussinov R. 2003. Reducing the computational complexity
of protein folding via fragment folding and assembly. Protein Sci. 12:117787
98. Hecht MH, Das A, Go A, Bradley LH, Wei Y. 2004. De novo proteins from designed
combinatorial libraries. Protein Sci. 13:171123
99. Hecht MH, RichardsonJS, RichardsonDC, OgdenRC. 1990. De novodesign, expression,
and characterization of felix: a four-helix bundle protein of native-like sequence. Science
249:88491
100. Ho BK, Dill KA. 2006. Folding very short peptides using molecular dynamics. PLoS
Comput. Biol. 2:e27
101. Hoang L, Bedard S, Krishna MMG, Lin Y, Englander SW. 2002. Cytochrome c folding
pathway: kinetic native-state hydrogen exchange. Proc. Natl. Acad. Sci. USA 99:1217378
102. Hockenmaier J, Joshi AK, Dill KA. 2006. Routes are trees: the parsing perspective on
protein folding. Proteins 66:115
103. Huang F, Sato S, Sharpe TD, Ying L, Fersht AR. 2007. Distinguishing between cooper-
ative and unimodal downhill protein folding. Proc. Natl. Acad. Sci. USA 104:12327
104. Huang JT, Cheng JP, Chen H. 2007. Secondary structure length as a determinant of
folding rate of proteins with two- and three-state kinetics. Proteins 67:1217
105. Hubner IA, Shimada J, Shakhnovich EI. 2003. values and the folding transition state of
protein G: utilization and interpretation of experimental data through simulation. Abstr.
Pap. Am. Chem. Soc. 226:U450
106. Itzhaki LS, Otzen DE, Fersht AR. 1995. The structure of the transition state for folding
of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a
nucleation-condensation mechanism for protein folding. J. Mol. Biol. 254:26088
107. Ivankov DN, Finkelstein AV. 2004. Prediction of protein folding rates from the amino
acid sequence-predicted secondary structure. Proc. Natl. Acad. Sci. USA 101:894244
108. Jang S, Kim E, Pak Y. 2007. Direct folding simulation of -helices and -hairpins based
on a single all-atom force eld with an implicit solvation model. Proteins 66:5360
109. Jas GS, Eaton WA, Hofrichter J. 2001. Effect of viscosity on the kinetics of -helix and
-hairpin formation. J. Phys. Chem. B 105:26172
110. Jayachandran G, Vishal V, Pande VS. 2006. Using massively parallel simulation and
Markovian models to study protein folding: examining the dynamics of the villin head-
piece. J. Chem. Phys. 124:164902
111. Jones DT, Taylor WR, Thornton JM. 1992. A new approach to protein fold recognition.
Nature 358:8689
112. Kamtekar S, Schiffer JM, Xiong H, Babik JM, Hecht MH. 1993. Protein design by binary
patterning of polar and nonpolar amino acids. Science 262:168085
113. Karanicolas J, Brooks CL. 2002. The origins of asymmetry in the folding transition states
of protein L and protein G. Protein Sci. 11:235161
114. Karplus M, Kuriyan J. 2005. Molecular dynamics and protein function. Proc. Natl. Acad.
Sci. USA 102:667985
www.annualreviews.org The Protein Folding Problem 309
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
115. Karplus M, Weaver DL. 1979. Diffusion-collision model for protein folding. Biopolymers
18:142137
116. Karplus M, Weaver DL. 1994. Protein folding dynamics: the diffusion-collision model
and experimental data. Protein Sci. 3:65068
117. Kendrew JC. 1961. The three-dimensional structure of a protein molecule. Sci. Am.
205:96110
118. Kim DE, Gu H, Baker D. 1998. The sequences of small proteins are not extensively
optimized for rapid folding by natural selection. Proc. Natl. Acad. Sci. USA 95:498286
119. KimPS, BaldwinRL. 1982. Specic intermediates inthe foldingreactions of small proteins
and the mechanism of protein folding. Annu. Rev. Biochem. 51:45989
120. Kirshenbaum K, Zuckermann RN, Dill KA. 1999. Designing polymers that mimic
biomolecules. Curr. Opin. Struct. Biol. 9:53035
121. Klepeis JL, Floudas CA. 2003. ASTRO-FOLD: a combinatorial and global optimization
framework for ab initio prediction of three-dimensional structures of proteins from the
amino acid sequence. Biophys. J. 85:211946
122. Knott M, Chan HS. 2006. Criteria for downhill protein folding: calorimetry, chevron plot,
kinetic relaxation, and single-molecule radius of gyration in chain models with subdued
degrees of cooperativity. Proteins 65:37391
123. Korzhnev DM, Salvatella X, Vendruscolo M, Di Nardo AA, Davidson AR, et al. 2004.
Low-populated folding intermediates of Fyn SH3 characterized by relaxation dispersion
NMR. Nature 430:58690
124. Krieger F, Fierz B, Bieri O, Drewello M, Kiefhaber T. 2003. Dynamics of unfolded
polypeptide chains as model for the earliest steps in protein folding. J. Mol. Biol. 332:265
74
125. Krishna MM, Hoang L, Lin Y, Englander SW. 2004. Hydrogen exchange methods to
study protein folding. Methods 34:5164
126. Krishna MMG, Maity H, Rumbley JN, Lin Y, Englander SW. 2006. Order of steps in
the cytochrome c folding pathway: evidence for a sequential stabilization mechanism. J.
Mol. Biol. 359:141120
127. Kubelka J, Chiu TK, Davies DR, Eaton WA, Hofrichter J. 2006. Sub-microsecond protein
folding. J. Mol. Biol. 359:54653
128. Kubelka J, Hofrichter J, Eaton WA. 2004. The protein folding speed limit. Curr. Opin.
Struct. Biol. 14:7688
129. Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. 2003. Design of a
novel globular protein fold with atomic-level accuracy. Science 302:136468
130. Kumar S, Bouzida D, Swendsen RH, Kollman PA, Rosenberg JM. 1992. The weighted
histogram analysis method for free-energy calculations on biomolecules. 1. The method.
J. Comput. Chem. 13:101121
131. Kuznetsov IB, Rackovsky S. 2004. Class-specic correlations between protein folding rate,
structure-derived, and sequence-derived descriptors. Proteins 54:33341
132. Lau KF, Dill KA. 1989. A lattice statistical mechanics model of the conformational and
sequence spaces of proteins. Macromolecules 22:398697
133. Laurence TA, Kong X, J ager M, Weiss S. 2005. Probing structural heterogeneities and
uctuations of nucleic acids and denatured proteins. Proc. Natl. Acad. Sci. USA102:17348
53
134. Lee BC, Zuckermann RN, Dill KA. 2005. Folding a nonbiological polymer into a compact
multihelical structure. J. Am. Chem. Soc. 127:109991009
135. Lei H, Duan Y. 2007. Ab initio folding of albumin binding domain from all-atom molec-
ular dynamics simulation. J. Phys. Chem. B 111:545863
310 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
136. Lei H, Duan Y. 2007. Two-stage folding of Hp-35 from ab initio simulations. J. Mol. Biol.
370:196206
137. Lei H, Wu C, Liu H, Duan Y. 2007. Folding free-energy landscape of villin headpiece
subdomain from molecular dynamics simulations. Proc. Natl. Acad. Sci. USA 104:492530
138. Leopold PE, Montal M, Onuchic JN. 1992. Protein folding funnels: a kinetic approach
to the sequence-structure relationship. Proc. Natl. Acad. Sci. USA 89:872125
139. Lesk AM, Rose GD. 1981. Folding units in globular proteins. Proc. Natl. Acad. Sci. USA
78:43048
140. Levitt M. 1983. Molecular dynamics of native protein. I. Computer simulation of trajec-
tories. J. Mol. Biol. 168:595617
141. Li Z, Scheraga HA. 1987. Monte Carlominimization approach to the multiple-minima
problem in protein folding. Proc. Natl. Acad. Sci. USA 84:661115
142. LindbergMO, HaglundE, Hubner IA, ShakhnovichEI, OlivebergM. 2006. Identication
of the minimal protein-folding nucleus through loop-entropy perturbations. Proc. Natl.
Acad. Sci. USA 103:408388
143. Lindberg MO, Oliveberg M. 2007. Malleability of protein folding pathways: a simple
reason for complex behaviour. Curr. Opin. Struct. Biol. 17:2129
144. Lindorff-Larsen K, Paci E, Serrano L, Dobson CM, Vendruscolo M. 2003 . Calculation of
mutational free energy changes in transition states for protein folding. Biophys. J. 85:1207
14
145. Looger LL, Dwyer MA, Smith JJ, Hellinga HW. 2003. Computational design of receptor
and sensor proteins with novel functions. Nature 423:18590
146. Lucas A, Huang L, Joshi A, Dill KA. 2007. Statistical mechanics of helix bundles using a
dynamic programming approach. J. Am. Chem. Soc. 129:427281
147. Lucent D, Vishal V, Pande VS. 2007. Proteinfoldingunder connement: a role for solvent.
Proc. Natl. Acad. Sci. USA 104:1043034
148. Ma H, Gruebele M. 2006. Low barrier kinetics: dependence on observables and free
energy surface. J. Comput. Chem. 27:12534
149. Maity H, Maity M, Krishna MM, Mayne L, Englander SW. 2005. Protein folding: the
stepwise assembly of foldon units. Proc. Natl. Acad. Sci. USA 102:474146
150. Makarov DE, Keller CA, Plaxco KW, Metiu H. 2002. How the folding rate constant of
simple, single-domain proteins depends on the number of native contacts. Proc. Natl. Acad.
Sci. USA 99:353539
151. Matagne A, Radford SE, Dobson CM. 1997. Fast and slow tracks in lysozyme folding:
insight into the role of domains in the folding process. J. Mol. Biol. 267:106874
152. Matheson RRJr, Scheraga HA. 1978. A method for predicting nucleation sites for protein
folding based on hydrophobic contacts. Macromolecules 11:81929
153. Matouschek A, Kellis JTJr, Serrano L, Fersht AR. 1989. Mapping the transition state and
pathway of protein folding by protein engineering. Nature 340:12226
154. Maxwell KL, Wildes D, Zarrine-Afsar A, De Los Rios MA, Brown AG, et al. 2005. Protein
folding: dening a standard set of experimental conditions and a preliminary kinetic data
set of two-state proteins. Protein Sci. 14:60216
155. McCammon JA, Gelin BR, Karplus M. 1977. Dynamics of folded proteins. Nature
267:58590
156. Meisner WK, Sosnick TR. 2004. Barrier-limited, microsecond folding of a stable protein
measured with hydrogen exchange: implications for downhill folding. Proc. Natl. Acad. Sci.
USA 101:1563944
157. Mello CC, Barrick D. 2004. An experimentally determined protein folding energy land-
scape. Proc. Natl. Acad. Sci. USA 101:141027
www.annualreviews.org The Protein Folding Problem 311
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
158. Mezei M. 1998. Chameleon sequences in the PDB. Protein Eng. 11:41114
159. Micheletti C, Banavar JR, Maritan A, Seno F. 1999. Protein structures and optimal folding
from a geometrical variational principle. Phys. Rev. Lett. 82:337275
160. Miller EJ, Fischer KF, Marqusee S. 2002. Experimental evaluation of topological param-
eters determining protein-folding rates. Proc. Natl. Acad. Sci. USA 99:1035963
161. Miller R, Danko CA, Fasolka MJ, Balazs AC, Chan HS, Dill KA. 1992. Folding kinetics
of proteins and copolymers. J. Chem. Phys. 96:76880
162. Minor DL, Kim PS. 1996. Context-dependent secondary structure formation of a de-
signed protein sequence. Nature 380:73034
163. Mirny L, Shakhnovich E. 2001. Protein folding theory: from lattice to all-atom models.
Annu. Rev. Biophys. Biomol. Struct. 30:36196
164. Moult J. 2005. A decade of CASP: progress, bottlenecks and prognosis in protein
structure prediction. Curr. Opin. Struct. Biol. 15:28589
165. Moult J, Pedersen JT, Judson R, Fidelis K. 1995. A large-scale experiment to assess
protein structure prediction methods. Proteins 23:iiiv
166. Mukhopadhyay S, Krishnan R, Lemke EA, Lindquist S, Deniz AA. 2007. A natively
unfolded yeast prion monomer adopts an ensemble of collapsed and rapidly uctuating
structures. Proc. Natl. Acad. Sci. USA 104:264954
167. Munoz V. 2002. Thermodynamics and kinetics of downhill protein folding investigated
with a simple statistical mechanical model. Int. J. Quant. Chem. 90:152228
168. Munoz V, Sanchez-Ruiz JM. 2004. Exploring protein-folding ensembles: a variable-
barrier model for the analysis of equilibrium unfolding experiments. Proc. Natl. Acad.
Sci. USA 101:1764651
169. Myers JK, Oas TG. 2001. Preorganized secondary structure as an important determinant
of fast protein folding. Nat. Struct. Biol. 8:55258
170. Naganathan AN, Perez-Jimenez R, Sanchez-Ruiz JM, Munoz V. 2005. Robustness of
downhill folding: guidelines for the analysis of equilibrium folding experiments on small
proteins. Biochemistry 44:743549
171. Nemethy G, Scheraga HA. 1965. Theoretical determination of sterically allowed confor-
mations of a polypeptide chain by a computer method. Biopolymers 3:15584
172. No e F, Horenko I, Sch utte C, Smith JC. 2007. Hierarchical analysis of conforma-
tional dynamics in biomolecules: transition networks of metastable states. J. Chem. Phys.
126:155102
173. ONeil KT, Hoess RH, DeGrado WF. 1990. Design of DNA-binding peptides based on
the leucine zipper motif. Science 249:77478
174. Oldziej S, Czaplewski C, Liwo A, Chinchio M, Nanias M, et al. 2005. Physics-based
protein-structure prediction using a hierarchical protocol based on the UNRES force
eld: assessment in two blind tests. Proc. Natl. Acad. Sci. USA 102:754752
175. Onuchic JN, Wolynes PG. 1993. Energy landscapes, glass transitions, and chemical-
reaction dynamics in biomolecular or solvent environment. J. Chem. Phys. 98:221824
176. Ozkan SB, Bahar I, Dill KA. 2001. Transition states and the meaning of -values in protein
folding kinetics. Nat. Struct. Biol. 8:76569
177. Ozkan SB, Wu GHA, Chodera JD, Dill KA. 2007. Protein folding by zipping and assem-
bly. Proc. Natl. Acad. Sci. USA 104:1198792
178. Paci E, Vendruscolo M, Dobson CM, Karplus M. 2002. Determination of a transition
state at atomic resolution from protein engineering data. J. Mol. Biol. 324:15163
179. Patch JA, Barron AE. 2003. Helical peptoid mimics of magainin-2 amide. J. Am. Chem.
Soc. 125:1209293
312 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
180. Pauling L, Corey RB. 1951. Atomic coordinates and structure factors for two helical
congurations of polypeptide chains. Proc. Natl. Acad. Sci. USA 37:23540
181. PaulingL, Corey RB, BransonHR. 1951. The structure of proteins: twohydrogen-bonded
helical congurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA 37:20511
182. Pieper U, Eswar N, Braberg H, Madhusudhan MS, Davis FP, et al. 2004. MODBASE,
a database of annotated comparative protein structure models, and associated resources.
Nucleic Acids Res. 32:D21722
183. Pitera JW, Swope W. 2003. Understanding folding and design: replica-exchange simula-
tions of Trp-cage miniproteins. Proc. Natl. Acad. Sci. USA 100:758792
184. Plaxco KW, Simons KT, Baker D. 1998. Contact order, transition state placement and
the refolding rates of single domain proteins. J. Mol. Biol. 277:98594
185. Porter EA, Wang X, Lee HS, WeisblumB, Gellman SH. 2000. Non-haemolytic -amino-
acid oligomers. Nature 404:565
186. Punta M, Rost B. 2005. Protein folding rates estimated from contact predictions. J. Mol.
Biol. 348:50712
187. Qiu L, Hagen SJ. 2004. A limiting speed for protein folding at low solvent viscosity. J.
Am. Chem. Soc. 126:339899
188. Raleigh DP, Plaxco KW. 2005. The protein folding transition state: What are -values
really telling us? Protein Pept. Lett. 12:11722
189. Reich L, Weikl TR. 2006. Substructural cooperativity and parallel versus sequential events
during protein unfolding. Proteins 63:105258
190. Rost B, Eyrich VA. 2001. EVA: large-scale analysis of secondary structure prediction.
Proteins 45:19299
191. Sadqi M, Fushman D, Munoz V. 2006. Atom-by-atom analysis of global downhill protein
folding. Nature 442:31721
192. Sali A, Blundell TL. 1993. Comparative protein modelling by satisfaction of spatial re-
straints. J. Mol. Biol. 234:779815
193. Sanchez IE, Kiefhaber T. 2003. Origin of unusual -values in protein folding: evidence
against specic nucleation sites. J. Mol. Biol. 334:107785
194. Schuler B, Lipman EA, Eaton WA. 2002. Probing the free-energy surface for protein
folding with single-molecule uorescence spectroscopy. Nature 419:74347
195. Segel DJ, Bachmann A, Hofrichter J, Hodgson KO, Doniach S, Kiefhaber T. 1999.
Characterization of transient intermediates in lysozyme folding with time-resolved small-
angle X-ray scattering. J. Mol. Biol. 288:48999
196. Shea JE, Brooks CL III. 2001. From folding theories to folding proteins: a review and
assessment of simulation studies of protein folding and unfolding. Annu. Rev. Phys. Chem.
52:499535
197. Shea JE, Onuchic JN, Brooks CL. 1999. Exploring the origins of topological frustration:
design of a minimally frustrated model of fragment B of protein A. Proc. Natl. Acad. Sci.
USA 96:1251217
198. Shirts M, Pande VS. 2000. Computing: Screen savers of the world unite! Science 290:1903
4
199. Shmygelska A. 2005. Search for folding nuclei in native protein structures. Bioinformatics
21:394402
200. Simmerling C, Strockbine B, Roitberg AE. 2002. All-atom structure prediction and fold-
ing simulations of a stable protein. J. Am. Chem. Soc. 124:1125859
201. Singhal N, Snow CD, Pande VS. 2004. Using path sampling to build better Markovian
state models: predicting the folding rate and mechanism of a tryptophan zipper hairpin.
J. Chem. Phys. 121:41525
www.annualreviews.org The Protein Folding Problem 313
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
202. Snow CD, Rhee YM, Pande VS. 2006. Kinetic denition of protein folding transition
state ensembles and reaction coordinates. Biophys. J. 91:1424
203. Sohl JL, Jaswal SS, Agard DA. 1998. Unfolded conformations of -lytic protease are more
stable than its native state. Nature 395:81719
204. Sosnick TR, Dothager RS, Krantz BA. 2004. Differences in the folding transition state of
ubiquitin indicated by and analyses. Proc. Natl. Acad. Sci. USA 101:1737782
205. Sridevi K, Juneja J, BhuyanAK, Krishnamoorthy G, Udgaonkar JB. 2000. The slowfolding
reaction of barstar: The core tryptophan region attains tight packing before substantial
secondary and tertiary structure formation and nal compaction of the polypeptide chain.
J. Mol. Biol. 302:47995
206. Sridevi K, Lakshmikanth GS, Krishnamoorthy G, Udgaonkar JB. 2004. Increasing sta-
bility reduces conformational heterogeneity in a protein folding intermediate ensemble.
J. Mol. Biol. 337:699711
207. Srinivasan R, Rose GD. 2002. Ab initio prediction of protein structure using LINUS.
Proteins 47:48995
208. Sriraman S, Kevrekidis IG, Hummer G. 2005. Coarse master equation from Bayesian
analysis of replica molecular dynamics simulations. J. Phys. Chem. B 109:647984
209. Street TO, Bradley CM, Barrick D. 2007. Predicting coupling limits from an experimen-
tally determined energy landscape. Proc. Natl. Acad. Sci. USA 104:490712
210. Sugita Y, Okamoto Y. 1999. Replica-exchange molecular dynamics method for protein
folding. Chem. Phys. Lett. 314:14151
211. Swope WC, Pitera JW, Suits F. 2004. Describing protein folding kinetics by molecular
dynamics simulations. 1. Theory. J. Phys. Chem. B 108:657181
212. Swope WC, Pitera JW, Suits F, Pitman M, Eleftheriou M, et al. 2004. Describing protein
folding kinetics by molecular dynamics simulations. 2. Example applications to alanine
dipeptide and a B-hairpin peptide. J. Phys. Chem. B 108:658294
213. Travaglini-Allocatelli C, Cutruzzol ` a F, Bigotti MG, Staniforth RA, Brunori M. 1999.
Folding mechanism of Pseudomonas aeruginosa cytochrome c. J. Mol. Biol. 289:1459
67
214. Tress M, Ezkurdia I, Gra na O, L opez G, Valencia A. 2005. Assessment of predictions
submitted for the CASP6 comparative modelling category. Proteins 61:2745
215. Tsai CJ, Maizel JV Jr, Nussinov R. 2000. Anatomy of protein structures: visualizing how a
one-dimensional protein chain folds into a three-dimensional shape. Proc. Natl. Acad. Sci.
USA 97:1203843
216. Unger R, Moult J. 1993. Finding the lowest free energy conformation of a protein is an
NP-hard problem: proof and implications. Bull. Math. Biol. 55:118398
217. Utku Y, Dehan E, Ouerfelli O, Piano F, Zuckermann RN, et al. 2006. A peptidomimetic
siRNA transfection reagent for highly effective gene silencing. Mol. Biosyst. 2:312
17
218. Uversky VN, Fink AL. 2002. The chicken-egg scenario of protein folding revisited. FEBS
Lett. 515:7983
219. Venclovas C, Zemla A, Fidelis K, Moult J. 2003. Assessment of progress over the CASP
experiments. Proteins 53:58595
220. Verma A, Schug A, Lee KH, Wenzel W. 2006. Basin hopping simulations for all-atom
protein folding. J. Chem. Phys. 124:044515
221. Viguera AR, Serrano L, Wilmanns M. 1996. Different folding transition states may result
in the same native structure. Nat. Struct. Biol. 3:87480
222. Vitkup D, Melamud E, Moult J, Sander C. 2001. Completeness in structural genomics.
Nat. Struct. Biol. 8:55966
314 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
223. Voelz VA, Dill KA. 2006. Exploring zipping and assembly as a protein folding principle.
Proteins 66:87788
224. Wallace IM, Blackshields G, Higgins DG. 2005. Multiple sequence alignments. Curr.
Opin. Struct. Biol. 15:26166
225. Wallin S, Chan HS. 2006. Conformational entropic barriers in topology-dependent pro-
tein folding: perspectives from a simple native-centric polymer model. J. Phys. Condens.
Matter 18:S30728
226. Wang L, Xie J, Schultz PG. 2006. Expanding the genetic code. Annu. Rev. Biophys. Biomol.
Struct. 35:22549
227. Wang Z, Mottonen J, Goldsmith EJ. 1996. Kinetically controlled folding of the serpin
plasminogen activator inhibitor 1. Biochemistry 35:1644348
228. Weikl TR. 2005. Loop-closure events during protein folding: rationalizing the shape of
-value distributions. Proteins 60:70111
229. Weikl TR, Dill KA. 2003. Folding kinetics of two-state proteins: effect of circularization,
permutation, and crosslinks. J. Mol. Biol. 332:95363
230. Weikl TR, Dill KA. 2003. Folding rates and low-entropy-loss routes of two-state proteins.
J. Mol. Biol. 329:58598
231. Weikl TR, Dill KA. 2007. Transition-states in protein folding kinetics: the structural
interpretation of values. J. Mol. Biol. 365:157886
232. Weikl TR, Palassini M, Dill KA. 2004. Cooperativity in two-state protein folding kinetics.
Protein Sci. 13:82229
233. White GWN, Gianni S, Grossmann JG, Jemth P, Fersht AR, Daggett V. 2005. Simulation
and experiment conspire to reveal cryptic intermediates and a slide from the nucleation-
condensation to framework mechanism of folding. J. Mol. Biol. 350:75775
234. Wolfenden R. 2007. Experimental measures of amino acid hydrophobicity and the topol-
ogy of transmembrane and globular proteins. J. Gen. Physiol. 129:35762
235. Wu CW, Seurynck SL, Lee KY, Barron AE. 2003. Helical peptoid mimics of lung sur-
factant protein C. Chem. Biol. 10:105763
236. Wurth C, Kim W, Hecht MH. 2006. Combinatorial approaches to probe the sequence
determinants of protein aggregation and amyloidogenicity. Protein Pept. Lett. 13:279
86
237. Xu Y, Purkayastha P, Gai F. 2006. Nanosecond folding dynamics of a three-stranded
-sheet. J. Am. Chem. Soc. 128:1583642
238. Yang JS, Chen WW, Skolnick J, Shakhnovich EI. 2006. All-atom ab initio folding of a
diverse set of proteins. Structure 15:5363
239. Yeh SR, Rousseau DL. 2000. Hierarchical folding of cytochrome c. Nat. Struct. Biol. 7:443
45
240. Yoda T, Sugita Y, Okamoto Y. 2004. Secondary structure preferences of force elds
for proteins evaluated by generalized-ensemble simulations. Chem. Phys. 307:269
83
241. Zagrovic B, Snow CD, Shirts MR, Pande VS. 2002. Simulation of folding of a small -
helical protein in atomistic detail using worldwide-distributed computing. J. Mol. Biol.
323:92737
242. Zhang Y, Arakaki AK, Skolnick J. 2005. TASSER: an automated method for the
prediction of protein tertiary structures in CASP6. Proteins 61:9198
243. Zhao H, Giver L, Shao Z, Affholter JA, Arnold FH. 1998. Molecular evolution by
staggered extension process (StEP) in vitro recombination. Nat. Biotechnol. 16:258
61
www.annualreviews.org The Protein Folding Problem 315
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANRV343-BB37-14 ARI 9 April 2008 11:34
244. Zhong S, Rousseau DL, Yeh SR. 2004. Modulation of the folding energy landscape of
cytochrome c with salt. J. Am. Chem. Soc. 126:1393435
245. Zhou HY, Zhou YQ. 2002. Folding rate prediction using total contact distance. Biophys.
J. 82:45863
246. Zhou R. 2003. Free energy landscape of protein folding in water: explicit vs. implicit
solvent. Proteins 53:14861
316 Dill et al.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
AR343-FM ARI 10 April 2008 7:2
Annual Review of
Biophysics
Volume 37, 2008
Contents
Frontispiece
Robert L. Baldwin p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p xiv
The Search for Folding Intermediates and the Mechanism
of Protein Folding
Robert L. Baldwin p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 1
How Translocons Select Transmembrane Helices
Stephen H. White and Gunnar von Heijne p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 23
Unique Rotary ATP Synthase and Its Biological Diversity
Christoph von Ballmoos, Gregory M. Cook, and Peter Dimroth p p p p p p p p p p p p p p p p p p p p p p p p 43
Mediation, Modulation, and Consequences
of Membrane-Cytoskeleton Interactions
Gary J. Doherty and Harvey T. McMahon p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 65
Metal Binding Afnity and Selectivity in Metalloproteins:
Insights from Computational Studies
Todor Dudev and Carmay Lim p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 97
Riboswitches: Emerging Themes in RNA Structure and Function
Rebecca K. Montange and Robert T. Batey p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 117
Calorimetry and Thermodynamics in Drug Design
Jonathan B. Chaires p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 135
Protein Design by Directed Evolution
Christian Jckel, Peter Kast, and Donald Hilvert p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 153
PIP
2
Is A Necessary Cofactor for Ion Channel Function:
How and Why?
Byung-Chang Suh and Bertil Hille p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 175
RNA Folding: Conformational Statistics, Folding Kinetics,
and Ion Electrostatics
Shi-Jie Chen p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 197
Intrinsically Disordered Proteins in Human Diseases: Introducing
the D
2
Concept
Vladimir N. Uversky, Christopher J. Oldeld, and A. Keith Dunker p p p p p p p p p p p p p p p p 215
Crowding Effects on Diffusion in Solutions and Cells
James A. Dix and A.S. Verkman p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 247
vii
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
AR343-FM ARI 10 April 2008 7:2
Nanobiotechnology and Cell Biology: Micro- and Nanofabricated
Surfaces to Investigate Receptor-Mediated Signaling
Alexis J. Torres, Min Wu, David Holowka, and Barbara Baird p p p p p p p p p p p p p p p p p p p p p p 265
The Protein Folding Problem
Ken A. Dill, S. Banu Ozkan, M. Scott Shell, and Thomas R. Weikl p p p p p p p p p p p p p p p p p p 289
Translocation and Unwinding Mechanisms of RNA
and DNA Helicases
Anna Marie Pyle p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 317
Structure of Eukaryotic RNA Polymerases
P. Cramer, K.-J. Armache, S. Baumli, S. Benkert, F. Brueckner, C. Buchen,
G.E. Damsma, S. Dengl, S.R. Geiger, A.J. Jasiak, A. Jawhari, S. Jennebach,
T. Kamenski, H. Kettenberger, C.-D. Kuhn, E. Lehmann, K. Leike, J.F. Sydow,
and A. Vannini p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 337
Structure-Based View of Epidermal Growth Factor Receptor
Regulation
Kathryn M. Ferguson p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 353
Macromolecular Crowding and Connement: Biochemical,
Biophysical, and Potential Physiological Consequences
Huan-Xiang Zhou, Germn Rivas, and Allen P. Minton p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 375
Biophysics of Catch Bonds
Wendy E. Thomas, Viola Vogel, and Evgeni Sokurenko p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 399
Single-Molecule Approach to Molecular Biology in Living Bacterial
Cells
X. Sunney Xie, Paul J. Choi, Gene-Wei Li, Nam Ki Lee, and Giuseppe Lia p p p p p p p p p 417
Structural Principles from Large RNAs
Stephen R. Holbrook p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 445
Bimolecular Fluorescence Complementation (BiFC) Analysis
as a Probe of Protein Interactions in Living Cells
Tom K. Kerppola p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 465
Multiple Routes and Structural Heterogeneity in Protein Folding
Jayant B. Udgaonkar p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 489
Index
Cumulative Index of Contributing Authors, Volumes 3337 p p p p p p p p p p p p p p p p p p p p p p p p 511
Errata
An online log of corrections to Annual Review of Biophysics articles may be found at
http://biophys.annualreviews.org/errata.shtml
viii Contents
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.
ANNUAL REVIEWS
Its about time. Your time. Its time well spent.
ANNUAL REVIEWS | Connect With Our Experts
Tel: 800.523.8635 (US/CAN) | Tel: 650.493.4400 | Fax: 650.424.0910 | Email: [email protected]
New From Annual Reviews:
Annual Review of Statistics and Its Application
Volume 1 Online January 2014 http://statistics.annualreviews.org
Editor: Stephen E. Fienberg, Carnegie Mellon University
Associate Editors: Nancy Reid, University of Toronto
Stephen M. Stigler, University of Chicago
The Annual Review of Statistics and Its Application aims to inform statisticians and quantitative methodologists, as
well as all scientists and users of statistics about major methodological advances and the computational tools that
allow for their implementation. It will include developments in the feld of statistics, including theoretical statistical
underpinnings of new methodology, as well as developments in specifc application domains such as biostatistics
and bioinformatics, economics, machine learning, psychology, sociology, and aspects of the physical sciences.
Complimentary online access to the frst volume will be available until January 2015.
TABLE OF CONTENTS:
What Is Statistics? Stephen E. Fienberg
A Systematic Statistical Approach to Evaluating Evidence
from Observational Studies, David Madigan, Paul E. Stang,
Jesse A. Berlin, Martijn Schuemie, J. Marc Overhage,
Marc A. Suchard, Bill Dumouchel, Abraham G. Hartzema,
Patrick B. Ryan
The Role of Statistics in the Discovery of a Higgs Boson,
David A. van Dyk
Brain Imaging Analysis, F. DuBois Bowman
Statistics and Climate, Peter Guttorp
Climate Simulators and Climate Projections,
Jonathan Rougier, Michael Goldstein
Probabilistic Forecasting, Tilmann Gneiting,
Matthias Katzfuss
Bayesian Computational Tools, Christian P. Robert
Bayesian Computation Via Markov Chain Monte Carlo,
Radu V. Craiu, Jefrey S. Rosenthal
Build, Compute, Critique, Repeat: Data Analysis with Latent
Variable Models, David M. Blei
Structured Regularizers for High-Dimensional Problems:
Statistical and Computational Issues, Martin J. Wainwright
High-Dimensional Statistics with a View Toward Applications
in Biology, Peter Bhlmann, Markus Kalisch, Lukas Meier
Next-Generation Statistical Genetics: Modeling, Penalization,
and Optimization in High-Dimensional Data, Kenneth Lange,
Jeanette C. Papp, Janet S. Sinsheimer, Eric M. Sobel
Breaking Bad: Two Decades of Life-Course Data Analysis
in Criminology, Developmental Psychology, and Beyond,
Elena A. Erosheva, Ross L. Matsueda, Donatello Telesca
Event History Analysis, Niels Keiding
Statistical Evaluation of Forensic DNA Profle Evidence,
Christopher D. Steele, David J. Balding
Using League Table Rankings in Public Policy Formation:
Statistical Issues, Harvey Goldstein
Statistical Ecology, Ruth King
Estimating the Number of Species in Microbial Diversity
Studies, John Bunge, Amy Willis, Fiona Walsh
Dynamic Treatment Regimes, Bibhas Chakraborty,
Susan A. Murphy
Statistics and Related Topics in Single-Molecule Biophysics,
Hong Qian, S.C. Kou
Statistics and Quantitative Risk Management for Banking
and Insurance, Paul Embrechts, Marius Hofert
Access this and all other Annual Reviews journals via your institution at www.annualreviews.org.
A
n
n
u
.
R
e
v
.
B
i
o
p
h
y
s
.
2
0
0
8
.
3
7
:
2
8
9
-
3
1
6
.
D
o
w
n
l
o
a
d
e
d
f
r
o
m
w
w
w
.
a
n
n
u
a
l
r
e
v
i
e
w
s
.
o
r
g
b
y
I
n
d
i
a
n
I
n
s
t
i
t
u
t
e
o
f
T
e
c
h
n
o
l
o
g
y
-
M
a
d
r
a
s
o
n
0
6
/
1
9
/
1
4
.
F
o
r
p
e
r
s
o
n
a
l
u
s
e
o
n
l
y
.