Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2001, Proteins: Structure, Function, and Genetics
Protein structure can be viewed as a compact linear array of nearly standard size closed loops of 25-30 amino acid residues (Berezovsky et al., FEBS Letters 2000; 466: 283-286) irrespective of details of secondary structure. The end-to-end contacts in the loops are likely to be hydrophobic, which is a testable hypothesis. This notion could be verified by direct comparison of the loop maps with Kyte and Doolittle hydropathicity plots. This analysis reveals that most of the ends of the loops are hydrophobic, indeed. The same conclusion is reached on the basis of positional autocorrelation analysis of protein sequences of 23 fully sequenced bacterial genomes. Hydrophobic residues valine, alanine, glycine, leucine, and isoleucine appear preferentially at the 25-30 residues distance one from another. These observations open a new perspective in the understanding of protein structure and folding: a consecutive looping of the polypeptide chain with the loops ending primarily at hydrophobic nuclei. Proteins 2001;45:346 -350.
Protein …, 2006
BMC Structural Biology, 2007
Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction.
Protein Science, 2001
Patterns of hydrophobic and hydrophilic residues play a major role in protein folding and function. Long, predominantly hydrophobic strings of 20-22 amino acids each are associated with transmembrane helices and have been used to identify such sequences. Much less attention has been paid to hydrophobic sequences within globular proteins. In prior work on computer simulations of the competition between on-pathway folding and off-pathway aggregate formation, we found that long sequences of consecutive hydrophobic residues promoted aggregation within the model, even controlling for overall hydrophobic content. We report here on an analysis of the frequencies of different lengths of contiguous blocks of hydrophobic residues in a database of amino acid sequences of proteins of known structure. Sequences of three or more consecutive hydrophobic residues are found to be significantly less common in actual globular proteins than would be predicted if residues were selected independently. The result may reflect selection against long blocks of hydrophobic residues within globular proteins relative to what would be expected if residue hydrophobicities were independent of those of nearby residues in the sequence.
2000
Though the electrostatic, ionic, van der Waals, Lennard-Jones, hydrogen bonding, and other forces play an important role in the energy function minimized at a protein's native state, it is widely believed that the hydrophobic force is the dominant term in protein folding. In this paper, we attempt to quantify the extent to which the hydrophobic force determines the positions of the backbone-carbon atoms in PDB data, by applying Monte-Carlo and genetic algorithms to determine the predicted conformation with minimum energy, where only the hydrophobic force is considered (i.e. Dill's HP-model, and re nements using Woese's polar requirement). This is done by computing the root mean square deviation between the normalized distance matrix D = (d i;j) (d i;j is normalized Euclidean distance between residues r i and r j) for PDB data with that obtained from the output of our algorithms. Our program was run on the database of ancient conserved regions drawn from GenBank 101 generously supplied by W. Gilbert's lab 8, 7], as well as medium-sized proteins (E. Coli RecA, 2reb, Erythrocruorin, 1eca, and Actinidin 2act). The root mean square deviation (RMSD) between distance matrices derived from the PDB data and from our program output is quite small, and by comparison with RMSD between PDB data and random coils, allows a quanti cation of the hydrophobic force contribution The nal version of this paper will appear in the proceedings of PSB'2000 at the URL
Journal of Proteome Research, 2004
The presence of partially folded intermediates along the folding funnel of proteins has been suggested to be a signature of potentially aggregating systems. Many studies have concluded that metastable, highly flexible intermediates are the basic elements of the aggregation process. In a previous paper, we demonstrated how the choice between aggregation and folding behavior was influenced by hydrophobicity distribution patterning along the sequence, as quantified by recurrence quantification analysis (RQA) of the Myiazawa-Jernigan coded primary structures. In the present paper, we tried to unify the "partially folded intermediate" and "hydrophobicity/charge" models of protein aggregation verifying the ability of an empirical relation, developed for rationalizing the effect of different mutations on aggregation propensity of acyl-phosphatase and based on the combination of hydrophobicity RQA and charge descriptors, to discriminate in a statistically significant way two different protein populations: (a) proteins that fold by a process passing by partially folded intermediates and (b) proteins that do not present partially folded intermediates.
The Journal of Chemical Physics, 2002
Brownian dynamics simulation study of the folding of a model thermostable chicken villin head piece subdomain, a 36-residue protein ͑HP-36͒, is carried out using the hydropathy scale of amino acids. The diverse interactions among the amino acid residues are categorized into three classes by introducing a simplified hydrophobic scale. The simulations incorporate all the six different interand intraamino acid interactions. The model protein reproduces some of the qualitative features of the complex protein folding, including the funnel-like energy landscape. Although there are several states near the minimum of the folding funnel, we could identify a stable native configuration. In addition, the study reveals a correlation between the contact order, topology, and the stability.
Journal of Biological Chemistry, 1991
Helix formation in folding proteins is stabilized by binding of recurrent hydrophobic side chains in one longitudinal quadrant against the locally most hydrophobic region of the protein.
The International Journal of Physics, 2013
The idea that the hydrophobic effect is the major driving force for processes such as protein folding and protein-protein association has prevailed in the biochemical literature for over half a century. It has recently become clear that the evidence in favor of the hydrophobic paradigm has totally dissipated. The dominance of the hydrophobic effect has been reduced into nothing but a myth. On the other hand, the new paradigm based on a host of hydrophilic effects has emerged. This new paradigm offers simple and straightforward answers to the long sought problems of protein folding and protein-protein association.
Journal of The Royal Society Interface, 2013
The closed-loop (loop-n-lock) hypothesis of protein folding suggests that loops of about 25 residues, closed through interactions between the loop ends (locks), play an important role in protein structure. Coarse-grain elastic network simulations, and examination of loop lengths in a diverse set of proteins, each supports a bias towards loops of close to 25 residues in length between residues of high stability. Previous studies have established a correlation between total contact distance (TCD), a metric of sequence distances between contacting residues (cf. contact order), and the log-folding rate of a protein. In a set of 43 proteins, we identify an improved correlation (r 2 ¼ 0.76), when the metric is restricted to residues contacting the locks, compared to the equivalent result when all residues are considered (r 2 ¼ 0.65). This provides qualified support for the hypothesis, albeit with an increased emphasis upon the importance of a much larger set of residues surrounding the locks. Evidence of a similarsized protein core/extended nucleus (with significant overlap) was obtained from TCD calculations in which residues were successively eliminated according to their hydrophobicity and connectivity, and from molecular dynamics simulations. Our results suggest that while folding is determined by a subset of residues that can be predicted by application of the closed-loop hypothesis, the original hypothesis is too simplistic; efficient protein folding is dependent on a considerably larger subset of residues than those involved in lock formation.
FEBS Letters, 2000
By screening the crystal protein structure database for close CK K^CK K contacts, a size distribution of the closed loops is generated. The distribution reveals a maximum at 27 þ 5 residues, the same for eukaryotic and prokaryotic proteins. This is apparently a consequence of polymer statistic properties of protein chain trajectory. That is, closure into the loops depends on the flexibility (persistence length) of the chain. The observed preferential loop size is consistent with the theoretical optimal loop closure size. The mapping of the detected unit-size loops on the sequences of major typical folds reveals an almost regular compact consecutive arrangement of the loops. Thus, a novel basic element of protein architecture is discovered; structurally diverse closed loops of the particular size.
Current Opinion in Structural Biology, 2001
Abbreviations ACBP acyl coenzyme A-binding protein AcP acylphosphatase ADA2h activation domain of procarboxypeptidase A2 apoMb apomyoglobin CD2.d1 domain 1 of immunoglobulin CD2 CoC conservatism of conservatism CRABP cellular retinoic acid-binding protein Csp cold shock protein fn fibronectin-like domain GuHCl guanidinium hydrochloride IFABP intestinal fatty acid-binding protein Ig immunoglobulin iLBP intracellular lipid-binding protein ILBP ileal lipid-binding protein Lb leghemoglobin PDB Protein Data Bank SH Src homology TN tenascin Recent work by Ptitsyn and Shakhnovich has exploited the availability of large numbers of sequences to extract folding information from evolutionarily related proteins. Ptitsyn and Ting [7 • ] looked for conserved residues among globins. This approach led to the identification of two conserved clusters, one clearly involved in interactions with the heme and the other, a group of large nonpolar residues on helices A, G and H. The authors suggested that interactions among these residues could be critical to initiation of folding in the globin family, pointing out that indeed the A, G and H helices form early in apomyoglobin (apoMb) folding [8]. Interestingly, this prediction has been called into question in a recent experimental study comparing the folding of leghemoglobin (Lb) with that of myoglobin [9 • ], as discussed below. In order to identify amino acids conserved for folding and eliminate those conserved for function, Mirny and Shakhnovich [10 • ] developed the 'conservatism of conservatism' (CoC) approach, in which they compare members of superfamilies, that is, proteins that share folds, but are highly divergent in sequence and function. Not surprisingly, sites
Proceedings of the National Academy of Sciences
Direct structural information obtained for many proteins supports the following conclusions. The amino acid sequences of proteins can stabilize not only the final native state but also a small set of discrete partially folded native-like intermediates. Intermediates are formed in steps that use as units the cooperative secondary structural elements of the native protein. Earlier intermediates guide the addition of subsequent units in a process of sequential stabilization mediated by native-like tertiary interactions. The resulting stepwise self-assembly process automatically constructs a folding pathway, whether linear or branched. These conclusions are drawn mainly from hydrogen exchange-based methods, which can depict the structure of infinitesimally populated folding intermediates at equilibrium and kinetic intermediates with subsecond lifetimes. Other kinetic studies show that the polypeptide chain enters the folding pathway after an initial free-energy-uphill conformational sea...
2006
A lattice model is used to study mutations and compacting effects on protein folding rates and folding temperature. In the context of protein evolution, we address the question regarding the best scenario for a polypeptide chain to fold: either a fast nonspecific collapse followed by a slow rearrangement to form the native structure or a specific collapse from the unfolded state with the simultaneous formation of the native state. This question is investigated for optimized sequences, whose native state has no frustrated contacts between monomers, and also for mutated sequences, whose native state has some degree of frustration. It is found that the best scenario for folding may depend on the amount of frustration of the native structure. The implication of this result on protein evolution is discussed.
FEBS Letters, 2003
A judicious examination of an exhaustive PDB sample of soluble globular proteins of moderate size (N 6 102) reveals a commensurable relationship between hydrophobic surface burial and number of backbone hydrogen bonds. An analysis of 50 000 conformations along the longest all-atom MD trajectory allows us to infer that not only the hydrophobic collapse is concurrent with the formation of backbone amidecarbonyl hydrogen bonds, they are also dynamically coupled processes. In statistical terms, hydrophobic clustering of the side chains is inevitably conducive to backbone burial and the latter process becomes thermodynamically too costly and kinetically unfeasible without amide-carbonyl hydrogen-bond formation. Furthermore, the desolvation of most hydrogen bonds is exhaustive along the pathway, implying that such bonds guide the collapse process. ß
Proteins-structure Function and Bioinformatics, 2003
Patterns of hydrophobic and hydrophilic residues (binary patterns) play an important role in protein architecture and can be roughly categorized into two classes regarding their preferential participation in α-helices or β-strands. However, a single binary pattern can be embedded into different longer patterns carrying opposite structural information and thus cannot be as much informative as expected. Here, we consider conditional binary patterns, or hydrophobic clusters, whose existence is conditioned by the presence of a minimum number of nonhydrophobic residues, called the connectivity distance, that separate two hydrophobic amino acids assumed to belong to two distinct patterns. Conditional binary patterns are distinct from simple ones in that they are not intertwined, i.e., they can not include or be included in other conditional patterns and therefore carry a much more differentiated information, in particular being dramatically better correlated with regular secondary structures (especially β ones). The distribution of these nonintertwined binary patterns in natural proteins was assessed relative to randomness, evidencing the structural bricks that are favored and disfavored by evolutionary selection. Several connectivity distances as well as several hydrophobic alphabets were tested, evidencing the clear superiority of a connectivity distance of 4, which mimics the minimum current length of loops in globular domains, and of the VILFMYW alphabet, selected from structural data (secondary structure propension and Voronoï tesselation), in highlighting fundamental properties of protein folds. Proteins 2003;51:236–244. © 2003 Wiley-Liss, Inc.
Proceedings of the National Academy of Sciences, 1993
This paper investigates quantitatively the characteristics of the local folding code. The overlapping four-residue fragments which make up the amino acid sequences of 114 proteins are divided into classes on the basis of the physical properties of their constituent amino acids. The distribution of structural types associated with each class of sequence fragment is determined and compared with an ensemble of random structural distributions of the same size selected from the actual protein structures. A criterion is proposed, based on the relative entropies of the two types of distribution, and on a hypothesis as to the characters of 644 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Proceedings of the National Academy of Sciences, 1971
A mechanism is proposed for the folding of protein chains. On the basis of short-range interactions, certain aminoacid sequences have a high propensity to be, say, α-helical. However, these short helical (or other ordered) regions can be stabilized only by long-range interactions arising from the proximity of two such ordered regions. These regions are brought near each other by the directing influence of certain other aminoacid sequences that have a high probability of forming β-bends or variants thereof, also on the basis of short-range interactions. An analysis is made of the tendency of various amino acids to occur in β-bends, and it is possible to predict the regions of a chain in which a β-bend will occur with a high degree of reliability.
There is continued interest in predicting the structure of proteins either at the simplest level of identifying their fold class or persevering all the way to an atomic resolution structure. Protein folding methods have become very sophisticated and many successes have been recorded with claims to have solved the native structure of the protein. But for any given protein, there may be more than one solution. Many proteins can exist in one of the other two (or more) different forms and some populate multiple metastable states. Here, the two-state case is considered and the key structural changes that take place when the protein switches from one state to the other are identified. Analysis of these results show that hydro- gen bonding patterns and hydrophobic contacts vary considerably between different conformers. Contrary to what has often been assumed previously, these two types of interaction operate essentially independently of one another. Core packing is critical for proper protein structure and function and it is shown that there are considerable changes in internal cavity volumes in many cases. The way in which these switches are made is fold dependent. Considerations such as these need to be taken into account in protein structure prediction.
Physica A: Statistical Mechanics and its Applications, 2018
We assume that the protein folding process follows two autonomous steps: the conformational search for the native, mainly ruled by the hydrophobic effect; and, the final adjustment stage, which eventually gives stability to the native. Our main tool of investigation is a 3D lattice model provided with a ten-letter alphabet, the stereochemical model. This model was conceived for Monte Carlo (MC) simulations when one keeps in mind the kinetic behavior of protein-like chains in solution. In order to characterize the folding characteristic time (τ) by two distinct sampling methods, first we present two sets of 10 3 MC simulations for a fast protein-like sequence. For these sets of folding times, τ and τ q were obtained with the application of the standard Metropolis algorithm (MA), and a modified algorithm (M q A). The results for τ q reveal two things: i) the hydrophobic chain-solvent interactions plus a set of inter-residues steric constraints are enough to emulate the first stage of the process: for each one of the 10 3 MC performed simulations, the native is always found without exception, ii) the ratio τ q /τ≅1/3 suggests that the effect of local thermal fluctuations, encompassed by the Tsallis weight, provides an innate efficiency to the chain escapes from energetic and steric traps. A physical insight is provided. Our second result was obtained through a set of 600 independent MC simulations performed with the M q A method applied to a set of 200 representative targets (native structures). The results show how structural patterns modulate τ q , which cover four orders of magnitude in the temporal scale. The third, and last result, was obtained from a special kind of simulation for those same 200 targets, we simulated their stability. We obtained a strong correlation (R=0.85) between the hydrophobic component of protein stability and the folding rate: the faster is the protein to find the native, larger is the hydrophobic component of its stability. This final result suggests that the hydrophobic interactions could not be a general stabilizing factor for proteins.
The Journal of Chemical Physics, 2014
The dynamics and energetics of formation of loops in the 46-residue N-terminal fragment of the B-domain of staphylococcal protein A has been studied. Numerical simulations have been performed using coarse-grained molecular dynamics with the united-residue (UNRES) force field. The results have been analyzed in terms of a kink (heteroclinic standing wave solution) of a generalized discrete nonlinear Schrödinger (DNLS) equation. In the case of proteins, the DNLS equation arises from a C α -trace-based energy function. Three individual kink profiles were identified in the experimental three-α-helix structure of protein A, in the range of the Glu16-Asn29, Leu20-Asn29, and Gln33-Asn44 residues, respectively; these correspond to two loops in the native structure. UNRES simulations were started from the full right-handed α-helix to obtain a clear picture of kink formation, which would otherwise be blurred by helix formation. All three kinks emerged during coarse-grained simulations. It was found that the formation of each is accompanied by a local free energy increase; this is expressed as the change of UNRES energy which has the physical sense of the potential of mean force of a polypeptide chain. The increase is about 7 kcal/mol. This value can thus be considered as the free energy barrier to kink formation in full α-helical segments of polypeptide chains. During the simulations, the kinks emerge, disappear, propagate, and annihilate each other many times. It was found that the formation of a kink is initiated by an abrupt change in the orientation of a pair of consecutive side chains in the loop region. This resembles the formation of a Bloch wall along a spin chain, where the C α backbone corresponds to the chain, and the amino acid side chains are interpreted as the spin variables. This observation suggests that nearest-neighbor side chain-side chain interactions are responsible for initiation of loop formation. It was also found that the individual kinks are reflected as clear peaks in the principal modes of the analyzed trajectory of protein A, the shapes of which resemble the directional derivatives of the kinks along the chain. These observations suggest that the kinks of the DNLS equation determine the functionally important motions of proteins. © 2014 AIP Publishing LLC. [http://dx.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.