Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1976, The Modern Language Review
…
15 pages
1 file
Nuclear magnetic resonance (NMR) spectroscopy allows scientists to study protein structure, dynamics and interactions in solution. A necessary first step for such applications is determining the resonance assignment, mapping spectral data to atoms and residues in the primary sequence. Automated resonance assignment algorithms rely on information regarding connectivity (e.g., through-bond atomic interactions) and amino acid type, typically using the former to determine strings of connected residues and the latter to map those strings to positions in the primary sequence. Significant ambiguity exists in both connectivity and amino acid type information. This paper focuses on the information content available in connectivity alone and develops a novel random-graph theoretic framework and algorithm for connectivity-driven NMR sequential assignment. Our random graph model captures the structure of chemical shift degeneracy, a key source of connectivity ambiguity. We then give a simple and natural randomized algorithm for finding optimal assignments as sets of connected fragments in NMR graphs. The algorithm naturally and efficiently reuses substrings while exploring connectivity choices; it overcomes local ambiguity by enforcing global consistency of all choices. By analyzing our algorithm under our random graph model, we show that it can provably tolerate relatively large ambiguity while still giving expected optimal performance in polynomial time. We present results from practical applications of the algorithm to experimental datasets from a variety of proteins and experimental setups. We demonstrate that our approach is able to overcome significant noise and local ambiguity in identifying significant fragments of sequential assignments.
Journal of Biomolecular NMR, 2010
High-throughput functional protein NMR studies, like protein interactions or dynamics, require an automated approach for the assignment of the protein backbone. With the availability of a growing number of protein 3D structures, a new class of automated approaches, called structure-based assignment, has been developed quite recently. Structurebased approaches use primarily NMR input data that are not based on J-coupling and for which connections between residues are not limited by through bonds magnetization transfer efficiency. We present here a robust structure-based assignment approach using mainly H N -H N NOEs networks, as well as 1 H-15 N residual dipolar couplings and chemical shifts. The NOEnet complete search algorithm is robust against assignment errors, even for sparse input data. Instead of a unique and partly erroneous assignment solution, an optimal assignment ensemble with an accuracy equal or near to 100% is given by NOEnet. We show that even low precision assignment ensembles give enough information for functional studies, like modeling of protein-complexes. Finally, the combination of NOEnet with a low number of ambiguous J-coupling sequential connectivities yields a high precision assignment ensemble. NOEnet will be available under: http://www.icsn. cnrs-gif.fr/download/nmr.
Statistical Applications in Genetics and Molecular Biology, 2000
Nuclear Magnetic Resonance (NMR) spectroscopy is a key experimental technique used to study protein structure, dynamics, and interactions. NMR methods face the bottleneck of spectral analysis, in particular determining the resonance assignment, which helps define the mapping between atoms in the protein and peaks in the spectra. A substantial amount of noise in spectral data, along with ambiguities in interpretation, make this analysis a daunting task, and there exists no generally accepted measure of uncertainty associated with the resulting solutions. This paper develops a modelbased inference approach that addresses the problem of characterizing uncertainty in backbone resonance assignment. We argue that NMR spectra are subject to random variation, and ignoring this stochasticity can lead to false optimism and erroneous conclusions. We propose a Bayesian statistical model that accounts for various sources of uncertainty and provides an automatable framework for inference. While assignment has previously been viewed as a deterministic optimization problem, we demonstrate the importance of considering all solutions consistent with the data, and develop an algorithm to search this space within our statistical framework. Our approach is able to characterize the uncertainty associated with backbone resonance assignment in several ways: 1) it quantifies of uncertainty in the individually assigned resonances in terms of their posterior standard deviations; 2) it assesses the information content in the data with a posterior distribution of plausible assignments; and 3) it provides a measure of the overall plausibility of assignments. We demonstrate the value of our approach in a study of experimental data from two proteins, Human Ubiquitin and Cold-shock protein A from E. coli. In addition, we provide simulations showing the impact of experimental conditions on uncertainty in the assignments.
Bioinformatics, 2009
A prerequisite for any protein study by NMR is the assignment of the resonances from the 15 N − 1 H HSQC spectrum to their corresponding atoms of the protein backbone. Usually, this assignment is obtained by analyzing triple resonance NMR experiments. An alternative assignment strategy exploits the information given by an already available 3D structure of the same or a homologous protein. Up to now, the algorithms that have been developed around the structure-based assignment strategy have the important drawbacks that they cannot guarantee a high assignment accuracy near to 100%. Results: We propose here a new program, called NOEnet, implementing an efficient complete search algorithm that ensures the correctness of the assignment results. NOEnet exploits the network character of unambiguous NOE constraints to realize an exhaustive search of all matching possibilities of the NOE network onto the structural one. NOEnet has been successfully tested on EIN, a large protein of 28 kDa, using only NOE data. The complete search of NOEnet finds all possible assignments compatible with experimental data that can be defined as an assignment ensemble. We show that multiple assignment possibilities of large NOE networks are restricted to a small spatial assignment range (SAR), so that assignment ensembles, obtained from accessible experimental data, are precise enough to be used for functional proteins studies, like protein-ligand interaction or protein dynamics studies. We believe that NOEnet can become a major tool for the structure-based backbone resonance assignment strategy in NMR.
Journal of Computational Biology, 2011
In NMR resonance assignment, an indispensable step in NMR protein studies, manually processed peaks from both N-labeled and C-labeled spectra are typically used as inputs. However, the use of homologous structures can allow one to use only N-labeled NMR data and avoid the added expense of using C-labeled data. We propose a novel integer programming framework for structure-based backbone resonance assignment using N-labeled data. The core consists of a pair of integer programming models: one for spin system forming and amino acid typing, and the other for backbone resonance assignment. The goal is to perform the assignment directly from spectra without any manual intervention via automatically picked peaks, which are much noisier than manually picked peaks, so methods must be error-tolerant. In the case of semiautomated/manually processed peak data, we compare our system with the Xiong-Pandurangan-Bailey-Kellogg's contact replacement (CR) method, which is the most error-tolerant method for structure-based resonance assignment. Our system, on average, reduces the error rate of the CR method by five folds on their data set. In addition, by using an iterative algorithm, our system has the added capability of using the NOESY data to correct assignment errors due to errors in predicting the amino acid and secondary structure type of each spin system. On a publicly available data set for human ubiquitin, where the typing accuracy is 83%, we achieve 91% accuracy, compared to the 59% accuracy obtained without correcting for such errors. In the case of automatically picked peaks, using assignment information from yeast ubiquitin, we achieve a fully automatic assignment with 97% accuracy. To our knowledge, this is the first system that can achieve fully automatic structure-based assignment directly from spectra. This has implications in NMR protein mutant studies, where the assignment step is repeated for each mutant.
Journal of Bioinformatics and Computational Biology, 2011
Error tolerant backbone resonance assignment is the cornerstone of the NMR structure determination process. Although a variety of assignment approaches have been developed, none works sufficiently well on noisy fully automatically picked peaks to enable the subsequent automatic structure determination steps. We have designed an integer linear programming (ILP) based assignment system (IPASS) that has enabled fully automatic protein structure determination for four test proteins. IPASS employs probabilistic spin system typing based on chemical shifts and secondary structure predictions. Furthermore, IPASS extracts connectivity information from the inter-residue information and the (automatically picked) 15 N-edited NOESY peaks which are then used to fix reliable fragments. When applied to automatically picked peaks for real proteins, IPASS achieves an average precision and recall of 82% and 63%, respectively. In contrast, the next best method, MARS, achieves an average precision and recall of 77% and 36%, respectively. The assignments generated by IPASS are then fed into our protein structure calculation system, FALCON-NMR, to determine the 3D structures without human intervention. The final models have backbone RMSDs of 1.25Å, 0.88Å, 1.49Å, and 0.67Å to the reference native structures for proteins TM1112, CASKIN, VRAR, and HACS1, respectively. The web server is publicly available at http://monod.uwaterloo.ca/nmr/ipass. 15 16 B. Alipanahi et al. Error Tolerant NMR Resonance Assignment 17
Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference, 2002
NMR resonance assignment is one of the key steps in solving an NMR protein structure. The assignment process links resonance peaks to individual residues of the target protein sequence, providing the prerequisite for establishing intra- and inter-residue spatial relationships between atoms. The assignment process is tedious and time-consuming, which could take many weeks. Though there exist a number of computer programs to assist the assignment process, many NMR labs are still doing the assignments manually to ensure quality. This paper presents a new computational method based on our recent work towards automating the assignment process, particularly the process of backbone resonance peak assignment. We formulate the assignment problem as a constrained weighted bipartite matching problem. While the problem, in the most general situation, is NP-hard, we present an efficient solution based on a branch-and-bound algorithm with effective bounding techniques and a greedy filtering algor...
Journal of Biomolecular NMR, 2019
Various methods for understanding the structural and dynamic properties of proteins rely on the analysis of their NMR chemical shifts. These methods require the initial assignment of NMR signals to particular atoms in the sequence of the protein, a step that can be very time-consuming. The probabilistic interaction network of evidence (PINE) algorithm for automated assignment of backbone and side chain chemical shifts utilizes a Bayesian probabilistic network model that analyzes sequence data and peak lists from multiple NMR experiments. PINE, which is one of the most popular and reliable automated chemical shift assignment algorithms, has been available to the protein NMR community for longer than a decade. We announce here a new web server version of PINE, called Integrative PINE (I-PINE), which supports more types of NMR experiments than PINE (including three-dimensional nuclear Overhauser enhancement and four-dimensional J-coupling experiments) along with more comprehensive visualization of chemical shift based analysis of protein structure and dynamics. The I-PINE server is freely accessible at http://i-pine.nmrfa m.wisc.edu. Help pages and tutorial including browser capability are available at: http://i-pine.nmrfa m.wisc.edu/instr uctio n.html. Sample data that can be used for testing the web server are available at: http://i-pine.nmrfa m.wisc.edu/examp les.html.
Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology, 1993
AUTOASSIGN is a prototype expert system designed to aid in the determination of protein structure from nuclear magnetic resonance (NMR) measurements. In this paper we focus on one of the key steps of this process, the assignment of the observed NMR signals to specific atomic nuclei in the protein; i.e. the determination of sequence-specific resonance assignments. Recently developed triple-resonance (1H, 15N, and 13C) NMR experiments [Montelione et al., 1992] have provided an important breakthrough in this field, as the resulting data are more amenable to automated analysis than data sets generated using conventional strategies [Wuethrich, 1986]. The "assignment problem" can be stated as a constraint satisfaction problem (CSP) with some added complexities. There is very little internal structure to the problem, making it difficult to apply subgoaling and problem decomposition. Moreover, the data used to generate the constraints are incomplete, non-unique, and noisy, and con...
2022
The comprehensive assignment of individual resonances of the nuclear magnetic resonance spectrum of a protein to specific atoms remains a labor-intensive and often debilitating taskespecially for proteins larger than 30 kDa. Recently, there have been tremendous advances in our empirical knowledge of the relationship between the structural context of a nuclear spin and its observed resonance frequency. Indeed, the expansion in the database of determined highresolution protein structures and recent advances in structure prediction provide an enormous resource in this respect. Robust automation of the resonance assignment process nevertheless often remains a bottleneck in the exploitation of solution NMR spectroscopy for the study of protein structure-dynamics-function relationships. Here we present a new approach for the assignment of backbone triple resonance spectra of proteins. A Bayesian statistical analysis of predicted and observed chemical shifts is used to provide a pseudo-energy potential to drive the search for the most optimal set of resonance assignments. This approach has been implemented in the C++ program Bayllagio and tested against protein systems ranging in size to over 450 amino acids. Bayllagio makes almost no errors, accommodates incomplete information, is sufficiently fast to allow for real-time evaluation of data acquisition, and greatly outperforms currently employed deterministic algorithms.
Journal of Biomolecular NMR, 1995
A novel procedure is presented for the automatic identification of secondary structures in proteins from their corresponding NOE data. The method uses a branch of mathematics known as graph theory to identify prescribed NOE connectivity patterns characteristic of the regular secondary structures. Resonance assignment is achieved by connecting these patterns of secondary structure together, thereby matching the connected spin systems to specific segments of the protein sequence. The method known as SERENDIPITY refers to a set of routines developed in a modular fashion, where each program has one or several well-defined tasks. NOE templates for several secondary structure motifs have been developed and the method has been successfully applied to data obtained from NOESY-type spectra. The present report describes the application of the SERENDIPITY protocol to a 3D NOESY-HMQC spectrum of the 15N-labelled luc repressor headpiece protein. The application demonstrates that: under favourable conditions fully automated identification of secondary structures and semi-automated assignment are feasible.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Journal of Computational Biology, 2006
Journal of Molecular Biology, 1997
PLoS computational …, 2009
Journal of Biomolecular NMR, 2013
European Conference on Computational Biology, 2005
Journal of Biomolecular NMR, 2010