Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2007, Nature Biotechnology
…
6 pages
1 file
The Brookhaven Protein Data Bank (PDB) is facing critical issues due to its outdated organizational structure and the inadequacies of the mmCIF data standard. This paper discusses the need for comprehensive reengineering of both PDB and mmCIF to address problems related to data redundancy, integrity, and query accuracy that undermine their utility in the structural biology community. A proposed optimization for data sharing and preservation within PDB aims to enhance its functionality and alignment with current scientific standards.
Data Science Journal
Research data is acquired, interpreted, published, reused, and sometimes eventually discarded. Understanding this life cycle better will help the development of appropriate infrastructural services, ones which make it easier for researchers to preserve, share, and find data. Structural biology is a discipline within the life sciences, one that investigates the molecular basis of life by discovering and interpreting the shapes and motions of macromolecules. Structural biology has a strong tradition of data sharing, expressed by the founding of the Protein Data Bank (PDB) in 1971. The culture of structural biology is therefore already in line with the perspective that data from publicly funded research projects are public data. This review is based on the data life cycle as defined by the UK Data Archive. It identifies six stages: creating data, processing data, analysing data, preserving data, giving access to data, and re-using data. For clarity, ʻpreserving dataʼ and ʻgiving access to dataʼ are discussed together. A final stage to the life cycle, ʻdiscarding dataʼ, is also discussed. The review concludes with recommendations for future improvements to the IT infrastructure for structural biology.
Data Science Journal
The Protein Data Bank archive (PDB) was established in 1971 as the 1st open access digital data resource for biology and medicine. Today, the PDB contains >160,000 atomic-level, experimentally-determined 3D biomolecular structures. PDB data are freely and publicly available for download, without restrictions. Each entry contains summary information about the structure and experiment, atomic coordinates, and in most cases, a citation to a corresponding scientific publication. Individually and in bulk, PDB structures can be downloaded and/or analyzed and visualized online using tools at RCSB.org. As such, it is challenging to understand and monitor reuse of data. Citations of the scientific publications describing PDB structures provide one way of understanding which structures are being used, and in which research areas. Our analysis highlights frequently-cited structures and identifies milestone structures that have demonstrated impact across scientific fields.
Nature structural biology, 2000
The PDB has created systems for the processing, exchange, query, and distribution of data that will enable many aspects of high throughput structural genomics.
Nucleic Acids Research, 2003
The Protein Data Bank (PDB; http://www.pdb.org/) continues to be actively involved in various aspects of the informatics of structural genomics projectsdeveloping and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.
Structural Genomics and Drug Discovery, 2014
Modern high-throughput structural biology laboratories produce vast amounts of raw experimental data. The traditional method of data reduction is very simple-results are summarized in peerreviewed publications, which are hopefully published in high-impact journals. By their nature, publications include only the most important results derived from experiments that may have been performed over the course of many years. The main content of the published paper is a concise compilation of these data, an interpretation of the experimental results, and a comparison of these results with those obtained by other scientists.
Trends in Biochemical Sciences, 1997
There are currently over 6000 threedimensional structures of biological macromolecules-primarily proteins and nucleic acids-in the Brookhaven Protein Data Bank (PDB)L This number is doubling every two years and hence will be over 12 000 by the end of the millennium. This steadily increasing volume of data requires some quick and simple means of access and organization. Of interest are not only the data stored in each PDB file, i.e. the names, sequences and formulae of the molecule(s), the authors who solved the structure, literature let erences, experimental details and of
Nucleic Acids Research, 2004
The Protein Data Bank (PDB) is the central worldwide repository for three-dimensional (3D) structure data of biological macromolecules. The Research Collaboratory for Structural Bioinformatics (RCSB) has completely redesigned its resource for the distribution and query of 3D structure data. The reengineered site is currently in public beta test at http://pdbbeta.rcsb.org. The new site expands the functionality of the existing site by providing structure data in greater detail and uniformity, improved query and enhanced analysis tools. A new key feature is the integration and searchability of data from over 20 other sources covering genomic, proteomic and disease relationships. The current capabilities of the re-engineered site, which will become the RCSB production site at http://www.pdb.org in late 2005, are described.
Trends in Biotechnology, 2002
European Journal of Biochemistry, 1977
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Nucleic Acids Research, 2002
Acta Crystallographica Section D-biological Crystallography, 2002
Nucleic Acids Research, 2004
Nucleic Acids Research, 2011