Juan Antonio Vizcaino

Followers

Following

Co-authors

Public Views

Steven Laere

Robert Zabawa

Tuskegee University

Jane Balme

The University of Western Australia

Alistair Paterson

The University of Western Australia

University of Oxford

Interests

Uploads

Papers by Juan Antonio Vizcaino

The Proteomics Identifications database (PRIDE), its associated tools and the ProteomeXchange consortium

Download

Exploring the potential of public proteomics data

Proteomics, Jan 9, 2015

In a global effort for scientific transparency, it has become feasible and good practice to share... more In a global effort for scientific transparency, it has become feasible and good practice to share experimental data supporting novel findings. Consequently, the amount of publicly available mass spectrometry-based proteomics data has grown substantially in recent years. With some notable exceptions, this extensive material has however largely been left untouched. The time has now come for the proteomics community to utilize this potential gold mine for new discoveries, and uncover its untapped potential. In this review, we provide a brief history of the sharing of proteomics data, showing ways in which publicly available proteomics data are already being (re-)used, and outline potential future opportunities based on four different usage types: use, reuse, reprocess and repurpose. We thus aim to assist the proteomics community in stepping up to the challenge, and to make the most of the rapidly increasing amount of public proteomics data. This article is protected by copyright. All r...

Delicate Metabolic Control and Coordinated Stress Response Critically Determine Antifungal Tolerance of Candida albicans Biofilm Persisters

Antimicrobial agents and chemotherapy, Jan 20, 2015

Candida infection has emerged as a critical healthcare burden worldwide, owing to the formation o... more Candida infection has emerged as a critical healthcare burden worldwide, owing to the formation of robust biofilms against common antifungals. Recent evidence shows that multidrug-tolerant persisters critically account for biofilm recalcitrance, whereas their underlying biological mechanisms are poorly understood. Here, we firstly investigated the phenotypic characteristics of Candida biofilm persisters under consecutive harsh treatments of amphotericin B. The prolonged treatments effectively killed the majority cells of biofilms derived from representative strains of Candida albicans, Candida glabrata and Candida tropicalis, but failed to eradicate a small fraction of persisters. Next, we explored the tolerance mechanisms of the persisters through investigating the proteomic profiles of C. albicans biofilm persister fractions by liquid chromatography-tandem mass spectrometry. The C. albicans biofilm persisters displayed a specific proteomic signature with an array of 205 differenti...

A Quest for Missing Proteins: update 2015 on Chromosome-Centric Human Proteome Project

Journal of Proteome Research, 2015

This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-H... more This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-HPP) consortium, which develops new technologies to identify yet-to-be annotated proteins (termed "missing proteins") in biological samples that lack sufficient experimental evidence at the protein level for confident protein identification. The C-HPP also aims to identify new protein forms that may be caused by genetic variability, post-translational modifications, and alternative splicing. Proteogenomic data integration forms the basis of the C-HPP's activities; therefore, we have summarized some of key approaches and their roles in the project. We present new analytical technologies that improve the chemical space and lower detection limits coupled with bioinformatics tools and some publicly

Download

ms-data-core-api: An open-source, metadata-oriented library for computational proteomics

Bioinformatics (Oxford, England), Jan 24, 2015

The ms-data-core-api is a free, open-source library for developing computational proteomics tools... more The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Program Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the si...

Download

Development of data representation standards by the human proteome organization proteomics standards initiative

Journal of the American Medical Informatics Association : JAMIA, Jan 28, 2015

To describe the goals of the Proteomics Standards Initiative (PSI) of the Human Proteome Organiza... more To describe the goals of the Proteomics Standards Initiative (PSI) of the Human Proteome Organization, the methods that the PSI has employed to create data standards, the resulting output of the PSI, lessons learned from the PSI's evolution, and future directions and synergies for the group. The PSI has 5 categories of deliverables that have guided the group. These are minimum information guidelines, data formats, controlled vocabularies, resources and software tools, and dissemination activities. These deliverables are produced via the leadership and working group organization of the initiative, driven by frequent workshops and ongoing communication within the working groups. Official standards are subjected to a rigorous document process that includes several levels of peer review prior to release. We have produced and published minimum information guidelines describing what information should be provided when making data public, either via public repositories or other means. ...

Download

The mzTab Data Exchange Format: communicating MS-based proteomics and metabolomics experimental results to a wider audience

The HUPO Proteomics Standards Initiative has developed several standardized data formats to facil... more The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R.

Download

PRIDE and "Database on Demand" as valuable tools for computational proteomics

Data Mining in Proteomics: From Standards to Applications, 2011

The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride ) provides users with ... more The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride ) provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues, and disease states. A PRIDE experiment typically includes identifications of proteins, peptides, and protein modifications. Additionally, many of the submitted experiments also include the mass spectra that provide the evidence for these identifications. Finally, one of the strongest advantages of PRIDE in comparison with other proteomics repositories is the amount of metadata it contains, a key point to put the above-mentioned data in biological and/or technical context. Several informatics tools have been developed in support of the PRIDE database. The most recent one is called &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;quot;Database on Demand&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;quot; (DoD), which allows custom sequence databases to be built in order to optimize the results from search engines. We describe the use of DoD in this chapter. Additionally, in order to show the potential of PRIDE as a source for data mining, we also explore complex queries using federated BioMart queries to integrate PRIDE data with other resources, such as Ensembl, Reactome, or UniProt.

PRIDE: Data Submission and Analysis

Current Protocols in Protein Science, 2001

The Proteomics Identifications database (PRIDE, http://www.ebi.ac.uk/pride) is one of the main re... more The Proteomics Identifications database (PRIDE, http://www.ebi.ac.uk/pride) is one of the main repositories designed to store, disseminate, and analyze mass spectrometry-based proteomics datasets. In this unit, an overview of the PRIDE system is given, including its key satellite tools: the Ontology Lookup Service (OLS), the Protein Identifier Cross-Referencing Service (PICR), and Database on Demand (DoD). Also described in detail are procedures for submitting data to PRIDE, and accessing data stored in PRIDE using the BioMart interface. Finally, to demonstrate the potential of PRIDE as a source for data mining, an example protocol is provided to showcase the powerful cross-domain query capabilities available through a combination of BioMarts.

Analysis of the Protein Domain and Domain Architecture Content in Fungi and Its Application in the Search of New Antifungal Targets

PLoS Computational Biology, 2014

Over the past several years fungal infections have shown an increasing incidence in the susceptib... more Over the past several years fungal infections have shown an increasing incidence in the susceptible population, and caused high mortality rates. In parallel, multi-resistant fungi are emerging in human infections. Therefore, the identification of new potential antifungal targets is a priority. The first task of this study was to analyse the protein domain and domain architecture content of the 137 fungal proteomes (corresponding to 111 species) available in UniProtKB (UniProt KnowledgeBase) by January 2013. The resulting list of core and exclusive domain and domain architectures is provided in this paper. It delineates the different levels of fungal taxonomic classification: phylum, subphylum, order, genus and species. The analysis highlighted Aspergillus as the most diverse genus in terms of exclusive domain content. In addition, we also investigated which domains could be considered promiscuous in the different organisms. As an application of this analysis, we explored three different ways to detect potential targets for antifungal drugs. First, we compared the domain and domain architecture content of the human and fungal proteomes, and identified those domains and domain architectures only present in fungi. Secondly, we looked for information regarding fungal pathways in public repositories, where proteins containing promiscuous domains could be involved. Three pathways were identified as a result: lovastatin biosynthesis, xylan degradation and biosynthesis of siroheme. Finally, we classified a subset of the studied fungi in five groups depending on their occurrence in clinical samples. We then looked for exclusive domains in the groups that were more relevant clinically and determined which of them had the potential to bind small molecules. Overall, this study provides a comprehensive analysis of the available fungal proteomes and shows three approaches that can be used as a first step in the detection of new antifungal targets. Citation: Barrera A, Alastruey-Izquierdo A, Martín MJ, Cuesta I, Vizcaíno JA (2014) Analysis of the Protein Domain and Domain Architecture Content in Fungi and

Download

qcML: an exchange format for quality control metrics from mass spectrometry experiments

Molecular & Cellular Proteomics, 2014

Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteom... more Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.

Controlled vocabularies and ontologies in proteomics: Overview, principles and practice

by Gerhard Mayer, Juan Antonio Vizcaino, and Christian Stephan

Biochimica et biophysica acta, 2013

This paper focuses on the use of controlled vocabularies (CVs) and ontologies especially in the a... more This paper focuses on the use of controlled vocabularies (CVs) and ontologies especially in the area of proteomics, primarily related to the work of the Proteomics Standards Initiative (PSI). It describes the relevant proteomics standard formats and the ontologies used within them. Software and tools for working with these ontology files are also discussed. The article also examines the "mapping files" used to ensure correct controlled vocabulary terms that are placed within PSI standards and the fulfillment of the MIAPE (Minimum Information about a Proteomics Experiment) requirements.

Download

ProteomeXchange provides globally coordinated proteomics data submission and dissemination

by Gerhard Mayer, Lennart Martens, Ioannis Xenarios, Salvador Martínez-Bartolomé, Juan Antonio Vizcaino, and Alex Campos

The HUPO proteomics standards initiative- mass spectrometry controlled vocabulary

by Gerhard Mayer, Christian Stephan, and Juan Antonio Vizcaino

Database : the journal of biological databases and curation, 2013

Controlled vocabularies (CVs), i.e. a collection of predefined terms describing a modeling domain... more Controlled vocabularies (CVs), i.e. a collection of predefined terms describing a modeling domain, used for the semantic annotation of data, and ontologies are used in structured data formats and databases to avoid inconsistencies in annotation, to have a unique (and preferably short) accession number and to give researchers and computer algorithms the possibility for more expressive semantic annotation of data. The Human Proteome Organization (HUPO)-Proteomics Standards Initiative (PSI) makes extensive use of ontologies/CVs in their data formats. The PSI-Mass Spectrometry (MS) CV contains all the terms used in the PSI MS-related data standards. The CV contains a logical hierarchical structure to ensure ease of maintenance and the development of software that makes use of complex semantics. The CV contains terms required for a complete description of an MS analysis pipeline used in proteomics, including sample labeling, digestion enzymes, instrumentation parts and parameters, software used for identification and quantification of peptides/proteins and the parameters and scores used to determine their significance. Owing to the range of topics covered by the CV, collaborative development across several PSI working groups, including proteomics research groups, instrument manufacturers and software vendors, was necessary. In this article, we describe the overall structure of the CV, the process by which it has been developed and is maintained and the dependencies on other ontologies.

Download

Gene expression analysis of the biocontrol fungus Trichoderma harzianum in the presence of tomato plants, chitin, or glucose using a high-density oligonucleotide microarray

by Juan Antonio Vizcaino and Enrique Monte

BMC Microbiology, 2009

Background: It has recently been shown that the Trichoderma fungal species used for biocontrol of... more Background: It has recently been shown that the Trichoderma fungal species used for biocontrol of plant diseases are capable of interacting with plant roots directly, behaving as symbiotic microorganisms. With a view to providing further information at transcriptomic level about the early response of Trichoderma to a host plant, we developed a high-density oligonucleotide (HDO) microarray encompassing 14,081 Expressed Sequence Tag (EST)-based transcripts from eight Trichoderma spp. and 9,121 genome-derived transcripts of T. reesei, and we have used this microarray to examine the gene expression of T. harzianum either alone or in the presence of tomato plants, chitin, or glucose.

Download

Partial silencing of a hydroxy-methylglutaryl-CoA reductase-encoding gene in Trichoderma harzianum CECT 2413 results in a lower level of resistance to lovastatin and lower antifungal activity

by Juan Antonio Vizcaino and Enrique Monte

Fungal Genetics and Biology, 2007

In the present article, we describe the cloning and characterization of the Trichoderma harzianum... more In the present article, we describe the cloning and characterization of the Trichoderma harzianum hmgR gene encoding a hydroxymethylglutaryl CoA reductase (HMGR), a key enzyme in the biosynthesis of terpene compounds. In T. harzianum, partial silencing of the hmgR gene gave rise to transformants with a higher level of sensitivity to lovastatin, a competitive inhibitor of the HMGR enzyme. In addition, these hmgR-silenced transformants produced lower levels of ergosterol than the wild-type strain in a minimal medium containing lovastatin. The silenced transformants showed a decrease in hmgR gene expression (up to a 8.4-fold, after 72 h of incubation), together with an increase in the expression of erg7 (up to a 15.8-fold, after 72 h of incubation), a gene involved in the biosynthesis of triterpenes. Finally, hmgR-silenced transformants showed a reduction in their antifungal activity against the plant-pathogen fungi Rhizoctonia solani and Fusarium oxysporum.

Download

Screening of antimicrobial activities in Trichoderma isolates representing three Trichoderma sections

by Juan Antonio Vizcaino and Enrique Monte

Mycological Research, 2005

Methanol extracts from 24 Trichoderma isolates, selected as biocontrol agents and representating ... more Methanol extracts from 24 Trichoderma isolates, selected as biocontrol agents and representating different species and genotypes from three of the four taxonomic sections of this genus (T. sect. Trichoderma, T. sect. Pachybasium and T. sect. Longibrachiatum) were screened for antibacterial, anti-yeast and antifungal activities against a panel of seven bacteria, seven yeasts and six filamentous fungi previously used in similar studies. Two different growth media were tested (potato dextrose broth and CYS80), and all isolates included in the antimicrobial tests showed at least one inhibitory activity against one of the target microorganisms in one of the two culture media. No statistically significant differences were detected in the number of active strains between the two culture media, but the highest number of inhibitory strains against bacteria and fungi were found in strains from Trichoderma sect. Pachybasium, whereas strains from T. sect. Longibrachiatum showed the highest anti-yeast values. In all cases, a correlation was found between the strains that were active against yeasts and fungi. However, some degree of variability was detected for strains within the same taxonomic section. In general terms, strains from T. asperellum (mainly in CYS80 medium), and T. longibrachiatum gave the best non-enzymatic antimicrobial profiles.

Download

Generation, annotation, and analysis of ESTs from four different Trichoderma strains grown under conditions related to biocontrol

by Juan Antonio Vizcaino and Enrique Monte

Applied Microbiology and Biotechnology, 2007

ProteomeXchange provides globally coordinated proteomics data submission and dissemination

by Salvador Martínez-Bartolomé, José A Dianes Santos, Gerhard Mayer, and Juan Antonio Vizcaino

Nature Biotechnology, 2014

There is a growing trend towards public dissemination of proteomics data, which is facilitating t... more There is a growing trend towards public dissemination of proteomics data, which is facilitating the assessment, reuse, comparative analyses and extraction of new findings from published data 1, 2 . This process has been mainly driven by journal publication guidelines and funding agencies. However, there is a need for better integration of public repositories and coordinated sharing of all the pieces of information needed to represent a full mass spectrometry (MS)-based proteomics experiment. Your July 2009 editorial "Credit where credit is overdue" 3 exposed the situation in the proteomics field, where full data disclosure is still not common practise. Olsen and Mann 4 identified different levels of information in the typical experiment, starting from raw data and going through peptide identification and quantification, protein identifications and ratios and the resulting biological conclusions. All of these levels should be captured and properly annotated in public databases, using the existing MS proteomics repositories for the MS data (raw data, identification and quantification results) and metadata, whereas the resulting biological information should be integrated in protein knowledgebases, such as UniProt 5 . A recent editorial in Nature Methods 6 again highlighted the need for a stable repository for raw MS proteomics data. In this Correspondence, we report on the first implementation of the ProteomeXchange consortium, an integrated framework for submission and dissemination of MS-based proteomics data.

Download