Academia.eduAcademia.edu

Pan-cancer analysis of neoepitopes

2018, Scientific Reports

Somatic variations are frequent and important drivers in cancers. Amino acid substitutions can yield neoantigens that are detected by the immune system. Neoantigens can lead to immune response and tumor rejection. Although neoantigen load and occurrence have been widely studied, a detailed pancancer analysis of the occurrence and characterization of neoepitopes is missing. We investigated the proteome-wide amino acid substitutions in 8-, 9-, 10-, and 11-mer peptides in 30 cancer types with the NetMHC 4.0 software. 11,316,078 (0.24%) of the predicted 8-, 9-, 10-, and 11-mer peptides were highly likely neoepitope candidates and were derived from 95.44% of human proteins. Binding affinity to MHC molecules is just one of the many epitope features. The most likely epitopes are those which are detected by several MHCs and of several peptide lengths. 9-mer peptides are the most common among the high binding neoantigens. 0.17% of all variants yield more than 100 neoepitopes and are considered as the best candidates for any application. Amino acid distributions indicate that variants at all positions in neoepitopes of any length are, on average, more hydrophobic than the wild-type residues. We characterized properties of neoepitopes in 30 cancer types and estimated the likely numbers of tumor-derived epitopes that could induce an immune response. We found that amino acid distributions, at all positions in neoepitopes of all lengths, contain more hydrophobic residues than the wild-type sequences implying that the hydropathy nature of neoepitopes is an important property. The neoepitope characteristics can be employed for various applications including targeted cancer vaccine development for precision medicine. The task of the immune system is to detect and destroy foreign molecules and organisms. This is achieved by the numerous mechanisms and processes that form the innate and adaptive arms of the immune system. Three complementary adaptive systems have evolved to recognize foreign materials. First, antibodies recognize and neutralize non-self-molecules. Second, the major histocompatibility complexes (MHCs) I and II bind to and present short fragments of foreign peptides to T cells. Third, T cell receptors are produced with a similar recombination process as antibodies. The binding sites of these molecules are highly variable due to genetic recombination processes. Therefore, it is essential that the immune system does not react against natural human molecules to prevent autoimmune diseases. Safeguards against self-reactivity and induced tolerance prevent this from happening. These mechanisms are still poorly understood. Recently, antigen-specific regulatory T-cells were shown to be responsible for autoimmunity protection 1 . Variations accumulate during a lifetime. It has been estimated that in fibroblasts, B, and T cells, the mutation rate is 2-10 variations per diploid genome per cell division 2 . This means that normal cells can have from hundreds to several thousands of variations in comparison to the original genome of the individual 3 . In cancers, the variation rate can be much higher, for example, lung cancer cells typically contain over a million variants 4 . It is thus highly likely that cancer tissues include numerous immunogenic proteins because substitutions in the DNA, the most abundant changes in cancers, can lead to amino acid substitutions (AASs) in proteins. Such immunogenic epitopes are called neoantigens. To use neoantigens for therapeutic purposes, numerous research projects aim at detecting cancer variant peptides for diagnosis and treatment, including vaccination. Although next-generation sequencing methods are efficient for sequencing and detecting variants in tumors, the translation to neoantigens is not straightforward. Neoantigen-based treatment would facilitate personalized medicine for cancer patients. In addition to the possibilities for treatment, neoantigens could possibly be used for diagnosis especially in the case of relapse. Numerous methods have been developed to predict the antigenicity of peptides, especially those binding to MHC type I molecules 5 . The performance of these tools varies 6,7 depending on the size and composition of the used benchmark dataset 8 . Despite intensive research, the number of experimentally defined epitopes is still relatively small 7,9 and affects the performance of the predictors. By combining the epitope predictions with experimental validation assays, the performance can be improved. NetMHC 10,11 is a predictor for epitopes and