Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008
…
17 pages
1 file
Microarray based transcription profiling is now a consolidated methodology and has widespread use in areas such as pharmacogenomics, diagnostics and drug target identification. Large-scale microarray studies are also becoming crucial to a new way of conceiving experimental biology. A main issue in microarray transcription profiling is data analysis and mining. When microarrays became a methodology of general use, considerable effort was made to produce algorithms and methods for the identification of differentially expressed genes. More recently, the focus has switched to algorithms and database development for microarray data mining. Furthermore, the evolution of microarray technology is allowing researchers to grasp the regulative nature of transcription, integrating basic expression analysis with mRNA characteristics, i.e. exon-based arrays, and with DNA characteristics, i.e. comparative genomic hybridization, single nucleotide polymorphism, tiling and promoter structure. In this article, we will review approaches used to detect differentially expressed genes and to link differential expression to specific biological functions.
Briefings in functional genomics & proteomics, 2007
Microarray based transcription profiling is now a consolidated methodology and has widespread use in areas such as pharmacogenomics, diagnostics and drug target identification. Large-scale microarray studies are also becoming crucial to a new way of conceiving experimental biology. A main issue in microarray transcription profiling is data analysis and mining. When microarrays became a methodology of general use, considerable effort was made to produce algorithms and methods for the identification of differentially expressed genes. More recently, the focus has switched to algorithms and database development for microarray data mining. Furthermore, the evolution of microarray technology is allowing researchers to grasp the regulative nature of transcription, integrating basic expression analysis with mRNA characteristics, i.e. exon-based arrays, and with DNA characteristics, i.e. comparative genomic hybridization, single nucleotide polymorphism, tiling and promoter structure. In this...
Puerto Rico health sciences journal, 2009
DNA microarray is a technology that simultaneously evaluates quantitative measurements for the expression of thousands of genes. DNA microarrays have been used to assess gene expression between groups of cells of different organs or different populations. In order to understand the role and function of the genes, one needs the complete information about their mRNA transcripts and proteins. Unfortunately, exploring the protein functions is very difficult, due to their unique 3-dimentional complicated structure. To overcome this difficulty, one may concentrate on the mRNA molecules produced by the gene expression. In this paper, we describe some of the methods for preprocessing data for gene expression and for pairwise comparison from genomic experiments. Previous studies to assess the efficiency of different methods for pairwise comparisons have found little agreement in the lists of significant genes. Finally, we describe the procedures to control false discovery rates, sample size ...
International Journal of Biotech Trends and Technology
Genes contain blue print of living organism. Malfunctioning occurred in cellular life is indicated by proteins which are responsible for behavior of genes. Fixed set of genes decides behavior and functioning of cells. They guide the cells what to do and when to do. To analyze the insight of biological activities, analysis of gene expressions is necessary. Advanced technology like microarray plays an important role in gene analysis as it captures expressions of thousands of genes under different conditions simultaneously. Out of thousands of genes, very few behave differently which are called as Differentially Expressing Genes (DEGs). Identifying these most significant genes is a crucial task in molecular biology and is a major area of research for bioinformaticians because DEGs are the major source of disease prediction. They help in planning therapeutic strategies for a disease through Gene Regulatory Network (GRN) constructed from them. GRN is a graphical representation containing genes as nodes and regulatory interactions among them as edges. GRN helps to know change over occurred among genes which involved in cause of genetic diseases as well as to analyze their response to different stress conditions through microarray expressions. In this paper we have discussed many methods proposed by researchers for identifying differentially expressing genes based upon changes in their expressions patterns.
BMC Bioinformatics, 2007
Background: This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework. Results: The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering. The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms. Further analyses on 3 real cancer datasets and 3 Parkinson's datasets show the similar improvement in performance. First, a 3 fold validation process is provided for the two-sample cancer datasets. In addition, the analysis on 3 sets of Parkinson's data is performed to demonstrate the scalability of the proposed method to multi-sample microarray datasets. Conclusion: This paper presents a unified framework for the robust selection of genes from the two-sample as well as multi-sample microarray experiments. Two different ranking methods used in module 1 bring diversity in the selection of genes. The conversion of ranks to p-values, the fusion of p-values and FDR analysis aid in the identification of significant genes which cannot be judged based on gene ranking alone. The 3 fold validation, namely, robustness in selection of genes using FDR analysis, clustering, and visualization demonstrate the relevance of the DEGs. Empirical analyses on 50 artificial datasets and 6 real microarray datasets illustrate the efficacy of the proposed approach. The analyses on 3 cancer datasets demonstrate the utility of the proposed approach on microarray datasets with two classes of samples. The scalability of the proposed unified approach to multi-sample (more than two sample classes) microarray datasets is addressed using three sets of Parkinson's Data. Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.
2011
With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to ...
Physiological Genomics, 2006
DNA microarray represents a powerful tool in biomedical discoveries. Harnessing the potential of this technology depends on the development and appropriate use of data mining and statistical tools. Significant current advances have made microarray data mining more versatile. Researchers are no longer limited to default choices that generate suboptimal results. Conflicting results in repeated experiments can be resolved through attention to the statistical details. In the current dynamic environment, there are many choices and potential pitfalls for researchers who intend to incorporate microarrays as a research tool. This review is intended to provide a simple framework to understand the choices and identify the pitfalls. Specifically, this review article discusses the choice of microarray platform, preprocessing raw data, differential expression and validation, clustering, annotation and functional characterization of genes, and pathway construction in light of emergent concepts an...
Advances in Data …, 2010
There are several different algorithms published for the identification of differentially expressed genes in DNA microarray experiments. Such algorithms produce ordered lists of genes. To compare the performance of these algorithms established measurements from Information Retrieval are proposed. A benchmark data set with known properties is generated and published. This benchmark data is used to compare the performance of different algorithms with a new algorithm, called PUL. Surprisingly a clear ordering in performance of the algorithms was observed. PUL outperformed other algorithms by a factor of two. PUL was applied successfully in different practical applications. For these experiments the importance of the genes identified by PUL were independently verified.
Molecular Microbiology, 2003
Here, we review briefly the sources of experimental and biological variance that affect the interpretation of high-dimensional DNA microarray experiments. We discuss methods using a regularized t -test based on a Bayesian statistical framework that allow the identification of differentially regulated genes with a higher level of confidence than a simple t -test when only a few experimental replicates are available. We also describe a computational method for calculating the global false-positive and false-negative levels inherent in a DNA microarray data set. This method provides a probability of differential expression for each gene based on experiment-wide false-positive and -negative levels driven by experimental error and biological variance.
DNA microarray is a powerful technology that can simultaneously determine the levels of thousands of transcripts (generated, for example, from genes/miRNAs) across different experimental conditions or tissue samples. The motto of differential expression analysis is to identify the transcripts whose expressions change significantly across different types of samples or experimental conditions. A number of statistical testing methods are available for this purpose. In this article, we provide a comprehensive survey on different parametric and nonparametric testing methodologies for identifying differential expression from microarray datasets. The performances of the different testing methods have been compared based on some real-life miRNA and mRNA expression data sets. For validating the resulting differentially expressed miRNAs, the outcomes of each test are checked with the information available for miRNA in the standard miRNA database PhenomiR 2.0. Subsequently, we have prepared different simulated datasets of different sample sizes (from 10 to 100 per group/population) and thereafter the power of each test have been calculated individually. The comparative simulated study might lead to formulate robust and comprehensive judgements about the performance of each test in the basis of assumption of data distribution. Finally, a list of advantages and limitations of the different statistical tests has been provided, along with indications of some areas where further studies are required.
Bioinformation, 2011
Identification of genes differentially expressed across multiple conditions has become an important statistical problem in analyzing large-scale microarray data. Many statistical methods have been developed to address the challenging problem. Therefore, an extensive comparison among these statistical methods is extremely important for experimental scientists to choose a valid method for their data analysis. In this study, we conducted simulation studies to compare six statistical methods: the Bonferroni (B-) procedure, the Benjamini and Hochberg (BH-) procedure, the Local false discovery rate (Localfdr) method, the Optimal Discovery Procedure (ODP), the Ranking Analysis of F-statistics (RAF), and the Significant Analysis of Microarray data (SAM) in identifying differentially expressed genes. We demonstrated that the strength of treatment effect, the sample size, proportion of differentially expressed genes and variance of gene expression will significantly affect the performance of ...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Bioinformatics, 2007
Methods in molecular medicine
BMC Bioinformatics, 2007
Current Bioinformatics, 2006
OMICS: A Journal of Integrative Biology, 2008
BioData Mining, 2010