Gene set analyses for interpreting microarray experiments on prokaryotic organisms

Nathan L. Tintle, Aaron A. Best, Matthew DeJongh, Dirk Van Bruggen, Fred Heffron, Steffen Porwollik, Ronald C. Taylor

    Research output: Contribution to journalArticle

    11 Citations (Scopus)

    Abstract

    Background: Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher's exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, as is typically the case for prokaryotes. Results: We extend five methods of gene set analysis from use on experiments with multiple replicates, for use on experiments with few replicates. We then use simulated and real data to compare these methods with each other and with the Fisher's exact test (FET) method. As a result of the simulation we find that a method named MAXMEAN-NR, maintains the nominal rate of false positive findings (type I error rate) while offering good statistical power and robustness to a variety of gene set distributions for set sizes of at least 10. Other methods (ABSSUM-NR or SUM-NR) are shown to be powerful for set sizes less than 10. Analysis of three sets of experimental data shows similar results. Furthermore, the MAXMEAN-NR method is shown to be able to detect biologically relevant sets as significant, when other methods (including FET) cannot. We also find that the popular GSEA-NR method performs poorly when compared to MAXMEAN-NR. Conclusion: MAXMEAN-NR is a method of gene set analysis for experiments with few replicates, as is common for prokaryotes. Results of simulation and real data analysis suggest that the MAXMEAN-NR method offers increased robustness and biological relevance of findings as compared to FET and other methods, while maintaining the nominal type I error rate.

    Original languageEnglish (US)
    Article number469
    JournalBMC Bioinformatics
    Volume9
    DOIs
    StatePublished - Nov 5 2008

    Fingerprint

    Microarray Analysis
    Microarrays
    Microarray
    Genes
    Gene
    Experiment
    Fisher's Exact Test
    Experiments
    Type I Error Rate
    Categorical or nominal
    Robustness
    Statistical Power
    DNA Microarray
    DNA
    Oligonucleotide Array Sequence Analysis
    False Positive
    Data analysis
    Simulation
    Experimental Data

    ASJC Scopus subject areas

    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Structural Biology
    • Applied Mathematics

    Cite this

    Tintle, N. L., Best, A. A., DeJongh, M., Van Bruggen, D., Heffron, F., Porwollik, S., & Taylor, R. C. (2008). Gene set analyses for interpreting microarray experiments on prokaryotic organisms. BMC Bioinformatics, 9, [469]. https://doi.org/10.1186/1471-2105-9-469

    Gene set analyses for interpreting microarray experiments on prokaryotic organisms. / Tintle, Nathan L.; Best, Aaron A.; DeJongh, Matthew; Van Bruggen, Dirk; Heffron, Fred; Porwollik, Steffen; Taylor, Ronald C.

    In: BMC Bioinformatics, Vol. 9, 469, 05.11.2008.

    Research output: Contribution to journalArticle

    Tintle, NL, Best, AA, DeJongh, M, Van Bruggen, D, Heffron, F, Porwollik, S & Taylor, RC 2008, 'Gene set analyses for interpreting microarray experiments on prokaryotic organisms', BMC Bioinformatics, vol. 9, 469. https://doi.org/10.1186/1471-2105-9-469
    Tintle NL, Best AA, DeJongh M, Van Bruggen D, Heffron F, Porwollik S et al. Gene set analyses for interpreting microarray experiments on prokaryotic organisms. BMC Bioinformatics. 2008 Nov 5;9. 469. https://doi.org/10.1186/1471-2105-9-469
    Tintle, Nathan L. ; Best, Aaron A. ; DeJongh, Matthew ; Van Bruggen, Dirk ; Heffron, Fred ; Porwollik, Steffen ; Taylor, Ronald C. / Gene set analyses for interpreting microarray experiments on prokaryotic organisms. In: BMC Bioinformatics. 2008 ; Vol. 9.
    @article{c91519e019c5488aa882a87051a1fc45,
    title = "Gene set analyses for interpreting microarray experiments on prokaryotic organisms",
    abstract = "Background: Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher's exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, as is typically the case for prokaryotes. Results: We extend five methods of gene set analysis from use on experiments with multiple replicates, for use on experiments with few replicates. We then use simulated and real data to compare these methods with each other and with the Fisher's exact test (FET) method. As a result of the simulation we find that a method named MAXMEAN-NR, maintains the nominal rate of false positive findings (type I error rate) while offering good statistical power and robustness to a variety of gene set distributions for set sizes of at least 10. Other methods (ABSSUM-NR or SUM-NR) are shown to be powerful for set sizes less than 10. Analysis of three sets of experimental data shows similar results. Furthermore, the MAXMEAN-NR method is shown to be able to detect biologically relevant sets as significant, when other methods (including FET) cannot. We also find that the popular GSEA-NR method performs poorly when compared to MAXMEAN-NR. Conclusion: MAXMEAN-NR is a method of gene set analysis for experiments with few replicates, as is common for prokaryotes. Results of simulation and real data analysis suggest that the MAXMEAN-NR method offers increased robustness and biological relevance of findings as compared to FET and other methods, while maintaining the nominal type I error rate.",
    author = "Tintle, {Nathan L.} and Best, {Aaron A.} and Matthew DeJongh and {Van Bruggen}, Dirk and Fred Heffron and Steffen Porwollik and Taylor, {Ronald C.}",
    year = "2008",
    month = "11",
    day = "5",
    doi = "10.1186/1471-2105-9-469",
    language = "English (US)",
    volume = "9",
    journal = "BMC Bioinformatics",
    issn = "1471-2105",
    publisher = "BioMed Central",

    }

    TY - JOUR

    T1 - Gene set analyses for interpreting microarray experiments on prokaryotic organisms

    AU - Tintle, Nathan L.

    AU - Best, Aaron A.

    AU - DeJongh, Matthew

    AU - Van Bruggen, Dirk

    AU - Heffron, Fred

    AU - Porwollik, Steffen

    AU - Taylor, Ronald C.

    PY - 2008/11/5

    Y1 - 2008/11/5

    N2 - Background: Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher's exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, as is typically the case for prokaryotes. Results: We extend five methods of gene set analysis from use on experiments with multiple replicates, for use on experiments with few replicates. We then use simulated and real data to compare these methods with each other and with the Fisher's exact test (FET) method. As a result of the simulation we find that a method named MAXMEAN-NR, maintains the nominal rate of false positive findings (type I error rate) while offering good statistical power and robustness to a variety of gene set distributions for set sizes of at least 10. Other methods (ABSSUM-NR or SUM-NR) are shown to be powerful for set sizes less than 10. Analysis of three sets of experimental data shows similar results. Furthermore, the MAXMEAN-NR method is shown to be able to detect biologically relevant sets as significant, when other methods (including FET) cannot. We also find that the popular GSEA-NR method performs poorly when compared to MAXMEAN-NR. Conclusion: MAXMEAN-NR is a method of gene set analysis for experiments with few replicates, as is common for prokaryotes. Results of simulation and real data analysis suggest that the MAXMEAN-NR method offers increased robustness and biological relevance of findings as compared to FET and other methods, while maintaining the nominal type I error rate.

    AB - Background: Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher's exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, as is typically the case for prokaryotes. Results: We extend five methods of gene set analysis from use on experiments with multiple replicates, for use on experiments with few replicates. We then use simulated and real data to compare these methods with each other and with the Fisher's exact test (FET) method. As a result of the simulation we find that a method named MAXMEAN-NR, maintains the nominal rate of false positive findings (type I error rate) while offering good statistical power and robustness to a variety of gene set distributions for set sizes of at least 10. Other methods (ABSSUM-NR or SUM-NR) are shown to be powerful for set sizes less than 10. Analysis of three sets of experimental data shows similar results. Furthermore, the MAXMEAN-NR method is shown to be able to detect biologically relevant sets as significant, when other methods (including FET) cannot. We also find that the popular GSEA-NR method performs poorly when compared to MAXMEAN-NR. Conclusion: MAXMEAN-NR is a method of gene set analysis for experiments with few replicates, as is common for prokaryotes. Results of simulation and real data analysis suggest that the MAXMEAN-NR method offers increased robustness and biological relevance of findings as compared to FET and other methods, while maintaining the nominal type I error rate.

    UR - http://www.scopus.com/inward/record.url?scp=57049101812&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=57049101812&partnerID=8YFLogxK

    U2 - 10.1186/1471-2105-9-469

    DO - 10.1186/1471-2105-9-469

    M3 - Article

    VL - 9

    JO - BMC Bioinformatics

    JF - BMC Bioinformatics

    SN - 1471-2105

    M1 - 469

    ER -