Consensus framework for exploring microarray data using multiple clustering methods

Ted Laderas, Shannon McWeeney

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomaic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www. ohsucancer.com/isrdev/consense/) .

Original languageEnglish (US)
Pages (from-to)116-128
Number of pages13
JournalOMICS A Journal of Integrative Biology
Volume11
Issue number1
DOIs
StatePublished - 2007

Fingerprint

Microarrays
Clustering algorithms
Cluster Analysis
Genes
Software packages
Throughput
Benchmarking
Metabolomics
Multigene Family
Experiments
Software
Research Personnel

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Consensus framework for exploring microarray data using multiple clustering methods. / Laderas, Ted; McWeeney, Shannon.

In: OMICS A Journal of Integrative Biology, Vol. 11, No. 1, 2007, p. 116-128.

Research output: Contribution to journalArticle

@article{191417fc1259494fb7e95a4ef945465f,
title = "Consensus framework for exploring microarray data using multiple clustering methods",
abstract = "The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomaic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www. ohsucancer.com/isrdev/consense/) .",
author = "Ted Laderas and Shannon McWeeney",
year = "2007",
doi = "10.1089/omi.2006.0008",
language = "English (US)",
volume = "11",
pages = "116--128",
journal = "OMICS A Journal of Integrative Biology",
issn = "1536-2310",
publisher = "Mary Ann Liebert Inc.",
number = "1",

}

TY - JOUR

T1 - Consensus framework for exploring microarray data using multiple clustering methods

AU - Laderas, Ted

AU - McWeeney, Shannon

PY - 2007

Y1 - 2007

N2 - The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomaic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www. ohsucancer.com/isrdev/consense/) .

AB - The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomaic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www. ohsucancer.com/isrdev/consense/) .

UR - http://www.scopus.com/inward/record.url?scp=34247250323&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34247250323&partnerID=8YFLogxK

U2 - 10.1089/omi.2006.0008

DO - 10.1089/omi.2006.0008

M3 - Article

C2 - 17411399

AN - SCOPUS:34247250323

VL - 11

SP - 116

EP - 128

JO - OMICS A Journal of Integrative Biology

JF - OMICS A Journal of Integrative Biology

SN - 1536-2310

IS - 1

ER -