Prediction of epigenetically regulated genes in breast cancer cell lines

Leandro A. Loss, Anguraj Sadanandam, Steffen Durinck, Shivani Nautiyal, Diane Flaucher, Victoria E H Carlton, Martin Moorhead, Yontao Lu, Joe Gray, Malek Faham, Paul Spellman, Bahram Parvin

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Background: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis.Results: Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes.Conclusions: Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.

Original languageEnglish (US)
Article number305
JournalBMC Bioinformatics
Volume11
DOIs
StatePublished - Jun 4 2010
Externally publishedYes

Fingerprint

Methylation
Breast Cancer
Genes
Cells
Breast Neoplasms
Gene
Cell Line
Line
Prediction
Cell
CpG Islands
Gene expression
Gene Expression
Epigenomics
Dimensionality Reduction
Logistic Regression
Gene Expression Data
Regulator
High Throughput
Cancer

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

Loss, L. A., Sadanandam, A., Durinck, S., Nautiyal, S., Flaucher, D., Carlton, V. E. H., ... Parvin, B. (2010). Prediction of epigenetically regulated genes in breast cancer cell lines. BMC Bioinformatics, 11, [305]. https://doi.org/10.1186/1471-2105-11-305

Prediction of epigenetically regulated genes in breast cancer cell lines. / Loss, Leandro A.; Sadanandam, Anguraj; Durinck, Steffen; Nautiyal, Shivani; Flaucher, Diane; Carlton, Victoria E H; Moorhead, Martin; Lu, Yontao; Gray, Joe; Faham, Malek; Spellman, Paul; Parvin, Bahram.

In: BMC Bioinformatics, Vol. 11, 305, 04.06.2010.

Research output: Contribution to journalArticle

Loss, LA, Sadanandam, A, Durinck, S, Nautiyal, S, Flaucher, D, Carlton, VEH, Moorhead, M, Lu, Y, Gray, J, Faham, M, Spellman, P & Parvin, B 2010, 'Prediction of epigenetically regulated genes in breast cancer cell lines', BMC Bioinformatics, vol. 11, 305. https://doi.org/10.1186/1471-2105-11-305
Loss LA, Sadanandam A, Durinck S, Nautiyal S, Flaucher D, Carlton VEH et al. Prediction of epigenetically regulated genes in breast cancer cell lines. BMC Bioinformatics. 2010 Jun 4;11. 305. https://doi.org/10.1186/1471-2105-11-305
Loss, Leandro A. ; Sadanandam, Anguraj ; Durinck, Steffen ; Nautiyal, Shivani ; Flaucher, Diane ; Carlton, Victoria E H ; Moorhead, Martin ; Lu, Yontao ; Gray, Joe ; Faham, Malek ; Spellman, Paul ; Parvin, Bahram. / Prediction of epigenetically regulated genes in breast cancer cell lines. In: BMC Bioinformatics. 2010 ; Vol. 11.
@article{33b8cc766f9b4e12bf7b22160f9a3401,
title = "Prediction of epigenetically regulated genes in breast cancer cell lines",
abstract = "Background: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis.Results: Our two-step dimensionality reduction compressed 90{\%} of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes.Conclusions: Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.",
author = "Loss, {Leandro A.} and Anguraj Sadanandam and Steffen Durinck and Shivani Nautiyal and Diane Flaucher and Carlton, {Victoria E H} and Martin Moorhead and Yontao Lu and Joe Gray and Malek Faham and Paul Spellman and Bahram Parvin",
year = "2010",
month = "6",
day = "4",
doi = "10.1186/1471-2105-11-305",
language = "English (US)",
volume = "11",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Prediction of epigenetically regulated genes in breast cancer cell lines

AU - Loss, Leandro A.

AU - Sadanandam, Anguraj

AU - Durinck, Steffen

AU - Nautiyal, Shivani

AU - Flaucher, Diane

AU - Carlton, Victoria E H

AU - Moorhead, Martin

AU - Lu, Yontao

AU - Gray, Joe

AU - Faham, Malek

AU - Spellman, Paul

AU - Parvin, Bahram

PY - 2010/6/4

Y1 - 2010/6/4

N2 - Background: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis.Results: Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes.Conclusions: Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.

AB - Background: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis.Results: Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes.Conclusions: Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.

UR - http://www.scopus.com/inward/record.url?scp=77953007415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953007415&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-11-305

DO - 10.1186/1471-2105-11-305

M3 - Article

C2 - 20525369

AN - SCOPUS:77953007415

VL - 11

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 305

ER -