Abstract
Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.
Original language | English (US) |
---|---|
Pages (from-to) | 6798-6807 |
Number of pages | 10 |
Journal | Nucleic acids research |
Volume | 35 |
Issue number | 20 |
DOIs | |
State | Published - Nov 1 2007 |
Externally published | Yes |
Fingerprint
ASJC Scopus subject areas
- Genetics
Cite this
CG dinucleotide clustering is a species-specific property of the genome. / Glass, Jacob L.; Thompson, Reid; Khulan, Batbayar; Figueroa, Maria E.; Olivier, Emmanuel N.; Oakley, Erin J.; Van Zant, Gary; Bouhassira, Eric E.; Melnick, Ari; Golden, Aaron; Fazzari, Melissa J.; Greally, John M.
In: Nucleic acids research, Vol. 35, No. 20, 01.11.2007, p. 6798-6807.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - CG dinucleotide clustering is a species-specific property of the genome
AU - Glass, Jacob L.
AU - Thompson, Reid
AU - Khulan, Batbayar
AU - Figueroa, Maria E.
AU - Olivier, Emmanuel N.
AU - Oakley, Erin J.
AU - Van Zant, Gary
AU - Bouhassira, Eric E.
AU - Melnick, Ari
AU - Golden, Aaron
AU - Fazzari, Melissa J.
AU - Greally, John M.
PY - 2007/11/1
Y1 - 2007/11/1
N2 - Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.
AB - Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.
UR - http://www.scopus.com/inward/record.url?scp=36749017978&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36749017978&partnerID=8YFLogxK
U2 - 10.1093/nar/gkm489
DO - 10.1093/nar/gkm489
M3 - Article
C2 - 17932072
AN - SCOPUS:36749017978
VL - 35
SP - 6798
EP - 6807
JO - Nucleic Acids Research
JF - Nucleic Acids Research
SN - 0305-1048
IS - 20
ER -