Motivation: Representations of the genome can be generated by the selection of a subpopulation of restriction fragments using ligation-mediated PCR. Such representations form the basis for a number of high-throughput assays, including the HELP assay to study cytosine methylation. We find that HELP data analysis is complicated not only by PCR amplification heterogeneity but also by a complex and variable distribution of cytosine methylation. To address this, we created an analytical pipeline and novel normalization approach that improves concordance between microarray-derived data and single locus validation results, demonstrating the value of the analytical approach. A major influence on the PCR amplification is the size of the restriction fragment, requiring a quantile normalization approach that reduces the influence of fragment length on signal intensity. Here we describe all of the components of the pipeline, which can also be applied to data derived from other assays based on genomic representations.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics