Session: 663 Computational biology and bioinformatics I
(663.8) Pyllelic, a Software Suite for Examining Allelic DNA CpG Methylation Patterns in Genomic Datasets
Monday, April 4, 2022
12:30 PM – 1:45 PM
Location: Exhibit/Poster Hall A-B - Pennsylvania Convention Center
Poster Board Number: A266
Andrew Bonham (Metropolitan State University of Denver), Dylan Poch (Metropolitan State University of Denver), Teisha Rowland (University of Colorado Boulder)
Professor Metropolitan State University of Denver Denver, Colorado
While genome-wide demethylation of DNA CpG sites is a well-established early hallmark of cancer, only recently has the presence of allelic methylation patterns associating with pathological gene expression been reported. Such allelic methylation patterns have often been obscured due to statistical averaging procedures applied to bisulfite conversion sequencing (Bis-Seq) CpG methylation genomic datasets. Here, we present a Python software suite, Pyllelic, that dissects Bis-Seq genomic datasets at the individual read level, allowing nuanced analyses of methylation patterns of genes, including allelic patterns associated with gene expression. We used this approach to examine the role of allelic methylation in the telomerase reverse transcriptase (TERT) gene, which is pathologically expressed in the majority of human cancers (approximately 80-90%) and has known allelic expression patterns. Surprisingly, active TERT gene expression in human cancers has been characterized as having a hypermethylated CpG island in the promoter, which is the opposite of general promoter methylation associations with gene expression, including reports of hypomethylation of promoters correlating with overexpression of oncogenes. However, when we used Pyllelic to analyze gt;600 cancer cell lines from 23 different tissue types, we found that in the proximal region of the TERT promoter, allele-specific hypomethylation correlated with TERT expression. This analysis was performed in cancer cell lines with known activating, monoallelic promoter mutations as well as in wild-type TERT cancer cell lines. The Pyllelic suite allows both aggregate and individual read-level access to the underlying methylated reads of genomic datasets, with a data science-focused implementation to facilitate the path from hypothesis to publishable figures and statistical analyses. Pyllelic thus should serve as an important tool to help identify allelic DNA methylation patterns and associations for a wide variety of genomic regions, with particular relevance for pathological activation in cancers, as well as other tissue or disease methylation analysis.
Heatmap of allelic methylation differences in the promoter region of the telomerase reverse transcriptase gene, generated by Pyllelic.