Transcription Factors (TFs) are DNA-specific binding proteins that regulate gene expression and other biological processes like cell differentiation. For example, the cardiac TF NKX2-5 has been proven to be a master regulator in the transcriptional network needed for heart development. Mutations in NKX2-5 have been shown to cause congenital heart diseases (CHDs), the most common birth defect that result in structural abnormalities in the heart. However, ~95% of disease-associated mutations occur within the non-coding genome, such as promoters and enhancers. The molecular mechanism of these genetic variations and their role in many diseases has remained unexplored. We hypothesize that non-coding variants within NKX2-5 binding sites will affect DNA recognition and alter TF-DNA binding. First, using ~84 million Single Nucleotide Polymorphisms (SNPs) reported in the 1000 Genomes Project, I identified 8,475 that are predicted to affect NKX2-5 binding. After filtering these variants, I found 901 SNPs localized in cardiac enhancers, and 30 disease-associated SNPs from the GWAS catalog predicted to affect NKX2-5 binding sites. Using position weight matrix (PWM) DNA-specificity models, binding scores for the identified mutations were calculated. The variants rs7350789 and rs7719885 were predicted to have the greatest impact in NKX2-5 binding affinity with ΔPWM scores of 258 and -212 respectively. These variants were prioritized for in vitro validation through Electrophoretic Mobility Shift Assay (EMSA) using purified full-length NKX2-5 to evaluate TF-DNA complex formation. After performing an EMSA, both variants resulted in an increase in TF-DNA binding even though rs7719885 had a negative ΔPWM score which predicts a decrease in binding affinity. Although not as expected, two of the variants predicted to impact TF-DNA binding were successfully validated.
This research was supported by: NSF Bridge to Doctorate Fellowship: HRD-19006130, NSF BioXFEL Fellowship: STC-1231306 and NIH Grant: SC1GM127231
Summary of non-coding variant identification and in vitro validation.