Understanding evolutionary changes across Drosophila species can lead to a better understanding of how genomes have changed over time. Drosophila species have existed for millions of years, allowing the opportunity to study and identify the functions of genes orthologous to human genes. The etiology of numerous diseases can be better elucidated through the utilization of bioinformatic tools and genomic annotation. The Genome Education Partnership (GEP) is a nationwide community that provides undergraduate students the opportunity to gain experience in annotating eukaryotic genomes. Through the use of bioinformatics tools such as UCSC Genome Browser, NCBI Blast, and Flybase, students participate in the F-element and Pathways projects. The F-element project involves the annotation of protein-coding genes located on chromosome four, commonly known as the Muller F element. Over time, the F element in some species of Drosophila, including Drosophila ananassae, has been expanding. Comparative genome annotation analyzing changes in the organization of this genomic region may lead to a better understanding of the complexity of these genomes. The Pathways project involves annotating specific genes involved in the Insulin Signaling Pathway. These genes play a critical role in the regulation of expression and metabolism in many eukaryotes. In this study, the annotation of a genomic region in Drosophila ananassae is discussed.
A 25,000 base pair region of chromosome 3L in D. ananassae (Muller D element) was annotated for comparison to the F element. In this region, known as Contig 21, five protein-coding genes were identified: CG10222, CG10713, CG32121, CG33263, and flr. These genes had high protein alignment (70% to 90% amino acid identity) compared to Drosophila melanogaster, however, the flr gene possessed some interesting characteristics. Upon the annotation of the flr gene in Drosophila ananassae, it was predicted that one of the isoforms was no longer expressed and/or functional. The gene flr encodes the Drosophila Actin Interacting Protein 1, which interacts with the gene product of trs (cofillin) to promote F-actin disassembly. In Drosophila melanogaster, the flr gene encodes two isoforms, however in Drosophila ananassae, only the flr-PA isoform appears to be expressed and/or functional. A deletion/insertion of a nucleotide within exon 1 in the flr gene leads to a frameshift resulting in a non-functional protein due to a change in the original stop codon of isoform B in D. ananassae. The gene flr plays an important role in the development of the organism, therefore molecular analysis was utilized to confirm the annotation model above. RNA was extracted from Drosophila ananassae, and reverse transcription PCR (RT-PCR) was utilized to amplify this region of the genome. Dideoxy sequencing was performed to analyze the expression of different isoforms of this gene. The sequencing data was compared to the Drosophila ananassae and Drosophila melanogaster genomes using BLAST and the UCSC Genome Browser. This data supported the expression of only one isoform of the flr gene in D. ananassae. This data also suggests that one of the flr gene isoforms is highly conserved in this species. Additional analysis of newly sequenced genomic data such as this will provide additional understanding on how genomes such as these have changed over time.
GEP is supported by NIH grant #R25GM130517, NSF grant #1915544, and hosted by The University of Alabama and Washington University in St. Louis.