Introduction: Breast Cancer (BrCA) is one of the most common cancers, causing mortality in women worldwide. Among the subtypes of breast cancer, triple-negative breast cancer (TNBC) is the most aggressive type of cancer due to the absence of three major receptors, estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor-2 (HER-2). Treatment options for TNBC include chemotherapy, surgery, and radiation; however, these are associated with severe adverse effects and acquired resistance. We hypothesized that the identification of new molecular targets in TNBC, and characterization of their molecular biology will enhance therapeutic outcomes and improve patient survival. This study aims to investigate TNBC biology by analyzing transcriptional regulatory networks (TRN) inherent in BrCA samples.
Methods: We initiated our work by using data from the Cancer Genome Atlas (TCGA). We divided the dataset into two groups: triple-positive breast cancer (TPBC) and TNBC. We used Bioconductor packages such as siggenes in R and identified genes that were differentially expressed (DEG) between these groups Similarly, we used the protein expression data and identified proteins differentially expressed (DEP) between these phenotypes. Pathways overrepresented among DEG and DEP were identified using another Bioconductor package, ReactomePA. For their impact on protein-protein interactions (PPIs) and biological pathways, those DEGs were mapped to a curated database of PPIs i.e. the STRING database. Additionally, the DEGs were superposed on two different inference algorithm-generated transcriptional regulatory networks (TRNs).
Results: To identify the genes differentially expressed between the two phenotypes we used a false discovery rate of 5% and identified approximately 20000 genes and 128 proteins. Over-represented pathways among DEGs include “mitotic signaling” and “G1/S transition”. Over-represented pathways among DEPs include “PI3K/Akt signaling”, “platelet aggregation” as well as “diseases of signal transduction”. Some known pro-oncogenic markers, such as FOXM1, NOTCH1, CDK1, PARP1, and NDRG1, were found linked in physical interaction (PPI) with other markers. The Virtual Inference of Protein-activity by Enriched Regulon analysis (VIPER) algorithm was used to identify find master regulators driving the gene expression differences between the two phenotypes. They include NFE2L3, HMGA1, TP53, FOXC1, and SOX9 (all up-regulated in TNBC), and FOXA1, AR, XBP1, and ESR1 (all down-regulated in TNBC).
Conclusion: These results illustrate sharp transcriptomic and proteomic differences between TNBC and TPBC. HMGA1 has been identified to be overexpressed in various cancers and it regulates the expression of various other genes. It works with various molecular partners that essentially play a role in the development of breast cancer. FOXA1 another master regulator which we found is a tumor suppressor and is downregulated in the TNBC group compared to the TPBC group. The absence of ER, PR, and HER2 expression in TNBC shapes gene and protein expression and activities in consequential ways underscored by these pathways, interactions, and master regulators.
Support or Funding Information
Study was carried out with support from MCPHS University