Data Science and AI
Victor SC Wong, Ph.D.
Chief Scientific Officer
Core Life Analytics, Noord-Brabant, Netherlands
There is a significant interest in adopting image-based phenotypic profiling for target and drug discovery processes. This approach yields rich biological data that can reveal novel and critical insights into disease phenotypes and mechanisms of drug action and toxicity. The Broad Institute has developed the Cell Painting assay as a standardized profiling method. More recently, the Joint Undertaking in Morphological Profiling (JUMP) Cell Painting consortium has been established to generate a large public reference Cell Painting dataset. This high content imaging dataset will be generated using 140,000 different conditions with small molecules, CRISPRs and ORFs. The goal is to present ‘ground truth’ for the study of phenotypic relationships between chemical, and genetic perturbations that target the same genes in cells. Public access to such a large dataset will provide enormous opportunities for drug discovery but the analysis of this dataset will also present challenges for scientists to find critical insights in such a large volume of numeric data. We used the publicly available preliminary dataset, wherein A549 and U2OS cell lines were treated with chemical and genetic perturbations in various replicates. We demonstrated that our web-based data analytics platform, StratoMineRTM, can detect and generate distinct plate maps, select relevant features, and perform dimensionality reduction and unbiased hit picking. We can make several comparisons to examine differences in phenotypic outcomes between two cell lines, time points and conditions. We examined datasets from compound and CRISPR experiments, interestingly this comparison detected differential phenotypic outcomes between chemical and genetic approaches for identical gene targets in both cell lines. We further revealed more than 100 compounds and CRISPR guides that gave significant phenotypic distance scores from the negative control. Clustering analysis of hits revealed that: 1) substantial numbers of treatments gave diverse phenotypes regardless of their common gene targets, 2) there are a few phenotypic similarities between compound and CRISPR treatments that share common gene targets (e.g., only cycloheximide and TG-02, and CRISPR guide that targets the Rpl3 and Cdk9 genes respectively are clustered in A549 cells, and SB-505124 and CRISPR treatment that targets Tfgbr1 gene are clustered in U2OS cell line), and 3) there are significant clusterings of unrelated gene targets that gave similar phenotypic outcomes. Our analysis provides valuable information that will allow biologists who wish to develop their own Cell Painting platforms so that they can take advantage of the full JUMP-CP dataset when it becomes available.