Motivation: In recent years, vast advances in biomedical technologies and comprehensive sequencing have revealed the genomic landscape of common forms of human cancer in unprecedented detail. The broad heterogeneity of the disease calls for rapid development of personalized therapies. Translating the readily available genomic data into useful knowledge that can be applied in the clinic remains a challenge. Computational methods are needed to aid these efforts by robustly analyzing genome-scale data from distinct experimental platforms for prioritization of targets and treatments. Results: We propose a novel, biologically motivated, Bayesian multitask approach, which explicitly models gene-centric dependencies across multiple and distinct genomic platforms. We introduce a gene-wise prior and present a fully Bayesian formulation of a group factor analysis model. In supervised prediction applications, our multitask approach leverages similarities in response profiles of groups of drugs that are more likely to be related to true biological signal, which leads to more robust performance and improved generalization ability. We evaluate the performance of our method on molecularly characterized collections of cell lines profiled against two compound panels, namely the Cancer Cell Line Encyclopedia and the Cancer Therapeutics Response Portal. We demonstrate that accounting for the gene-centric dependencies enables leveraging information from multi-omic input data and improves prediction and feature selection performance. We further demonstrate the applicability of our method in an unsupervised dimensionality reduction application by inferring genes essential to tumorigenesis in the pancreatic ductal adenocarcinoma and lung adenocarcinoma patient cohorts from The Cancer Genome Atlas.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics