CanDrA: Cancer-specific driver missense mutation annotation with optimized features

Yong Mao, Han Chen, Han Liang, Funda Meric-Bernstam, Gordon Mills, Ken Chen

Research output: Contribution to journalArticle

44 Citations (Scopus)

Abstract

Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.

Original languageEnglish (US)
Article numbere77945
JournalPLoS One
Volume8
Issue number10
DOIs
StatePublished - Oct 30 2013
Externally publishedYes

Fingerprint

missense mutation
Missense Mutation
Cells
mutation
Mutation
neoplasms
Learning systems
Tumors
Neoplasms
Genes
somatic mutation
Encyclopedias
artificial intelligence
Atlases
carcinogenesis
carcinoma
Glioblastoma
cell lines
Carcinogenesis
Genome

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

CanDrA : Cancer-specific driver missense mutation annotation with optimized features. / Mao, Yong; Chen, Han; Liang, Han; Meric-Bernstam, Funda; Mills, Gordon; Chen, Ken.

In: PLoS One, Vol. 8, No. 10, e77945, 30.10.2013.

Research output: Contribution to journalArticle

Mao, Yong ; Chen, Han ; Liang, Han ; Meric-Bernstam, Funda ; Mills, Gordon ; Chen, Ken. / CanDrA : Cancer-specific driver missense mutation annotation with optimized features. In: PLoS One. 2013 ; Vol. 8, No. 10.
@article{8f1372ec08344be6bfe43d646f2094da,
title = "CanDrA: Cancer-specific driver missense mutation annotation with optimized features",
abstract = "Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.",
author = "Yong Mao and Han Chen and Han Liang and Funda Meric-Bernstam and Gordon Mills and Ken Chen",
year = "2013",
month = "10",
day = "30",
doi = "10.1371/journal.pone.0077945",
language = "English (US)",
volume = "8",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "10",

}

TY - JOUR

T1 - CanDrA

T2 - Cancer-specific driver missense mutation annotation with optimized features

AU - Mao, Yong

AU - Chen, Han

AU - Liang, Han

AU - Meric-Bernstam, Funda

AU - Mills, Gordon

AU - Chen, Ken

PY - 2013/10/30

Y1 - 2013/10/30

N2 - Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.

AB - Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.

UR - http://www.scopus.com/inward/record.url?scp=84894456192&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894456192&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0077945

DO - 10.1371/journal.pone.0077945

M3 - Article

C2 - 24205039

AN - SCOPUS:84894456192

VL - 8

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 10

M1 - e77945

ER -