Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection

ICGC-TCGA DREAM Somatic Mutation Calling Challenge Participants

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .

Original languageEnglish (US)
Number of pages1
JournalGenome Biology
Volume19
Issue number1
DOIs
StatePublished - Nov 6 2018

Fingerprint

Crowdsourcing
Benchmarking
tumor
benchmarking
genome
Genome
neoplasms
simulation
cancer
Neoplasms
resource
detection method
aggression
somatic mutation
ranking
phenotype
mutation
cell aggregates
genomics
gold

Keywords

  • Benchmarking
  • Cancer genomics
  • Crowdsourcing
  • Simulation
  • Somatic mutations
  • Structural variants
  • Whole-genome sequencing

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection. / ICGC-TCGA DREAM Somatic Mutation Calling Challenge Participants.

In: Genome Biology, Vol. 19, No. 1, 06.11.2018.

Research output: Contribution to journalArticle

ICGC-TCGA DREAM Somatic Mutation Calling Challenge Participants. / Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection. In: Genome Biology. 2018 ; Vol. 19, No. 1.
@article{55ac9af02ced4559aa2cfea35872cadb,
title = "Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection",
abstract = "BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .",
keywords = "Benchmarking, Cancer genomics, Crowdsourcing, Simulation, Somatic mutations, Structural variants, Whole-genome sequencing",
author = "{ICGC-TCGA DREAM Somatic Mutation Calling Challenge Participants} and Lee, {Anna Y.} and Ewing, {Adam D.} and Kyle Ellrott and Yin Hu and Houlahan, {Kathleen E.} and Bare, {J. Christopher} and Espiritu, {Shadrielle Melijah G.} and Vincent Huang and Kristen Dang and Zechen Chong and Cristian Caloian and Yamaguchi, {Takafumi N.} and Kellen, {Michael R.} and Ken Chen and Norman, {Thea C.} and Friend, {Stephen H.} and Justin Guinney and Gustavo Stolovitzky and David Haussler and Margolin, {Adam A.} and Adam Margolin and Boutros, {Paul C.}",
year = "2018",
month = "11",
day = "6",
doi = "10.1186/s13059-018-1539-5",
language = "English (US)",
volume = "19",
journal = "Genome Biology",
issn = "1474-7596",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection

AU - ICGC-TCGA DREAM Somatic Mutation Calling Challenge Participants

AU - Lee, Anna Y.

AU - Ewing, Adam D.

AU - Ellrott, Kyle

AU - Hu, Yin

AU - Houlahan, Kathleen E.

AU - Bare, J. Christopher

AU - Espiritu, Shadrielle Melijah G.

AU - Huang, Vincent

AU - Dang, Kristen

AU - Chong, Zechen

AU - Caloian, Cristian

AU - Yamaguchi, Takafumi N.

AU - Kellen, Michael R.

AU - Chen, Ken

AU - Norman, Thea C.

AU - Friend, Stephen H.

AU - Guinney, Justin

AU - Stolovitzky, Gustavo

AU - Haussler, David

AU - Margolin, Adam A.

AU - Margolin, Adam

AU - Boutros, Paul C.

PY - 2018/11/6

Y1 - 2018/11/6

N2 - BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .

AB - BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .

KW - Benchmarking

KW - Cancer genomics

KW - Crowdsourcing

KW - Simulation

KW - Somatic mutations

KW - Structural variants

KW - Whole-genome sequencing

UR - http://www.scopus.com/inward/record.url?scp=85056286059&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056286059&partnerID=8YFLogxK

U2 - 10.1186/s13059-018-1539-5

DO - 10.1186/s13059-018-1539-5

M3 - Article

C2 - 30400818

AN - SCOPUS:85056286059

VL - 19

JO - Genome Biology

JF - Genome Biology

SN - 1474-7596

IS - 1

ER -