A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

Tyler S. Alioto; Ivo Buchhalter; Sophia Derdak; Barbara Hutter; Matthew D. Eldridge; Eivind Hovig; Lawrence E. Heisler; Timothy A. Beck; Jared T. Simpson; Laurie Tonon; Anne Sophie Sertier; Ann Marie Patch; Natalie Jäger; Philip Ginsbach; Ruben Drews; Nagarajan Paramasivam; Rolf Kabbe; Sasithorn Chotewutmontri; Nicolle Diessl; Christopher Previti; Sabine Schmidt; Benedikt Brors; Lars Feuerbach; Michael Heinold; Susanne Gröbner; Andrey Korshunov; Patrick S. Tarpey; Adam P. Butler; Jonathan Hinton; David Jones; Andrew Menzies; Keiran Raine; Rebecca Shepherd; Lucy Stebbings; Jon W. Teague; Paolo Ribeca; Francesc Castro Giner; Sergi Beltran; Emanuele Raineri; Marc Dabad; Simon C. Heath; Marta Gut; Robert E. Denroche; Nicholas J. Harding; Takafumi N. Yamaguchi; Akihiro Fujimoto; Hidewaki Nakagawa; Víctor Quesada; Rafael Valdés-Mas; Sigve Nakken; Daniel Vodák; Lawrence Bower; Andrew G. Lynch; Charlotte L. Anderson; Nicola Waddell; John V. Pearson; Sean M. Grimmond; Myron Peto; Paul Spellman; Minghui He; Cyriac Kandoth; Semin Lee; John Zhang; Louis Létourneau; Singer Ma; Sahil Seth; David Torrents; Liu Xi; David A. Wheeler; Carlos López-Otín; Elías Campo; Peter J. Campbell; Paul C. Boutros; Xose S. Puente; Daniela S. Gerhard; Stefan M. Pfister; John D. McPherson; Thomas J. Hudson; Matthias Schlesner; Peter Lichter; Roland Eils; David T.W. Jones; Ivo G. Gut

doi:10.1038/ncomms10001

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

Tyler S. Alioto, Ivo Buchhalter, Sophia Derdak, Barbara Hutter, Matthew D. Eldridge, Eivind Hovig, Lawrence E. Heisler, Timothy A. Beck, Jared T. Simpson, Laurie Tonon, Anne Sophie Sertier, Ann Marie Patch, Natalie Jäger, Philip Ginsbach, Ruben Drews, Nagarajan Paramasivam, Rolf Kabbe, Sasithorn Chotewutmontri, Nicolle Diessl, Christopher PrevitiSabine Schmidt, Benedikt Brors, Lars Feuerbach, Michael Heinold, Susanne Gröbner, Andrey Korshunov, Patrick S. Tarpey, Adam P. Butler, Jonathan Hinton, David Jones, Andrew Menzies, Keiran Raine, Rebecca Shepherd, Lucy Stebbings, Jon W. Teague, Paolo Ribeca, Francesc Castro Giner, Sergi Beltran, Emanuele Raineri, Marc Dabad, Simon C. Heath, Marta Gut, Robert E. Denroche, Nicholas J. Harding, Takafumi N. Yamaguchi, Akihiro Fujimoto, Hidewaki Nakagawa, Víctor Quesada, Rafael Valdés-Mas, Sigve Nakken, Daniel Vodák, Lawrence Bower, Andrew G. Lynch, Charlotte L. Anderson, Nicola Waddell, John V. Pearson, Sean M. Grimmond, Myron Peto, Paul Spellman, Minghui He, Cyriac Kandoth, Semin Lee, John Zhang, Louis Létourneau, Singer Ma, Sahil Seth, David Torrents, Liu Xi, David A. Wheeler, Carlos López-Otín, Elías Campo, Peter J. Campbell, Paul C. Boutros, Xose S. Puente, Daniela S. Gerhard, Stefan M. Pfister, John D. McPherson, Thomas J. Hudson, Matthias Schlesner, Peter Lichter, Roland Eils, David T.W. Jones, Ivo G. Gut

Molecular and Medical Genetics

Research output: Contribution to journal › Article › peer-review

216 Scopus citations

Abstract

As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.

Original language	English (US)
Article number	10001
Journal	Nature communications
Volume	6
DOIs	https://doi.org/10.1038/ncomms10001
State	Published - Dec 9 2015

ASJC Scopus subject areas

General Chemistry
General Biochemistry, Genetics and Molecular Biology
General Physics and Astronomy

Access to Document

10.1038/ncomms10001

Cite this

Alioto, T. S., Buchhalter, I., Derdak, S., Hutter, B., Eldridge, M. D., Hovig, E., Heisler, L. E., Beck, T. A., Simpson, J. T., Tonon, L., Sertier, A. S., Patch, A. M., Jäger, N., Ginsbach, P., Drews, R., Paramasivam, N., Kabbe, R., Chotewutmontri, S., Diessl, N., ... Gut, I. G. (2015). A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing. Nature communications, 6, Article 10001. https://doi.org/10.1038/ncomms10001

Alioto, TS, Buchhalter, I, Derdak, S, Hutter, B, Eldridge, MD, Hovig, E, Heisler, LE, Beck, TA, Simpson, JT, Tonon, L, Sertier, AS, Patch, AM, Jäger, N, Ginsbach, P, Drews, R, Paramasivam, N, Kabbe, R, Chotewutmontri, S, Diessl, N, Previti, C, Schmidt, S, Brors, B, Feuerbach, L, Heinold, M, Gröbner, S, Korshunov, A, Tarpey, PS, Butler, AP, Hinton, J, Jones, D, Menzies, A, Raine, K, Shepherd, R, Stebbings, L, Teague, JW, Ribeca, P, Giner, FC, Beltran, S, Raineri, E, Dabad, M, Heath, SC, Gut, M, Denroche, RE, Harding, NJ, Yamaguchi, TN, Fujimoto, A, Nakagawa, H, Quesada, V, Valdés-Mas, R, Nakken, S, Vodák, D, Bower, L, Lynch, AG, Anderson, CL, Waddell, N, Pearson, JV, Grimmond, SM, Peto, M, Spellman, P, He, M, Kandoth, C, Lee, S, Zhang, J, Létourneau, L, Ma, S, Seth, S, Torrents, D, Xi, L, Wheeler, DA, López-Otín, C, Campo, E, Campbell, PJ, Boutros, PC, Puente, XS, Gerhard, DS, Pfister, SM, McPherson, JD, Hudson, TJ, Schlesner, M, Lichter, P, Eils, R, Jones, DTW & Gut, IG 2015, 'A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing', Nature communications, vol. 6, 10001. https://doi.org/10.1038/ncomms10001

@article{9134cd3453894350ba509b24a3b88c5b,

title = "A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing",

abstract = "As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.",

author = "Alioto, {Tyler S.} and Ivo Buchhalter and Sophia Derdak and Barbara Hutter and Eldridge, {Matthew D.} and Eivind Hovig and Heisler, {Lawrence E.} and Beck, {Timothy A.} and Simpson, {Jared T.} and Laurie Tonon and Sertier, {Anne Sophie} and Patch, {Ann Marie} and Natalie J{\"a}ger and Philip Ginsbach and Ruben Drews and Nagarajan Paramasivam and Rolf Kabbe and Sasithorn Chotewutmontri and Nicolle Diessl and Christopher Previti and Sabine Schmidt and Benedikt Brors and Lars Feuerbach and Michael Heinold and Susanne Gr{\"o}bner and Andrey Korshunov and Tarpey, {Patrick S.} and Butler, {Adam P.} and Jonathan Hinton and David Jones and Andrew Menzies and Keiran Raine and Rebecca Shepherd and Lucy Stebbings and Teague, {Jon W.} and Paolo Ribeca and Giner, {Francesc Castro} and Sergi Beltran and Emanuele Raineri and Marc Dabad and Heath, {Simon C.} and Marta Gut and Denroche, {Robert E.} and Harding, {Nicholas J.} and Yamaguchi, {Takafumi N.} and Akihiro Fujimoto and Hidewaki Nakagawa and V{\'i}ctor Quesada and Rafael Vald{\'e}s-Mas and Sigve Nakken and Daniel Vod{\'a}k and Lawrence Bower and Lynch, {Andrew G.} and Anderson, {Charlotte L.} and Nicola Waddell and Pearson, {John V.} and Grimmond, {Sean M.} and Myron Peto and Paul Spellman and Minghui He and Cyriac Kandoth and Semin Lee and John Zhang and Louis L{\'e}tourneau and Singer Ma and Sahil Seth and David Torrents and Liu Xi and Wheeler, {David A.} and Carlos L{\'o}pez-Ot{\'i}n and El{\'i}as Campo and Campbell, {Peter J.} and Boutros, {Paul C.} and Puente, {Xose S.} and Gerhard, {Daniela S.} and Pfister, {Stefan M.} and McPherson, {John D.} and Hudson, {Thomas J.} and Matthias Schlesner and Peter Lichter and Roland Eils and Jones, {David T.W.} and Gut, {Ivo G.}",

note = "Funding Information: The International Cancer Genome Consortium (ICGC) is characterizing over 25,000 cancer cases from many forms of cancer1. Currently, there are 74 projects supported by different national and international funding agencies. As innovation and development of sequencing technologies have driven prices down and throughput up, projects have been transitioning from exome to whole-genome sequencing (WGS) of tumour and matched germline samples, supplemented by transcript and methylation analyses when possible, facilitating the discovery of new biology for many different forms of cancer2–10. However, as data from the different projects began to be collected and centralized (https://dcc.icgc.org/), it became apparent that there are marked differences in how teams generate WGS data and analyse it. On the basis of cost, capacity and analytical experience, it was initially determined that comprehensive identification of tumour-specific somatic mutations requires WGS with a minimum of 30 ⨯ sequence coverage of each the tumour and normal genomes11 with paired reads on the order of 100–250 bp in length, depending on the platform. However, from project to project the sample preparation, coverage of tumour and normal samples and read lengths vary. Even more variability exists in the approaches to identify differences between tumour and normal genomes, evidenced by the many strategies developed to identify somatic single-base mutations (SSM)12, somatic insertion/deletion mutations (SIM) and larger structural changes (rearrangements and chromosome segment copy number changes)5.",

year = "2015",

month = dec,

day = "9",

doi = "10.1038/ncomms10001",

language = "English (US)",

volume = "6",

journal = "Nature communications",

issn = "2041-1723",

publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

AU - Alioto, Tyler S.

AU - Buchhalter, Ivo

AU - Derdak, Sophia

AU - Hutter, Barbara

AU - Eldridge, Matthew D.

AU - Hovig, Eivind

AU - Heisler, Lawrence E.

AU - Beck, Timothy A.

AU - Simpson, Jared T.

AU - Tonon, Laurie

AU - Sertier, Anne Sophie

AU - Patch, Ann Marie

AU - Jäger, Natalie

AU - Ginsbach, Philip

AU - Drews, Ruben

AU - Paramasivam, Nagarajan

AU - Kabbe, Rolf

AU - Chotewutmontri, Sasithorn

AU - Diessl, Nicolle

AU - Previti, Christopher

AU - Schmidt, Sabine

AU - Brors, Benedikt

AU - Feuerbach, Lars

AU - Heinold, Michael

AU - Gröbner, Susanne

AU - Korshunov, Andrey

AU - Tarpey, Patrick S.

AU - Butler, Adam P.

AU - Hinton, Jonathan

AU - Jones, David

AU - Menzies, Andrew

AU - Raine, Keiran

AU - Shepherd, Rebecca

AU - Stebbings, Lucy

AU - Teague, Jon W.

AU - Ribeca, Paolo

AU - Giner, Francesc Castro

AU - Beltran, Sergi

AU - Raineri, Emanuele

AU - Dabad, Marc

AU - Heath, Simon C.

AU - Gut, Marta

AU - Denroche, Robert E.

AU - Harding, Nicholas J.

AU - Yamaguchi, Takafumi N.

AU - Fujimoto, Akihiro

AU - Nakagawa, Hidewaki

AU - Quesada, Víctor

AU - Valdés-Mas, Rafael

AU - Nakken, Sigve

AU - Vodák, Daniel

AU - Bower, Lawrence

AU - Lynch, Andrew G.

AU - Anderson, Charlotte L.

AU - Waddell, Nicola

AU - Pearson, John V.

AU - Grimmond, Sean M.

AU - Peto, Myron

AU - Spellman, Paul

AU - He, Minghui

AU - Kandoth, Cyriac

AU - Lee, Semin

AU - Zhang, John

AU - Létourneau, Louis

AU - Ma, Singer

AU - Seth, Sahil

AU - Torrents, David

AU - Xi, Liu

AU - Wheeler, David A.

AU - López-Otín, Carlos

AU - Campo, Elías

AU - Campbell, Peter J.

AU - Boutros, Paul C.

AU - Puente, Xose S.

AU - Gerhard, Daniela S.

AU - Pfister, Stefan M.

AU - McPherson, John D.

AU - Hudson, Thomas J.

AU - Schlesner, Matthias

AU - Lichter, Peter

AU - Eils, Roland

AU - Jones, David T.W.

AU - Gut, Ivo G.

N1 - Funding Information: The International Cancer Genome Consortium (ICGC) is characterizing over 25,000 cancer cases from many forms of cancer1. Currently, there are 74 projects supported by different national and international funding agencies. As innovation and development of sequencing technologies have driven prices down and throughput up, projects have been transitioning from exome to whole-genome sequencing (WGS) of tumour and matched germline samples, supplemented by transcript and methylation analyses when possible, facilitating the discovery of new biology for many different forms of cancer2–10. However, as data from the different projects began to be collected and centralized (https://dcc.icgc.org/), it became apparent that there are marked differences in how teams generate WGS data and analyse it. On the basis of cost, capacity and analytical experience, it was initially determined that comprehensive identification of tumour-specific somatic mutations requires WGS with a minimum of 30 ⨯ sequence coverage of each the tumour and normal genomes11 with paired reads on the order of 100–250 bp in length, depending on the platform. However, from project to project the sample preparation, coverage of tumour and normal samples and read lengths vary. Even more variability exists in the approaches to identify differences between tumour and normal genomes, evidenced by the many strategies developed to identify somatic single-base mutations (SSM)12, somatic insertion/deletion mutations (SIM) and larger structural changes (rearrangements and chromosome segment copy number changes)5.

PY - 2015/12/9

Y1 - 2015/12/9

N2 - As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.

AB - As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84949564442&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949564442&partnerID=8YFLogxK

U2 - 10.1038/ncomms10001

DO - 10.1038/ncomms10001

M3 - Article

C2 - 26647970

AN - SCOPUS:84949564442

SN - 2041-1723

VL - 6

JO - Nature communications

JF - Nature communications

M1 - 10001

ER -

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this