Towards a comprehensive structural variation map of an individual human genome

Andy W. Pang; Jeffrey R. MacDonald; Dalila Pinto; John Wei; Muhammad A. Rafiq; Donald F. Conrad; Hansoo Park; Matthew E. Hurles; Charles Lee; J. Craig Venter; Ewen F. Kirkness; Samuel Levy; Lars Feuk; Stephen W. Scherer

doi:10.1186/gb-2010-11-5-r52

Towards a comprehensive structural variation map of an individual human genome

Andy W. Pang, Jeffrey R. MacDonald, Dalila Pinto, John Wei, Muhammad A. Rafiq, Donald F. Conrad, Hansoo Park, Matthew E. Hurles, Charles Lee, J. Craig Venter, Ewen F. Kirkness, Samuel Levy, Lars Feuk, Stephen W. Scherer

Research output: Contribution to journal › Editorial › peer-review

227 Scopus citations

Abstract

Background: Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.Results: We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.Conclusions: Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.

Original language	English (US)
Article number	R52
Journal	Genome biology
Volume	11
Issue number	5
DOIs	https://doi.org/10.1186/gb-2010-11-5-r52
State	Published - May 19 2010
Externally published	Yes

ASJC Scopus subject areas

Ecology, Evolution, Behavior and Systematics
Genetics
Cell Biology

Access to Document

10.1186/gb-2010-11-5-r52

Cite this

@article{bab468ebacce4d3d82f0341fb21ee1ba,

title = "Towards a comprehensive structural variation map of an individual human genome",

abstract = "Background: Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.Results: We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.Conclusions: Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.",

author = "Pang, {Andy W.} and MacDonald, {Jeffrey R.} and Dalila Pinto and John Wei and Rafiq, {Muhammad A.} and Conrad, {Donald F.} and Hansoo Park and Hurles, {Matthew E.} and Charles Lee and Venter, {J. Craig} and Kirkness, {Ewen F.} and Samuel Levy and Lars Feuk and Scherer, {Stephen W.}",

note = "Funding Information: The work is supported by Genome Canada/Ontario Genomics Institute, the Canadian Institutes of Health Research (CIHR), the McLaughlin Centre for Molecular Medicine, the Canadian Institute for Advanced Research, and the Hospital for Sick Children (SickKids) Foundation. AWP holds the Natural Sciences and Engineering Research Council of Canada (NSERC) Alexander Graham Bell Canada Graduate Scholarship. DP is supported by fellowships from the Royal Netherlands Academy of Arts and Sciences (TMF/DA/5801) and the Netherlands Organization for Scientific Research (Rubicon, 825.06.031). LF is supported by the G{\"o}ran Gu stafsson Foundation and the Swedish Foundation for Strategic Research. SWS holds the GlaxoSmithKline-CIHR Pathfinder Chair in Genetics and Genomics at the University of Toronto and Hospital for Sick Children.",

year = "2010",

month = may,

day = "19",

doi = "10.1186/gb-2010-11-5-r52",

language = "English (US)",

volume = "11",

journal = "Genome biology",

issn = "1474-7596",

publisher = "BioMed Central",

number = "5",

}

TY - JOUR

T1 - Towards a comprehensive structural variation map of an individual human genome

AU - Pang, Andy W.

AU - MacDonald, Jeffrey R.

AU - Pinto, Dalila

AU - Wei, John

AU - Rafiq, Muhammad A.

AU - Conrad, Donald F.

AU - Park, Hansoo

AU - Hurles, Matthew E.

AU - Lee, Charles

AU - Venter, J. Craig

AU - Kirkness, Ewen F.

AU - Levy, Samuel

AU - Feuk, Lars

AU - Scherer, Stephen W.

N1 - Funding Information: The work is supported by Genome Canada/Ontario Genomics Institute, the Canadian Institutes of Health Research (CIHR), the McLaughlin Centre for Molecular Medicine, the Canadian Institute for Advanced Research, and the Hospital for Sick Children (SickKids) Foundation. AWP holds the Natural Sciences and Engineering Research Council of Canada (NSERC) Alexander Graham Bell Canada Graduate Scholarship. DP is supported by fellowships from the Royal Netherlands Academy of Arts and Sciences (TMF/DA/5801) and the Netherlands Organization for Scientific Research (Rubicon, 825.06.031). LF is supported by the Göran Gu stafsson Foundation and the Swedish Foundation for Strategic Research. SWS holds the GlaxoSmithKline-CIHR Pathfinder Chair in Genetics and Genomics at the University of Toronto and Hospital for Sick Children.

PY - 2010/5/19

Y1 - 2010/5/19

N2 - Background: Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.Results: We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.Conclusions: Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.

AB - Background: Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions.Results: We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association.Conclusions: Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.

UR - http://www.scopus.com/inward/record.url?scp=77952296952&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952296952&partnerID=8YFLogxK

U2 - 10.1186/gb-2010-11-5-r52

DO - 10.1186/gb-2010-11-5-r52

M3 - Editorial

C2 - 20482838

AN - SCOPUS:77952296952

SN - 1474-7596

VL - 11

JO - Genome biology

JF - Genome biology

IS - 5

M1 - R52

ER -

Towards a comprehensive structural variation map of an individual human genome

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this