Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data

Steven J. Schrodi, Andrea De Barber, Max He, Zhan Ye, Peggy Peissig, Jeffrey J. Van Wormer, Robert Haws, Murray H. Brilliant, Robert D. Steiner

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Genetic methods can complement epidemiological surveys and clinical registries in determining prevalence of monogenic autosomal recessive diseases. Several large population-based genetic databases, such as the NHLBI GO Exome Sequencing Project, are now publically available. By assuming Hardy–Weinberg equilibrium, the frequency of individuals homozygous in the general population for a particular pathogenic allele can be directly calculated from a sample of chromosomes where some harbor the pathogenic allele. Further assuming that the penetrance of the pathogenic allele(s) is known, the prevalence of recessive phenotypes can be determined. Such work can inform public health efforts for rare recessive diseases. A Bayesian estimation procedure has yet to be applied to the problem of estimating disease prevalence from large population-based genetic data. A Bayesian framework is developed to derive the posterior probability density of monogenic, autosomal recessive phenotypes. Explicit equations are presented for the credible intervals of these disease prevalence estimates. A primary impediment to performing accurate disease prevalence calculations is the determination of truly pathogenic alleles. This issue is discussed, but in many instances remains a significant barrier to investigations solely reliant on statistical interrogation—functional studies can provide important information for solidifying evidence of variant pathogenicity. We also discuss several challenges to these efforts, including the population structure in the sample of chromosomes, the treatment of allelic heterogeneity, and reduced penetrance of pathogenic variants. To illustrate the application of these methods, we utilized recently published genetic data collected on a large sample from the Schmiedeleut Hutterites. We estimate prevalence and calculate 95 % credible intervals for 13 autosomal recessive diseases using these data. In addition, the Bayesian estimation procedure is applied to data from a central European study of hereditary fructose intolerance. The methods described herein show a viable path to robustly estimating both the expected prevalence of autosomal recessive phenotypes and corresponding credible intervals using population-based genetic databases that have recently become available. As these genetic databases increase in number and size with the advent of cost-effective next-generation sequencing, we anticipate that these methods and approaches may be helpful in recessive disease prevalence calculations, potentially impacting public health management, health economic analyses, and treatment of rare diseases.

Original languageEnglish (US)
Pages (from-to)659-669
Number of pages11
JournalHuman Genetics
Volume134
Issue number6
DOIs
StatePublished - Jun 1 2015

Fingerprint

Population Genetics
Genetic Databases
Alleles
Penetrance
Rare Diseases
Phenotype
Fructose Intolerance
Public Health
Chromosomes
Exome
National Heart, Lung, and Blood Institute (U.S.)
Population
Virulence
Registries
Economics
Costs and Cost Analysis
Health

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics
  • Medicine(all)

Cite this

Schrodi, S. J., De Barber, A., He, M., Ye, Z., Peissig, P., Van Wormer, J. J., ... Steiner, R. D. (2015). Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. Human Genetics, 134(6), 659-669. https://doi.org/10.1007/s00439-015-1551-8

Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. / Schrodi, Steven J.; De Barber, Andrea; He, Max; Ye, Zhan; Peissig, Peggy; Van Wormer, Jeffrey J.; Haws, Robert; Brilliant, Murray H.; Steiner, Robert D.

In: Human Genetics, Vol. 134, No. 6, 01.06.2015, p. 659-669.

Research output: Contribution to journalArticle

Schrodi, SJ, De Barber, A, He, M, Ye, Z, Peissig, P, Van Wormer, JJ, Haws, R, Brilliant, MH & Steiner, RD 2015, 'Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data', Human Genetics, vol. 134, no. 6, pp. 659-669. https://doi.org/10.1007/s00439-015-1551-8
Schrodi, Steven J. ; De Barber, Andrea ; He, Max ; Ye, Zhan ; Peissig, Peggy ; Van Wormer, Jeffrey J. ; Haws, Robert ; Brilliant, Murray H. ; Steiner, Robert D. / Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. In: Human Genetics. 2015 ; Vol. 134, No. 6. pp. 659-669.
@article{7e178734dd5c4bd682ff048696f5d844,
title = "Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data",
abstract = "Genetic methods can complement epidemiological surveys and clinical registries in determining prevalence of monogenic autosomal recessive diseases. Several large population-based genetic databases, such as the NHLBI GO Exome Sequencing Project, are now publically available. By assuming Hardy–Weinberg equilibrium, the frequency of individuals homozygous in the general population for a particular pathogenic allele can be directly calculated from a sample of chromosomes where some harbor the pathogenic allele. Further assuming that the penetrance of the pathogenic allele(s) is known, the prevalence of recessive phenotypes can be determined. Such work can inform public health efforts for rare recessive diseases. A Bayesian estimation procedure has yet to be applied to the problem of estimating disease prevalence from large population-based genetic data. A Bayesian framework is developed to derive the posterior probability density of monogenic, autosomal recessive phenotypes. Explicit equations are presented for the credible intervals of these disease prevalence estimates. A primary impediment to performing accurate disease prevalence calculations is the determination of truly pathogenic alleles. This issue is discussed, but in many instances remains a significant barrier to investigations solely reliant on statistical interrogation—functional studies can provide important information for solidifying evidence of variant pathogenicity. We also discuss several challenges to these efforts, including the population structure in the sample of chromosomes, the treatment of allelic heterogeneity, and reduced penetrance of pathogenic variants. To illustrate the application of these methods, we utilized recently published genetic data collected on a large sample from the Schmiedeleut Hutterites. We estimate prevalence and calculate 95 {\%} credible intervals for 13 autosomal recessive diseases using these data. In addition, the Bayesian estimation procedure is applied to data from a central European study of hereditary fructose intolerance. The methods described herein show a viable path to robustly estimating both the expected prevalence of autosomal recessive phenotypes and corresponding credible intervals using population-based genetic databases that have recently become available. As these genetic databases increase in number and size with the advent of cost-effective next-generation sequencing, we anticipate that these methods and approaches may be helpful in recessive disease prevalence calculations, potentially impacting public health management, health economic analyses, and treatment of rare diseases.",
author = "Schrodi, {Steven J.} and {De Barber}, Andrea and Max He and Zhan Ye and Peggy Peissig and {Van Wormer}, {Jeffrey J.} and Robert Haws and Brilliant, {Murray H.} and Steiner, {Robert D.}",
year = "2015",
month = "6",
day = "1",
doi = "10.1007/s00439-015-1551-8",
language = "English (US)",
volume = "134",
pages = "659--669",
journal = "Human Genetics",
issn = "0340-6717",
publisher = "Springer Verlag",
number = "6",

}

TY - JOUR

T1 - Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data

AU - Schrodi, Steven J.

AU - De Barber, Andrea

AU - He, Max

AU - Ye, Zhan

AU - Peissig, Peggy

AU - Van Wormer, Jeffrey J.

AU - Haws, Robert

AU - Brilliant, Murray H.

AU - Steiner, Robert D.

PY - 2015/6/1

Y1 - 2015/6/1

N2 - Genetic methods can complement epidemiological surveys and clinical registries in determining prevalence of monogenic autosomal recessive diseases. Several large population-based genetic databases, such as the NHLBI GO Exome Sequencing Project, are now publically available. By assuming Hardy–Weinberg equilibrium, the frequency of individuals homozygous in the general population for a particular pathogenic allele can be directly calculated from a sample of chromosomes where some harbor the pathogenic allele. Further assuming that the penetrance of the pathogenic allele(s) is known, the prevalence of recessive phenotypes can be determined. Such work can inform public health efforts for rare recessive diseases. A Bayesian estimation procedure has yet to be applied to the problem of estimating disease prevalence from large population-based genetic data. A Bayesian framework is developed to derive the posterior probability density of monogenic, autosomal recessive phenotypes. Explicit equations are presented for the credible intervals of these disease prevalence estimates. A primary impediment to performing accurate disease prevalence calculations is the determination of truly pathogenic alleles. This issue is discussed, but in many instances remains a significant barrier to investigations solely reliant on statistical interrogation—functional studies can provide important information for solidifying evidence of variant pathogenicity. We also discuss several challenges to these efforts, including the population structure in the sample of chromosomes, the treatment of allelic heterogeneity, and reduced penetrance of pathogenic variants. To illustrate the application of these methods, we utilized recently published genetic data collected on a large sample from the Schmiedeleut Hutterites. We estimate prevalence and calculate 95 % credible intervals for 13 autosomal recessive diseases using these data. In addition, the Bayesian estimation procedure is applied to data from a central European study of hereditary fructose intolerance. The methods described herein show a viable path to robustly estimating both the expected prevalence of autosomal recessive phenotypes and corresponding credible intervals using population-based genetic databases that have recently become available. As these genetic databases increase in number and size with the advent of cost-effective next-generation sequencing, we anticipate that these methods and approaches may be helpful in recessive disease prevalence calculations, potentially impacting public health management, health economic analyses, and treatment of rare diseases.

AB - Genetic methods can complement epidemiological surveys and clinical registries in determining prevalence of monogenic autosomal recessive diseases. Several large population-based genetic databases, such as the NHLBI GO Exome Sequencing Project, are now publically available. By assuming Hardy–Weinberg equilibrium, the frequency of individuals homozygous in the general population for a particular pathogenic allele can be directly calculated from a sample of chromosomes where some harbor the pathogenic allele. Further assuming that the penetrance of the pathogenic allele(s) is known, the prevalence of recessive phenotypes can be determined. Such work can inform public health efforts for rare recessive diseases. A Bayesian estimation procedure has yet to be applied to the problem of estimating disease prevalence from large population-based genetic data. A Bayesian framework is developed to derive the posterior probability density of monogenic, autosomal recessive phenotypes. Explicit equations are presented for the credible intervals of these disease prevalence estimates. A primary impediment to performing accurate disease prevalence calculations is the determination of truly pathogenic alleles. This issue is discussed, but in many instances remains a significant barrier to investigations solely reliant on statistical interrogation—functional studies can provide important information for solidifying evidence of variant pathogenicity. We also discuss several challenges to these efforts, including the population structure in the sample of chromosomes, the treatment of allelic heterogeneity, and reduced penetrance of pathogenic variants. To illustrate the application of these methods, we utilized recently published genetic data collected on a large sample from the Schmiedeleut Hutterites. We estimate prevalence and calculate 95 % credible intervals for 13 autosomal recessive diseases using these data. In addition, the Bayesian estimation procedure is applied to data from a central European study of hereditary fructose intolerance. The methods described herein show a viable path to robustly estimating both the expected prevalence of autosomal recessive phenotypes and corresponding credible intervals using population-based genetic databases that have recently become available. As these genetic databases increase in number and size with the advent of cost-effective next-generation sequencing, we anticipate that these methods and approaches may be helpful in recessive disease prevalence calculations, potentially impacting public health management, health economic analyses, and treatment of rare diseases.

UR - http://www.scopus.com/inward/record.url?scp=84936746793&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84936746793&partnerID=8YFLogxK

U2 - 10.1007/s00439-015-1551-8

DO - 10.1007/s00439-015-1551-8

M3 - Article

C2 - 25893794

AN - SCOPUS:84936746793

VL - 134

SP - 659

EP - 669

JO - Human Genetics

JF - Human Genetics

SN - 0340-6717

IS - 6

ER -