A Beta-Mixture Model for Assessing Genetic Population Structure

Rongwei (Rochelle) Fu, Dipak K. Dey, Kent E. Holsinger

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

An important fraction of recently generated molecular data is dominant markers. They contain substantial information about genetic variation but dominance makes it impossible to apply standard techniques to calculate measures of genetic differentiation, such as F-statistics. In this article, we propose a new Bayesian beta-mixture model that more accurately describes the genetic structure from dominant markers and estimates multipleF STs from the sample. The model also has important application for codominant markers and single-nucleotide polymorphism (SNP) data. The number ofF STis assumed unknown beforehand and follows a random distribution. The reversible jump algorithm is used to estimate the unknown number of multipleF STs. We evaluate the performance of three split proposals and the overall performance of the proposed model based on simulated dominant marker data. The model could reliably identify and estimate a spectrum of degrees of genetic differentiation present in multiple loci. The estimates ofF STs also incorporate uncertainty about the magnitude of within-population inbreeding coefficient. We illustrate the method with two examples, one using dominant marker data from a rare orchid and the other using codominant marker data from human populations.

Original languageEnglish (US)
Pages (from-to)1073-1082
Number of pages10
JournalBiometrics
Volume67
Issue number3
DOIs
StatePublished - Sep 2011

Fingerprint

Population Structure
Genetic Structures
Mixture Model
population structure
Inbreeding
genetic variation
Population
Uncertainty
Single Nucleotide Polymorphism
Estimate
Reversible Jump
inbreeding coefficient
F-statistics
Nucleotides
Unknown
Polymorphism
Genetic Variation
human population
Single nucleotide Polymorphism
single nucleotide polymorphism

Keywords

  • Allele frequency
  • Bayesian modeling
  • Beta mixture
  • F
  • Inbreeding coefficient
  • Reversible jump algorithm

ASJC Scopus subject areas

  • Applied Mathematics
  • Statistics and Probability
  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Medicine(all)

Cite this

A Beta-Mixture Model for Assessing Genetic Population Structure. / Fu, Rongwei (Rochelle); Dey, Dipak K.; Holsinger, Kent E.

In: Biometrics, Vol. 67, No. 3, 09.2011, p. 1073-1082.

Research output: Contribution to journalArticle

Fu, Rongwei (Rochelle) ; Dey, Dipak K. ; Holsinger, Kent E. / A Beta-Mixture Model for Assessing Genetic Population Structure. In: Biometrics. 2011 ; Vol. 67, No. 3. pp. 1073-1082.
@article{8e69eff0e6ce4343a5de9e13710e0d2e,
title = "A Beta-Mixture Model for Assessing Genetic Population Structure",
abstract = "An important fraction of recently generated molecular data is dominant markers. They contain substantial information about genetic variation but dominance makes it impossible to apply standard techniques to calculate measures of genetic differentiation, such as F-statistics. In this article, we propose a new Bayesian beta-mixture model that more accurately describes the genetic structure from dominant markers and estimates multipleF STs from the sample. The model also has important application for codominant markers and single-nucleotide polymorphism (SNP) data. The number ofF STis assumed unknown beforehand and follows a random distribution. The reversible jump algorithm is used to estimate the unknown number of multipleF STs. We evaluate the performance of three split proposals and the overall performance of the proposed model based on simulated dominant marker data. The model could reliably identify and estimate a spectrum of degrees of genetic differentiation present in multiple loci. The estimates ofF STs also incorporate uncertainty about the magnitude of within-population inbreeding coefficient. We illustrate the method with two examples, one using dominant marker data from a rare orchid and the other using codominant marker data from human populations.",
keywords = "Allele frequency, Bayesian modeling, Beta mixture, F, Inbreeding coefficient, Reversible jump algorithm",
author = "Fu, {Rongwei (Rochelle)} and Dey, {Dipak K.} and Holsinger, {Kent E.}",
year = "2011",
month = "9",
doi = "10.1111/j.1541-0420.2010.01506.x",
language = "English (US)",
volume = "67",
pages = "1073--1082",
journal = "Biometrics",
issn = "0006-341X",
publisher = "Wiley-Blackwell",
number = "3",

}

TY - JOUR

T1 - A Beta-Mixture Model for Assessing Genetic Population Structure

AU - Fu, Rongwei (Rochelle)

AU - Dey, Dipak K.

AU - Holsinger, Kent E.

PY - 2011/9

Y1 - 2011/9

N2 - An important fraction of recently generated molecular data is dominant markers. They contain substantial information about genetic variation but dominance makes it impossible to apply standard techniques to calculate measures of genetic differentiation, such as F-statistics. In this article, we propose a new Bayesian beta-mixture model that more accurately describes the genetic structure from dominant markers and estimates multipleF STs from the sample. The model also has important application for codominant markers and single-nucleotide polymorphism (SNP) data. The number ofF STis assumed unknown beforehand and follows a random distribution. The reversible jump algorithm is used to estimate the unknown number of multipleF STs. We evaluate the performance of three split proposals and the overall performance of the proposed model based on simulated dominant marker data. The model could reliably identify and estimate a spectrum of degrees of genetic differentiation present in multiple loci. The estimates ofF STs also incorporate uncertainty about the magnitude of within-population inbreeding coefficient. We illustrate the method with two examples, one using dominant marker data from a rare orchid and the other using codominant marker data from human populations.

AB - An important fraction of recently generated molecular data is dominant markers. They contain substantial information about genetic variation but dominance makes it impossible to apply standard techniques to calculate measures of genetic differentiation, such as F-statistics. In this article, we propose a new Bayesian beta-mixture model that more accurately describes the genetic structure from dominant markers and estimates multipleF STs from the sample. The model also has important application for codominant markers and single-nucleotide polymorphism (SNP) data. The number ofF STis assumed unknown beforehand and follows a random distribution. The reversible jump algorithm is used to estimate the unknown number of multipleF STs. We evaluate the performance of three split proposals and the overall performance of the proposed model based on simulated dominant marker data. The model could reliably identify and estimate a spectrum of degrees of genetic differentiation present in multiple loci. The estimates ofF STs also incorporate uncertainty about the magnitude of within-population inbreeding coefficient. We illustrate the method with two examples, one using dominant marker data from a rare orchid and the other using codominant marker data from human populations.

KW - Allele frequency

KW - Bayesian modeling

KW - Beta mixture

KW - F

KW - Inbreeding coefficient

KW - Reversible jump algorithm

UR - http://www.scopus.com/inward/record.url?scp=80052810183&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052810183&partnerID=8YFLogxK

U2 - 10.1111/j.1541-0420.2010.01506.x

DO - 10.1111/j.1541-0420.2010.01506.x

M3 - Article

C2 - 21114661

AN - SCOPUS:80052810183

VL - 67

SP - 1073

EP - 1082

JO - Biometrics

JF - Biometrics

SN - 0006-341X

IS - 3

ER -