Utility of the 5-Minute Apgar Score as a Research Endpoint

Marit L. Bovbjerg, Mekhala V. Dissanayake, Melissa Cheyney, Jennifer Brown, Jonathan Snowden

Research output: Contribution to journalArticle

Abstract

Although Apgar scores are commonly used as proxy outcomes, little evidence exists in support of the most common cutpoints (<7, <4). We used 2 data sets to explore this issue: one contained planned community births from across the United States (n = 52,877; 2012-2016), and the other contained hospital births from California (n = 428,877; 2010). We treated 5-minute Apgars as clinical "tests," compared against 18 known outcomes; we calculated sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve for each. We used 3 different criteria to determine optimal cutpoints. Results were very consistent across data sets, outcomes, and all subgroups: The cutpoint that maximizes the trade-off between sensitivity and specificity is universally <9. However, extremely low positive predictive values for all outcomes at <9 indicate more misclassification than is acceptable for research. The areas under the receiver operating characteristic curves (which treat Apgars as quasicontinuous) were generally indicative of adequate discrimination between infants destined to experience poor outcomes and those not; comparing median Apgars between groups might be an analytical alternative to dichotomizing. Nonetheless, because Apgar scores are not clearly on any causal pathway of interest, we discourage researchers from using them unless the motivation for doing so is clear.

Original languageEnglish (US)
Pages (from-to)1695-1704
Number of pages10
JournalAmerican journal of epidemiology
Volume188
Issue number9
DOIs
StatePublished - Sep 1 2019

Fingerprint

Apgar Score
ROC Curve
Parturition
Sensitivity and Specificity
Proxy
Research
Research Personnel
Datasets

Keywords

  • Apgar score
  • infant health
  • ROC curve

ASJC Scopus subject areas

  • Epidemiology

Cite this

Utility of the 5-Minute Apgar Score as a Research Endpoint. / Bovbjerg, Marit L.; Dissanayake, Mekhala V.; Cheyney, Melissa; Brown, Jennifer; Snowden, Jonathan.

In: American journal of epidemiology, Vol. 188, No. 9, 01.09.2019, p. 1695-1704.

Research output: Contribution to journalArticle

Bovbjerg, ML, Dissanayake, MV, Cheyney, M, Brown, J & Snowden, J 2019, 'Utility of the 5-Minute Apgar Score as a Research Endpoint', American journal of epidemiology, vol. 188, no. 9, pp. 1695-1704. https://doi.org/10.1093/aje/kwz132
Bovbjerg, Marit L. ; Dissanayake, Mekhala V. ; Cheyney, Melissa ; Brown, Jennifer ; Snowden, Jonathan. / Utility of the 5-Minute Apgar Score as a Research Endpoint. In: American journal of epidemiology. 2019 ; Vol. 188, No. 9. pp. 1695-1704.
@article{1414324d58ff457d96f4a12236226096,
title = "Utility of the 5-Minute Apgar Score as a Research Endpoint",
abstract = "Although Apgar scores are commonly used as proxy outcomes, little evidence exists in support of the most common cutpoints (<7, <4). We used 2 data sets to explore this issue: one contained planned community births from across the United States (n = 52,877; 2012-2016), and the other contained hospital births from California (n = 428,877; 2010). We treated 5-minute Apgars as clinical {"}tests,{"} compared against 18 known outcomes; we calculated sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve for each. We used 3 different criteria to determine optimal cutpoints. Results were very consistent across data sets, outcomes, and all subgroups: The cutpoint that maximizes the trade-off between sensitivity and specificity is universally <9. However, extremely low positive predictive values for all outcomes at <9 indicate more misclassification than is acceptable for research. The areas under the receiver operating characteristic curves (which treat Apgars as quasicontinuous) were generally indicative of adequate discrimination between infants destined to experience poor outcomes and those not; comparing median Apgars between groups might be an analytical alternative to dichotomizing. Nonetheless, because Apgar scores are not clearly on any causal pathway of interest, we discourage researchers from using them unless the motivation for doing so is clear.",
keywords = "Apgar score, infant health, ROC curve",
author = "Bovbjerg, {Marit L.} and Dissanayake, {Mekhala V.} and Melissa Cheyney and Jennifer Brown and Jonathan Snowden",
year = "2019",
month = "9",
day = "1",
doi = "10.1093/aje/kwz132",
language = "English (US)",
volume = "188",
pages = "1695--1704",
journal = "American Journal of Epidemiology",
issn = "0002-9262",
publisher = "Oxford University Press",
number = "9",

}

TY - JOUR

T1 - Utility of the 5-Minute Apgar Score as a Research Endpoint

AU - Bovbjerg, Marit L.

AU - Dissanayake, Mekhala V.

AU - Cheyney, Melissa

AU - Brown, Jennifer

AU - Snowden, Jonathan

PY - 2019/9/1

Y1 - 2019/9/1

N2 - Although Apgar scores are commonly used as proxy outcomes, little evidence exists in support of the most common cutpoints (<7, <4). We used 2 data sets to explore this issue: one contained planned community births from across the United States (n = 52,877; 2012-2016), and the other contained hospital births from California (n = 428,877; 2010). We treated 5-minute Apgars as clinical "tests," compared against 18 known outcomes; we calculated sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve for each. We used 3 different criteria to determine optimal cutpoints. Results were very consistent across data sets, outcomes, and all subgroups: The cutpoint that maximizes the trade-off between sensitivity and specificity is universally <9. However, extremely low positive predictive values for all outcomes at <9 indicate more misclassification than is acceptable for research. The areas under the receiver operating characteristic curves (which treat Apgars as quasicontinuous) were generally indicative of adequate discrimination between infants destined to experience poor outcomes and those not; comparing median Apgars between groups might be an analytical alternative to dichotomizing. Nonetheless, because Apgar scores are not clearly on any causal pathway of interest, we discourage researchers from using them unless the motivation for doing so is clear.

AB - Although Apgar scores are commonly used as proxy outcomes, little evidence exists in support of the most common cutpoints (<7, <4). We used 2 data sets to explore this issue: one contained planned community births from across the United States (n = 52,877; 2012-2016), and the other contained hospital births from California (n = 428,877; 2010). We treated 5-minute Apgars as clinical "tests," compared against 18 known outcomes; we calculated sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve for each. We used 3 different criteria to determine optimal cutpoints. Results were very consistent across data sets, outcomes, and all subgroups: The cutpoint that maximizes the trade-off between sensitivity and specificity is universally <9. However, extremely low positive predictive values for all outcomes at <9 indicate more misclassification than is acceptable for research. The areas under the receiver operating characteristic curves (which treat Apgars as quasicontinuous) were generally indicative of adequate discrimination between infants destined to experience poor outcomes and those not; comparing median Apgars between groups might be an analytical alternative to dichotomizing. Nonetheless, because Apgar scores are not clearly on any causal pathway of interest, we discourage researchers from using them unless the motivation for doing so is clear.

KW - Apgar score

KW - infant health

KW - ROC curve

UR - http://www.scopus.com/inward/record.url?scp=85072059150&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072059150&partnerID=8YFLogxK

U2 - 10.1093/aje/kwz132

DO - 10.1093/aje/kwz132

M3 - Article

C2 - 31145428

AN - SCOPUS:85072059150

VL - 188

SP - 1695

EP - 1704

JO - American Journal of Epidemiology

JF - American Journal of Epidemiology

SN - 0002-9262

IS - 9

ER -