Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons

Nancy A. Obuchowski, Anthony P. Reeves, Erich P. Huang, Xiao Feng Wang, Andrew J. Buckler, Hyun J. Kim, Huiman X. Barnhart, Edward F. Jackson, Maryellen L. Giger, Gene Pennello, Alicia Y. Toledano, Jayashree Kalpathy-Cramer, Tatiyana V. Apanasovich, Paul E. Kinahan, Kyle J. Myers, Dmitry B. Goldgof, Daniel P. Barboriak, Robert J. Gillies, Lawrence H. Schwartz, Daniel C. Sullivan

Research output: Contribution to journalArticle

54 Citations (Scopus)

Abstract

Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

Original languageEnglish (US)
Pages (from-to)68-106
Number of pages39
JournalStatistical Methods in Medical Research
Volume24
Issue number1
DOIs
StatePublished - Feb 27 2015
Externally publishedYes

Fingerprint

Biomarkers
Statistical method
Imaging
Medical Image
Phantom
Therapy
Review
Planning
Equivalence
Monitoring
Testing
Series
Zero
Research
Standards

Keywords

  • agreement
  • bias
  • image metrics
  • imaging biomarkers
  • precision
  • quantitative imaging
  • repeatability
  • reproducibility

ASJC Scopus subject areas

  • Epidemiology
  • Health Information Management
  • Statistics and Probability

Cite this

Obuchowski, N. A., Reeves, A. P., Huang, E. P., Wang, X. F., Buckler, A. J., Kim, H. J., ... Sullivan, D. C. (2015). Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons. Statistical Methods in Medical Research, 24(1), 68-106. https://doi.org/10.1177/0962280214537390

Quantitative imaging biomarkers : A review of statistical methods for computer algorithm comparisons. / Obuchowski, Nancy A.; Reeves, Anthony P.; Huang, Erich P.; Wang, Xiao Feng; Buckler, Andrew J.; Kim, Hyun J.; Barnhart, Huiman X.; Jackson, Edward F.; Giger, Maryellen L.; Pennello, Gene; Toledano, Alicia Y.; Kalpathy-Cramer, Jayashree; Apanasovich, Tatiyana V.; Kinahan, Paul E.; Myers, Kyle J.; Goldgof, Dmitry B.; Barboriak, Daniel P.; Gillies, Robert J.; Schwartz, Lawrence H.; Sullivan, Daniel C.

In: Statistical Methods in Medical Research, Vol. 24, No. 1, 27.02.2015, p. 68-106.

Research output: Contribution to journalArticle

Obuchowski, NA, Reeves, AP, Huang, EP, Wang, XF, Buckler, AJ, Kim, HJ, Barnhart, HX, Jackson, EF, Giger, ML, Pennello, G, Toledano, AY, Kalpathy-Cramer, J, Apanasovich, TV, Kinahan, PE, Myers, KJ, Goldgof, DB, Barboriak, DP, Gillies, RJ, Schwartz, LH & Sullivan, DC 2015, 'Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons', Statistical Methods in Medical Research, vol. 24, no. 1, pp. 68-106. https://doi.org/10.1177/0962280214537390
Obuchowski, Nancy A. ; Reeves, Anthony P. ; Huang, Erich P. ; Wang, Xiao Feng ; Buckler, Andrew J. ; Kim, Hyun J. ; Barnhart, Huiman X. ; Jackson, Edward F. ; Giger, Maryellen L. ; Pennello, Gene ; Toledano, Alicia Y. ; Kalpathy-Cramer, Jayashree ; Apanasovich, Tatiyana V. ; Kinahan, Paul E. ; Myers, Kyle J. ; Goldgof, Dmitry B. ; Barboriak, Daniel P. ; Gillies, Robert J. ; Schwartz, Lawrence H. ; Sullivan, Daniel C. / Quantitative imaging biomarkers : A review of statistical methods for computer algorithm comparisons. In: Statistical Methods in Medical Research. 2015 ; Vol. 24, No. 1. pp. 68-106.
@article{93099c18c1d64dee9f8bf6f1e2380b2d,
title = "Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons",
abstract = "Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.",
keywords = "agreement, bias, image metrics, imaging biomarkers, precision, quantitative imaging, repeatability, reproducibility",
author = "Obuchowski, {Nancy A.} and Reeves, {Anthony P.} and Huang, {Erich P.} and Wang, {Xiao Feng} and Buckler, {Andrew J.} and Kim, {Hyun J.} and Barnhart, {Huiman X.} and Jackson, {Edward F.} and Giger, {Maryellen L.} and Gene Pennello and Toledano, {Alicia Y.} and Jayashree Kalpathy-Cramer and Apanasovich, {Tatiyana V.} and Kinahan, {Paul E.} and Myers, {Kyle J.} and Goldgof, {Dmitry B.} and Barboriak, {Daniel P.} and Gillies, {Robert J.} and Schwartz, {Lawrence H.} and Sullivan, {Daniel C.}",
year = "2015",
month = "2",
day = "27",
doi = "10.1177/0962280214537390",
language = "English (US)",
volume = "24",
pages = "68--106",
journal = "Statistical Methods in Medical Research",
issn = "0962-2802",
publisher = "SAGE Publications Ltd",
number = "1",

}

TY - JOUR

T1 - Quantitative imaging biomarkers

T2 - A review of statistical methods for computer algorithm comparisons

AU - Obuchowski, Nancy A.

AU - Reeves, Anthony P.

AU - Huang, Erich P.

AU - Wang, Xiao Feng

AU - Buckler, Andrew J.

AU - Kim, Hyun J.

AU - Barnhart, Huiman X.

AU - Jackson, Edward F.

AU - Giger, Maryellen L.

AU - Pennello, Gene

AU - Toledano, Alicia Y.

AU - Kalpathy-Cramer, Jayashree

AU - Apanasovich, Tatiyana V.

AU - Kinahan, Paul E.

AU - Myers, Kyle J.

AU - Goldgof, Dmitry B.

AU - Barboriak, Daniel P.

AU - Gillies, Robert J.

AU - Schwartz, Lawrence H.

AU - Sullivan, Daniel C.

PY - 2015/2/27

Y1 - 2015/2/27

N2 - Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

AB - Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

KW - agreement

KW - bias

KW - image metrics

KW - imaging biomarkers

KW - precision

KW - quantitative imaging

KW - repeatability

KW - reproducibility

UR - http://www.scopus.com/inward/record.url?scp=84925671502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925671502&partnerID=8YFLogxK

U2 - 10.1177/0962280214537390

DO - 10.1177/0962280214537390

M3 - Article

C2 - 24919829

AN - SCOPUS:84925671502

VL - 24

SP - 68

EP - 106

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

SN - 0962-2802

IS - 1

ER -