Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons

Nancy A. Obuchowski; Anthony P. Reeves; Erich P. Huang; Xiao Feng Wang; Andrew J. Buckler; Hyun J. Kim; Huiman X. Barnhart; Edward F. Jackson; Maryellen L. Giger; Gene Pennello; Alicia Y. Toledano; Jayashree Kalpathy-Cramer; Tatiyana V. Apanasovich; Paul E. Kinahan; Kyle J. Myers; Dmitry B. Goldgof; Daniel P. Barboriak; Robert J. Gillies; Lawrence H. Schwartz; Daniel C. Sullivan

doi:10.1177/0962280214537390

Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons

Nancy A. Obuchowski, Anthony P. Reeves, Erich P. Huang, Xiao Feng Wang, Andrew J. Buckler, Hyun J. Kim, Huiman X. Barnhart, Edward F. Jackson, Maryellen L. Giger, Gene Pennello, Alicia Y. Toledano, Jayashree Kalpathy-Cramer, Tatiyana V. Apanasovich, Paul E. Kinahan, Kyle J. Myers, Dmitry B. Goldgof, Daniel P. Barboriak, Robert J. Gillies, Lawrence H. Schwartz, Daniel C. Sullivan

Research output: Contribution to journal › Review article › peer-review

126 Scopus citations

Abstract

Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

Original language	English (US)
Pages (from-to)	68-106
Number of pages	39
Journal	Statistical methods in medical research
Volume	24
Issue number	1
DOIs	https://doi.org/10.1177/0962280214537390
State	Published - Feb 27 2015
Externally published	Yes

Keywords

agreement
bias
image metrics
imaging biomarkers
precision
quantitative imaging
repeatability
reproducibility

ASJC Scopus subject areas

Epidemiology
Statistics and Probability
Health Information Management

Access to Document

10.1177/0962280214537390

Cite this

Obuchowski, N. A., Reeves, A. P., Huang, E. P., Wang, X. F., Buckler, A. J., Kim, H. J., Barnhart, H. X., Jackson, E. F., Giger, M. L., Pennello, G., Toledano, A. Y., Kalpathy-Cramer, J., Apanasovich, T. V., Kinahan, P. E., Myers, K. J., Goldgof, D. B., Barboriak, D. P., Gillies, R. J., Schwartz, L. H., & Sullivan, D. C. (2015). Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons. Statistical methods in medical research, 24(1), 68-106. https://doi.org/10.1177/0962280214537390

Obuchowski, NA, Reeves, AP, Huang, EP, Wang, XF, Buckler, AJ, Kim, HJ, Barnhart, HX, Jackson, EF, Giger, ML, Pennello, G, Toledano, AY, Kalpathy-Cramer, J, Apanasovich, TV, Kinahan, PE, Myers, KJ, Goldgof, DB, Barboriak, DP, Gillies, RJ, Schwartz, LH & Sullivan, DC 2015, 'Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons', Statistical methods in medical research, vol. 24, no. 1, pp. 68-106. https://doi.org/10.1177/0962280214537390

@article{93099c18c1d64dee9f8bf6f1e2380b2d,

title = "Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons",

abstract = "Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.",

keywords = "agreement, bias, image metrics, imaging biomarkers, precision, quantitative imaging, repeatability, reproducibility",

author = "Obuchowski, {Nancy A.} and Reeves, {Anthony P.} and Huang, {Erich P.} and Wang, {Xiao Feng} and Buckler, {Andrew J.} and Kim, {Hyun J.} and Barnhart, {Huiman X.} and Jackson, {Edward F.} and Giger, {Maryellen L.} and Gene Pennello and Toledano, {Alicia Y.} and Jayashree Kalpathy-Cramer and Apanasovich, {Tatiyana V.} and Kinahan, {Paul E.} and Myers, {Kyle J.} and Goldgof, {Dmitry B.} and Barboriak, {Daniel P.} and Gillies, {Robert J.} and Schwartz, {Lawrence H.} and Sullivan, {Daniel C.}",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.",

year = "2015",

month = feb,

day = "27",

doi = "10.1177/0962280214537390",

language = "English (US)",

volume = "24",

pages = "68--106",

journal = "Statistical methods in medical research",

issn = "0962-2802",

publisher = "SAGE Publications Ltd",

number = "1",

}

TY - JOUR

T1 - Quantitative imaging biomarkers

T2 - A review of statistical methods for computer algorithm comparisons

AU - Obuchowski, Nancy A.

AU - Reeves, Anthony P.

AU - Huang, Erich P.

AU - Wang, Xiao Feng

AU - Buckler, Andrew J.

AU - Kim, Hyun J.

AU - Barnhart, Huiman X.

AU - Jackson, Edward F.

AU - Giger, Maryellen L.

AU - Pennello, Gene

AU - Toledano, Alicia Y.

AU - Kalpathy-Cramer, Jayashree

AU - Apanasovich, Tatiyana V.

AU - Kinahan, Paul E.

AU - Myers, Kyle J.

AU - Goldgof, Dmitry B.

AU - Barboriak, Daniel P.

AU - Gillies, Robert J.

AU - Schwartz, Lawrence H.

AU - Sullivan, Daniel C.

N1 - Publisher Copyright: © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

PY - 2015/2/27

Y1 - 2015/2/27

N2 - Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

AB - Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

KW - agreement

KW - bias

KW - image metrics

KW - imaging biomarkers

KW - precision

KW - quantitative imaging

KW - repeatability

KW - reproducibility

UR - http://www.scopus.com/inward/record.url?scp=84925671502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925671502&partnerID=8YFLogxK

U2 - 10.1177/0962280214537390

DO - 10.1177/0962280214537390

M3 - Review article

C2 - 24919829

AN - SCOPUS:84925671502

SN - 0962-2802

VL - 24

SP - 68

EP - 106

JO - Statistical methods in medical research

JF - Statistical methods in medical research

IS - 1

ER -

Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this