Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example

Nancy A. Obuchowski, Huiman X. Barnhart, Andrew J. Buckler, Gene Pennello, Xiao Feng Wang, Jayashree Kalpathy-Cramer, Hyun J. Kim, Anthony P. Reeves

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Quantitative imaging biomarkers are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure quantitative imaging biomarkers have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for quantitative imaging biomarker studies.

Original languageEnglish (US)
Pages (from-to)107-140
Number of pages34
JournalStatistical Methods in Medical Research
Volume24
Issue number1
DOIs
StatePublished - Feb 27 2015
Externally publishedYes

Fingerprint

Nodule
Biomarkers
Tumor
Imaging
Lung
Phantom
Tumor Burden
Statistical method
Repeatability
Reproducibility
Performance Metrics
Medicine
Neoplasms
Monitor
Metric

Keywords

  • agreement
  • bias
  • coverage probability
  • intraclass correlation coefficient
  • limits of agreement
  • repeatability
  • reproducibility

ASJC Scopus subject areas

  • Epidemiology
  • Health Information Management
  • Statistics and Probability

Cite this

Obuchowski, N. A., Barnhart, H. X., Buckler, A. J., Pennello, G., Wang, X. F., Kalpathy-Cramer, J., ... Reeves, A. P. (2015). Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example. Statistical Methods in Medical Research, 24(1), 107-140. https://doi.org/10.1177/0962280214537392

Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example. / Obuchowski, Nancy A.; Barnhart, Huiman X.; Buckler, Andrew J.; Pennello, Gene; Wang, Xiao Feng; Kalpathy-Cramer, Jayashree; Kim, Hyun J.; Reeves, Anthony P.

In: Statistical Methods in Medical Research, Vol. 24, No. 1, 27.02.2015, p. 107-140.

Research output: Contribution to journalArticle

Obuchowski, NA, Barnhart, HX, Buckler, AJ, Pennello, G, Wang, XF, Kalpathy-Cramer, J, Kim, HJ & Reeves, AP 2015, 'Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example', Statistical Methods in Medical Research, vol. 24, no. 1, pp. 107-140. https://doi.org/10.1177/0962280214537392
Obuchowski, Nancy A. ; Barnhart, Huiman X. ; Buckler, Andrew J. ; Pennello, Gene ; Wang, Xiao Feng ; Kalpathy-Cramer, Jayashree ; Kim, Hyun J. ; Reeves, Anthony P. / Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example. In: Statistical Methods in Medical Research. 2015 ; Vol. 24, No. 1. pp. 107-140.
@article{98d5843b53e040fd842d9da673c2b049,
title = "Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example",
abstract = "Quantitative imaging biomarkers are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure quantitative imaging biomarkers have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for quantitative imaging biomarker studies.",
keywords = "agreement, bias, coverage probability, intraclass correlation coefficient, limits of agreement, repeatability, reproducibility",
author = "Obuchowski, {Nancy A.} and Barnhart, {Huiman X.} and Buckler, {Andrew J.} and Gene Pennello and Wang, {Xiao Feng} and Jayashree Kalpathy-Cramer and Kim, {Hyun J.} and Reeves, {Anthony P.}",
year = "2015",
month = "2",
day = "27",
doi = "10.1177/0962280214537392",
language = "English (US)",
volume = "24",
pages = "107--140",
journal = "Statistical Methods in Medical Research",
issn = "0962-2802",
publisher = "SAGE Publications Ltd",
number = "1",

}

TY - JOUR

T1 - Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example

AU - Obuchowski, Nancy A.

AU - Barnhart, Huiman X.

AU - Buckler, Andrew J.

AU - Pennello, Gene

AU - Wang, Xiao Feng

AU - Kalpathy-Cramer, Jayashree

AU - Kim, Hyun J.

AU - Reeves, Anthony P.

PY - 2015/2/27

Y1 - 2015/2/27

N2 - Quantitative imaging biomarkers are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure quantitative imaging biomarkers have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for quantitative imaging biomarker studies.

AB - Quantitative imaging biomarkers are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure quantitative imaging biomarkers have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for quantitative imaging biomarker studies.

KW - agreement

KW - bias

KW - coverage probability

KW - intraclass correlation coefficient

KW - limits of agreement

KW - repeatability

KW - reproducibility

UR - http://www.scopus.com/inward/record.url?scp=84925612171&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925612171&partnerID=8YFLogxK

U2 - 10.1177/0962280214537392

DO - 10.1177/0962280214537392

M3 - Article

VL - 24

SP - 107

EP - 140

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

SN - 0962-2802

IS - 1

ER -