TY - JOUR
T1 - A Comparison of Lung Nodule Segmentation Algorithms
T2 - Methods and Results from a Multi-institutional Study
AU - Kalpathy-Cramer, Jayashree
AU - Zhao, Binsheng
AU - Goldgof, Dmitry
AU - Gu, Yuhua
AU - Wang, Xingwei
AU - Yang, Hao
AU - Tan, Yongqiang
AU - Gillies, Robert
AU - Napel, Sandy
N1 - Publisher Copyright:
© 2016, Society for Imaging Informatics in Medicine.
PY - 2016/8/1
Y1 - 2016/8/1
N2 - Tumor volume estimation, as well as accurate and reproducible borders segmentation in medical images, are important in the diagnosis, staging, and assessment of response to cancer therapy. The goal of this study was to demonstrate the feasibility of a multi-institutional effort to assess the repeatability and reproducibility of nodule borders and volume estimate bias of computerized segmentation algorithms in CT images of lung cancer, and to provide results from such a study. The dataset used for this evaluation consisted of 52 tumors in 41 CT volumes (40 patient datasets and 1 dataset containing scans of 12 phantom nodules of known volume) from five collections available in The Cancer Imaging Archive. Three academic institutions developing lung nodule segmentation algorithms submitted results for three repeat runs for each of the nodules. We compared the performance of lung nodule segmentation algorithms by assessing several measurements of spatial overlap and volume measurement. Nodule sizes varied from 29 μl to 66 ml and demonstrated a diversity of shapes. Agreement in spatial overlap of segmentations was significantly higher for multiple runs of the same algorithm than between segmentations generated by different algorithms (p < 0.05) and was significantly higher on the phantom dataset compared to the other datasets (p < 0.05). Algorithms differed significantly in the bias of the measured volumes of the phantom nodules (p < 0.05) underscoring the need for assessing performance on clinical data in addition to phantoms. Algorithms that most accurately estimated nodule volumes were not the most repeatable, emphasizing the need to evaluate both their accuracy and precision. There were considerable differences between algorithms, especially in a subset of heterogeneous nodules, underscoring the recommendation that the same software be used at all time points in longitudinal studies.
AB - Tumor volume estimation, as well as accurate and reproducible borders segmentation in medical images, are important in the diagnosis, staging, and assessment of response to cancer therapy. The goal of this study was to demonstrate the feasibility of a multi-institutional effort to assess the repeatability and reproducibility of nodule borders and volume estimate bias of computerized segmentation algorithms in CT images of lung cancer, and to provide results from such a study. The dataset used for this evaluation consisted of 52 tumors in 41 CT volumes (40 patient datasets and 1 dataset containing scans of 12 phantom nodules of known volume) from five collections available in The Cancer Imaging Archive. Three academic institutions developing lung nodule segmentation algorithms submitted results for three repeat runs for each of the nodules. We compared the performance of lung nodule segmentation algorithms by assessing several measurements of spatial overlap and volume measurement. Nodule sizes varied from 29 μl to 66 ml and demonstrated a diversity of shapes. Agreement in spatial overlap of segmentations was significantly higher for multiple runs of the same algorithm than between segmentations generated by different algorithms (p < 0.05) and was significantly higher on the phantom dataset compared to the other datasets (p < 0.05). Algorithms differed significantly in the bias of the measured volumes of the phantom nodules (p < 0.05) underscoring the need for assessing performance on clinical data in addition to phantoms. Algorithms that most accurately estimated nodule volumes were not the most repeatable, emphasizing the need to evaluate both their accuracy and precision. There were considerable differences between algorithms, especially in a subset of heterogeneous nodules, underscoring the recommendation that the same software be used at all time points in longitudinal studies.
KW - Computed tomography
KW - Infrastructure
KW - Lung cancer
KW - Quantitative imaging
KW - Segmentation
UR - http://www.scopus.com/inward/record.url?scp=84957603850&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84957603850&partnerID=8YFLogxK
U2 - 10.1007/s10278-016-9859-z
DO - 10.1007/s10278-016-9859-z
M3 - Article
C2 - 26847203
AN - SCOPUS:84957603850
SN - 0897-1889
VL - 29
SP - 476
EP - 487
JO - Journal of Digital Imaging
JF - Journal of Digital Imaging
IS - 4
ER -