Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks

James M. Brown; J. Peter Campbell; Andrew Beers; Ken Chang; Susan Ostmo; R. V.Paul Chan; Jennifer Dy; Deniz Erdogmus; Stratis Ioannidis; Jayashree Kalpathy-Cramer; Michael F. Chiang

doi:10.1001/jamaophthalmol.2018.1934

Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks

James M. Brown, J. Peter Campbell, Andrew Beers, Ken Chang, Susan Ostmo, R. V.Paul Chan, Jennifer Dy, Deniz Erdogmus, Stratis Ioannidis, Jayashree Kalpathy-Cramer, Michael F. Chiang

Ophthalmology

Research output: Contribution to journal › Article › peer-review

424 Scopus citations

Abstract

IMPORTANCE Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. OBJECTIVE To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. DESIGN, SETTING, AND PARTICIPANTS A deep convolutional neural networkwas trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017. EXPOSURES A deep learning algorithm trained on retinal photographs. MAIN OUTCOMES AND MEASURES Receiver operating characteristic analysiswas performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre-plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts. RESULTS Of the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre-plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre-plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre-plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93%with 94%specificity. For detection of pre-plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts. CONCLUSIONS AND RELEVANCE This fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP.

Original language	English (US)
Pages (from-to)	803-810
Number of pages	8
Journal	JAMA ophthalmology
Volume	136
Issue number	7
DOIs	https://doi.org/10.1001/jamaophthalmol.2018.1934
State	Published - Jul 2018

ASJC Scopus subject areas

Ophthalmology

Access to Document

10.1001/jamaophthalmol.2018.1934

Cite this

@article{d9dd77473de74cf7a8907f9d8d354d07,

title = "Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks",

abstract = "IMPORTANCE Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. OBJECTIVE To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. DESIGN, SETTING, AND PARTICIPANTS A deep convolutional neural networkwas trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017. EXPOSURES A deep learning algorithm trained on retinal photographs. MAIN OUTCOMES AND MEASURES Receiver operating characteristic analysiswas performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre-plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts. RESULTS Of the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre-plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre-plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre-plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93%with 94%specificity. For detection of pre-plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts. CONCLUSIONS AND RELEVANCE This fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP.",

author = "Brown, {James M.} and Campbell, {J. Peter} and Andrew Beers and Ken Chang and Susan Ostmo and Chan, {R. V.Paul} and Jennifer Dy and Deniz Erdogmus and Stratis Ioannidis and Jayashree Kalpathy-Cramer and Chiang, {Michael F.}",

note = "Publisher Copyright: {\textcopyright} 2018 American Medical Association.",

year = "2018",

month = jul,

doi = "10.1001/jamaophthalmol.2018.1934",

language = "English (US)",

volume = "136",

pages = "803--810",

journal = "JAMA ophthalmology",

issn = "2168-6165",

publisher = "American Medical Association",

number = "7",

}

TY - JOUR

T1 - Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks

AU - Brown, James M.

AU - Campbell, J. Peter

AU - Beers, Andrew

AU - Chang, Ken

AU - Ostmo, Susan

AU - Chan, R. V.Paul

AU - Dy, Jennifer

AU - Erdogmus, Deniz

AU - Ioannidis, Stratis

AU - Kalpathy-Cramer, Jayashree

AU - Chiang, Michael F.

PY - 2018/7

Y1 - 2018/7

N2 - IMPORTANCE Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. OBJECTIVE To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. DESIGN, SETTING, AND PARTICIPANTS A deep convolutional neural networkwas trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017. EXPOSURES A deep learning algorithm trained on retinal photographs. MAIN OUTCOMES AND MEASURES Receiver operating characteristic analysiswas performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre-plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts. RESULTS Of the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre-plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre-plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre-plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93%with 94%specificity. For detection of pre-plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts. CONCLUSIONS AND RELEVANCE This fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP.

AB - IMPORTANCE Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. OBJECTIVE To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. DESIGN, SETTING, AND PARTICIPANTS A deep convolutional neural networkwas trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017. EXPOSURES A deep learning algorithm trained on retinal photographs. MAIN OUTCOMES AND MEASURES Receiver operating characteristic analysiswas performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre-plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts. RESULTS Of the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre-plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre-plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre-plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93%with 94%specificity. For detection of pre-plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts. CONCLUSIONS AND RELEVANCE This fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP.

UR - http://www.scopus.com/inward/record.url?scp=85049693038&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049693038&partnerID=8YFLogxK

U2 - 10.1001/jamaophthalmol.2018.1934

DO - 10.1001/jamaophthalmol.2018.1934

M3 - Article

C2 - 29801159

AN - SCOPUS:85049693038

SN - 2168-6165

VL - 136

SP - 803

EP - 810

JO - JAMA ophthalmology

JF - JAMA ophthalmology

IS - 7

ER -

Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this