Pathologists' diagnosis of invasive melanoma and melanocytic proliferations: Observer accuracy and reproducibility study

Joann G. Elmore, Raymond L. Barnhill, David E. Elder, Gary M. Longton, Margaret S. Pepe, Lisa M. Reisch, Patricia (Patty) Carney, Linda J. Titus, Heidi Nelson, Tracy Onega, Anna N.A. Tosteson, Martin A. Weinstock, Stevan R. Knezevich, Michael W. Piepkorn

Research output: Contribution to journalArticle

70 Citations (Scopus)

Abstract

Objective To quantify the accuracy and reproducibility of pathologists' diagnoses of melanocytic skin lesions. Design Observer accuracy and reproducibility study. Setting 10 US states. Participants Skin biopsy cases (n=240), grouped into sets of 36 or 48. Pathologists from 10 US states were randomized to independently interpret the same set on two occasions (phases 1 and 2), at least eight months apart. Main outcome measures Pathologists' interpretations were condensed into five classes: I (eg, nevus or mild atypia); II (eg, moderate atypia); III (eg, severe atypia or melanoma in situ); IV (eg, pathologic stage T1a (pT1a) early invasive melanoma); and V (eg, ≥pT1b invasive melanoma). Reproducibility was assessed by intraobserver and interobserver concordance rates, and accuracy by concordance with three reference diagnoses. Results In phase 1, 187 pathologists completed 8976 independent case interpretations resulting in an average of 10 (SD 4) different diagnostic terms applied to each case. Among pathologists interpreting the same cases in both phases, when pathologists diagnosed a case as class I or class V during phase 1, they gave the same diagnosis in phase 2 for the majority of cases (class I 76.7%; class V 82.6%). However, the intraobserver reproducibility was lower for cases interpreted as class II (35.2%), class III (59.5%), and class IV (63.2%). Average interobserver concordance rates were lower, but with similar trends. Accuracy using a consensus diagnosis of experienced pathologists as reference varied by class: I, 92% (95% confidence interval 90% to 94%); II, 25% (22% to 28%); III, 40% (37% to 44%); IV, 43% (39% to 46%); and V, 72% (69% to 75%). It is estimated that at a population level, 82.8% (81.0% to 84.5%) of melanocytic skin biopsy diagnoses would have their diagnosis verified if reviewed by a consensus reference panel of experienced pathologists, with 8.0% (6.2% to 9.9%) of cases overinterpreted by the initial pathologist and 9.2% (8.8% to 9.6%) underinterpreted. Conclusion Diagnoses spanning moderately dysplastic nevi to early stage invasive melanoma were neither reproducible nor accurate in this large study of pathologists in the USA. Efforts to improve clinical practice should include using a standardized classification system, acknowledging uncertainty in pathology reports, and developing tools such as molecular markers to support pathologists' visual assessments.

Original languageEnglish (US)
Article numberj2813
JournalBMJ (Online)
Volume357
DOIs
StatePublished - Jun 28 2017

Fingerprint

Melanoma
Skin
Pathologists
Dysplastic Nevus Syndrome
Biopsy
Nevus
Uncertainty
Outcome Assessment (Health Care)
Confidence Intervals
Pathology

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Elmore, J. G., Barnhill, R. L., Elder, D. E., Longton, G. M., Pepe, M. S., Reisch, L. M., ... Piepkorn, M. W. (2017). Pathologists' diagnosis of invasive melanoma and melanocytic proliferations: Observer accuracy and reproducibility study. BMJ (Online), 357, [j2813]. https://doi.org/10.1136/bmj.j2813

Pathologists' diagnosis of invasive melanoma and melanocytic proliferations : Observer accuracy and reproducibility study. / Elmore, Joann G.; Barnhill, Raymond L.; Elder, David E.; Longton, Gary M.; Pepe, Margaret S.; Reisch, Lisa M.; Carney, Patricia (Patty); Titus, Linda J.; Nelson, Heidi; Onega, Tracy; Tosteson, Anna N.A.; Weinstock, Martin A.; Knezevich, Stevan R.; Piepkorn, Michael W.

In: BMJ (Online), Vol. 357, j2813, 28.06.2017.

Research output: Contribution to journalArticle

Elmore, JG, Barnhill, RL, Elder, DE, Longton, GM, Pepe, MS, Reisch, LM, Carney, PP, Titus, LJ, Nelson, H, Onega, T, Tosteson, ANA, Weinstock, MA, Knezevich, SR & Piepkorn, MW 2017, 'Pathologists' diagnosis of invasive melanoma and melanocytic proliferations: Observer accuracy and reproducibility study', BMJ (Online), vol. 357, j2813. https://doi.org/10.1136/bmj.j2813
Elmore, Joann G. ; Barnhill, Raymond L. ; Elder, David E. ; Longton, Gary M. ; Pepe, Margaret S. ; Reisch, Lisa M. ; Carney, Patricia (Patty) ; Titus, Linda J. ; Nelson, Heidi ; Onega, Tracy ; Tosteson, Anna N.A. ; Weinstock, Martin A. ; Knezevich, Stevan R. ; Piepkorn, Michael W. / Pathologists' diagnosis of invasive melanoma and melanocytic proliferations : Observer accuracy and reproducibility study. In: BMJ (Online). 2017 ; Vol. 357.
@article{d72b460e46ff43e9a8c90e5f8151d9b3,
title = "Pathologists' diagnosis of invasive melanoma and melanocytic proliferations: Observer accuracy and reproducibility study",
abstract = "Objective To quantify the accuracy and reproducibility of pathologists' diagnoses of melanocytic skin lesions. Design Observer accuracy and reproducibility study. Setting 10 US states. Participants Skin biopsy cases (n=240), grouped into sets of 36 or 48. Pathologists from 10 US states were randomized to independently interpret the same set on two occasions (phases 1 and 2), at least eight months apart. Main outcome measures Pathologists' interpretations were condensed into five classes: I (eg, nevus or mild atypia); II (eg, moderate atypia); III (eg, severe atypia or melanoma in situ); IV (eg, pathologic stage T1a (pT1a) early invasive melanoma); and V (eg, ≥pT1b invasive melanoma). Reproducibility was assessed by intraobserver and interobserver concordance rates, and accuracy by concordance with three reference diagnoses. Results In phase 1, 187 pathologists completed 8976 independent case interpretations resulting in an average of 10 (SD 4) different diagnostic terms applied to each case. Among pathologists interpreting the same cases in both phases, when pathologists diagnosed a case as class I or class V during phase 1, they gave the same diagnosis in phase 2 for the majority of cases (class I 76.7{\%}; class V 82.6{\%}). However, the intraobserver reproducibility was lower for cases interpreted as class II (35.2{\%}), class III (59.5{\%}), and class IV (63.2{\%}). Average interobserver concordance rates were lower, but with similar trends. Accuracy using a consensus diagnosis of experienced pathologists as reference varied by class: I, 92{\%} (95{\%} confidence interval 90{\%} to 94{\%}); II, 25{\%} (22{\%} to 28{\%}); III, 40{\%} (37{\%} to 44{\%}); IV, 43{\%} (39{\%} to 46{\%}); and V, 72{\%} (69{\%} to 75{\%}). It is estimated that at a population level, 82.8{\%} (81.0{\%} to 84.5{\%}) of melanocytic skin biopsy diagnoses would have their diagnosis verified if reviewed by a consensus reference panel of experienced pathologists, with 8.0{\%} (6.2{\%} to 9.9{\%}) of cases overinterpreted by the initial pathologist and 9.2{\%} (8.8{\%} to 9.6{\%}) underinterpreted. Conclusion Diagnoses spanning moderately dysplastic nevi to early stage invasive melanoma were neither reproducible nor accurate in this large study of pathologists in the USA. Efforts to improve clinical practice should include using a standardized classification system, acknowledging uncertainty in pathology reports, and developing tools such as molecular markers to support pathologists' visual assessments.",
author = "Elmore, {Joann G.} and Barnhill, {Raymond L.} and Elder, {David E.} and Longton, {Gary M.} and Pepe, {Margaret S.} and Reisch, {Lisa M.} and Carney, {Patricia (Patty)} and Titus, {Linda J.} and Heidi Nelson and Tracy Onega and Tosteson, {Anna N.A.} and Weinstock, {Martin A.} and Knezevich, {Stevan R.} and Piepkorn, {Michael W.}",
year = "2017",
month = "6",
day = "28",
doi = "10.1136/bmj.j2813",
language = "English (US)",
volume = "357",
journal = "BMJ (Online)",
issn = "0267-0623",
publisher = "BMJ Publishing Group",

}

TY - JOUR

T1 - Pathologists' diagnosis of invasive melanoma and melanocytic proliferations

T2 - Observer accuracy and reproducibility study

AU - Elmore, Joann G.

AU - Barnhill, Raymond L.

AU - Elder, David E.

AU - Longton, Gary M.

AU - Pepe, Margaret S.

AU - Reisch, Lisa M.

AU - Carney, Patricia (Patty)

AU - Titus, Linda J.

AU - Nelson, Heidi

AU - Onega, Tracy

AU - Tosteson, Anna N.A.

AU - Weinstock, Martin A.

AU - Knezevich, Stevan R.

AU - Piepkorn, Michael W.

PY - 2017/6/28

Y1 - 2017/6/28

N2 - Objective To quantify the accuracy and reproducibility of pathologists' diagnoses of melanocytic skin lesions. Design Observer accuracy and reproducibility study. Setting 10 US states. Participants Skin biopsy cases (n=240), grouped into sets of 36 or 48. Pathologists from 10 US states were randomized to independently interpret the same set on two occasions (phases 1 and 2), at least eight months apart. Main outcome measures Pathologists' interpretations were condensed into five classes: I (eg, nevus or mild atypia); II (eg, moderate atypia); III (eg, severe atypia or melanoma in situ); IV (eg, pathologic stage T1a (pT1a) early invasive melanoma); and V (eg, ≥pT1b invasive melanoma). Reproducibility was assessed by intraobserver and interobserver concordance rates, and accuracy by concordance with three reference diagnoses. Results In phase 1, 187 pathologists completed 8976 independent case interpretations resulting in an average of 10 (SD 4) different diagnostic terms applied to each case. Among pathologists interpreting the same cases in both phases, when pathologists diagnosed a case as class I or class V during phase 1, they gave the same diagnosis in phase 2 for the majority of cases (class I 76.7%; class V 82.6%). However, the intraobserver reproducibility was lower for cases interpreted as class II (35.2%), class III (59.5%), and class IV (63.2%). Average interobserver concordance rates were lower, but with similar trends. Accuracy using a consensus diagnosis of experienced pathologists as reference varied by class: I, 92% (95% confidence interval 90% to 94%); II, 25% (22% to 28%); III, 40% (37% to 44%); IV, 43% (39% to 46%); and V, 72% (69% to 75%). It is estimated that at a population level, 82.8% (81.0% to 84.5%) of melanocytic skin biopsy diagnoses would have their diagnosis verified if reviewed by a consensus reference panel of experienced pathologists, with 8.0% (6.2% to 9.9%) of cases overinterpreted by the initial pathologist and 9.2% (8.8% to 9.6%) underinterpreted. Conclusion Diagnoses spanning moderately dysplastic nevi to early stage invasive melanoma were neither reproducible nor accurate in this large study of pathologists in the USA. Efforts to improve clinical practice should include using a standardized classification system, acknowledging uncertainty in pathology reports, and developing tools such as molecular markers to support pathologists' visual assessments.

AB - Objective To quantify the accuracy and reproducibility of pathologists' diagnoses of melanocytic skin lesions. Design Observer accuracy and reproducibility study. Setting 10 US states. Participants Skin biopsy cases (n=240), grouped into sets of 36 or 48. Pathologists from 10 US states were randomized to independently interpret the same set on two occasions (phases 1 and 2), at least eight months apart. Main outcome measures Pathologists' interpretations were condensed into five classes: I (eg, nevus or mild atypia); II (eg, moderate atypia); III (eg, severe atypia or melanoma in situ); IV (eg, pathologic stage T1a (pT1a) early invasive melanoma); and V (eg, ≥pT1b invasive melanoma). Reproducibility was assessed by intraobserver and interobserver concordance rates, and accuracy by concordance with three reference diagnoses. Results In phase 1, 187 pathologists completed 8976 independent case interpretations resulting in an average of 10 (SD 4) different diagnostic terms applied to each case. Among pathologists interpreting the same cases in both phases, when pathologists diagnosed a case as class I or class V during phase 1, they gave the same diagnosis in phase 2 for the majority of cases (class I 76.7%; class V 82.6%). However, the intraobserver reproducibility was lower for cases interpreted as class II (35.2%), class III (59.5%), and class IV (63.2%). Average interobserver concordance rates were lower, but with similar trends. Accuracy using a consensus diagnosis of experienced pathologists as reference varied by class: I, 92% (95% confidence interval 90% to 94%); II, 25% (22% to 28%); III, 40% (37% to 44%); IV, 43% (39% to 46%); and V, 72% (69% to 75%). It is estimated that at a population level, 82.8% (81.0% to 84.5%) of melanocytic skin biopsy diagnoses would have their diagnosis verified if reviewed by a consensus reference panel of experienced pathologists, with 8.0% (6.2% to 9.9%) of cases overinterpreted by the initial pathologist and 9.2% (8.8% to 9.6%) underinterpreted. Conclusion Diagnoses spanning moderately dysplastic nevi to early stage invasive melanoma were neither reproducible nor accurate in this large study of pathologists in the USA. Efforts to improve clinical practice should include using a standardized classification system, acknowledging uncertainty in pathology reports, and developing tools such as molecular markers to support pathologists' visual assessments.

UR - http://www.scopus.com/inward/record.url?scp=85021663116&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021663116&partnerID=8YFLogxK

U2 - 10.1136/bmj.j2813

DO - 10.1136/bmj.j2813

M3 - Article

C2 - 28659278

AN - SCOPUS:85021663116

VL - 357

JO - BMJ (Online)

JF - BMJ (Online)

SN - 0267-0623

M1 - j2813

ER -