Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry

Charles J. Robbins; Aileen I. Fernandez; Gang Han; Serena Wong; Malini Harigopal; Mirna Podoll; Kamaljeet Singh; Amy Ly; M. Gabriela Kuba; Hannah Wen; Mary Ann Sanders; Jane Brock; Shi Wei; Oluwole Fadare; Krisztina Hanley; Julie Jorns; Olivia L. Snir; Esther Yoon; Kim Rabe; T. Rinda Soong; Emily S. Reisenbichler; David L. Rimm

doi:10.1016/j.modpat.2022.100032

Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry

Charles J. Robbins, Aileen I. Fernandez, Gang Han, Serena Wong, Malini Harigopal, Mirna Podoll, Kamaljeet Singh, Amy Ly, M. Gabriela Kuba, Hannah Wen, Mary Ann Sanders, Jane Brock, Shi Wei, Oluwole Fadare, Krisztina Hanley, Julie Jorns, Olivia L. Snir, Esther Yoon, Kim Rabe, T. Rinda SoongEmily S. Reisenbichler, David L. Rimm

Research output: Contribution to journal › Article › peer-review

18 Scopus citations

Abstract

The HercepTest was approved 20+ years ago as the companion diagnostic test for trastuzumab in human epidermal growth factor 2 (HER2) or ERBB2 gene–amplified/overexpressing breast cancers. Subsequent HER2 immunohistochemistry (IHC) assays followed, including the now most common Ventana 4B5 assay. Although this IHC assay has become the clinical standard, its reliability, reproducibility, and accuracy have largely been approved and accepted on the basis of concordance among small numbers of pathologists without validation in a real-world setting. In this study, we evaluated the concordance and interrater reliability of scoring HER2 IHC in 170 breast cancer biopsies by 18 breast cancer–specialized pathologists from 15 institutions. We used the Observers Needed to Evaluate Subjective Tests method to determine the plateau of concordance and the minimum number of pathologists needed to estimate interrater agreement values for large numbers of raters, as seen in the real-world setting. We report substantial discordance within the intermediate categories (<1% agreement for 1+ and 3.6% agreement for 2+) in the 4-category HER2 IHC scoring system. The discordance within the IHC 0 cases is also substantial with an overall percent agreement (OPA) of only 25% and poor interrater reliability metrics (0.49 Fleiss’ kappa, 0.55 intraclass correlation coefficient). This discordance can be partially reduced by using a 3-category system (28.8% vs 46.5% OPA for 4-category and 3-category scoring systems, respectively). Observers Needed to Evaluate Subjective Tests plots suggest that the OPA for the task of determining a HER2 IHC score 0 from not 0 plateaus statistically around 59.4% at 10 raters. Conversely, at the task of scoring HER2 IHC as 3+ or not 3+ pathologists’ concordance was much higher with an OPA that plateaus at 87.1% with 6 raters. This suggests that legacy HER2 IHC remains valuable for finding the patients in whom the ERBB2 gene is amplified but unacceptably discordant in assigning HER2-low or HER2-negative status for the emerging HER2-low therapies.

Original language	English (US)
Article number	100032
Journal	Modern Pathology
Volume	36
Issue number	1
DOIs	https://doi.org/10.1016/j.modpat.2022.100032
State	Published - Jan 2023
Externally published	Yes

Keywords

HER2
immunohistochemistry
pathology
predictive markers
prognostic markers

ASJC Scopus subject areas

General Medicine

Access to Document

10.1016/j.modpat.2022.100032

Cite this

Robbins, C. J., Fernandez, A. I., Han, G., Wong, S., Harigopal, M., Podoll, M., Singh, K., Ly, A., Kuba, M. G., Wen, H., Sanders, M. A., Brock, J., Wei, S., Fadare, O., Hanley, K., Jorns, J., Snir, O. L., Yoon, E., Rabe, K., ... Rimm, D. L. (2023). Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry. Modern Pathology, 36(1), Article 100032. https://doi.org/10.1016/j.modpat.2022.100032

Robbins, CJ, Fernandez, AI, Han, G, Wong, S, Harigopal, M, Podoll, M, Singh, K, Ly, A, Kuba, MG, Wen, H, Sanders, MA, Brock, J, Wei, S, Fadare, O, Hanley, K, Jorns, J, Snir, OL, Yoon, E, Rabe, K, Soong, TR, Reisenbichler, ES & Rimm, DL 2023, 'Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry', Modern Pathology, vol. 36, no. 1, 100032. https://doi.org/10.1016/j.modpat.2022.100032

@article{b91a7f7e9aef4ef39566a4b6c8ac9e8e,

title = "Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry",

abstract = "The HercepTest was approved 20+ years ago as the companion diagnostic test for trastuzumab in human epidermal growth factor 2 (HER2) or ERBB2 gene–amplified/overexpressing breast cancers. Subsequent HER2 immunohistochemistry (IHC) assays followed, including the now most common Ventana 4B5 assay. Although this IHC assay has become the clinical standard, its reliability, reproducibility, and accuracy have largely been approved and accepted on the basis of concordance among small numbers of pathologists without validation in a real-world setting. In this study, we evaluated the concordance and interrater reliability of scoring HER2 IHC in 170 breast cancer biopsies by 18 breast cancer–specialized pathologists from 15 institutions. We used the Observers Needed to Evaluate Subjective Tests method to determine the plateau of concordance and the minimum number of pathologists needed to estimate interrater agreement values for large numbers of raters, as seen in the real-world setting. We report substantial discordance within the intermediate categories (<1% agreement for 1+ and 3.6% agreement for 2+) in the 4-category HER2 IHC scoring system. The discordance within the IHC 0 cases is also substantial with an overall percent agreement (OPA) of only 25% and poor interrater reliability metrics (0.49 Fleiss{\textquoteright} kappa, 0.55 intraclass correlation coefficient). This discordance can be partially reduced by using a 3-category system (28.8% vs 46.5% OPA for 4-category and 3-category scoring systems, respectively). Observers Needed to Evaluate Subjective Tests plots suggest that the OPA for the task of determining a HER2 IHC score 0 from not 0 plateaus statistically around 59.4% at 10 raters. Conversely, at the task of scoring HER2 IHC as 3+ or not 3+ pathologists{\textquoteright} concordance was much higher with an OPA that plateaus at 87.1% with 6 raters. This suggests that legacy HER2 IHC remains valuable for finding the patients in whom the ERBB2 gene is amplified but unacceptably discordant in assigning HER2-low or HER2-negative status for the emerging HER2-low therapies.",

keywords = "HER2, immunohistochemistry, pathology, predictive markers, prognostic markers",

author = "Robbins, {Charles J.} and Fernandez, {Aileen I.} and Gang Han and Serena Wong and Malini Harigopal and Mirna Podoll and Kamaljeet Singh and Amy Ly and Kuba, {M. Gabriela} and Hannah Wen and Sanders, {Mary Ann} and Jane Brock and Shi Wei and Oluwole Fadare and Krisztina Hanley and Julie Jorns and Snir, {Olivia L.} and Esther Yoon and Kim Rabe and Soong, {T. Rinda} and Reisenbichler, {Emily S.} and Rimm, {David L.}",

note = "Publisher Copyright: {\textcopyright} 2022 United States & Canadian Academy of Pathology",

year = "2023",

month = jan,

doi = "10.1016/j.modpat.2022.100032",

language = "English (US)",

volume = "36",

journal = "Modern Pathology",

issn = "0893-3952",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry

AU - Robbins, Charles J.

AU - Fernandez, Aileen I.

AU - Han, Gang

AU - Wong, Serena

AU - Harigopal, Malini

AU - Podoll, Mirna

AU - Singh, Kamaljeet

AU - Ly, Amy

AU - Kuba, M. Gabriela

AU - Wen, Hannah

AU - Sanders, Mary Ann

AU - Brock, Jane

AU - Wei, Shi

AU - Fadare, Oluwole

AU - Hanley, Krisztina

AU - Jorns, Julie

AU - Snir, Olivia L.

AU - Yoon, Esther

AU - Rabe, Kim

AU - Soong, T. Rinda

AU - Reisenbichler, Emily S.

AU - Rimm, David L.

PY - 2023/1

Y1 - 2023/1

N2 - The HercepTest was approved 20+ years ago as the companion diagnostic test for trastuzumab in human epidermal growth factor 2 (HER2) or ERBB2 gene–amplified/overexpressing breast cancers. Subsequent HER2 immunohistochemistry (IHC) assays followed, including the now most common Ventana 4B5 assay. Although this IHC assay has become the clinical standard, its reliability, reproducibility, and accuracy have largely been approved and accepted on the basis of concordance among small numbers of pathologists without validation in a real-world setting. In this study, we evaluated the concordance and interrater reliability of scoring HER2 IHC in 170 breast cancer biopsies by 18 breast cancer–specialized pathologists from 15 institutions. We used the Observers Needed to Evaluate Subjective Tests method to determine the plateau of concordance and the minimum number of pathologists needed to estimate interrater agreement values for large numbers of raters, as seen in the real-world setting. We report substantial discordance within the intermediate categories (<1% agreement for 1+ and 3.6% agreement for 2+) in the 4-category HER2 IHC scoring system. The discordance within the IHC 0 cases is also substantial with an overall percent agreement (OPA) of only 25% and poor interrater reliability metrics (0.49 Fleiss’ kappa, 0.55 intraclass correlation coefficient). This discordance can be partially reduced by using a 3-category system (28.8% vs 46.5% OPA for 4-category and 3-category scoring systems, respectively). Observers Needed to Evaluate Subjective Tests plots suggest that the OPA for the task of determining a HER2 IHC score 0 from not 0 plateaus statistically around 59.4% at 10 raters. Conversely, at the task of scoring HER2 IHC as 3+ or not 3+ pathologists’ concordance was much higher with an OPA that plateaus at 87.1% with 6 raters. This suggests that legacy HER2 IHC remains valuable for finding the patients in whom the ERBB2 gene is amplified but unacceptably discordant in assigning HER2-low or HER2-negative status for the emerging HER2-low therapies.

AB - The HercepTest was approved 20+ years ago as the companion diagnostic test for trastuzumab in human epidermal growth factor 2 (HER2) or ERBB2 gene–amplified/overexpressing breast cancers. Subsequent HER2 immunohistochemistry (IHC) assays followed, including the now most common Ventana 4B5 assay. Although this IHC assay has become the clinical standard, its reliability, reproducibility, and accuracy have largely been approved and accepted on the basis of concordance among small numbers of pathologists without validation in a real-world setting. In this study, we evaluated the concordance and interrater reliability of scoring HER2 IHC in 170 breast cancer biopsies by 18 breast cancer–specialized pathologists from 15 institutions. We used the Observers Needed to Evaluate Subjective Tests method to determine the plateau of concordance and the minimum number of pathologists needed to estimate interrater agreement values for large numbers of raters, as seen in the real-world setting. We report substantial discordance within the intermediate categories (<1% agreement for 1+ and 3.6% agreement for 2+) in the 4-category HER2 IHC scoring system. The discordance within the IHC 0 cases is also substantial with an overall percent agreement (OPA) of only 25% and poor interrater reliability metrics (0.49 Fleiss’ kappa, 0.55 intraclass correlation coefficient). This discordance can be partially reduced by using a 3-category system (28.8% vs 46.5% OPA for 4-category and 3-category scoring systems, respectively). Observers Needed to Evaluate Subjective Tests plots suggest that the OPA for the task of determining a HER2 IHC score 0 from not 0 plateaus statistically around 59.4% at 10 raters. Conversely, at the task of scoring HER2 IHC as 3+ or not 3+ pathologists’ concordance was much higher with an OPA that plateaus at 87.1% with 6 raters. This suggests that legacy HER2 IHC remains valuable for finding the patients in whom the ERBB2 gene is amplified but unacceptably discordant in assigning HER2-low or HER2-negative status for the emerging HER2-low therapies.

KW - HER2

KW - immunohistochemistry

KW - pathology

KW - predictive markers

KW - prognostic markers

UR - http://www.scopus.com/inward/record.url?scp=85148112613&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85148112613&partnerID=8YFLogxK

U2 - 10.1016/j.modpat.2022.100032

DO - 10.1016/j.modpat.2022.100032

M3 - Article

C2 - 36788069

AN - SCOPUS:85148112613

SN - 0893-3952

VL - 36

JO - Modern Pathology

JF - Modern Pathology

IS - 1

M1 - 100032

ER -

Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this