Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry

Charles J. Robbins, Aileen I. Fernandez, Gang Han, Serena Wong, Malini Harigopal, Mirna Podoll, Kamaljeet Singh, Amy Ly, M. Gabriela Kuba, Hannah Wen, Mary Ann Sanders, Jane Brock, Shi Wei, Oluwole Fadare, Krisztina Hanley, Julie Jorns, Olivia L. Snir, Esther Yoon, Kim Rabe, T. Rinda SoongEmily S. Reisenbichler, David L. Rimm

Research output: Contribution to journalArticlepeer-review


The HercepTest was approved 20+ years ago as the companion diagnostic test for trastuzumab in human epidermal growth factor 2 (HER2) or ERBB2 gene-amplified/overexpressing breast cancers. Subsequent HER2 immunohistochemistry (IHC) assays followed, including the now most common Ventana 4B5 assay. Although this IHC assay has become the clinical standard, its reliability, reproducibility, and accuracy have largely been approved and accepted on the basis of concordance among small numbers of pathologists without validation in a real-world setting. In this study, we evaluated the concordance and interrater reliability of scoring HER2 IHC in 170 breast cancer biopsies by 18 breast cancer-specialized pathologists from 15 institutions. We used the Observers Needed to Evaluate Subjective Tests method to determine the plateau of concordance and the minimum number of pathologists needed to estimate interrater agreement values for large numbers of raters, as seen in the real-world setting. We report substantial discordance within the intermediate categories (<1% agreement for 1+ and 3.6% agreement for 2+) in the 4-category HER2 IHC scoring system. The discordance within the IHC 0 cases is also substantial with an overall percent agreement (OPA) of only 25% and poor interrater reliability metrics (0.49 Fleiss' kappa, 0.55 intraclass correlation coefficient). This discordance can be partially reduced by using a 3-category system (28.8% vs 46.5% OPA for 4-category and 3-category scoring systems, respectively). Observers Needed to Evaluate Subjective Tests plots suggest that the OPA for the task of determining a HER2 IHC score 0 from not 0 plateaus statistically around 59.4% at 10 raters. Conversely, at the task of scoring HER2 IHC as 3+ or not 3+ pathologists' concordance was much higher with an OPA that plateaus at 87.1% with 6 raters. This suggests that legacy HER2 IHC remains valuable for finding the patients in whom the ERBB2 gene is amplified but unacceptably discordant in assigning HER2-low or HER2-negative status for the emerging HER2-low therapies.

Original languageEnglish (US)
Pages (from-to)100032
Number of pages1
JournalModern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc
Issue number1
StatePublished - Jan 1 2023
Externally publishedYes


  • HER2
  • immunohistochemistry
  • pathology
  • predictive markers
  • prognostic markers

ASJC Scopus subject areas

  • Pathology and Forensic Medicine


Dive into the research topics of 'Multi-institutional Assessment of Pathologist Scoring HER2 Immunohistochemistry'. Together they form a unique fingerprint.

Cite this