Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval

Tracy Edinger, Dina Demner-Fushman, Aaron Cohen, Steven Bedrick, William (Bill) Hersh

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.

Original languageEnglish (US)
Pages (from-to)660-669
Number of pages10
JournalAMIA ... Annual Symposium proceedings. AMIA Symposium
Volume2017
StatePublished - Jan 1 2017

Fingerprint

Search Engine
Electronic Health Records

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval. / Edinger, Tracy; Demner-Fushman, Dina; Cohen, Aaron; Bedrick, Steven; Hersh, William (Bill).

In: AMIA ... Annual Symposium proceedings. AMIA Symposium, Vol. 2017, 01.01.2017, p. 660-669.

Research output: Contribution to journalArticle

@article{7b60920f1d64486382227ce9ea0a884d,
title = "Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval",
abstract = "Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.",
author = "Tracy Edinger and Dina Demner-Fushman and Aaron Cohen and Steven Bedrick and Hersh, {William (Bill)}",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "2017",
pages = "660--669",
journal = "AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium",
issn = "1559-4076",
publisher = "American Medical Informatics Association",

}

TY - JOUR

T1 - Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval

AU - Edinger, Tracy

AU - Demner-Fushman, Dina

AU - Cohen, Aaron

AU - Bedrick, Steven

AU - Hersh, William (Bill)

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.

AB - Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.

UR - http://www.scopus.com/inward/record.url?scp=85058764771&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058764771&partnerID=8YFLogxK

M3 - Article

C2 - 29854131

AN - SCOPUS:85058764771

VL - 2017

SP - 660

EP - 669

JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

SN - 1559-4076

ER -