Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC Medical Records Track.

Tracy Edinger; Aaron M. Cohen; Steven Bedrick; Kyle Ambert; William Hersh

Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC Medical Records Track.

Tracy Edinger, Aaron M. Cohen, Steven Bedrick, Kyle Ambert, William Hersh

Medical Informatics and Clinical Epidemiology

Research output: Contribution to journal › Article › peer-review

42 Scopus citations

Abstract

Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems.

Original language	English (US)
Pages (from-to)	180-188
Number of pages	9
Journal	Unknown Journal
Volume	2012
State	Published - 2012

ASJC Scopus subject areas

General Medicine

Cite this

@article{1f7317428bb346e4aece697e30c65e2c,

title = "Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC Medical Records Track.",

abstract = "Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems.",

author = "Tracy Edinger and Cohen, {Aaron M.} and Steven Bedrick and Kyle Ambert and William Hersh",

year = "2012",

language = "English (US)",

volume = "2012",

pages = "180--188",

journal = "Unknown Journal",

issn = "0973-3698",

publisher = "Elsevier (Singapore) Pte Ltd",

}

TY - JOUR

T1 - Barriers to retrieving patient information from electronic health record data

T2 - failure analysis from the TREC Medical Records Track.

AU - Edinger, Tracy

AU - Cohen, Aaron M.

AU - Bedrick, Steven

AU - Ambert, Kyle

AU - Hersh, William

PY - 2012

Y1 - 2012

N2 - Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems.

AB - Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems.

UR - http://www.scopus.com/inward/record.url?scp=84880805384&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880805384&partnerID=8YFLogxK

M3 - Article

C2 - 23304287

AN - SCOPUS:84880805384

SN - 0973-3698

VL - 2012

SP - 180

EP - 188

JO - Unknown Journal

JF - Unknown Journal

ER -

Barriers to retrieving patient information from electronic health record data: failure analysis from the TREC Medical Records Track.

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this