Sick patients have more data: the non-random completeness of electronic health records.

Nicole G. Weiskopf, Alex Rusanov, Chunhua Weng

Research output: Contribution to journalArticle

37 Scopus citations


As interest in the reuse of electronic health record (EHR) data for research purposes grows, so too does awareness of the significant data quality problems in these non-traditional datasets. In the past, however, little attention has been paid to whether poor data quality merely introduces noise into EHR-derived datasets, or if there is potential for the creation of spurious signals and bias. In this study we use EHR data to demonstrate a statistically significant relationship between EHR completeness and patient health status, indicating that records with more data are likely to be more representative of sick patients than healthy ones, and therefore may not reflect the broader population found within the EHR.

Original languageEnglish (US)
Pages (from-to)1472-1477
Number of pages6
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
StatePublished - 2013


ASJC Scopus subject areas

  • Medicine(all)

Cite this