TY - JOUR
T1 - Hidden in plain sight
T2 - Bias towards sick patients when sampling patients with sufficient electronic health record data for research
AU - Rusanov, Alexander
AU - Weiskopf, Nicole G.
AU - Wang, Shuang
AU - Weng, Chunhua
N1 - Funding Information:
This work was supported by grants R01LM009886, R01LM010815, and 5T15LM007079 from the National Library of Medicine, grant UL1 TR000040 from the National Center for Advancing Translational Sciences (NCATS), and grant 5T32GM008464 from the National Institute of Health.
PY - 2014/6/11
Y1 - 2014/6/11
N2 - Background: To demonstrate that subject selection based on sufficient laboratory results and medication orders in electronic health records can be biased towards sick patients. Methods. Using electronic health record data from 10,000 patients who received anesthetic services at a major metropolitan tertiary care academic medical center, an affiliated hospital for women and children, and an affiliated urban primary care hospital, the correlation between patient health status and counts of days with laboratory results or medication orders, as indicated by the American Society of Anesthesiologists Physical Status Classification (ASA Class), was assessed with a Negative Binomial Regression model. Results: Higher ASA Class was associated with more points of data: compared to ASA Class 1 patients, ASA Class 4 patients had 5.05 times the number of days with laboratory results and 6.85 times the number of days with medication orders, controlling for age, sex, emergency status, admission type, primary diagnosis, and procedure. Conclusions: Imposing data sufficiency requirements for subject selection allows researchers to minimize missing data when reusing electronic health records for research, but introduces a bias towards the selection of sicker patients. We demonstrated the relationship between patient health and quantity of data, which may result in a systematic bias towards the selection of sicker patients for research studies and limit the external validity of research conducted using electronic health record data. Additionally, we discovered other variables (i.e., admission status, age, emergency classification, procedure, and diagnosis) that independently affect data sufficiency.
AB - Background: To demonstrate that subject selection based on sufficient laboratory results and medication orders in electronic health records can be biased towards sick patients. Methods. Using electronic health record data from 10,000 patients who received anesthetic services at a major metropolitan tertiary care academic medical center, an affiliated hospital for women and children, and an affiliated urban primary care hospital, the correlation between patient health status and counts of days with laboratory results or medication orders, as indicated by the American Society of Anesthesiologists Physical Status Classification (ASA Class), was assessed with a Negative Binomial Regression model. Results: Higher ASA Class was associated with more points of data: compared to ASA Class 1 patients, ASA Class 4 patients had 5.05 times the number of days with laboratory results and 6.85 times the number of days with medication orders, controlling for age, sex, emergency status, admission type, primary diagnosis, and procedure. Conclusions: Imposing data sufficiency requirements for subject selection allows researchers to minimize missing data when reusing electronic health records for research, but introduces a bias towards the selection of sicker patients. We demonstrated the relationship between patient health and quantity of data, which may result in a systematic bias towards the selection of sicker patients for research studies and limit the external validity of research conducted using electronic health record data. Additionally, we discovered other variables (i.e., admission status, age, emergency classification, procedure, and diagnosis) that independently affect data sufficiency.
UR - http://www.scopus.com/inward/record.url?scp=84903315871&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903315871&partnerID=8YFLogxK
U2 - 10.1186/1472-6947-14-51
DO - 10.1186/1472-6947-14-51
M3 - Article
C2 - 24916006
AN - SCOPUS:84903315871
SN - 1472-6947
VL - 14
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
IS - 1
M1 - 51
ER -