A narrative review on the validity of electronic health record-based research in epidemiology

Table 1 A summary of the challenges faced by epidemiologists when conducting electronic health record-based research, their manifestations in terms of threats to validity, and potential solutions

Challenge	Sub-challenge	Example	Threat(s) to validity	Potential solution(s)
#1 Representativeness	--	Catchment of a federally qualified health center versus academic medical center	Selection bias and generalizability	Comparison to external data; Inverse probability weighting for selection bias
#2 Data availability and interpretation	2.1 Billing versus Clinical versus Epidemiological Needs	Presence or absence of diagnostic codes	Information bias and confounding	Validation study; quantitative bias analysis
	2.2 Consistency in Data and Interpretation	Variations in reported laboratory results	Information bias and confounding	Validation study; quantitative bias analysis
	2.3 Unstructured Data: Clinical Notes and Reports	Operationalizing phenotypes from the encounter note	Information Bias and confounding	Natural language processing
#3 Missing measurements	--	Socioeconomic status not captured	Information or Selection Bias, Confounding	Imputation, Surrogate Measures, Validation Study
#4 Missing visits	--	Lack of longitudinal view of patient	Information or Selection Bias	Imputation, Surrogate Measures, Validation Study

ISSN: 1471-2288