Skip to main content

Comparison of health information exchange data with self-report in measuring cancer screening



Efficient measurement of the receipt of cancer screening has been attempted with electronic health records (EHRs), but EHRs are commonly implemented within a single health care setting. However, health information exchange (HIE) includes EHR data from multiple health care systems and settings, thereby providing a more population-based measurement approach. In this study, we set out to understand the value of statewide HIE data in comparison to survey self-report (SR) to measure population-based cancer screening.


A statewide survey was conducted among residents in Indiana who had been seen at an ambulatory or inpatient clinical setting in the past year. Measured cancer screening tests included colonoscopy and fecal immunochemical test (FIT) for colorectal cancer, human papilloma virus (HPV) and Pap tests for cervical cancer, and mammogram for breast cancer. For each screening test, the self-reported response for receipt of the screening (yes/no) and ‘time since last screening’ were compared with the corresponding information from patient HIE to evaluate the concordance between the two measures.


Gwet’s AC for HIE and self-report of screening receipt ranged from 0.24–0.73, indicating a fair to substantial concordance. For the time since receipt of last screening test, the Gwet’s AC ranged from 0.21–0.90, indicating fair to almost perfect concordance. In comparison with SR data, HIE data provided relatively more additional information about laboratory-based tests: FIT (19% HIE alone vs. 4% SR alone) and HPV tests (27% HIE alone vs. 12% SR alone) and less additional information about procedures: colonoscopy (8% HIE alone vs. 23% SR alone), Pap test (13% HIE alone vs. 19% SR alone), or mammography (9% HIE alone vs. 10% SR alone).


Studies that use a single data source should consider the type of cancer screening test to choose the optimal data collection method. HIE and self-report both provided unique information in measuring cancer screening, and the most robust measurement approach involves collecting screening information from both HIE and patient self-report.

Peer Review reports


Evidence-based screening tests are important for early detection of cancer to reduce the likelihood of a more advanced stage at diagnosis when cancer may be less treatable [1]. Information about the receipt of cancer screening tests is widely available from patient data stored in electronic health records (EHR), a systematic collection of patient health information in a digital format [2]. Another alternative source of information about cancer screening is from patients’ self-report, an arguably more cost-effective source [3]. EHRs may benefit both clinical and epidemiologic research, patient care, as well as performance measurement. However, structured EHR data inevitably contains incomplete or inaccurate information. Furthermore, EHRs typically are deployed in a single health care practice, hospital, or system. Health information exchange (HIE), which involves the electronic exchange of clinical and administrative information across a variety of health care organizations, provides a more population-based approach to the measurement of cancer screening as it contains EHR data from multiple health care settings or systems [4, 5].

To our knowledge, no prior statewide studies of cancer screening has compared HIE data with patient self-report. Also, previous literature comparing self-report and EHR data showed varying levels of agreement based on factors such as type of clinical condition, type of clinical procedures performed, or data collection methods [6,7,8,9,10]. Given the varying results of previous validation studies and the implications for population-based measurement in the use of HIE, we assessed the concordance between self-reported cancer screening by survey among Indiana residents and the individuals’ corresponding information from the statewide Indiana Health Information Exchange (IHIE). We focused on the cancer screening phase of the cancer care continuum for colorectal, cervical, and breast cancers. We hypothesized that the value of information gained from different data sources on cancer screening, might vary according to the types of screening tests.


From January 2018 through February 2018, Indiana residents who had been seen at least once in the previous year at Indiana University Health (IU Health) participated in a cross-sectional, mail-based survey known as the Hoosier Health Survey. A statewide integrated health system, IU Health operates 16 hospitals in Indiana and 178 clinics that provide outpatient care; it is the largest health system in the state, had 115,690 admissions in 2021 with a leading 31.3% market share in its primary service area (PSA) in central Indiana [11]. The purpose of this survey was to better understand the cancer control needs of the community served by the Indiana University Cancer Center [12]. The population catchment area of IU Cancer Center as defined for the National Cancer Institute includes the entire state of Indiana. Following HIPAA authorization from respondents to access their electronic health records, cancer screening information obtained through the survey were matched with the participant’s longitudinal EHR data by referencing each patient’s first and last name, birthdate, and place of residence. The electronic information was obtained from the Indiana Network for Patient Care (INPC), which is the clinical data repository for IHIE, a community-wide HIE operating in central Indiana, with the support of the Regenstrief Institute. The INPC consists of clinical observations across five major hospital systems, public health departments (both state and county), and Indiana Medicaid [13, 14]. The study was approved by the IUPUI Institutional Review Board.

Study cohort

From a list of 284,062 people seen at least once in the past 12 months in the statewide health system and living in one of 34 Indiana counties with higher cancer mortality rates, a random, stratified sample of 8,000 adults was selected. In stratifying the sample, rural geographic location and race were equally weighted. The initial goal was to sample 2,000 individuals from each of four strata (rural White, rural Black, urban White, urban Black); due to the small number of participants in rural Black, the remaining 2,000 were taken from the rural White strata, resulting in 4,000 individuals from both rural and urban areas. Twenty-one patients were excluded from the sample because their primary care providers declined to authorize their participation in the survey, thus resulting in 7,979 mailed surveys. Out of all mailed surveys, a total of 970 adults aged 18–75 years completed the survey, generating a 12% response rate. Younger adults were included in the sample so as to collect data on cervical cancer screening behavior; the upper age limit of the sample was set at 75 years because guidelines do not routinely recommend cancer screening after age 75 [15]. Out of these 970 respondents, a total of 711 individuals provided HIPAA authorization (73.3%), comprising our final study sample. The survey methodology has been described in more detail elsewhere [12].

Populations eligible for cancer screening

The participants in our study were assessed on their cancer screening behavior for three types of cancer: colorectal, cervical, and breast. Screening guidelines from the U.S. Preventive Services Task Force (USPSTF) were used to determine the sample of survey respondents eligible for appropriate screening tests. For colorectal cancer, the eligible sample included men and women aged 50–75 years to receive a colonoscopy every ten years or receive a fecal immunochemical test (FIT)/stool test every year [16]. For cervical cancer, the eligible sample included women aged 21–29 years to receive a Pap test every three years and 30–65 years to receive a Pap test every three years or a human papilloma virus (HPV) test every five years [17]. In the case of breast cancer, the eligible sample was taken from women aged 50–75 years to receive a mammography every two years [18].

Survey-based cancer screening measures

When eligible for screening, the respondents were asked whether or not they reported receiving one of the three cancer screening approaches with responses being “Yes” or “No” (binary in nature). For colorectal cancer, patients were asked if they ever received a colonoscopy and whether they had one every ten years, as well as if they ever received a Fecal Immunochemical Test (FIT)/stool test, and whether they had one every year. For cervical cancer, patients were asked if they ever received a PAP test and if they had one every three years, or if they ever received a human papilloma virus (HPV) test and if they had one every five years. For breast cancer, the patients were asked whether they ever received a mammogram and if they had one every two years.

Respondents were also asked the time since their last screening test with responses being “Within the past year (less than 12 months ago)”, “More than 1 year ago, but less than 2 years ago”, “More than 2 years ago, but less than 3 years ago”, “More than 3 years ago, but less than 5 years ago”, or “5 or more years ago” (ordinal in nature). See Appendix 1 for detailed survey questions used for this study and Appendix 2 for the entire survey. Information on receipt of the screening and time since last screening were measured to assess the degree of concordance between survey self-report and HIE data.

Statistical analysis

Descriptive statistics, adjusted for sampling weights, were performed of individual socio-demographic characteristics that are patient reported, including age, gender, race, educational level, marital status, insurance status, income, home ownership, employment status, rurality based on RUCA codes, and self-reported health status.

For the questions on receipt of cancer screening and time since last screening, as a first step, we conducted bivariate analysis on participants’ responses to the survey and the corresponding information in HIE data using Chi-square tests to check for any significant differences between the two information sources. As a second step, we evaluated the sensitivity, specificity (only for receipt of cancer screening), and concordance as validity measures between the two measures of screening information. For the questions on time since last screening, we only considered those participants whose HIE data as well as self-report indicated receipt of screening. All the analyses were adjusted for sampling weights.

To assess under-reporting, we estimated sensitivity (proportion of patients who self-reported having a screening test done among those with the test documented in their EMR). To assess over-reporting, we estimated specificity (proportion of patients who self-reported not having a screening test done among those without the test documented in their EMR). Finally, we used the Gwet’s agreement coefficient (Gwet’s AC) to measure the concordance of screening information obtained from HIE data and survey self-report. The Gwet's agreement coefficient (Gwet's AC), a measure of correlation, is defined as the conditional probability that two randomly chosen observational measurements will agree, assuming no agreement by chance. The agreement coefficients were calculated using Gwet's new chance-corrected inter-rater agreement coefficients weighted ordinally, extending all existing agreement coefficients to include multiple raters, multiple rating categories, any measurement level, and multiple ratings per subject.

As an additional analysis, we showed comparison between three different agreement measures—Gwet’s Agreement Coefficient, Fleiss Kappa and Intraclass Correlation Coefficient (ICC) (see Appendix 3) but we chose Gwet’s AC as the final measure of concordance for this study over the alternative measuresFootnote 1 for various statistical concerns [19,20,21,22,23,24,25,26]. The Gwet's AC is interpreted according to Landis and Koch's guidelines [27, 28].

The analyses were performed in Stata (Stata 16.1, StataCorp LLC, College Station, TX).


Weighted descriptive statistics

Of the 711 patients surveyed, the participants were most often between ages 50–64 years(36%), female (63%), white (86%), partnered (60%), homeowners (69%), insured (96%), employed (47%), urban (89%), and reported very good general health (37%) (Table 1).

Table 1 Weighted summary statistics of participants’ sociodemographic variables

Weighted bivariate analysis

With regards to the receipt of screening, bivariate analysis showed statistically significant differences between the two data sources (survey self-report (SR) and EHR from IHIE) for all screening tests (p-value < 0.01) (Table 2, columns 2 and 3). The participants who reported positive receipt of cancer screening were also asked about their time since last screening. Bivariate analysis showed statistically significant differences between the two information sources for colonoscopy, Pap test, and mammogram (Table 3, columns 1 and 2).

Table 2 Summary validity measures of information on receipt of screening in survey self-report and IHIE (Weighted)
Table 3 Agreement of information on time since last screening in survey self-report and IHIE (Weighted)

The proportion of patients for whom both the HIE and self-report data indicated receipt of screening showed the following pattern: colonoscopy (305/505 = 60%), FIT test (15/504 = 3%), HPV test (33/161 = 20%), Pap test (103/185 = 56%), and mammogram (190/255 = 74%). Comparing the proportion of patients whose HIE data indicated screening (but self-report did not) with the proportion of patients whose self-report indicated screening (but HIE did not), the following patterns emerged: colonoscopy (8% HIE alone vs. 23% SR alone), FIT test (19% HIE alone vs. 4% SR alone), HPV test (27% HIE alone vs. 12% SR alone), Pap test (13% HIE alone vs. 19% SR alone), mammography (9% HIE alone vs. 10% SR alone) (Table 2).

Weighted sensitivity, specificity and concordance (Receipt of cancer screening)

For receipt of cancer screening, patients’ self-reports showed high sensitivity with their corresponding information recorded in their EMRs for colonoscopy (sensitivity = 88%, 95% CI: 0.85–0.92), Pap test (sensitivity = 81%, 95% CI: 0.73–0.88) and mammogram (sensitivity = 89%, 95% CI: 0.84–0.93), thus indicating less under-reporting for these tests. However, for FIT (sensitivity = 13%, 95% CI: 0.07–0.21) and HPV tests (sensitivity = 43%, 95% CI: 0.32–0.55), patients’ self-reports showed low sensitivity and high specificity, indicating more under-reporting than over-reporting (Table 2, columns 4 and 5), With regards to the level of concordance of information on receipt of cancer screening between HIE data and survey self-report, Gwet’s AC showed the highest level of concordance for Mammogram (Gwet’s AC: 0.73, 95% CI: 0.65–0.81) and the lowest level of agreement for HPV test (Gwet’s AC: 0.24, 95% CI: 0.08–0.40).

To summarize, there was high sensitivity between the two information sources for colonoscopy, Pap test and mammogram, which are all procedures and low sensitivity for FIT and HPV tests, both laboratory tests. Screening receipt information from HIE data and survey-self report showed overall concordance ranging from 24 to 73%, indicating fair to substantial concordance [19] according to Gwet’s AC (Table 2, column 6).

Weighted concordance (Time since last cancer screening)

For time since last screening, Gwet’s AC showed the highest level of agreement for mammogram timing (Gwet’s AC: 0.90, 95% CI: 0.86, 0.95) and the lowest level of agreement for FIT test timing (Gwet’s AC: 0.21, 95% CI: -0.21, 0.64 (although p-value > 0.10)), thus indicating almost perfect to fair concordance [19] according to Gwet’s AC (Table 3, column 3).


In our study we focused on the cancer screening phase and evaluated the concordance between HIE data and self-reported responses of surveyed Indiana residents seen in a statewide healthcare system for receipt of screenings and the time since receipt of the last screening test. For screening receipt, results indicated the highest level of agreement for mammogram and lowest level of agreement for HPV test. For screening timing, the highest level of agreement between the two data sources was for mammogram timing and the lowest level of agreement for FIT test timing. Additionally, HIE data provided relatively more information about FIT and HPV tests, which are both laboratory-based screening tests. Self-reported data provided more information about colonoscopy, Pap test, and mammography, all of which are medical screening procedures.

In the screening phase of the cancer care continuum, one of the earliest prior studies assessed the concordance between self-report and non-electronic medical record documentation among Kaiser Permanente Medical Care Program participants. Data were collected on the reason and timing for Pap tests, mammograms, clinical breast exams, fecal occult blood tests (FIT tests), digital rectal examinations, and sigmoidoscopies. Researchers found that self-reported response and non-electronic medical record documentation generally agreed more for procedures involving a test report (mammogram, Pap test, fecal occult blood test, and sigmoidoscopy) than a physician's note (clinical breast examination and digital rectal examination) [6]. The results of our study are similar to their findings, especially for mammogram timing and receipt of FIT test where we found the most agreement between HIE data and self-report. A relatively recent study conducted among patients in 25 New Jersey Primary Care Practices who participated in the SCOPE program (supporting colorectal cancer outcomes through participatory engagement) indicated evidence of agreement for cancer screening ranging from 61% for Pap and PSA test to 83% for colorectal endoscopy. In this study, self-reports had a higher rate than non-electronic medical records [7]. Both of these studies compared patients' self-reports against non-electronic medical records. Paper-based documentation may not provide accurate information on screening histories if information is entered incompletely; further, non-electronic medical records also suffer from disorganization, non-integration with other electronic systems, and lack of backups and security issues [29, 30]. Hence, our study has the advantage of using EHRs from HIE over non-electronic health records which is more consistent with current medical practice. Moreover, EHRs arguably provide higher quality data with more accessible, accurate, complete and up-to-date patient records with built-in privacy and security features [31, 32].

In the treatment phase of the cancer care continuum, an academic hospital cancer registry study among breast cancer survivors from 2004–2009 evaluated concordance first between electronic query and manual review used to extract EHR data, and second between survivors' self-reports and the extracted EHR data on post-treatment mammography. Electronic query identified more mammograms post-treatment than manual review, with high concordance between the two methods (0.90). Fewer days since mammogram were associated with better concordance between self-reporting and EHR data. In conclusion, Tiro et al. encouraged the use of self-report as a screening tool among cancer survivors for surveillance care delivery [10]. The advantage of our approach over prior EHR-based studies is the fact that they were studies performed within a single health care setting. On the other hand, EHRs in a state-based HIE, as explored in this study, use clinical data from among patient populations aggregated across multiple health care organizations [33]. In addition to offering a complete, accurate, and holistic view of patient records, HIEs reduce duplication of information on procedures or tests, improving the usefulness of patient health records. HIE also improves the accessibility of medical data across multiple clinical settings, thereby improving the capacity to use population data for public health purposes and other quality improvement activities across consortia of health care organizations [34,35,36]. Hence, using the Indiana Network of Patient Care (INPC), encompassing a community wide HIE, enables the measurement of population-based cancer screening behavior at a population level with greater efficiency and completeness.


Some study limitations must be considered when interpreting these results. First, our survey response rate was relatively low, at 12%, despite using established methods for survey research. Due to our expectation of a low response rate based upon current survey experience [37,38,39], enough surveys were delivered, with follow-up postcard reminders and a second copy of the survey to have meaningful population-based estimates. This was to ensure a relatively large absolute number of surveys among the target population. Other data collection methods, such as in-person interviews, might have improved our participation rate but would not have had the same reach as the mailed survey. Nonetheless, we received completed surveys from every surveyed county [40], and respondents and non-respondents did not differ significantly across available sociodemographic characteristics [41]. Second, we selected only those residents who had been involved with a single health system in Indiana, although IU Health is Indiana's largest integrated health system serving approximately 1 million individuals community wide Nonetheless, the results of our study should be interpreted as a sample from a statewide health system with at least some access to healthcare, as opposed to a population-based state sample. Overall, the access to healthcare by participants increased the likelihood of cancer screening receipt occurring. Finally, some HIEs have greater challenges to data sharing because of state-level variation in patient consent policies for sharing of health data [42]. Specifically, opt-in policies that require providers to consent each patient to sharing information with HIE programs increase administrative costs that make HIE more burdensome. Thus, all researchers will not have uniform access to HIE in their communities.


Different data sources yielded different information value about the receipt of cancer screening, depending on the type of cancer screening. The HIE data, for example, provided relatively more information about FIT and HPV tests, both laboratory tests, than about colonoscopy, Pap tests, or mammograms, all procedures. To choose the ideal data collection method, studies that use a single data source should consider the type of cancer screening test. Both HIE and self-reports provided unique information about cancer screening. The most robust measurement approach involves collecting both HIE and self-reported screening information. When there are disagreements between the data sources, a practical approach may be to consider most positives measures of cancer screening tests as true positives, in order to overcome the risks of false negatives posed by HIE (missing data) and self-report (recall bias). If one source of data is used over another, it will likely create biases in prediction, for example algorithms based upon different sources of cancer screening data (HIE vs. self-reported) will very likely provide different predictions about cancer mortality, and furthermore, these predictions will vary by different race/ethnicity groups. Moreover, the optimal data source may vary depending on the outcome of interest being measured, whether it is clinical decision making, performance measurement, or population surveillance. Future research opportunities include looking at concordance between self-report and EHR data over time, concentrating on vulnerable populations. Data should also be considered from different EHR systems such as single vendor EHRs, as well as EHRs controlled by patients known as personal health records, to draw comparisons with corresponding patient self-report regarding cancer screening.

Availability of data and materials

The datasets generated during and/or analyzed during the current study are not publicly available because public posting of database was not approved by the IUPUI IRB. However, datasets are available from the corresponding author on reasonable request.


  1. Fleiss Kappa (κ), an extension of Cohen/Conger’s Kappa similar to correlation coefficients ranging from -1 to + 1 with 0 representing the amount of agreement expected from random choice, and 1 indicating perfect agreement is an alternative method to measure the level of concordance between raters. We choose Gwet’s AC over Fleiss Kappa as Kappa measures suffer from some statistical issues; it assumes independence between the raters, hence frequently generating agreement due to chance which is not entirely correct. However, the Gwet’s AC does not depend upon this assumption. Additionally, the Kappa values suffer from the “Kappa paradox”; they tend to change quite a lot with a change in prevalence, i.e., the values become high and close to the percentage agreement when there is high prevalence (the prevalence problem). Also, the degree to which observers disagree has an impact on the Kappa values (the bias problem). Gwet’s AC tends to lessen the kappa limitations; [19, 21] hence we consider Gwet’s AC providing more stable inter-rater reliability coefficients in our study following few recent studies that have also preferred Gwet’s AC over Kappa statistic or being a more stable inter-rater reliability coefficient [22,23,24]. Moreover, Fleiss Kappa is more appropriate to use when the number of raters (in our case, self-reports and HIE data) are more than two. ICC measure has limitations as well as it strongly depends on the population variance; ICC values tend to be higher when applied to more heterogeneous population in comparison with more homogenous populations, even when there are similar levels of agreement.



Electronic Health Record


Fecal Immunochemical Test

Gwet’s AC:

Gwet’s Agreement Coefficient


Health Information Exchange


Health Insurance Portability and Accountability Act


Human Papilloma Virus


Indiana Health Information Exchange


Indiana Network for Patient Care


Indiana University


Indiana University Purdue University


Supporting Colorectal Cancer Outcomes Through Participatory Engagement




United States Preventive Services Task Force


  1. Centers for Disease Control and Prevention. Available at:

  2. Gunter TD, Terry NP. The emergence of national electronic health record architectures in the United States and Australia: models, costs, and questions. J Med Internet Res. 2005;7(1):e3.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Weiskopf N G, Cohen A M, Hannan J, Jarmon T, Dorr D A. Towards augmenting structured EHR data: a comparison of manual chart review and patient self-report. In AMIA Annual Symposium Proceedings. Am Med Info Assoc. 2019;2019:903.

  4. Dixon BE. What is health information exchange? In Health information exchange. Academic Press. 2016; 3–20.

  5. Dixon BE, Haggstrom DA, Weiner M. Implications for informatics given expanding access to care for Veterans and other populations. J Am Med Inform Assoc. 2015;22(4):917–20.

    Article  PubMed  Google Scholar 

  6. Gordon NP, Hiatt RA, Lampert DI. Concordance of self-reported data and medical record audit for six cancer screening procedures. JNCI: J Natl Cancer Inst. 1993;85(7):566–70.

    Article  CAS  PubMed  Google Scholar 

  7. Ferrante JM, Ohman-Strickland P, Hahn KA, Hudson SV, Shaw EK, Crosson JC, Crabtree BF. Self-report versus medical records for assessing cancer-preventive services delivery. Cancer Epidemiol Prev Biomarkers. 2008;17(11):2987–94.

    Article  Google Scholar 

  8. Gupta V, Gu K, Chen Z, Lu W, Shu XO, Zheng Y. Concordance of self-reported and medical chart information on cancer diagnosis and treatment. BMC Med Res Methodol. 2011;11(1):1–7.

    Article  Google Scholar 

  9. Ho PJ, Tan CS, Shawon SR, Eriksson M, Lim LY, Miao H, Li J. Comparison of self-reported and register-based hospital medical data on comorbidities in women. Sci Rep. 2019;9(1):1–9.

    Article  Google Scholar 

  10. Tiro JA, Sanders JM, Shay LA, Murphy CC, Hamann HA, Bartholomew LK, Vernon SW. Validation of self-reported post-treatment mammography surveillance among breast cancer survivors by electronic medical record extraction method. Breast cancer Res Treat. 2015;151(2):427–34.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Fitch Rates Indiana University Health's Series 2021A Bonds 'AA'; Outlook Positive. Available at:,(PSA)%20in%20central%20Indiana.

  12. Haggstrom DA, Lee JL, Dickinson SL, Kianersi S, Roberts JL, Teal E, Rawl SM. Rural and urban differences in the adoption of new health information and medical technologies. J Rural Health. 2019;35(2):144–54.

    Article  PubMed  Google Scholar 

  13. McDonald CJ, Overhage JM, Barnes M, Schadow G, Blevins L, Dexter PR, INPC Management Committee. The Indiana network for patient care: a working local health information infrastructure. Health Aff. 2005;24(5):1214–20.

    Article  Google Scholar 

  14. Vreeman D J, Stark M, Tomashefski GL, Phillips D R, Dexter P R (2008). Embracing change in a health information exchange. In AMIA Annual Symposium Proceedings. American Medical Informatics Association. 2008:768.

  15. Salzman B, Beldowski K, de La Paz A. Cancer screening in older patients. Am Fam Physician. 2016;93(8):659–67.

    PubMed  Google Scholar 

  16. U.S. Preventive Services Task Force. Available at:

  17. U.S. Preventive Services Task Force. Available at:

  18. U.S. Preventive Services Task Force. Available at:

  19. McHugh ML. Interrater reliability: the kappa statistic. Biochemia medica. 2012;22(3):276–82.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Cicchetti DV, Feinstein AR. High agreement but low kappa: II Resolving the paradoxes. J Clin Epidemiol. 1990;43(6):551–8.

    Article  CAS  PubMed  Google Scholar 

  21. Dettori JR, Norvell DC. Kappa and beyond: is there agreement? Global Spine J. 2020;10(4):499–501.

    Article  PubMed Central  Google Scholar 

  22. Wongpakaran N, Wongpakaran T, Wedding D, Gwet KL. A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med Res Methodol. 2013;13(1):1–7.

    Article  Google Scholar 

  23. Jimenez AM, Zepeda SJ. A Comparison of Gwet’s AC1 and kappa when calculating inter-rater reliability coefficients in a teacher evaluation context. J Educ Human Res. 2020;38(2):290–300.

    Google Scholar 

  24. Cibulka MT, Strube MJ. The Conundrum of Kappa and why some Musculoskeletal Tests Appear Unreliable despite High Agreement: A Comparison of Cohen Kappa and Gwet’s AC to Assess Observer Agreement when Using Nominal and Ordinal Data. Phys Ther. 2021;101(9):pzab150.

    Article  PubMed  Google Scholar 

  25. Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76(5):378.

    Article  Google Scholar 

  26. Müller R, Büttner P. A critical discussion of intraclass correlation coefficients. Stat Med. 1994;13(23–24):2465–76.

    Article  Google Scholar 

  27. Gwet KL. Handbook of Inter-Rater Reliability, 4th Edition: The Definitive Guide to Measuring The Extent of Agreement Among Raters. Advanced Analytics, LLC; 2014.

  28. Klein D. Implementing a General Framework for Assessing Interrater Agreement in Stata. Stata J. 2018;18(4):871–901.

    Article  Google Scholar 

  29. Hedges, L. The Pros and Cons of Paper Medical Records (According to Doctors Who Use Them). Retrieved from Software Advice:

  30. Advantages & Disadvantages of Paper Medical Records. Retrieved from Truenorth:

  31. Editorial Team. The Pros and Cons of Electronic Medical Records (EMRs). Retrieved from Virtru:

  32. Gallagher Healthcare. Retrieved from Gallagher Healthcare:

  33. Menachemi N, Rahurkar S, Harle CA, Vest JR. The benefits of health information exchange: an updated systematic review. J Am Med Inform Assoc. 2018;25(9):1259–65.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Department of Health Care Finance.



  37. Czajka JL, Beyler A. Background paper declining response rates in federal surveys: Trends and implications. Mathematica policy research. 2016;1:1–86.

    Google Scholar 

  38. Boyle J, Berman L, Dayton J, Iachan R, Jans M, ZuWallack R. Physical measures and biomarker collection in health surveys: Propensity to participate. Res Social Adm Pharm. 2021;17(5):921–9.

    Article  PubMed  Google Scholar 

  39. Lallukka T, Pietiläinen O, Jäppinen S, Laaksonen M, Lahti J, Rahkonen O. Factors associated with health survey response among young employees: a register-based study using online, mailed and telephone interview data collection methods. BMC Public Health. 2020;20(1):1–13.

    Article  Google Scholar 

  40. Rawl SM, Dickinson S, Lee JL, et al. Racial and Socioeconomic Disparities in Cancer-Related Knowledge, Beliefs, and Behaviors in Indiana. Cancer Epidemiol Biomarkers Prev. 2019;28(3):462–70.

    Article  PubMed  Google Scholar 

  41. Lee JL, Rawl SM, Dickinson S, Teal E, Baker LB, Lyu C, Haggstrom DA. Communication about health information technology use between patients and providers. J Gen Intern Med. 2020;35(9):2614–20.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Mello MM, Adler‐Milstein J, Ding KL, Savage L. Legal barriers to the growth of health information exchange—boulders or pebbles? Milbank Q. 2018;96(1):110–43.

    Article  PubMed Central  Google Scholar 

Download references


Not applicable.


This work was aided by National Cancer Institute supplement to Indiana University Cancer Center Grant (P30 CA082709-17S6), Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, Indiana.

Author information

Authors and Affiliations



O.B., D.H. and S.R. conceptualized and designed the study. O.B., D.H., S.R. and S.D. developed the methodology. S.R. and D.H. helped in acquiring the data. O.B. and D.H. analyzed and interpreted the data (e.g., statistical analysis, biostatistics, computational analysis). O.B., D.H., S.R. and S.D. wrote, reviewed, and/or revised the manuscript. O.B. and D.H. supervised the overall study. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to David A. Haggstrom.

Ethics declarations

Ethics approval and consent to participate

This study contains human subject. The survey was conducted in accordance with relevant guidelines and regulations. Informed consent was obtained from all subjects and/or their legal guardian(s). The Indiana University Purdue University Indianapolis (IUPUI) Institutional Review Board (IRB) approved the study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Appendix 1. Survey questions used for this research.

Additional file 2.

Additional file 3:

Appendix 3. Comparison of different agreement measures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharyya, O., Rawl, S.M., Dickinson, S.L. et al. Comparison of health information exchange data with self-report in measuring cancer screening. BMC Med Res Methodol 23, 172 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: