Bias of health estimates obtained from chronic disease and risk factor surveillance systems using telephone population surveys in Australia: results from a representative face-to-face survey in Australia from 2010 to 2013

Background Emerging communication technologies have had an impact on population-based telephone surveys worldwide. Our objective was to examine the potential biases of health estimates in South Australia, a state of Australia, obtained via current landline telephone survey methodologies and to report on the impact of mobile-only household on household surveys. Methods Data from an annual multi-stage, systematic, clustered area, face-to-face population survey, Health Omnibus Survey (approximately 3000 interviews annually), included questions about telephone ownership to assess the population that were non-contactable by current telephone sampling methods (2006 to 2013). Univariable analyses (2010 to 2013) and trend analyses were conducted for sociodemographic and health indicator variables in relation to telephone status. Relative coverage biases (RCB) of two hypothetical telephone samples was undertaken by examining the prevalence estimates of health status and health risk behaviours (2010 to 2013): directory-listed numbers, consisting mainly of landline telephone numbers and a small proportion of mobile telephone numbers; and a random digit dialling (RDD) sample of landline telephone numbers which excludes mobile-only households. Results Telephone (landline and mobile) coverage in South Australia is very high (97 %). Mobile telephone ownership increased slightly (7.4 %), rising from 89.7 % in 2006 to 96.3 % in 2013; mobile-only households increased by 431 % over the eight year period from 5.2 % in 2006 to 27.6 % in 2013. Only half of the households have either a mobile or landline number listed in the telephone directory. There were small differences in the prevalence estimates for current asthma, arthritis, diabetes and obesity between the hypothetical telephone samples and the overall sample. However, prevalence estimate for diabetes was slightly underestimated (RCB value of −0.077) in 2013. Mixed RCB results were found for having a mental health condition for both telephone samples. Current smoking prevalence was lower for both hypothetical telephone samples in absolute differences and RCB values: −0.136 to −0.191 for RDD landline samples and −0.129 to −0.313 for directory-listed samples. Conclusion These findings suggest landline-based sampling frames used in Australia, when appropriately weighted, produce reliable representative estimates for some health indicators but not for all. Researchers need to be aware of their limitations and potential biased estimates. Electronic supplementary material The online version of this article (doi:10.1186/s12874-016-0145-z) contains supplementary material, which is available to authorized users.

Methods: Data from an annual multi-stage, systematic, clustered area, face-to-face population survey, Health Omnibus Survey (approximately 3000 interviews annually), included questions about telephone ownership to assess the population that were non-contactable by current telephone sampling methods (2006 to 2013). Univariable analyses (2010 to 2013) and trend analyses were conducted for sociodemographic and health indicator variables in relation to telephone status. Relative coverage biases (RCB) of two hypothetical telephone samples was undertaken by examining the prevalence estimates of health status and health risk behaviours (2010 to 2013): directory-listed numbers, consisting mainly of landline telephone numbers and a small proportion of mobile telephone numbers; and a random digit dialling (RDD) sample of landline telephone numbers which excludes mobile-only households.
Results: Telephone (landline and mobile) coverage in South Australia is very high (97 %). Mobile telephone ownership increased slightly (7.4 %), rising from 89.7 % in 2006 to 96.3 % in 2013; mobile-only households increased by 431 % over the eight year period from 5.2 % in 2006 to 27.6 % in 2013. Only half of the households have either a mobile or landline number listed in the telephone directory. There were small differences in the prevalence estimates for current asthma, arthritis, diabetes and obesity between the hypothetical telephone samples and the overall sample. However, prevalence estimate for diabetes was slightly underestimated (RCB value of −0.077) in 2013. Mixed RCB results were found for having a mental health condition for both telephone samples. Current smoking prevalence was lower for both hypothetical telephone samples in absolute differences and RCB values: −0.136 to −0.191 for RDD landline samples and −0.129 to −0.313 for directory-listed samples.
(Continued on next page) (Continued from previous page) Conclusion: These findings suggest landline-based sampling frames used in Australia, when appropriately weighted, produce reliable representative estimates for some health indicators but not for all. Researchers need to be aware of their limitations and potential biased estimates.
Keywords: Bias, Telephone sampling methodology, Sampling frame, Public health surveillance, Health surveys, Chronic conditions, Risk factors Background Many established population-based, continuous chronic disease and behavioural risk factor surveillance systems worldwide utilise Computer Assisted Telephone Interviewing (CATI) [1][2][3][4][5][6][7][8][9]. Since the 1990s, CATI surveys have been seen as an ideal tool since they are effective, relatively inexpensive, flexible and timely [6,[8][9][10][11][12]. However, over the past 15 years vast changes have occurred in the telecommunication industry (mobile telephone and internet) and society's acceptance of, and engagement with, these new technologies [13,14]. The new communication technologies have had an impact on population-based telephone surveys, specifically, the diminishing coverage of traditional sampling frames and declining response rates [11,15] resulting in increased costs [16,17] and potential bias in survey estimates [18,19].
In the early 1990s, 95-97 % of Australian households had a landline telephone connected [20] and response rates of around 70-80 % were the norm [20][21][22][23][24]. For population health surveys in Australia, two sampling methodologies were used: directory-listed telephone numbers, referred to as Electronic White Pages (EWP) and random digit dialling (RDD) of landline telephone numbers [3,20,22]; both methods having the ability to target geographical areas (state, suburbs or postcodes) which has contributed to the utility and efficiency of telephone surveys [25,26]. EWP consists mainly of listed landline telephone numbers with name and address details for a household or business which the sampling frame can be easily stratified by state, suburb or postcode. EWP has mobile and Voice over Internet Protocol (VOIP) telephone numbers but only as a small proportion of the total sample. One drawback of EWP is that it does not include unlisted (silent) telephone number; that is, households which have opted, at a cost, to exclude their landline telephone number from the EWP. RDD methods have been developed to include silent landline telephone numbers based on the prefixes of the landline telephone numbers. Some of these methods use the EWP, known as list-assisted RDD (LA-RDD), to make the sampling frame more efficient by removing blocks of numbers that have a high chance of not being connected or are assigned to large businesses [3,27]. These RDD methods do not include mobile or VoIP telephone numbers. Since the turn of this century, there has been a trend of households moving away from traditional landline telephones with the emergence of mobile-only households [11,13,15,28]. This is due to increasing portability, flexibility, affordability and broadening internet capability of mobile telephones including smartphones and other telecommunications, such as VoIP [11,15,26,[29][30][31][32].
As a result of the increasing use of mobile telephones, conducting telephone surveys has become increasingly problematic in Australia and other countries [15,33]. This is because of the difficulty in obtaining a representative sampling frame of mobile telephones numbers since are they are rarely listed (7.3 % of mobile telephone owners in South Australia are listed [26]). Unlike the structure of landline telephone numbers, the Australian mobile numbers do not provide details of geographical location and the common methods used to generate a RDD sample of landline telephone numbers geographically are not applicable to mobile telephone numbers [34,35]. In 2011-12, approximately 20 % of households in Australia were mobile-only [14,29], 34 % of USA households in 2012 were mobile-only [30] with countries in Europe reporting 50-70 % [32]. More notably, studies have found that mobile-only households are demographically different to traditional landline households: they are generally younger people, unrelated, never married, and socioeconomically disadvantaged [26,30]. These issues suggest that by excluding mobile-only households biased estimates may be produced from chronic disease and behavioural risk factor surveillance systems.
This study presents the most up-to-date estimates available on the current status and possible sample biases of the current telephone survey methodology in South Australia, a state of Australia. Data from an annual representative face-to-face (non-telephone) population survey that included questions about telephone ownership were used to assess the population that were noncontactable by current telephone sampling methods. This included both household landline and mobile telephone ownership and listings in the telephone directory. This study will 1) explore trends of landline and mobile telephone ownership between 2006 and 2013; 2) describe the socio-demographic characteristics of respondents living in mobile-only households between 2010 and 2013; and 3) investigate the coverage bias of the two telephone samples (directory-listed numbers (EWP), consisting mainly of landline telephone numbers and a small proportion of mobile and VoIP telephone numbers; and a RDD sample of landline telephone numbers which excludes mobileonly households) by examining the prevalence estimates of health status and health risk behaviours between 2010 and 2013. This is one of the few studies to assess the potential bias of health estimates due to coverage bias from telephone sampling frames in terms of health indicators and socio-demographics, using a unique data source with telecommunication information on people who would be excluded from the hypothetical telephone samples [26,30]. This study uses relatively current data, which is important since telecommunications technologies have rapidly changed and evolved over the last 10 years, with increased uptake and saturation of mobile telephones and associated changes in the way people communicate [36]. Methodological studies therefore need to continually assess sample coverage and potential bias in health-related estimates [26].

Survey design and sample selection
The Health Omnibus Survey (HOS) [37,38] is a multistage, systematic, clustered area sample of South Australian households where face-to-face interviews are conducted annually. The HOS sample includes households randomly selected from Australian Bureau of Statistics (ABS) collector districts (CDs) (2006 to 2012) and Statistical Areas Level 1 (SA1) (2013), from the metropolitan Adelaide area and country towns with a population of 1,000 people or more. Within each CD or SA1, a random starting point was selected and from this point 10 households were selected in a given direction with a fixed skip interval. Hotels, motels, hospitals, hostels and other institutions were excluded from the sample. An approach letter and a brochure introducing the survey were sent to the selected household and the person aged 15 years or over, with the last birthday, was chosen for interview. The interviews were conducted in people's homes by trained interviewers. Up to six call back visits were made to chosen households to interview the selected person. There was no replacement for nonrespondents and no incentive of any kind was offered. Approximately 3000 people participate annually, achieving a median response rate of 59.3 % (range: 52 to 60 %). The data are weighted by five year age groups, sex, and area (metropolitan Adelaide and rural/remote South Australia) to the most recent Census or Estimated Residential Population for South Australia and probability of selection within the household size to provide population estimates.

Household telecommunications ownership
Questions regarding telecommunications services in the household, specifically, landline telephone and mobile connections, were included in the 2006 to 2013 HOS. Mobile-only households were defined if the respondent had a mobile telephone with no working landline connection to the household. Landline connections did not include using VoIP connection or Skype for telephone calls. In addition, questions were asked regarding landlines and mobile telephones currently listed in the Australian White Pages. From these questions, household landline and mobile telecommunication status were determined by classifying the respondents as living in mobile-only households; landline-only households; landline and mobile telephone households; or having no landline or mobile in the household.

Socio-demographics
Demographic variables included age, sex, area of residence, country of birth, household size, household structure, educational attainment, marital status, gross annual household income, employment status, dwelling ownership or renting status (2013 only) and area-level socioeconomic status. The Socio-Economic Indexes for Areas (SEIFA) Index of Relative Socio-Economic Disadvantage (IRSD) is a composite score of relative disadvantage developed by the ABS [39] for particular geographical areas, such as postcodes. It is based on selected 2011 Census socio-demographic variables. The SEIFA IRSD scores were grouped into quintiles for analysis where the highest quintile comprised postcodes with the highest SEIFA IRSD scores (most advantaged areas).

Comorbid conditions and health behaviours
Chronic conditions included self-reported medically confirmed diabetes (2010, 2011 and 2013 only), current asthma (2010 and 2011 only), arthritis and a current mental health condition. Self-reported health risk factor data included smoking status and obesity as determined by body mass index (BMI) which was derived from self-reported weight and height and recoded into four categories (underweight, normal weight, overweight and obese) [40].

Statistical analyses
Data analysis was conducted using Stata Version 12.0. All estimates and analyses were conducted using svy commands in Stata to incorporate the sampling design. Univariable analyses using chi-square tests compared the proportion of mobile-only households across sociodemographic variables for 2010, 2011, 2012 and 2013. Households that had no telecommunications, refused or where the status could not be determined were excluded from the analyses (n = 39). The univariable analyses were limited to data from 2010, since data has been previously published for earlier years [26]. Additional univariable analyses using chi-square tests were undertaken to describe the proportion of households with a landline telephone connected; the proportion of households with mobile telephones; and the proportion of households with a directory-listed telephone number (EWP). These results can be found in Additional file 1.
To explore the possibility of coverage bias of telephone surveys, two hypothetical telephone sampling frames (subsamples) were created from HOS: 1) RDD landline, that is, households that had a landline connection (mobile-only households excluded); and 2) directory-listed numbers, that is, households with either a landline or mobile telephone number listed in the White Pages. Prevalence estimates of health conditions and behavioural risk factors were presented for the overall population, and the two hypothetical telephone samples. The hypothetical telephone samples were subsamples of the total sample (landline RDD sample is 72-78 % of the total sample and directory-listed landline sample is 50-60 % of the total sample) which means that these subsamples would have a different demographic profile to each other and the overall sample. Therefore the data for the hypothetical telephone samples were re-weighted to produce health estimates that are reflective of the South Australian population. Re-weighting is calculated by incorporating the original relative sample weights, and by age, sex and area of residence to the most recent Census or Estimated Residential Population for South Australia.
To determine the amount of bias of the prevalence estimates derived from the two hypothetical sampling frames, the relative coverage bias (RCB) was calculated by the following formula: N nc N ⋅ p c − p nc ð Þ P [41]. This formula incorporates the proportion of the population that is not included in the hypothetical samples (N nc /N), that is, 1) mobile-only households, and 2) households that do not have either a mobile or landline telephone number listed in the telephone directory (N nc denotes the number in the sample that is not covered in the total sample, N). It also includes the differences in prevalence estimate obtained from the hypothetical samples, p c , and from the sample not in the hypothetical samples, p nc , divided by the prevalence estimate for the total population, P.  Table S1. Table 1 shows the proportion of respondents living in mobile-only households by socio-demographic variables across the four years. Generally, respondents living in mobile-only household were more likely to be male, younger, of Aboriginal or Torres Strait Islander descent, born in Asia or countries other than Australia, UK,    The prevalence estimates of various health conditions and behavioural risk factors for all households, for people who live in households with a landline connection (hypothetical landline RDD sample) and for people who live in a household with a directory-listed landline or mobile telephone number (hypothetical directorylisted sample) are shown in Table 2. The RCB for the prevalence estimates derived from the two hypothetical samples are also in Table 2. There were small absolute differences in the prevalence estimates for current asthma, arthritis and obesity between the hypothetical telephone samples and the overall sample. The prevalence estimates for diabetes by the two hypothetical samples did not differ in 2010 and 2011, however, the prevalence estimate was slightly underestimated (RCB value of −0.077) in 2013 for the directory-listed sample. Even though the prevalence estimates for arthritis were similar for both hypothetical samples, the prevalence estimate for arthritis in 2010 was underestimated for the directory-listed sample (RCB value of −0.083) compared to the overall sample (prevalence of 20.7 vs. 21.4 %). The prevalence of having a mental health condition showed mixed results for both hypothetical samples and over time: the prevalence of having a mental health condition was underestimated for both samples with estimates from the directorylisted sample having larger RCB (ranging from −0.102 to −0.242) with the exception of 2011, which had the opposite result of overestimating mental health conditions (RCB value of 0.056). Current smoking prevalence was lower for both hypothetical telephone samples with absolute differences ranging from 2.9 to 3.4 percentage points for RDD landline samples and 3. Our results show that mobile-only respondents are different across a range of socio-demographic indicators, which is similar to international studies [13,15,30]. Using hypothetical sampling frames (RDD landline and EWP directory listing) that were weighted to the age and sex structure of the South Australian population produced contradictory results for health prevalence estimates when compared to all households in the face-toface survey. Prevalence estimates of diabetes, current asthma, arthritis and obesity had very minor differences and biases, but the prevalence estimates for mental health condition and current smoking indicates biases using either RDD landline or EWP directory listing sampling frame. Even though our results show that mobileonly respondents are demographically different across a range of socio-demographic indicators, appropriately weighted data can produce reliable prevalence estimates for some health indicators, but not for others. These findings suggest landline-based sampling frames used in Australia are potentially biased for some health indicators, such as current smokers and having a mental health condition, particularly where conditions or risk factors are higher amongst those living in mobile-only households. Researchers using either RDD or directorylisting landline sampling frames need to be aware of their limitations and know of the potential biased estimates because of the groups that are excluded from the sampling frames. This study is important because it quantifies the potential biases from the various landline-based telephone sampling frames used in Australia and the groups that are potentially excluded. Even though the data are limited to South Australia, the conclusions may be generalizable to the Australian population. This study is unique since the same questions have been asked annually for eight years and, using the face-to-face methodology in which all types of households are included (mobile-only, landlineonly or both), it had the ability to examine, over time, the prevalence estimates of various health indicators by telephone status. Very few studies like this are known to exist nationally [14] and internationally [15,30] and even fewer examine the assessment on health indicators [30].

Results
The trends and demographic differences found in this study are similar to national and international studies [11,14,15,30,42,43] and support findings from our previous research [26]. Our estimate of mobile-only households in 2012 (23.9 %) was higher than the estimate reported by the Australian Communication and Media Authority (19 %) [14]; the proportion of households with a landline telephone in 2010 was 82.5 % which was slightly higher than the 80.3 % estimate from the 2010-11 Australian Health Survey (AHS); and our estimate of 68.7 % of landline telephone numbers listed in the telephone directory was slightly lower than the 70.1 % from the AHS 2010-11 survey [44]. Between 2006 and 2008 the trend of mobile-only households remained low, however since 2009, the trend has steadily increased, following international patterns [30]. Similarly for landline ownership, up to 2011 the proportion was over 80 %, however, this has steadily decreased to 71.9 % in 2013. These changes are mainly due to the increasing popularity of greater flexibility and affordability offered by mobile technology. People are using landlines less frequently because they are able to have a single device with multiple communication and media services, which is less expensive than having a landline connection [13].
In our previous study [26], nearly 10 % of the population in 2008 lived in mobile-only households, and we showed that with appropriate weighting, the sampling methodology used for telephone surveys produced reliable health estimates with the exception of smoking prevalence in South Australia being underestimated. In contrast, with more recent data and up-to-date analyses, this study has estimated that close to 30 % of the Australian population now live in mobile-only households and these analyses have demonstrated the impact of the vast changes in the telecommunication over the eight year study period on the coverage of the sampling frames. Excluding a distinct subpopulation from the landline sampling frames, namely mobile-only households, resulted in under-or over-estimation in some health estimates, although with appropriate weighting most health estimates (except smoking and mental health) were very similar to the overall population. Even though the results in the health estimates (absolute differences and RCB values) between the overall population and the two hypothetical landline sample groups showed no clear pattern over time, the results do highlight that for specific health indicators, such as current smokers and mental health, the direction of the bias was consistently under-estimated for both RDD and directorylisted landline hypothetical samples. The other conditions (diabetes, current asthma, arthritis and obesity) had little absolute differences in health estimates and an inconsistent pattern, but relatively low, RCB values over time, which may suggest that the differences could be due to the random nature of the sample or other sampling errors. Our findings for current smokers, asthma and obesity are similar to other USA studies [30] using similar methodology, and are consistent with studies using dual-frame telephone surveys for mental health [45], current smoking [30,46,47], asthma [47], and obesity [30]. This suggests that perhaps an alternative sampling, surveying or statistical methodological approach may need to be considered to include groups of the population to remove the coverage biases in landlinebased sampling frames.
Many studies have explored various methods to include the mobile-only group into chronic disease and risk factor surveillance systems [12,48]. The favoured method is an over-lapping dual-frame design which involves two independent samples: a sample of mobile telephones and a landline-based sample [34,35,46,49]. These studies showed an improvement in the representativeness, in particular for men, the younger and middle age groups, and people who were never married. However, obtaining a sample of mobile telephone numbers does have drawbacks, including low response rates and two to four times the costs of landline-based samples [34]. More importantly, the mobile sample that is currently available and used in Australia is of randomly generated mobile telephone numbers with no geographical marker. From a South Australian perspective, only 8 % of all mobile telephone numbers in Australia were estimated to be owned by South Australians [34,35,46,49], which is almost the same proportion of the state's population (7.4 %). This means a much larger initial sample is required for screening, and with the additional problem of low response rate, the feasibility of including mobile numbers using these methods in a chronic disease and behavioural risk factor surveillance system in South Australia would be costly. Even though 98 % of South Australians have a mobile telephone and it is perceived that people can be reached anytime, it does not mean that they are willing or able to use it to complete a survey. Receiving mobile telephone calls can happen at unpredictable moments when it is not suitable for the owner to respond, such as driving (safety issue), travelling overseas (which can incur a large cost to the researcher or participant), or during a meeting or in a restaurant (privacy issue); all have an impact on response rates [43].
Mixed-mode methods have also been suggested as a way to complement the traditional landline telephone survey by combining face-to-face, mail, and internet surveys [50]. These alternative modes introduce other methodological issues and the design of each mode need to be taken into consideration. The questionnaire design for CATI surveys, for example, complicated skips patterns or data range checks, needs to be careful considered in other modes such as mail survey [51]. Face-to-face, mail and internet survey can have the option of longer worded questions, explanations, and visual or prompt cards which is not recommended or possible with CATI surveys. Therefore, the wording of the questions in telephone surveys needs to be clear, concise and short [52]. Operational differences can have an impact on how the questions are answered. Telephone surveys are mainly interviewer administered whereas mail or internet surveys are self-administered which can lead to different responses [50,51]. In telephone surveys, the interviewer has control over who is the selected respondent within the household whereby in the mail or internet surveys any member of the household determines who is the selected [12]. The level of privacy can vary by survey modes which is high with mail or internet surveys compared to moderate level of privacy with telephone (others listening in, or answering sensitive questions) [53]. Mail surveys require a longer data collection period compared to the allocated time period for telephone surveys. In an attempt to include respondents from mobile-only households, a study examined the possibly of using two modes, telephone and mail, with a single database that consisted of residential addresses. However, they found that the groups that were under-represented in telephone surveys were also underrepresented in the mail surveys [48]. Another consideration for surveillance systems that used the telephone to collect data, is the challenge of how to incorporate alternative modes but still maintain the timeliness, flexibility, low non-response and low cost of the system [12]. Other methodological studies have used statistical approaches such as alternative weighting strategies, such as raked weights, which incorporate a wider range of sociodemographic variables, can improve the health estimates and are more in line with face-to-face surveys [54][55][56].
The study design used in this research is robust due to the large representative state-wide samples used and is unique in that the data were collected over eight years using the same or similar questions, and by one organisation, thus minimising interviewer biases. These data are also very recent and it is one of the few face-to-face studies conducted in Australia and worldwide that included questions on landline and mobile telephone status that also had questions on health status and behavioural risk factors [30] so the biases in health estimates can be assessed. However the results could be biased due to the moderately acceptable response rates (median = 59.3 %) which is following the trends observed interstate and overseas. This study only analysed a few health-related variables and additional questions such as health service usage, quality-of-life or alcohol consumption would have provided a more comprehensive description of telephone sampling biases.

Conclusion
Telephone surveys have become a standard and accepted method of collecting health information in Australia and are widely used to monitor chronic disease and behavioural risk factors. Such surveillance systems provide evidence to inform interventions and service planning with the aim of reducing the impact of chronic diseases and their associated costs to the health system. Analyses like those presented here are important to demonstrate that the health estimates obtained are not biased due to sampling methodology. This study has shown that the proportion of mobile-only households is increasing and this does not appear to have reached a plateau. This corresponds with the decrease in landline telephone coverage. Even with appropriately weighted data, using landline-based sampling frames in Australia are potentially biased for some health indicators. This implies that the landline sampling frames that are currently used in most Australian chronic disease and risk factor surveillance systems (RDD landline or directorylisted telephone numbers) are not sufficient on their own because of the exclusion of the mobile-only households. Other methodologies need to be considered for small states like South Australia that are timely, costeffective and efficient.

Availability of data and materials
The Health Omnibus Survey (HOS) is a user-pay survey in which various organisations pay for their questions to be included in the surveys. Because of this, the authors of this study do not own all of the HOS data and permission had to be sought from each owner, therefore data are not publicly available.

Ethical statement
Ethical approvals were obtained from the Research Ethics Committees of The University of Adelaide and the South Australian Department of Health. Participation in the study is voluntary. Verbal informed consent was obtained from participants at the start of the interview.