Development and validation of a self-administered questionnaire to estimate the distance and mode of children’s travel to school in urban India

Background Although some 300 million Indian children travel to school every day, little is known about how they get there. This information is important for transport planners and public health authorities. This paper presents the development of a self-administered questionnaire and examines its reliability and validity in estimating distance and mode of travel to school in a low resource urban setting. Methods We developed a questionnaire on children’s travel to school. We assessed test re-test reliability by repeating the questionnaire one week later (n = 61). The questionnaire was improved and re-tested (n = 68). We examined the convergent validity of distance estimates by comparing estimates based on the nearest landmark to children’s homes with a ‘gold standard’ based on one-to-one interviews with children using detailed maps (n = 50). Results Most questions showed fair to almost perfect agreement. Questions on usual mode of travel (κ 0.73- 0.84) and road injury (κ 0.61- 0.72) were found to be more reliable than those on parental permissions (κ 0.18- 0.30), perception of safety (κ 0.00- 0.54), and physical activity (κ -0.01- 0.07). The distance estimated by the nearest landmark method was not significantly different than the in-depth method for walking , 52 m [95 % CI -32 m to 135 m], 10 % of the mean difference, and for walking and cycling combined, 65 m [95 % CI -30 m to 159 m], 11 % of the mean difference. For children who used motorized transport (excluding private school bus), the nearest landmark method under-estimated distance by an average of 325 metres [95 % CI −664 m to 1314 m], 15 % of the mean difference. Conclusions A self-administered questionnaire was found to provide reliable information on the usual mode of travel to school, and road injury, in a small sample of children in Hyderabad, India. The ‘nearest landmark’ method can be applied in similar low-resource settings, for a reasonably accurate estimate of the distance from a child’s home to school. Electronic supplementary material The online version of this article (doi:10.1186/s12874-015-0086-y) contains supplementary material, which is available to authorized users.


Background
About 300 million children travel to school every day in India [1]. However, little is known about how they get there. Research from high-income countries shows that children are more likely to use motorised transport if the distance to school is greater [2,3]. Other factors associated with motor vehicle use are age [4][5][6], gender [2,7], parental concerns about safety [8,9], physical infrastructure, and weather conditions [10]. We do not have similar information in India that would help us better understand children's school travel. There is evidence to suggest that everyday travel by walking and cycling is associated with positive health benefits for children [11,12]. We need information on children's travel to school in India to understand the public health impacts of these journeys. Developing methods to measure children's travel to school for use in low resource settings is therefore important.
A range of methods have been used in high-income countries to measure distance from home to school: Geographical Information Systems (GIS) [10,13]; Geographical Positioning Systems (GPS) [14]; travel time [15]; or the 'straight-line' between school and home [4,16]. Distances have been calculated using the shortest route possible along the road network [17] or by asking children to draw their routes to school on image maps which were then digitalized and measured, using GIS [18]. In many low resource settings in India, postcodes and addresses often do not identify dwellings and cannot be used to reliably estimate distance to school. This paper presents the development and testing of a self-administered questionnaire on children's travel to school. This is part of a larger study that aims to estimate the distribution of children's mode of travel to school in Hyderabad (Telangana, India), a city with a population of almost 8 million [19]. A cross-sectional survey is planned to collect data from about 6,000 school children aged 11-14 years, which will be incorporated into a spreadsheet model of the public health impacts of school travel. Accurate estimates of distances and modes of travel by children in Hyderabad is an essential component of the study. The objective of this study was to develop a self-administered questionnaire and examine its reliability and validity in estimating distance and mode of travel to school.

Methods
We developed a questionnaire for use in children aged 11-14 years, as this is typically an age when children may be expected to travel independently [20]. In school terminology, it refers to children in grades 6-9.

Questionnaire development
We searched the literature to identify questions that could be applied in the context of a low resource setting like India (see Additional file 1) [8,21]. We originally identified about 25 items from previously published work on children's independent travel and adapted them for the Indian context [20]. We conducted a focus group with four public health experts to discuss the appropriateness of the questions. We included a question that asked children about the nearest landmark to their home and used this to estimate the distance from home to school. The final questionnaire (Additional file 3) had 21 multiple choice items: four on demographics, nine on mode of travel and travel during dry or wet weather, two items on parental permissions for independent travel, three on children's perceptions of safety, including road traffic injuries, and three items on physical activity after school. These questions were included because of our interest in children's commuting to school in Hyderabad, and its impacts on health.

Reliability studies
We assessed the comprehension of the questionnaire by focus group discussions among children aged 12-15 years, to assess the suitability of questions for the target age. We piloted the questionnaire in a private school (run by a Society/Trust, without government aid) [22] with 12 children of grade nine, noting all requests for clarifications. For assessing the reliability of the questionnaire, we distributed Telugu translated questionnaires to children in grade eight of a government school (n = 61) and conducted a re-test one week later. Telugu is the first language spoken by about 80 million people in India and is the local language in Hyderabad, where this study was conducted. We back-translated the questionnaire, to ensure the correct interpretation of the questions. We conducted a second reliability study in another government school (n = 68). We administered questionnaires using pencil-and-paper methods and read out each question, allowing plenty of time for marking the responses.

Validation of estimated distance
We assessed the validity of the distance estimates based on the 'nearest landmark to home' method, by comparing with a 'gold standard' measure, based on in-depth one-to-one interviews with 50 school children in grades 7, 8 and 9, using detailed maps of their neighbourhood and routes to school. The class teacher randomly selected children using each mode of transport. Fifty children, with 56 % (n = 28) females participated in the 'indepth interview' method. The distribution of school-type was government (30 %, n = 15); semi-private (26 %, n = 13) and private (44 %, n = 22).

Gold standard in-depth interview method
Google Earth [23] was installed on a laptop computer, with a 'place mark' on the map corresponding to the school. We visited one school of each type (i.e. Government, semiprivate and private). After a brief orientation, each child traced the route from his/her home to school, using a finger. Each route was recorded in Google Earth. We used the 'Play tour' viewing mode for children to see and confirm their routes to school, as well as the distance travelled.

Nearest landmark method
Using Google maps, [24] the 'nearest landmark' information of each of the 50 children was entered in the 'from' box and the school address in the 'to' box. The 'give directions' button gave a suggested route and corresponding distance. [Example screenshots of both methods are shown in the Additional file 2].

Statistical analysis
STATA 12 (Stata Corp, College Station, Texas) was used for statistical analysis. For the reliability studies, agreement was assessed for each question using the kappa statistic. Standard categories were used for interpreting agreement (i.e. κ >0.81 'almost perfect' agreement; κ 0.61-0.80 'substantial' agreement; κ 0.41-0.60 'moderate' agreement; κ 0.21-0.40 'fair' agreement; κ 0.01 -0.20 'slight' agreement; κ 0.00 'less than chance' agreement) [25]. The difference between the distances estimated by the two methods was plotted against the average of the two distances using a Tukey/Bland Altman plot [26]. Limits of agreement were calculated as the mean difference ±1.96 × SD, within which 95 % of the observed differences would be expected to lie. A paired sample t-test was used to assess whether the bias (mean difference) was statistically different from zero, where statistical significance was at the 5 % level.
Prior permissions were obtained from the Hyderabad District Education Office. The participating school principals gave verbal consent on behalf of the children, and parents/guardians were informed of the study. Ethics committee approved consent being taken only from the school principal. Ethical approvals were secured from the London School of Hygiene and Tropical Medicine, London, UK, and the Indian Institute of Public Health, Hyderabad, India.

Questionnaire development
The pilot confirmed that the questionnaire could be completed in 15-20 minutes. After the first reliability study, definitions were added for exercise, main roads, and feeling safe. Table 1 shows the results of the reliability studies. There were 61 children in the first reliability study and 68 children in the second. Fifteen children absent during the re-tests were removed from analysis. There was perfect agreement for age, sex and name. Almost all children (67 out of 68) wrote the same landmark in the test and re-test. The first reliability study showed 'substantial' or 'moderate' agreement in 69 % (11/16) questions; 'fair' agreement in 6 % (1/16) questions and 'slight' agreement in 25 % (4/16) questions. The second reliability study showed 'almost perfect' agreement in 11 % (2/17) questions, 'substantial or moderate' agreement in 41 % (7/17) questions, and 'fair' agreement in 23 % (4/17) questions. Questions on usual mode of travel to school showed 'substantial' to 'almost perfect' agreement. The question on road injury showed 'substantial' agreement in both the reliability studies. Questions on parental permissions for independent travel, perceptions of safety, and physical activity after school were shown to be less reliable.   Figure 1 shows the mean difference plot for walking. The dotted lines show the limits of agreement, and the solid line shows the bias (−52 m). Figure 2 shows the mean difference plots for different modes. The dotted lines show the limits of agreement.

Principal findings
The questionnaire on children's travel to school showed that the questions on usual mode of travel, and road injury were reliable. Distance to school measured by asking for the nearest landmark to a child's home was found to be a valid measure of distance when compared to a method based on in-depth interviews with children. This was true for different modes of travel to school in Hyderabad, but to a lesser extent with the school bus.

Strengths and weaknesses
Questionnaires were administered one week apart and some children's motivation and interest may have differed between occasions, altering the quality of their responses. There was a difference in the number of children who took the test and re-test, but it is not expected that the exclusion of the absentees would influence the results. Compared to those present, absentees had similar age (12.9 vs 13.1 years, p = 0.09), and sex (44 % vs 47 % boys, p = 0.55), and prevalence of walking (74 % vs 69 %, p = 0.99).
Due to limited resources, we could not use objective measures of distance such as GPS. Children's home address was not included because many urban areas in India including several localities in Hyderabad are growing rapidly. As a result, they do not have uniformly structured or geocoded searchable addresses on the web [27]. In the absence of searchable addresses, our questionnaire provides a cost-effective alternative. Reliability was assessed using written survey forms instead of 'hand-raising' protocols used in other studies [28].
Google Earth is increasingly being used in Public Health [29,30]. We used Google Earth and Google Maps as they are freely available and easy to use, and due to a lack of access to other GIS tools. It is suggested that Google Earth images should be checked for accuracy [31] because they may not reflect recent changes in landscape like new urban development and recent disasters [32]. The distance from home to nearest landmark was not accounted for in this analysis, and could therefore slightly alter the distance estimated.

Strengths and weaknesses in relation to other studies
The 'in-depth' method of recording children's journeys enabled good quality data to be collected, which was the strength of this study. Other studies have relied on parent's reports [18,33] but we did not involve parents because of concerns about high levels of illiteracy among low-income parents in India. The kappa score for the question on "mode of travel to school today" was lower than that obtained by another study that also used pen and paper (i.e. 0.79 vs 0.98) [25]. This was perhaps because it administered the questionnaire on the same day rather than one week apart. The difference in kappa in our survey could also be due to the difference in the travel behaviour on the day of the survey. Questions on the usual mode of travel and road injury were found to be more reliable than those on parental permissions, perception of safety, and physical activity, and this must be considered before using the questionnaire. The question on physical activity adapted from the WHO Global School Health Survey [34] was found to be especially challenging and many children asked for clarification. No evidence of bias was found in the distance estimate when walking and cycling were combined. The nearest landmark distance was slightly greater for walking, and when walking and cycling were combined, and for auto-rickshaw. Children probably take short-cut  routes which Google may not consider. This was not the case with the school bus, which undertakes long winding routes to collect children from their homes, and does not reflect the distance from home to school that would be travelled using other modes. For all types of motorized travel, the 'nearest landmark' distance was shorter than the 'in-depth interview' distance, with the exception of auto rickshaw, perhaps due to its ability to take short-cut routes, possibly leading to traffic violations [35].

Meaning of the study and future research
This study developed a questionnaire on mode of travel to school and a method to estimate the distance that children travel to school in Hyderabad, India. It may be used to determine whether these are journeys that could be made by walking or cycling. In the absence of searchable databases to pinpoint the home location, we used Google Earth and Google Maps to estimate distance. When we compared the 'nearest landmark' versus 'indepth' distance, they differed by 10 % for walking and cycling. We consider this margin of error to be within acceptable limits of accuracy. For other modes like the school bus, the mean difference is higher, but this is because the school bus does not use a direct route. Future studies can therefore use the nearest landmark method to estimate the true distance that a child would walk or cycle to school. It confirms that the nearest landmark method is feasible, in the absence of GPS equipment and software, especially in low resource urban settings. This method should be tested in rural areas, which have a different pattern of land-use. Further development of this approach, for example using factor analysis to refine the items, may also improve the questionnaire.