Translation method is validity evidence for construct equivalence: analysis of secondary data routinely collected during translations of the Health Literacy Questionnaire (HLQ)


 
 Cross-cultural research with patient-reported outcomes measures (PROMs) assumes that the PROM in the target language will measure the same construct in the same way as the PROM in the source language. Yet translation methods are rarely used to qualitatively maximise construct equivalence or to describe the intents of each item to support common understanding within translation teams. This study aimed to systematically investigate the utility of the Translation Integrity Procedure (TIP), in particular the use of item intent descriptions, to maximise construct equivalence during the translation process, and to demonstrate how documented data from the TIP contributes evidence to a validity argument for construct equivalence between translated and source language PROMs.
 
 
 Analysis of secondary data was conducted on routinely collected data in TIP Management Grids of translations (n = 9) of the Health Literacy Questionnaire (HLQ) that took place between August 2014 and August 2015: Arabic, Czech, French (Canada), French (France), Hindi, Indonesian, Slovak, Somali and Spanish (Argentina). Two researchers initially independently deductively coded the data to nine common types of translation errors. Round two of coding included an identified 10th code. Coded data were compared for discrepancies, and checked when needed with a third researcher for final code allocation.
 
 
 Across the nine translations, 259 changes were made to provisional forward translations and were coded into 10 types of errors. Most frequently coded errors were Complex word or phrase (n = 99), Semantic (n = 54) and Grammar (n = 27). Errors coded least frequently were Cultural errors (n = 7) and Printed errors (n = 5).
 
 
 To advance PROM validation practice, this study investigated a documented translation method that includes the careful specification of descriptions of item intents. Assumptions that translated PROMs have construct equivalence between linguistic contexts can be incorrect due to errors in translation. Of particular concern was the use of high level complex words by translators, which, if undetected, could cause flawed interpretation of data from people with low literacy. Item intent descriptions can support translations to maximise construct equivalence, and documented translation data can contribute evidence to justify score interpretation and use of translated PROMS in new linguistic contexts.



Introduction to the Translation Integrity Procedure (TIP)
This manual describes the Translation Integrity Procedure (TIP). The TIP guides researchers, program managers and translators to produce translations of psychometric questionnaires that are conceptually equivalent to the source language questionnaire, are appropriate for the target country and/or cultural group, use natural and acceptable language and phrasing, are easily read and understood by people with low literacy, and demonstrate a measurement performance that is equivalent to the source questionnaire. The TIP has been tested across many languages and over time (see Section 4). The procedure is outlined in detail in Section 2.

Translation licence
Please note that a translation licence may be required for the questionnaire you wish to translate. Always contact the authors of any questionnaire you plan to translate prior to the translation to determine copyright and intellectual property issues. Fees may be involved.

Background -construct equivalence between source and target languages
A core difference between the translation of technical documents (such as letters and reports) and psychometric questionnaires is the need to maximise measurement equivalence between the source and the target language questionnaires, as well as to accommodate linguistic and cultural adaptation. This section explains the importance of having a systematic and documented translation method to qualitatively maximise construct equivalence, and to contribute evidence to justify the score interpretations for use in the new linguistic context.
Psychometric questionnaires consist of one or more constructs (i.e., abstract concepts), each of which is represented by a psychometric scale. Each scale consists of items (questions or statements that require a response) that measure elements of its construct. Items work together within a scale to measure specific and independent constructs. Respondents usually score the items along an ordinal scale (e.g., response options that range from 'strongly agree' to 'strongly disagree').
To achieve accurate measurement across the full continuum of a person's status, a questionnaire scale will typically have 4 to 6 items that measure varying strengths of the elements of the construct. That is, scale items are designed to make it easier or more difficult for respondents to positively endorse those items. So translators must be mindful of maintaining the same item meanings and maintaining the same range of measurement relationships between items in a scale.
It is a considerable challenge to achieve equivalence of abstract constructs across different languages and cultures, especially when languages can be linguistically and conceptually very different. It is the task of a translation team to ensure that the translation method maximises the measurement equivalence of items (and thus of constructs) between the source and target languages. Intended users of a questionnaire must be assured that the items of the translated questionnaire will measure the same constructs in the same way as the source language questionnaire when used for the same purpose and under the same circumstances (same context). An analogy for capturing the same construct in two languages is that of measuring characteristics of water in Celsius and in Fahrenheit. A thermometer in Celsius has a scale that measures frozen water at 0 o C (an extreme endpoint) through gradations of temperatures to boiling at 100 o C (the other extreme endpoint). A score on a Fahrenheit scale for frozen (32°F) and a score for boiling (212°F) still measure the same characteristics of water even though the 'language' to name the condition is different.
An example to help understand the relationships between items within a scale is when measuring a condition such as depression: the items would be designed to detect extreme depression (that is, suicidal, long-term intractable negative mood, profound adverse effect on the person's life) through to moderate and then to very mild depression (that is, occasional and transient feelings of sadness, but mostly positive mood). For each scale, there needs to be items that detect small differences in the strength of a characteristic along any part of the scale, as well as improvements or decrements over time. The translation process must maintain not only the meaning of the source items but also the varying strengths of the characteristics of the constructs (i.e., the measurement relationships between items).
A translated questionnaire will require cognitive testing in the target language, and testing for psychometric comparison with the source language questionnaire prior to use in measurement studies in the target language (see Sections 3 and 4).

The translation team
The core team the questionnaire to the target language. The second translator checks the translation of each item against the item intents. Any discrepancies in translation are discussed, using the item intents as the guide, until agreement of the provisional forward translation is reached.
The back translator is a native speaker of the source language with excellent knowledge of and fluency in the target language. It is essential that the back translator does not see the source items so that a blind back translation can be undertaken. The back translation is used by anyone in the translation team who does not know the target language. This back translation will be used by the chairperson to gain an understanding of discrepancies or errors that may be occurring in the forward translation.

The group cognitive interview
The goal of the group cognitive interview is for the translation team to examine translated items against corresponding item intent descriptions, which are included in the Item Intent and Translation Management Grid (the Grid). Discrepancies in meaning between the translated items and the source items are negotiated to maximise equivalence in meaning and measurement. That is, the items in the two language versions must, as closely as possible, measure the same constructs in the same way. There must be construct equivalence to achieve equivalent interpretations of data from a translated questionnaire, and for the data to be comparable across language and cultural settings. Analysis of documented data from a TIP translation method contributes evidence to an evaluation of the extent to which score interpretations are valid for the new linguistic context.  Brief forward translators about the TIP and reference to item intents during translation and group cognitive interview.

Provisional forward translation checked against item intents and agreement reached by translators and translation team.
Back translator (blind to item intents) translates provisional forward translation to source language.
Chairperson provides feedback in the Management Grid. Translation team reviews feedback and translation prior to group cognitive interview.
Translated questionnaire undergoes qualitative and quantitative validity testing prior to use in studies.
Management Grid: lead translator records final translation, all changes to provisional translation, and reasons for changes.

Translation Integrity Procedure (TIP)
Minimum 4 people: Chairperson + lead translator + one other translator + native English speaker fluent in target language. Can also include the project manager + field workers / bilingual representatives. Discussion to examine each translated item against its item intent description.

The Translation Integrity Procedure (TIP)
A TIP translation package includes the following documents: Please note that the back translator must remain blind to the items and item intents, so the forward translation must not be sent in the Grid to the back translator.

Forward translation
There are two forward translators. Both are native speakers of the target language with very high levels of proficiency (fluency) in the source language. The forward translators must continually refer to the item intent descriptions during the translation. The lead translator first translates the items from the source language to the target language. This translation is then checked by the second forward translator. Both translators then confer, with reference to the item intent descriptions, to decide on an agreed provisional forward translation. They need to agree that the translation is as close to the conceptual meaning of the source questionnaire as can be accomplished, that I can be understood by people with low literacy skills, and that it uses natural language that is appropriate to the country or culture of the target language group. Furthermore, the number of words must be kept to a minimum to reduce the burden on people completing the questionnaire.
If there is persistent discrepancy between the forward translators about how an item or concept should be translated then they can offer alternatives for the back translation. In some circumstances, the alternative items can be later pilot tested with respondents to establish the wording that best conveys the meaning of the source item. When both the forward translators are satisfied with the translated document, the provisional forward translation can then be passed to the back translator. It is very important that the back translator is kept blind to the source items and the item intent descriptions.

Back translation
The back translator is a native speaker of the source language with high proficiency (fluency) in the target language. The back translator must be kept blind to the source items while translating the provisional forward translation back into the source language. The main purpose of the back translation is for use in the group cognitive interview when the chairperson and/or other team members do not speak the target language. The back translation will help alert the chairperson to discrepancies in meaning between the source language items and the provisional forward translation. These discrepancies can then be discussed with the forward translators to arrive at a final consensus translation.

Preparation for the group cognitive interview
Prior to the group cognitive interview, the study manager prepares the Grid with both • Linguistic issues -the lexical, syntactic or semantic demands of the target language appear to cause discrepancies between the meaning of a source language item and the translated item • Back translation issues -the back translation is incorrect or has not offered a suitable alternative, and causes concern about the meaning conveyed in the translated item • Translation issues -the back translation accurately identifies errors in the provisional forward translation that need to be corrected The Grid with the chairperson's feedback is returned to the client to be distributed to all translators. Each translator reviews the feedback separately and records their comments in the Grid. This is an opportunity for the translators to develop explanations for, or corrections to, their translations, including agreement or disagreement with the other translators and the chairperson. All areas for concern are clearly highlighted and made ready for the translation team to discuss during the group cognitive interview.
During the meeting, it is important to not dwell on the back translation. It is only used to support negotiation of the forward translation. It will be disregarded after the meeting.

The group cognitive interview
The aim of this meeting is for the chairperson to determine the quality of the translation. This occurs through in-depth discussions with the translation team about the meanings of translated words or phrases to determine if the translated items convey the same or comparable meanings as the source language items. Depending on the quality of the translation (i.e., adherence to the item intents), the discussions will take between 3 and 5 hours. The final translation is arrived at through these negotiations between the chairperson and the translation team. The translated items are then ready to undergo validity testing.
The following table outlines the participants who need to attend the group cognitive interview. Approaches to questionnaire development and validity testing are described elsewhere [1][2][3][4][5].

Figure 2. Group cognitive interview -participants
Validity evidence for a translated questionnaire can include data from analysis of the documented translation process, cognitive interviews with the target population, and psychometric analyses to compare data with validity studies of the source language questionnaire. Evaluation of these types of validity evidence will determine the extent to which the intended interpretation of scores is valid for the intended use in the new linguistic context.
It is recommended that existing or newly generated evidence from five sources be considered for the evaluation of the validity of score interpretation and use in a new context [5]: 1.
Evidence based on test content -the relationship of the item themes, wording and format with the intended construct, including administration process 2.
Evidence based on response processes -the cognitive processes and interpretation of items by respondents and users, as measured against the intended construct  [5][6][7]. Feedback from the target respondents can provide rich information about the appropriateness of the language used, the concepts conveyed, and the suitability for the target country or culture. Quantitative statistical validation is required to test the psychometric properties of a translated questionnaire against those of the source language version.
An evidence-based argument for the extent to which score interpretations of the translated questionnaire are valid for the intended use is needed before conducting measurement studies in the field.