Coding linguistic elements in clinical interactions: a step-by-step guide for analyzing communication form

Stortenbeker, Inge; Salm, Lisa; olde Hartman, Tim; Stommel, Wyke; Das, Enny; van Dulmen, Sandra

doi:10.1186/s12874-022-01647-0

BMC Medical Research Methodology

Table 5 A case study illustrating the codebook development for CLECI [16]

From: Coding linguistic elements in clinical interactions: a step-by-step guide for analyzing communication form

Phase 1: Research question and data collection
Step	Example from Stortenbeker et al. (2022)
Research question	“To what extent do linguistic markers in utterances differ between general practice patients presenting MUS and MES?”
Data collection	Verbatim transcripts of general practice consultations were derived from an existing research project [36].
Phase 2: Codebook development
Step	Issue	Action	Example from Stortenbeker et al. (2022)
Selection criteria	Inclusion and exclusion	Define research scope	Language use of patients presenting medically explained or unexplained symptoms to GPs.
		Read through training consultations	Patients talk about their past (‘but it was always low’) or current health problems (‘I am unstable’) as well as about potential future health issues (‘I think it could go wrong’).
		Redefine selection criteria	Scope was limited to include only utterances relating to current or past condition of patients, not prospective conditions.
Unit of analysis	Turn constructional unit	Define unit of analysis	Grammatical finite clauses served as unit of analysis in earlier stages.
		Read through training consultations	A more flexible unit of analysis was needed for subjectivity markers in cases such as ‘[I notice though] [that I’m getting sensitive to it]’.
		Redefine unit of analysis	Turn constructional unit was selected as the new unit of analysis.
Deductive categorization	Retain predefined category	Scan literature for relevant linguistic elements	Patients with MUS use more negations when describing (non-) occurrences of symptoms than patients with MES [37, 38].
		Formulate code	Negation – a) absent; b) syntactic; c) morphological
		Read through training consultations	Plenty of examples were found, such as ‘I am unstable’ and ‘I cannot move comfortably’, so negation was retained in the revised codebook.
Deductive categorization	Exclude predefined category	Scan literature for relevant linguistic elements	Doctors use more ‘illness terms’ (e.g. urination problems) towards MUS patients, whereas MES patients are often described with ‘disease terms’ (e.g. bladder infection) [39].
		Formulate code	Terminology – a) illness; b) disease
		Read through training consultations	Differentiating between the two was not easy (e.g. ‘I got dizzy’, ‘well then you’re all worn out’) and remained subjective. As an objective definition of the boundaries was not possible, the category was removed from the codebook.
Inductive categorization	Include category based on observations	Read through training consultations	Salient utterances such as ‘that ear keeps on whizzing’ were marked, suggesting ‘that ear’ operating as a separate agent as opposed to ‘I can hear pretty badly’.
		Scan literature for relevant studies	Patients can be disconnected from emotional and/or somatic experiences in various degrees [40].
		Formulate new code	Grammatical subject – a) first person (the patient, ‘I’); b) third person (patient’s biomedical or psychosocial state, ‘that ear’).
Iterative refinement	Add subcategory after test coding	Define code	Grammatical subject – a) first person; b) third person.
		Read through training consultations	Some utterances could not be indicated as having a first- or third-person subject, such as ‘[positive though] [that I do not have any new lesions]’ in which no subject is present in the first TCU.
		Redefine code	“empty subject” was included as a subcategory in the revised version of the codebook.
Phase 3: (double) coding
Step	Issue	Action	Example from Stortenbeker et al. (2022)
Double-coding	Refine coding categories	Double code session	Intensity displayed a Kappa of .66.
		Explore systematic differences	One coder did not interpret certain time words as intensifiers, whereas the other coder did, e.g. ‘sometimes’, ‘all of a sudden’.
		Fine-tune codebook and coders	Remarks were added to the codebook. Words denoting an in- or decrease in time/frequency words are only marked when intensified such that ‘after that it was wrong again’ is not intensified, ‘all the time I think oh I’m getting tired’ is intensified.
Coding	N/A	N/A	Final coding was performed by the main researcher in various separate coding sessions. Cases of doubt were marked and evaluated at a later point in time.
Phase 4: Analysis and reporting
Steps	Example from Stortenbeker et al. (2022)
Analysis	Logistic binary random intercepts models with various linguistic markers as outcome variables, and consultation type (unexplained or explained symptoms) and codes related to message content as predictor variables, controlled for various relevant confounders.
Reporting	Distinguished between hypothesis-based and explorative analyses. For more information, see Stortenbeker et al. (2022).

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com