Skip to main content

Table 1 COSMIN definitions [49] of the evaluated measurement properties, and their quality criteria [19]

From: A systematic review to investigate the measurement properties of goal attainment scaling, towards use in drug trials

Measurement Property

COSMIN definition

Quality criteria (+ equals good to very good quality, +/− equals intermediate quality and – equals poor quality)

Inter-rater reliability

The extent to which scores for patients who have not changed are the same for repeated measurement by different persons on the same occasion

+ ICCa or weighted Kappa ≥0.7

+/− Unclear design or method

- ICC or weighted Kappa ≤0.7

Intra-rater reliability

The extent to which scores for patients who have not changed are the same for repeated measurement by the same persons (i.e. raters or responders) on different occasions

+ ICC or weighted Kappa ≥0.7

+/− Unclear design or method

- ICC or weighted Kappa ≤0.7

Face validity

The degree to which the items of a Health Related-Patient Reported Outcome (HR-PRO) instrument indeed look as though they are an adequate reflection of the construct to be measured

+ A clear description is provided of the measurement aim, target population, the concepts that are measured, and the item selection and target population were involved in item selection

+/− A clear description of these aspects is lacking, or only target population involved, or doubtful design or method

- No target population involvement

Content validity

The degree to which the content of an HR-PRO instrument is an adequate reflection of the construct to be measured

+ A clear description is provided of the measurement aim, target population, the concepts that are measured, and the item selection and target population were involved in item selection

+/− A clear description of these aspects is lacking, or only target population involved, or doubtful design or method

- No target population involvement

Construct validity

The degree to which the scores of an HR-PRO instrument are consistent with hypotheses (for instance with regard to internal relationships, relationships to scores of other instruments, or differences between relevant groups) based on the assumption that the HR-PRO instrument validly measures the construct to be measured

+ Specific hypotheses were formulated and at least 75 % of the results are in accordance with these hypotheses

+/− Doubtful design or method (e.g. no hypotheses)

- Less than 75 % of hypotheses were confirmed

Responsiveness

The ability of an HR-PRO instrument to detect change over time in the construct to be measured

+ SDCb or SDC ˂ MICc or MIC outside the LoAd or RRe ˃ 1.96 OR AUCf ≥0.70

+/− Doubtful design or method

- Negative SDC or SDC ≥ MIC or MIC equals or inside LOA or RR ≤1.96 OR AUC ˂0.70, despite adequate design and methods

  1. aICC Intraclass Correlation Coefficient
  2. bSDC Smallest Detectable Change
  3. cMIC Minimal Important Change
  4. dLoA Limits of Agreement
  5. eRR Responsiveness Ratio
  6. fAUC Area Under the receiver operating characteristics Curve