Ancient Egyptian medical papyri (1550 BC) already emphasised diagnosis by physical examination as the cornerstone of the decision to treat or not to treat an ailment . Today, the clinical assessment of the probability of a disease comes from a series of implicitly and explicitly performed tests. In addition to the implicit diagnostic information from history (risk factors and symptoms) and clinical examination (signs), many additional diagnostic imaging or laboratory tests are available. The accuracy of such tests requires to be appropriately assessed before they can be used in clinical practice.
Studies on primary diagnostic research typically examine the accuracy of a test isolated from history and clinical examination or do not adjust for overlap of information captured by clinical history, physical examination and additional tests. Such studies and conventional meta-analyses of their reported results will therefore not show how useful the test will be in practice [2–4].
In addition to the predominance of isolated, single test evaluations in published literature, variations in design and quality of studies on diagnostic topics [5–8] make the interpretation of test accuracy data difficult [9–12]. Systematic reviews and meta-analyses, by definition, can not overcome these difficulties . Apart from intrinsic flaws in the original studies and methodological challenges in statistically pooling results [14, 15], there is concern about the generalisability of results of such meta analyses, due to the invalidity of assumptions about the constancy of accuracy measures (sensitivities, specificities, and likelihood ratios) across different patient groups [16–20].
Due to the limited space in medical journals and the lack of standard procedures to make original data accessible, little empirical evidence is available about the influence of many patient and study characteristics (i.e. patients' selection criteria, spectrum of disease, frequency of indeterminate test results and of drop outs, and the degree of blinding) on the estimates of diagnostic performance of tests [13, 21].
Another limitation is the fact that many original reports of diagnostic and prognostic meta-analyses report data only in a dichotomous way, since many test results that are continuous in nature are classified as abnormal or normal. By doing so, these meta-analyses are based on reduced information, thus neglecting the potential diagnostic information contained in continuous test results. They possibly give an overestimation of the accuracy by selection of optimal cut-off values in the original studies [3, 22–24].
As a consequence, it is difficult to make a good assessment of the generalisability of the accuracy of tests, either in an isolated situation as well in the context of other tests.
In contrast with conventional meta-analysis of test accuracy studies, individual patient data (IPD) meta-analysis has the potential to establish the value of test combinations. First, in IPD meta-analysis test results can be analysed taking into account the continuous test results rather than the dichotomous classification that is generally used in reports of diagnostic and prognostic tests. The use of original continuous data instead of the dichotomized reported test results creates the possibility to detect a (gradual) relation between test result and disease and it makes it possible to estimate test accuracy at different cut-off values. Second, the additional information provided by diagnostic tests can be examined in light of the diagnostic information already known from history and clinical examination, and less expensive or less invasive tests [16, 22, 25–28].
Assumptions about invariance of test accuracy across a range of disease prevalences (prior probabilities) can be tested. Finally, also the association across patient-level characteristics or between patient level and study level characteristics (setting, study design) can be assessed, without the ecological fallacy problem.
To our knowledge, no IPD meta-analyses of diagnostic or prognostic research have been conducted so far. In this paper we describe the outline of a research program to systematically evaluate the potential benefits of IPD meta-analyses in the evaluation of diagnostic tests. Thereby, we selected four clinical problems from gynaecology, obstetrics and reproductive medicine that will be used as clinical cases for this methodological project:
1. Diagnosis of endometrial cancer in women with postmenopausal bleeding (PMB)
2. Prediction of preterm birth
3. Diagnosis of tubal pathology in subfertile women
4. Assessment of ovarian response in women undergoing in vitro fertilisation (IVF)
The objectives and research methods will be outlined below, and practical, methodological and clinical issues that we anticipate to encounter will be discussed.