- Research article
- Open Access
- Open Peer Review
A survey of statistics in three UK general practice journal
BMC Medical Research Methodologyvolume 4, Article number: 28 (2004)
Many medical specialities have reviewed the statistical content of their journals. To our knowledge this has not been done in general practice. Given the main role of a general practitioner as a diagnostician we thought it would be of interest to see whether the statistical methods reported reflect the diagnostic process.
Hand search of three UK journals of general practice namely the British Medical Journal (general practice section), British Journal of General Practice and Family Practice over a one-year period (1 January to 31 December 2000).
A wide variety of statistical techniques were used. The most common methods included t-tests and Chi-squared tests. There were few articles reporting likelihood ratios and other useful diagnostic methods. There was evidence that the journals with the more thorough statistical review process reported a more complex and wider variety of statistical techniques.
The BMJ had a wider range and greater diversity of statistical methods than the other two journals. However, in all three journals there was a dearth of papers reflecting the diagnostic process. Across all three journals there were relatively few papers describing randomised controlled trials thus recognising the difficulty of implementing this design in general practice.
"Diagnosis is the keystone of good medical practice"
General practitioners (GPs) are primarily diagnosticians  yet it appears that diagnosis remains their Achilles heel. The problem has its origins in a misunderstanding of the differences of the five Ps (patients, pathologies, presentations, prevalences and predictive values) in hospital practice compared to primary care. Decisions made by GPs are different from those made by hospital clinicians. The precise diagnostic labels may be less important than deciding on an appropriate course of action. Hence, diagnoses are often framed in terms of binary decisions; treatment versus non-treatment, disease versus non-disease, referral versus non-referral, and serious versus non-serious for example.
From a statistical viewpoint the binary decision making process has a lot of appeal. For example, the use of the naïve Bayes' discriminant function (and from it the derivation of likelihood ratios) is appropriate. Proponents of Bayes' argue for its simplicity and ease of interpretation[5, 6]. In contrast, opponents argue that data are not used efficiently if they are simply ploughed through the "black box" of Bayes'[7, 8]. Whatever the rights and wrongs of Bayes' as a technique it is time for GPs to become more familiar with statistical methods aimed at diagnosis. In relation to haematuria (blood in the urine) and the diagnosis of urological malignancy two of the authors of this paper (NS and ASR) have used Bayesian techniques in order to seek to refine diagnostic discrimination by general practitioners . The results from this work have been incorporated successfully into local primary care oriented referral guidance.
Many medical journals, both generalist[10, 11] and specialist [12–18], have been reviewed for their statistical content. Articles have been published in the fields of radiology, [12–14] otolaryngology, [15, 16] rehabilitation medicine and ophthalmology to name but a few. However, general practice is under researched in this area. The aim of this paper is to review three leading UK journals in general practice and to see what statistical methods are being used. It is not our intention to see if the methods are being used correctly but to look at the range of techniques reported. The outcome of this research should give pointers to the future education of GPs who wish to undertake research.
Three statisticians (MJC, ASR and GKA) (two of them holding Chartered status of the Royal Statistical Society) including one Professor, one Senior Lecturer and one Lecturer each reviewed one leading UK journal in general practice. The fourth author (NS) is a Primary Care Physician. The journals chosen were the British Medical Journal (BMJ) (general practice section), British Journal of General Practice (BJGP) and Family Practice. These three journals were chosen because they reflected the main primary care journals in the UK. The journals were hand searched for original research articles over a one-year period (1 January to 31 December 2000). Articles were classified for both their statistical content and methods of design according to criteria laid down elsewhere[10, 20]. Tables 1 and 2 list the classification criteria used for both study design and statistical methods. Letters were excluded on the grounds that they are typically responses to previously published material rather than original contributions in themselves. We are aware, of course, that not all primary care research is published in these three journals alone and we comment on this later.
The main study was preceded by a pilot phase in which a random sample of 10 articles was classified both by statistical content and study design by the three statisticians. Where there were differences of opinion, consensus was reached by discussion. We met once to discuss our classification system, and to iron out differences of opinion. One problem lay in how we actually classified study design. For example, one of use used the phrase 'cross-sectional survey' while another used the phrase 'questionnaire survey' when both meant the same in terms of study design. Another problem was that we missed some of the statistical techniques (where there were many) and this required much more careful reading of the articles when we carried out the main survey. We did not carry out a formal reliability study of the pilot phase but instead relied on our experiences both as statisticians, and as journal reviewers. Similarly we chose not to carry out a formal reliability analysis in the main study.
The total number of articles reviewed over a one year period was as follows: BMJ (general practice section) (n = 79), BJGP (n = 145) and Family Practice (n = 81).
The most common design was that of a cross-sectional survey being found in 24.1%, 39.3% and 35.1% of articles in the BMJ, BJGP and Family Practice respectively (Table 3). Although we classified articles by the term 'cross-sectional survey' this was not necessarily the choice term adopted by the journal. Sometimes the phrase 'questionnaire survey' was used and we assumed this was data collected cross-sectionally. We found a similar difference in nomenclature for our phrase 'cohort study' in which the phrase 'prospective survey' was also found. The highest proportion of qualitative studies was in Family Practice (21.0% compared to an average of 11.8%). Qualitative studies included those encompassing terms such as 'focus groups' and 'semi-structured interviews' for example. Figure 1 shows the proportion of papers ranked by a qualitative design. For all three journals, diagnostic studies were infrequently used. Examples of these include those based on screening (e.g., the usefulness of N-terminal brain natriuretic peptide level for screening of patients with heart failure), and calculating the sensitivity and specificity of diagnostic tests (e.g., Helicobacter pylori for the detection of peptic ulcer). Examples of more unusual study designs include those based on video recordings, literature reviews and quasi-experimental designs.
The range of statistical methods reported can be seen in Table 4. The number of methods exceeds the number of articles as some reported more than one technique. There are differences between the journals. The BMJ shows a greater range and breadth of articles than Family Practice. More sophisticated techniques are reported more often in the BMJ than either of the other two journals. In the BMJ, the two most common statistical methods used were logistic regression (n = 14, 17.7%) and the Chi-squared test (n = 13, 16.5%). The two least common were the Mantel-Haenszel statistic (n = 1, 1.3%) and Cronbach's alpha (n = 1, 1.3%). Relatively new innovations such as random effects models were seen in both the BMJ and the BJGP. The least sophisticated statistical methods appeared in Family Practice. Methods based on likelihood ratios were seldom found in either the BMJ or BJGP and not at all in Family Practice. Nonparametric tests were often unspecified but where they were included Mann-Whitney U test, Spearman's correlation coefficient and the Wilcoxon matched-pairs signed ranks test. Multiple comparisons included Bonferonni techniques and Scheffe's contrasts. Survival analysis included Kaplan-Meier curves and Cox regression.
One-third of all articles reported no statistics or simple summaries (for example, mean, median, percentage, standard deviation, interquartile range). No journal article with a qualitative design had any statistical content.
A large number of articles reported other statistical methods, in particular the BJGP. This was due to a wide range of statistical methods being reported only once. Examples include time series, multilevel modelling and factor analysis. In others, we could not decipher which statistical techniques had been used.
Table 5 shows the rank order of the statistical methods by each journal. Differences between the journals can be seen more clearly.
Two-thirds of all journal articles relied on some type of statistical analysis beyond descriptive statistics (Table 4). The Chi-squared test and t-tests were commonly used in the BJGP and Family Practice. Papers in the BMJ and the BJGP used more sophisticated statistical methods than Family Practice (Table 4). While both the BMJ and the BJGP used sophisticated methods, the BMJ used them more often. Why might this be so? The sophistication of methods used is influenced by three factors. First, issuing instructions to authors of a statistical nature. This requires a bank of statisticians available for review to which the BMJ has access. Second, general articles on statistical aspects of writing papers. Third, tutorial type articles explaining specific techniques. The BMJ continues to take a lead in the latter two areas and indeed published statistical guidelines for contributions to medical journals over 20 years ago. Despite the lack of sophistication in Family Practice, there has been a trend of using more advanced statistics elsewhere,[14, 15, 17, 20] and this has been linked to the increasing availability of computer packages. The BJGP is currently struggling to find statistical reviewers (personal communication by Editor to ASR). It is perhaps too easy for us to lay blame at the Editors door for this lack of sophistication. Statisticians are relatively rare, and review, for the most part, is unpaid.
Although these three journals publish a large proportion of the research in general practice within the UK, they by no means represent 100% of it. To look at this further we examined the year 2000 and undertook a MEDLINE search using the key indexing phrase 'General Practice'. We found over 800 articles in a diversity of journals. Articles were published in the fields of rheumatology, medical ethics, obstetrics, public health, clinical pharmacology, clinical neurology and telemedicine to name but a few.
We chose to look at the year 2000. Would our results be different had we selected a different year? The published literature suggests otherwise. In a 20 year old study, Emerson and Colditz found t-tests (44%) and Chi-squared tests (27%) were the most common statistical methods reported although now Chi-squared tests are more common than t-tests (Table 5). Given the emphasis on statistical computing today we might have expected less reliance on these two methods. What lies behind this lack of progress? Altman and Goodman looked at the speed of the transfer of technology of new statistical methods into the medical literature. They concluded that many methodological innovations of the 1980s had still not made their way into the medical literature of the 1990s suggesting a typical lag-time of 4–6 years. Lag-time is likely to be related to quality statistical review and this may be longer in journals with less impact. It is also worth reporting that since we carried out this survey (year 2000) there will have been a modest increase in the use of newer, more sophisticated statistical techniques.
Now let us turn to study design. The gold standard research design is considered to be the randomised controlled trial (RCT). It has been acknowledged that carrying out RCTs in general practice are difficult[23, 24]. In our survey we found few RCTs (Table 3). There are particular problems of recruitment with respect to primary care. Many issues have been discussed. For example, most practices have no formal contractual arrangement to participate in research and may be unwilling to participate unless there is immediate benefit to their patients. It is known that motivating practices for long-term follow-up studies particularly is not easy. Practices may feel uncomfortable about randomising their patients but, delegation of this duty to another may lead to a breakdown of the special doctor/patient relationship. There are statistical and sample size concerns also. Randomisation by practice (so-called cluster randomisation) leads to larger sample sizes being required[27, 28].
What are the issues here? Are they really that different from secondary care? A recent publication posed the question 'What do residents really need to know about statistics?'. The authors surveyed six journals and catalogued them for their statistical and methodological content. The most popular statistical tests across the whole range of journals were the Chi-squared test followed by the t-test. The authors concluded that with knowledge of each of these two tests clinicians should be able to interpret up to 70% of the medical literature.
For all three journals there was a dearth of articles reflecting the diagnostic process. Why is this? It has already said that diagnosis is the Achilles Heel of GPs. If it is not to remain this way we must start to educate doctors. The question is how. The latest "Tomorrow's Doctors" states that students must have "Adequate knowledge of the sciences on which medicine is based and a good understanding of the scientific methods including principles of measuring biological functions, the evaluation of scientifically established facts and the analysis of data". Clearly, there is a role for teaching statistics in the education of doctors who wish to undertake research. The much greater prevalence of methods concerning binary data (Chi-squared test, logistic regression, odds ratios/relative risks) over methods concerned with continuous data should be reflected in our (statistical) teaching. Initial training in means, medians and modes should be replaced by relative risk, absolute risk and numbers needed to treat.
Anonymous: Diagnosis: logic and pseudo-logic. The Lancet. 1987, 1: 840-841.
Morrell DC: Diagnosis in General Practice. Art or Science?. 1993, London: Nuffield Provincial Hospitals Trust
Howie JGR: Diagnosis – the Achilles Heel?. Journal of the Royal College of General Practitioners. 1972, 22: 310-315.
Summerton N: Diagnosing Cancer in Primary Care. 1999, Oxford: Radcliffe Medical Press
Hilden J: Statistical diagnosis based on conditional independence does not require it. Computational Methods in Biology and Medicine. 1984, 14: 429-435. 10.1016/0010-4825(84)90043-X.
Crichton NJ, Fryer JC, Spicer CC: Some points on the use of 'Independent Bayes' to diagnose acute abdominal pain. Statistics in Medicine. 1987, 6: 945-959.
Feinstein AR: The haze of Bayes, the aerial palaces of decision making and the computerised Ouija board. Clinical Pharmacology and Therapeutics. 1977, 21: 482-496.
Morton BA, Teather D, du Boulay GH: Statistical modelling and diagnostic aids. Medical Decision Making. 1984, 4: 339-348.
Summerton N, Mann S, Rigby AS, Ashley J, Palmer S, Hetherington JW: Patients with new onset haematuria in relation to urological cancer attending an 'open access' clinic: assessing the discriminant value of items of clinical information in relation to urological malignancies. British Journal of General Practice. 2002, 52: 284-289.
Emerson JD, Colditz GA: Use of statistical analysis in the The New England Journal of Medicine. New England Journal of Medicine. 1983, 309: 709-713.
Ripoll RM, Terren CA, Vilalta JS: The current use of statistics in biomedical investigation: A comparison of general medicine journals. Medicina Clinica. 1996, 106: 451-456.
Elster AD: Use of statistical analysis in the AJR and Radiology- Frequency, methods and subspeciality differences. American Journal of Roentgenology. 1994, 163: 711-715.
Goldin J, Zhu W, Sayre JW: A review of statistical analysis used in papers published in Clinical Radiology and British Journal of Radiology. Clinical Radiology. 1996, 51: 47-50.
Golder W: Statistical analyses in German radiological periodicals: The last decade's development. Rofo-Fortschritte auf dem gebiet der rontgenstahlen und der bildgebdenden verfahren. 1999, 171: 232-239. 10.1055/s-1999-246.
Rosenfeld RM, Rockette HE: Biostatistics in otolaryngology journals. Archives of Otolaryngology, head and neck surgery. 1991, 117: 1172-1176.
Bhattacaryya N: Peer review: Studying the major otolaryngology journals. Laryngoscope. 1999, 104: 640-644. 10.1097/00005537-199904000-00023.
Schwartz SJ, Sturr M, Goldberg G: Statistical methods in rehabilitation literature: A survey of recent publications. Archives of Physical Medicine and Rehabilitation. 1996, 77: 497-500. 10.1016/S0003-9993(96)90040-4.
Juzych MS, Shin DH, Seyedsadr M, Siegner SW, Juzych LA: Statistical techniques in ophthalmic journals. Archives of Ophthalmology. 1992, 110: 1225-1229.
Thomas T, Fahey T, Somerset M: The content and methodology of research papers published in three United Kingdom primary care journals. British Journal of General Practice. 1998, 48: 1229-1232.
Wang Q, Zhang BH: Research design and statistical methods in Chinese medical journals. Journal of the American Medical Association. 1998, 280: 283-285. 10.1001/jama.280.3.283.
Altman DG, Gore SM, Gardner MJ, Pocock SJ: Statistical guidelines for contributions to medical journals. British Medical Journal. 1983, 286: 1489-1493.
Altman DG, Goodman SN: Transfer technology from statistical journals to the biomedical literature – Past trends and future predictions. Journal of the American Medical Association. 1994, 272: 129-132. 10.1001/jama.272.2.129.
Pringle M, Churchill R: Randomised controlled trials in general practice. Gold standard or fool's gold ?. British Medical Journal. 1995, 311: 1382-1383.
Sheikh S, Smeeth L, Ascroft R: Randomised controlled trials in general practice: scope and application. British Journal of General Practice. 2002, 52: 746-751.
Tognoni G, Alii C, Avanzini F, Bettelli G, Colombo F, Corso R, et al: Randomised controlled trials in general practice: lessons from failure. British Medical Journal. 1991, 303: 969-971.
King M, Broster G, Lloyd M, Horder J: Controlled trials in the evaluation of counselling in general practice. British Journal of General Practice. 1994, 44: 229-232.
Donner A, Brown KS, Brasher P: A methodological review of non-therapeutic intervention trials employing cluster randomisation, 1979–1989. International Journal of Epidemiology. 1990, 19: 795-800.
Campbell MJ: Cluster randomised controlled trials in general (family) practice. Statistical Methods in Medical Research. 2000, 9: 81-94. 10.1191/096228000676246354.
Reed III, Salen P, Bagher P: Methodological and statistical techniques: what do residents really need to know about statistics?. Journal of Medical Systems. 2003, 27: 233-238. 10.1023/A:1022519227039.
General Medical Council: Tomorrow's doctors. Recommedations on undergraduate medical education. 2002, [http://www.gmc-uk.org/med_ed/tomdoc.htm]
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/4/28/prepub
We wish to thank the referees for their constructive comments.
The author(s) declare that they have no competing interests.
Three authors (ASR, GKA and MJC) carried out the literature review while all four authors contributed to the writing.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.