Our results demonstrate that the distribution of citations differs between types of article and journals within journal subject areas. It appears that the journals with the highest impact factors generally had a lower proportion of articles which are never cited. This was reproducible between two subject fields which overlap with only two journals (Transplantation and Transplantation Proceedings).
The quality of a journal is difficult to assess objectively and perhaps impossible to define numerically. As such any method is likely to be open to criticism. Quality measures are useful to publishers, advertisers, librarians, editors and authors alike. Editors are often stirred to put pen to paper when a journal's impact factor rises[7, 8], whilst authors may use impact factors to decide were to submit scientific research. Whether the increasing ease of access to the abstracts and full text of articles by the internet and electronic publications, particularly open access, will change the importance attached to such measures is unknown. Furthermore simple citation measures may more accurately reflect the usefulness of an article to another authors' work rather than its quality. It has been previously recognised that citation rate and impact factors alter depending on the field of a journal, with basic science journals having higher, impact factors than clinical medicine journals. This is supported by the observation in this study that impact factors were higher and levels of non-citation lower, in the immunology literature compared to the surgical literature. There may also be differences in the relevance of citation counting between clinical and scientific research. It is reasonable to hypothesise that pure clinicians may read articles and journals which influence their clinical practice but never cite this work themselves. Some important clinical papers which guide clinical practice may gain a high number of citations; for example the North American Symptomatic Carotid Endarterectomy Trial has gained over 500 citations and the UK small aneurysm trial over 170. Yet it is easy to find papers which probably do not influence clinical practice as widely, published at similar times in the same journals, which gather large numbers of citations. For example a clinical report of 'Buffalo Hump' in males with HIV infection has accumulated over 280 citations. However it is not known whether research which changes clinical practice is cited more frequently than research which does not. Citation practices of authors may also be influenced by factors other than quality, including language of publication and personal choice. Across the literature, authors are more likely to cite longer articles and review articles.
Despite these limitations, citation counts provide a convenient and objective method of ranking articles and journals. It is therefore important to use the most appropriate and transparent way of communicating this information, particularly if such rankings are used to define quality.
The criticism of the impact factor itself has grown as its influence increases. Articles such as editorials, letters and news items are classified as "non-source" items and as such does not count towards the total number of articles used to calculate the impact factor. However, such items may attract numerous citations which are counted towards a journal's impact factor. Journals may increase the number of non-source items to artificially increase impact factors. It is also suggested that the calculation provides a method for comparing journals regardless of their size. However journal size may be a confounding factor- journals publishing more articles tend to have higher impact factors per se. Small journals may be disadvantaged by this bias. Most importantly impact factor does not communicate any information about the citation distribution to the reader.
Which is, then, the more appropriate method to measure quality; levels of non-citation or impact factor? Clearly this depends on the definition of quality. One can define a quality journal, in terms of non-citation, as one which maximises the amount of useful, interesting and original information per issue. What is the definition of quality as measured by impact factor? When the calculation is studied it is clear that impact factor represents a mean number of citations; yet we have demonstrated that the distribution of citations to articles within the vast majority of journals is non-parametric. Statistically, at the very least, the impact factor should represent the median number of citations to articles and not the mean. It is of great concern that the tool which is accepted as a measure of journal quality contains the type of fundamental statistical error which would make most editors and peer reviewers recoil.
The non-parametric distribution of citations to articles lies at the heart of the problem with impact factors. A journal which contains a handful of very useful articles with a large amount of articles which are not cited subsequently may have the same impact factor as one with a small number of citations spread evenly across most of its articles. It was observed that journals with high impact factors do tend to have a broader distribution of citations amongst articles (although rarely Gaussian) and lower levels of non-citation. It would be easy to conclude from this that impact factor therefore also reflects non-citation and it is unnecessary to consider different methods of ranking journals. Using a non-citation rate as a measure of the quality of a journal does have advantages over impact factor, particularly for contributors and institutions. Firstly, the definition of quality is explicit and logical. Most importantly however, it creates a clear distinction between how citation analysis is used to measure the quality of a journal (low level of non-citation) and an individual piece of work (citation counting). This will hopefully remove the temptation to use a journal's ranking to judge individual articles. Articles are, of course, best assessed by reading them, but they may be evaluated by counting citations. Although this is a less than ideal way of measuring quality, it may be preferable to the current method of assuming that an article is good because it is published in a journal which attracts many citations, even though these citations are unevenly dispersed.
In this study we have defined our measure of journal quality in terms of non-citation by evaluating the non-citation of one year's literature (2001) from publication to present day, due to the practicalities of the data retrieval. Should publication of non-citation rates be embraced then this information could be presented in a number of ways. Firstly the level of non-citation within the current year to the previous two years articles could be presented alongside the impact factor. However this does not overcome the problems of temporal bias produced by only reporting the citation statistics relating to 2 recent years. The level of non-citation could therefore also be reported yearly or even continuously for every previous year of each journal. This would provide an index which takes into account every citation made to a journal rather than just those made in a short period of time following publication.
Impact factor does have some advantages over non-citation measures for a handful of journals. For those journals which have no un-cited literature (2.2% of journals included in this report) the impact factor offers a further way of discriminating between journals, yet interestingly of the 5 journals with no un-cited literature only one, Nature Immunology, was a primary research journal. Perhaps non-citation is most useful for ranking the primary research literature. As reviews and original articles attract different levels of citation, it may be most appropriate to use the level of non-citation of original articles as the measure of a journal's quality. This would also mean that the citation of non-source items and publication of numerous reviews would not improve a journal's ranking as it may do using impact factor. This is an area for debate. Furthermore given that non-citation and citation practices are different in individual subject fields the measure of non-citation is probably no more valid than impact factor for comparing journals between fields. However, within individual subject fields, non-citation provides a more logical and explicit measure of a journal's quality than impact factor.