The search strategies developed performed well in terms of sensitivity, with the expected pay-off for precision and vice versa. Depending on the objective of the search, all three strategies are suitable for specific purposes such as the use of sensitive strategies in systematic reviews or e-mail alerts.
All strategies were assessed against three different test sets. For the measurement of sensitivity, the development and journal screening sets produced similar test results, while the results of the precision set showed greater differences. We assume these differences were caused by the insufficient sample size of the precision set. Although 2,195 references do not appear to be a small sample size, for a given prevalence of 0.0027% of relevant references in the PubMed/Medline population, the sample is still small. Precision ranged from 0.2% to 6.1% in the development set, 0.3% to 14.7% in the precision set, and 8.1% to 32.0% in the journal screening set. Although the ranges varied considerably between the test sets, the overall pattern in the comparison of the search strategies remained consistent: the precise strategies performed better than the balanced strategies and sensitive strategies. Conceptually, the precision set was the closest to the true population. Even with 2,195 screened references, this approach lacked the accuracy to differentiate between strategies in terms of sensitivity. However, precision derived from this set was not hampered by the small sample size and produced less biased estimates for the PubMed/Medline population than the development and journal screening set.
The comparison of PubMed's HSR Queries with the newly developed strategies shows advantages for the latter strategies. This favourable assessment could be expected, due to the broader scope of the HSR Queries. Therefore it might be more important to consider performance comparison as a validation method for the developed strategies and the test sets used.
One of the strengths of the population set is the possibility to infer to the overall PubMed/Medline population and calculate the expected number of relevant references on the topic of interest. However, it should be taken into account that this estimate is based on the Medline references of PubMed/Medline and therefore ignores a small percentage of non-Medline references. Although limited by wide confidence intervals, when exclusively compared to the development and journal screening set, the estimate allows an inference of the overall number of relevant papers.
In addition to the aims outlined, the study employed a development process for search strategies, with some features that might be useful to search strategy development in general. While the identification of candidate terms and the testing of the strategy against the development set have previously been done in research on methods filters, in our opinion the population and precision sets employed are unique features of this study. These sets allow (1) search strategy developers to select terms that are not only frequently used in relevant publications but also specific to the topic of interest, and (2) to achieve more realistic precision estimates for the PubMed/Medline population. For search strategy developers, the frequency of terms should not be the sole criterion for the selection of a term for a search strategy. For example, the term "patients" is present in 65% of the references in the development set, but also in 77% in the population set, indicating a lack of specificity for the topic of interest. The population set enables the developer to preselect these specific terms in order to develop sensitive and precise searches.
Although the development process described could support the development of performance-oriented search strategies, in general some limitations apply to this study and the generalizability of the process.
We assumed that the selected systematic reviews used for building the development set are the most comprehensive reviews in the topic area. However, we cannot rule out that other reviews containing additional relevant references exist.
The search filters developed require references to be fully indexed in Medline and might not be able to fully capture citations in-process; this applies to many search filters  and also limits the use of search strategies as e-mail-update filters.
An untested search strategy without MeSH terms is provided in Additional file 1 (Table S3).