- Open Access
- Open Peer Review
Development and validation of MIX: comprehensive free software for meta-analysis of causal research data
BMC Medical Research Methodologyvolume 6, Article number: 50 (2006)
Meta-analysis has become a well-known method for synthesis of quantitative data from previously conducted research in applied health sciences. So far, meta-analysis has been particularly useful in evaluating and comparing therapies and in assessing causes of disease. Consequently, the number of software packages that can perform meta-analysis has increased over the years. Unfortunately, it can take a substantial amount of time to get acquainted with some of these programs and most contain little or no interactive educational material. We set out to create and validate an easy-to-use and comprehensive meta-analysis package that would be simple enough programming-wise to remain available as a free download. We specifically aimed at students and researchers who are new to meta-analysis, with important parts of the development oriented towards creating internal interactive tutoring tools and designing features that would facilitate usage of the software as a companion to existing books on meta-analysis.
We took an unconventional approach and created a program that uses Excel as a calculation and programming platform. The main programming language was Visual Basic, as implemented in Visual Basic 6 and Visual Basic for Applications in Excel 2000 and higher. The development took approximately two years and resulted in the 'MIX' program, which can be downloaded from the program's website free of charge. Next, we set out to validate the MIX output with two major software packages as reference standards, namely STATA (metan, metabias, and metatrim) and Comprehensive Meta-Analysis Version 2. Eight meta-analyses that had been published in major journals were used as data sources. All numerical and graphical results from analyses with MIX were identical to their counterparts in STATA and CMA. The MIX program distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features.
The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments. It can be downloaded free of charge via http://www.mix-for-meta-analysis.info or http://sourceforge.net/projects/meta-analysis.
The amount of data produced by researchers in health sciences has been growing explosively and advances in genetics, genomics, and information technology are likely to further contribute to this growth. In the past two decades, meta-analysis has evolved into the statistical method par excellence to make sense out of the growing number of research reports. As the quantitative analytical part of a systematic review, it has been used for evaluating data from both experimental and observational studies in therapeutic, diagnostic, prognostic, and etiologic settings. In the commonly used definition of the hierarchy of scientific data for medical decision making, meta-analyses are considered as providing the highest level of evidence [1, 2]. As such, they can have a major impact on medical practice and health care policies, especially if aggregating data and investigating sources of heterogeneity provide new insights. Two well-known examples are the meta-analyses by Yusuf et al  and Lau et al , both showing that meta-analysis can be a powerful tool to show intervention effects that would remain beneath the surface of single study data without proper synthesis and re-analysis.
Although meta-analyses can be applied to all types of medical research, its primary application so far has been in the therapeutic realm. One of the main forces behind the rise of therapeutic meta-analysis is the Cochrane Collaboration , whose effort to systematically assess and synthesize evidence from randomized controlled trials has so far produced more than 4400 Cochrane systematic reviews, many with quantitative meta-analyses. The increasing interest for meta-analysis in health sciences over the past twenty years has been reported by several authors [6–11] and a small search we did in preparation of this project reveals that between 1990 and 2005 approximately 12,000 publications have been classified as a meta-analysis by PubMed. A bar graph of the annual numbers suggests that the interest for meta-analysis is still increasing (figure 1).
Many general statistical software packages have included options for meta-analysis in their basic program configuration, and user-communities have written numerous meta-analysis add-ons. Specialized software packages, meant exclusively for meta-analysis, are also available in various types and price ranges. Although the number of software packages for performing meta-analysis is substantial, in our opinion, most share one common limitation: low applicability in educational settings or environments with beginning researchers. Even though numerous researchers in health care are nowadays confronted with data from published meta-analyses or are even requested to do a meta-analysis themselves, there is still little or no electronic educational material and none of the existing software has explicit educational features. Cost is another issue that may have an impact on the use of software by students and lecturers: only a few of the modern meta-analysis packages are free and if academic pricing is available, prices can still be rather high for many.
After reading previously published software reviews [12–15] and using existing meta-analysis software, we made an inventory of what we thought was lacking or could be improved. Next, we set out to implement our ideas and create an innovative and comprehensive statistical meta-analysis package that would be freely accessible and user-friendly enough for students and beginning researchers. The program, called MIX (Meta-analysis with Interactive eXplanations), has been developed over the past two years and has been presented at several stages of the development at a number of conferences [16–19]. In October 2005, the first public version (1.0) was released during the Cochrane Colloquium in Melbourne  and has become available for download via the MIX website . It has been receiving a lot of interest (100–150 unique visitors to the MIX website each week) and has been downloaded over 1800 times within 6 months of its first release. This has prompted us to validate the results of all tests in the program formally and this article provides the offcial introduction of the MIX program together with the results of the validation.
Our primary objective was to develop a free program for meta-analysis of causal research (therapeutic trials as well as etiologic cohorts and case-control studies) that could be applied in both analytical and educational settings. Our secondary aim was to validate the analytical tests in the program with output from established reference standards.
Before the actual development, we started with making an inventory of the most important meta-analytical tests and approaches, and brainstormed on ideas for an interface. Since causal meta-analysis methods are relatively well-established (in contrast to diagnostic or prognostic approaches to meta-analysis), we focused on meta-analysis of controlled trials and cohort or case-control studies. In these studies, outcome differences between exposed or treated and non-exposed or untreated groups are compared to assess a causal relationship between the determinant (treatment or exposure) and an outcome (mortality or morbidity). As far as the program structure was concerned, our a priori idea was to create an add-in for Excel. Although a rather unorthodox approach in this area (all existing meta-analysis programs are stand-alone programs and work independently of Microsoft Office), Excel provides a sophisticated calculation and graphics platform that is well-suited to many meta-analytical methods and at the programmer's disposal before any programming is done. Consequently, development and maintenance is relatively easy and costs can be kept to a minimum (one of the main aims in our program development). Furthermore, the spreadsheet environment of Microsoft Excel is familiar to almost all researchers in medical, social, and economical sciences, which was very much in line with our attempt to develop a package that is fit for beginning researchers. Although we realized that even recent versions of Excel can be inaccurate with regard to some statistical calculations [21–23], we were confident that we could program around these difficulties if necessary.
Since we wanted to move beyond the occasional spreadsheet that can perform meta-analytical calculations, we started by designing a programming structure in which the already existing Excel functionality could be exploited to its maximum. Sophisticated procedures were custom-programmed with Visual Basic in the Visual Basic for Applications (VBA) editor of Excel 2003 (and tested in Excel 2000 and onward). The so-called front-loader (a start-up program initiated with an icon) and some small assistant programs, all being non-Excel entities, were developed with Visual Basic 6.0 (VB6).
Program architecture and operation
The current version of the program (version 1.5) is still only compatible with Windows operating systems running Excel 2000 or later, but versions for use with Excel on Macintosh and Linux are in preparation. The descriptions below apply to the Windows version, though most of it can be extended to future versions for other operating systems.
Installation is made easy with a set-up program that installs the necessary files in a folder that can be specified by the user (default is C:\Program Files\MIX). It will also create a MIX item in the Windows Start Menu (installing additional start-up icons on the Desktop or in the Quick-Launch bar is optional) and provides the option to start a Flash®-based program introduction. The MIX menu item contains an icon for starting up the MIX program, a folder with a shortcut to the uninstall program, a folder with shortcuts to programs for loading and unloading the Excel add-in, and a folder with educational programs and information. Loading the small MIX add-in that is supplied with the main program (typically automatically loaded during installation) results in a MIX menu-item under the Tools menu in Excel. This MIX menu contains several functions that can be accessed when the MIX program itself is not running. The files that form the core of the program are recognizable by their Mix file extension (*.mix) and currently contain approximately 16,000 lines of command code in 26 code modules and 17 custom user forms. These core files take up approximately 22 Mb of space on a hard-disk and their primary functions are (A) running interface procedures, (B) showing and manipulating output, (C) performing analyses, and finally (D) exporting and communicating with external files and programs. One of the core files is a large Excel workbook with 23 worksheets that forms the calculation engine of the program. It contains 6 sheets with primarily worksheet formulas and 10 sheets with various kinds of pre-calculated graphical and numerical results from meta-analytical tests. The remaining sheets contain information for help functions or programming purposes. This Excel workbook remains hidden from the users at all times. Figure 2 gives a graphical representation of the full program structure.
At start-up, a dedicated instance (an independent fully functional running program) of Excel is created and becomes visible once all regular Excel menus and toolbars are hidden and replaced by the MIX graphical interface. The Excel instance used by MIX is secured for exclusive use by the MIX program and does not interfere with existing Excel windows or settings.
The interface consists of a menu bar, two toolbars, and several shortcut menus. The menu bar and toolbar are directly accessible and the shortcut menus pop up with a right click of the mouse. The MIX menu bar has eight main menus (File, Edit, View, Numerical Output, Graphical Output, Analysis, and Help) via which all functions of the MIX program can be executed. Most of the common functions require only a single click on the toolbars. Double clicking graph items skips the shortcut menu and directly provides options for changing the graph item's format. Figure 3 shows the MIX program's user-interface with a forest plot and a format box to change the graph's format.
The MIX program provides several options for importing or creating data sets for meta-analysis. The most convenient option is to create an Excel or CSV file with data (standard output option in Excel) and import this file into the MIX program. The variable ranges are then selected in Excel-manner to create a data set (see figure 4), which is subsequently loaded for analysis and optionally saved as a MIX data set file (*.mxd). The program accepts descriptive data from studies with continuous outcomes, e.g. sample size, mean, standard deviation, and dichotomous outcomes, e.g. group sizes and event numbers (two-by-two table data). Comparative data can also be loaded by means of association measures with their standard error. Initially, however, it is not necessary to make a data set since 19 data sets from the most authoritative books on the subject ("Meta-analysis in Medical Research" by Sutton et al , "Systematic Reviews in Health Care, Meta-Analysis in Context" by Egger et al , and Systematic Reviews in Health Care, A Practial Guide by Glasziou et al ) have been included in the program. Most analyses and graphs presented in these books can be reproduced with a few clicks and the program can be used as a learning or teaching companion to these books. We hope to support more more books in this way in the future. In addition, the MIX website also contains a data set repository where users can contribute and download MIX data sets.
A large variety of numerical and graphical output can be produced by the program. Besides the association measure values from the meta-analysis, several formal tests for heterogeneity, small study effects (publication bias), single study influence, and cumulative trends are also available in MIX. The graphical output is particularly comprehensive, with no less than eighteen informative plots that can be formatted in detail.
Possible association measures from continuous outcome data input are mean difference (MD), Hedges' g (HG), and Cohen's d (CD), analyzed by inverse variance fixed or random effects models. Data from studies with dichotomous outcomes can be analyzed with a risk difference (RD), risk ratio (RR), or odds ratio (OR), weighted by inverse variance, Mantel-Haenszel, Peto (only odds ratio), or Dersimonian-Laird approaches. Analyses based on correlation coefficients or Fisher's Z are also possible, though only if the data are provided as comparative input, e.g. the association measures itself with their standard error. If correlation or effect size data are not in this format, they can be transformed via the MIX Statistics Converter that comes with the program. Table 1 gives an overview of the general features and the numerical and graphical methods in version 1.5 of the MIX program.
The most important educational features are the program's Output Tutor and Concept Tutor. Both are interactive dialog boxes that provide information about epidemiological and statistical concepts and tests. The Output Tutor changes with each analysis and always explains tests and results that are displayed or changed at the very moment. Additional teaching material includes a Flash®-based Theory Tour that explains the fundamentals of systematic reviews and meta-analyses and a Program Tour that shows the basics of how to use the program. The educational materials take up approximately 25 Mb and can also be downloaded separately.
To increase program stability and prevent users from accidentally altering the Visual Basic procedures, the source code cannot be accessed while the program is running. Codes to unlock the VBA modules are provided by the first author upon request.
Version 9.2 of STATA , and more specifically version 1.81 of the metan program , version 1.2.4 of the metabias program , and version 1.0.5 of the metatrim program  were used as the general reference standards for most tests. Details on the development of these user-written programs themselves can be found in the STATA Technical Bulletins [25–27]. The meta-analysis software Comprehensive Meta-Analysis (CMA) version 2  was used for validation of the Fail-safe N output and to double check the results of the other tests. Two investigators (LB, LMY) performed the validation independently with the MIX program (version 1.5 running in Excel 2003) and the reference standard(s) by analyzing eight data sets from meta-analyses that have been published in major journals [4, 29–35].
The data sets represent three of the most often used types of data for meta-analysis in health care research: 1) descriptive data for dichotomous outcomes, 2) descriptive data for continuous outcomes, and 3) comparative (association measure) data. For all three data types we chose a relatively small (less than 10 studies) and large data set (more than 20 studies) and we used two extra data sets in the 'descriptive dichotomous' category (one representing a meta-analysis of substantially heterogeneous studies and one with a rare event). The data sets are summarized in table 2. The tests that were subject to the validation procedures are shown in table 3. The items include individual study association measures, combined association measures, and several heterogeneity and small study effect assessments. Whenever applicable, p-values and/or confidence intervals were also compared.
Results from the analyses of the eight data sets with MIX and the reference software were entered independently in identical custom-made spreadsheets. These spreadsheets were later compared in separate analysis sheets that used a cell-based formula to check for discrepancies of results up to 4 decimals.
Results and Discussion
In summary, we have been able to achieve our objective of developing a comprehensive and yet free program for meta-analysis. The Excel platform, although not without problems, has proved to be flexible enough to create an easy-to-use, and graphically and numerically comprehensive program.
In its current state (version 1.5) all results from the MIX program are identical (up to 4 decimals minimally) to results from the most recent versions of the metan, metabias, and metatrim commands in STATA. The small study effect regression test by Macaskill  that was tested via STATA's regress command also turned out to be accurate. Table 4 and 5 are examples of the odds ratio validation results for data set 1 .
With regard to the trim-and-fill analysis , the MIX program allows for calculations using the weighting method applied in the original meta-analysis, whereas both CMA and STATA use only fixed or random effects inverse variance methods when trimming and filling. While the calculations in MIX for trim-and-fill analyses with other weighting methods were verified manually and we have no reason to believe anything is wrong, we recommend using the inverse variance methods until more is known about approaches with alternative weighting.
Although we are in the process of completing a formal software comparison project, we are confident that the MIX program can compete in many respects (usability, analytical options, comprehensiveness, and export options) with most of the existing meta-analysis programs like Comprehensive Meta-Analysis , MetaWin , RevMan , or WEasyMA . However, there are also still some limitations. One is the maximum number of studies that can be analyzed in the meta-analysis, which is now 100. Though systematic reviews finding 100 studies for analysis are still very rare, this is something that may change in the future. Furthermore, while sub-group analyses are easy to perform within MIX, they are currently not automated and during a sub-group analysis not all subgroups can be shown simultaneously in a single forest plot. The subgroup forest plot can however be created manually because the Excel graphs of individual forest plots are relatively easily formatted and stacked. We intend to improve the program with regard to these limitations in the near future.
Another important issue that we will focus on in upcoming updates is meta-regression. Although some univariable regression methods are integrated in the tests for small study effects, the MIX program can currently not perform meta-regression. We realize that meta-regression, especially with multiple independent variables, is a valuable tool for assessing heterogeneity and adapting a meta-analysis accordingly, but it requires matrix calculations that are far more difficult to program in Excel or VBA than the standard tests. Currently, univariable meta-regression is possible with Comprehensive Meta-Analysis  and MetaWin . However, like all dedicated meta-analysis packages they lack the option for multivariable meta-regression. We have started working on facilities for meta-regression within the MIX program and we hope it will be integrated sometime in 2007.
Finally, because we are still frequently updating the program and including new features, we have postponed the making of a hard-copy manual or methods guide until this process has stabilized.
The MIX program provides researchers, students, and lecturers with a free tool to perform state-of-the-art meta-analyses and learn or teach about what it is they are doing. It uses an innovative approach with Excel as a computing platform and even provides some numerical and graphical output that is not provided by other software. Results from version 1.5 of the MIX program are identical to those from STATA, and MIX can be regarded as a comprehensive and valid tool for performing causal meta-analyses.
Availability and requirements
Project name: MIX
Project homepage: http://www.mix-for-meta-analysis.info or http://www.sourceforge.net/projects/meta-analysis/
Operating system(s): Microsoft Windows
Programming language: Visual Basic (VB6, VBA)
Other requirements: Microsoft Excel 2000 or later
License: Open Source, free
Oxford-Center for Evidence Based Medicine, Levels of Evidence and Grades of Recommendation. [http://www.cebm.net/levels_of_evidence.asp]
Yusuf S, Cairns J, Camm A, Fallen E, Gersh B: Evidence-Based Cardiology. 1998, London: BMJ Publishing Group
Yusuf S, Zucker D, Peduzzi P, Fisher L, Takaro T, Kennedy J, Davis K, Killip T, Passamani E, Norris R, et al: Effect of coronary artery bypass graft surgery on survival: overview of 10-year results from randomised trials by the Coronary Artery Bypass Graft Surgery Trialists Collaboration. Lancet. 1994, 344 (8922): 563-70. 10.1016/S0140-6736(94)91963-1.
Lau J, Antman E, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers T: Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med. 1992, 327 (4): 248-54.
The Cochrane Collaboration. [http://www.cochrane.org]
Egger M, Davey Smith G, Altman D: Systematic Reviews in Health Care: Meta-Analysis in Context. 2001, London: BMJ Publishing Group
Glasziou P, Irwig L, Bain C, Colditz G: Systematic Reviews in Health Care: A Practical Guide. 2001, Cambridge: Cambridge University Press
Petitti D: Meta-Analysis, Decision Analysis, and Cost-Effectiveness Analysis: Methods for Quantitative Synthesis in Medicine. 2000, Oxford: Oxford University Press, second
Stangle D, Berry D: Meta-analysis in Medicine and Health Policy. 2000, New York: Marcel Dekker
Sutton A, Abrams K, Jones D, Sheldon T, Song F: Methods for Meta-Analysis in Medical Research. 2000, Chichester: Wiley
Whitehead A: Meta-Analysis of Controlled Clinical Trials. 2002, Chichester: Wiley
Egger M, Sterne J, Smith G: Meta-analysis software. BMJ. 1998, 316 (7126): [http://bmj.bmjjournals.com/archive/7126/7126ed9.htm]
Normand S: Meta-analysis software – a comparative review – DSTAT, version 1.10. Am Statistician. 1995, 49: 298-309. 10.2307/2684205.
Sterne J, Egger M, Sutton A: Meta-analysis software. Systematic Reviews in Health Care: Meta-Analysis in Context. Edited by: Egger M, Davey Smith G, Altman D. 2001, London: BMJ Books, 2
Sutton A, Lambert P, Hellmich M, Abrams K, Jones D: Meta-analysis in practice: A critical review of available software. Meta-Analysis in Medicine and Health Policy. Edited by: Berry D, Stangl D. 2000, New York: Marcel Dekker
Bax L, Ikeda N, Shirataka M, Takeuchi A: Explaining common meta-analytic statistics in Japan with a simple Excel add-in. The 24th Joint Conference on Medical Informatics. 2004, Nagoya, Japan
Bax L, Ikeda N: Explaining and performing common meta-analytic procedures in Japan: development of bilingual interactive software. The 12th Cochrane Colloquium. 2004, Ottawa, Canada
Bax L, Tsuruta H, Ikeda N, Takeuchi A, Shirataka M: The MIX program, free software for learning, teaching, and exploring meta-analysis with Excel. The 13th Cochrane Colloquium, Melbourne, Australia. 2005
Bax L, Tsuruta H, Shirataka M, Takeuchi A, Ikeda N: The MIX program, an active way of learning about meta-analysis with Excel. International Symposium: Systematic Review and Meta-Analysis, Wako, Japan. 2005
Meta-analysis with Interactive explanations. [http://www.mix-for-meta-analysis.info]
Knusel L: On the accuracy of statistical distributions in Microsoft Excel 2003. Comput Statist Data Anal. 2005, 48 (3): 445-449. 10.1016/j.csda.2004.02.008.
McCullough B, Wilson B: On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP. Comput Statist Data Anal. 2002, 40: 713-721. 10.1016/S0167-9473(02)00095-6.
McCullough B, Wilson B: On the accuracy of statistical procedures in Microsoft Excel 2003. Comput Statist Data Anal. 2005, 49 (4): 1244-1252. 10.1016/j.csda.2004.06.016.
StataCorp: Stata Statistical Software, Release 9. 2005, College Station, TX: StataCorp LP
Bradburn M, Deeks J, Altman D: Metan – an alternative meta-analysis command (Metan 1.81). Stata Technical Bulletin. 2003, STB 44 (sbe24): 4-15.
Steichen T: Tests for publication bias in meta-analysis (Metabias 1.2.4). Stata Journal. 2003, SJ3-4 (sbe19_5): 11-
Steichen T: Nonparametric trim and fill analysis of publication bias in meta-analysis (Metatrim 1.0.5). Stata Technical Bulletin. 2003, STB61 (sbe39.2): 11-
Borenstein M, Hedges L, Higgins J, Rothstein H: Comprehensive Meta-Analysis Version 2. 2005, Engelwood, NJ: Biostat
Hodnett E: Caregiver support for women during childbirth. Cochrane Database Syst Rev. 2000, CD000199-2
Teo K, Yusuf S, Collins R, Held P, Peto R: Effects of intravenous magnesium in suspected acute myocardial infarction: overview of randomised trials. Bmj. 1991, 303 (6816): 1499-503.
Crowley P: Interventions for preventing or improving the outcome of delivery at or beyond term. Cochrane Database Syst Rev. 2000, CD000170-2
Lightowler J, Wedzicha J, Elliott M, Ram F: Non-invasive positive pressure ventilation to treat respiratory failure resulting from exacerbations of chronic obstructive pulmonary disease: Cochrane systematic review and meta-analysis. Bmj. 2003, 326 (7382): 185-10.1136/bmj.326.7382.185.
Wahlbeck K, Cheine M, Essali M: Clozapine versus typical neuroleptic medication for schizophrenia. Cochrane Database Syst Rev. 2000, CD000059-2
Pagliaro L, D'Amico G, Sorensen T, Lebrec D, Burroughs A, Morabito A, Tine F, Politi F, Traina M: Prevention of first bleeding in cirrhosis. A meta-analysis of randomized trials of nonsurgical treatment. Ann Intern Med. 1992, 117: 59-70.
Law M, Wald N, Thompson S: By how much and how quickly does reduction in serum cholesterol concentration lower risk of ischaemic heart disease?. Bmj. 1994, 308 (6925): 367-72.
Macaskill P, Walter S, Irwig L: A comparison of methods to detect publication bias in meta-analysis. Stat Med. 2001, 20 (4): 641-54. 10.1002/sim.698.
Duval S, Tweedie R: Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics. 2000, 56 (2): 455-63. 10.1111/j.0006-341X.2000.00455.x.
Rosenberg M, Adams D, Gurevitch J: MetaWin: Statistical Software for Meta-Analysis Version 2. 2000, Sunderland, Massachusetts: Sinauer Associates
The Nordic Cochrane Centre: Review Manager (RevMan) Version 4.2 for Windows. 2003, Copenhagen: The Cochrane Collaboration
Chevarier P, Cucherat M, Freiburger T, Maupas J, Visele N, Bugnard F, Bazog P: WeasyMA. 2000, Lyon: ClinInfo
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/6/50/prepub
The development and validation of the MIX program was supported by a one-year grant from the Graduate School of Medical Sciences of Kitasato University, # 3042. We are furthermore grateful to all members of the Department of Medical Informatics at Kitasato University for the stimulating discussions during the project.
The author(s) declare that they have no competing interests.
LB designed and developed the MIX program, under supervision of NI and HT and with testing by and recommendations from all authors. The validation was performed by LB and LY and supervised by KGM. LB drafted the manuscript and all authors participated in the writing.