 Research article
 Open Access
 Open Peer Review
This article has Open Peer Review reports available.
A simple ratiobased approach for power and sample size determination for 2group comparison using Rasch models
 Véronique Sébille^{1}Email author,
 Myriam Blanchin^{1},
 Francis Guillemin^{2},
 Bruno Falissard^{3, 4} and
 JeanBenoit Hardouin^{1}
https://doi.org/10.1186/147122881487
© Sébille et al.; licensee BioMed Central Ltd. 2014
Received: 18 December 2013
Accepted: 27 June 2014
Published: 5 July 2014
Abstract
Background
Despite the widespread use of patientreported Outcomes (PRO) in clinical studies, their design remains a challenge. Justification of study size is hardly provided, especially when a Rasch model is planned for analysing the data in a 2group comparison study. The classical sample size formula (CLASSIC) for comparing normally distributed endpoints between two groups has shown to be inadequate in this setting (underestimated study sizes). A correction factor (RATIO) has been proposed to reach an adequate sample size from the CLASSIC when a Rasch model is intended to be used for analysis. The objective was to explore the impact of the parameters used for study design on the RATIO and to identify the most relevant to provide a simple method for sample size determination for Rasch modelling.
Methods
A large combination of parameters used for study design was simulated using a Monte Carlo method: variance of the latent trait, group effect, sample size per group, number of items and items difficulty parameters. A linear regression model explaining the RATIO and including all the former parameters as covariates was fitted.
Results
The most relevant parameters explaining the ratio’s variations were the number of items and the variance of the latent trait (R^{2} = 99.4%).
Conclusions
Using the classical sample size formula adjusted with the proposed RATIO can provide a straightforward and reliable formula for sample size computation for 2group comparison of PRO data using Rasch models.
Keywords
 Patientreported outcomes
 Item response theory
 Rasch model
 Sample size
 Power
Background
Patientreported outcomes (PRO) are increasingly used in clinical research; they have become essential criteria that have gained major importance especially in chronically ill patients. Consequently, nowadays these outcomes are often considered as main secondary endpoints or even primary endpoints in clinical studies [1–4]. Two main types of analytic strategies are used for PRO data: socalled classical test theory (CTT) and models coming from Item Response Theory (IRT). CTT relies on the observed scores (possibly weighted sum of patients items’ responses) that are assumed to provide a good representation of a “true” score, while IRT relies on an underlying response model relating the items responses to a latent trait, interpreted as the true individual quality of life (QoL) for instance. The potential of IRT models for constructing, validating, and reducing questionnaires and for analyzing PRO data has been regularly underlined [5–7]. IRT and in particular Rasch family models [8] can improve on the classical approach to PRO assessment with advantages that include interval measurements, appropriate management of missing data [9–11] and of possible floor and ceiling effects, comparison of patients across different instruments [12]. Consequently, many questionnaires are validated (or revalidated) using IRT along with CTT [13–15] allowing analysing PRO data with IRT models in clinical research.
Clinical research methodology has reached a high level of requirements through the publication of international guidelines including the CONSORT statement, the STROBE (Strengthening the Reporting of Observational Studies in epidemiology), or TREND (Transparent Reporting of Evaluations with Nonrandomized Designs), initiative for instance [16–19]. All of these published recommendations are aimed at improving the reporting of scientific investigations coming either from randomized clinical trials or observational studies and systematically include an item related to sample size justification and determination. Furthermore, good methodological standards recommend that methods used for sample size planning and for subsequent statistical analysis should be based on similar grounds. Even if guidelines have also been recently published for PRO based studies [20, 21], the reporting of such studies often lacks mentioning the justification of study size and its computation. Three main types of situations are often encountered in 2group comparison studies: i) sample size determination is not performed whatever the intended analysis for PRO data (CTT and/or IRT), ii) tentative justification is occasionally given a posteriori for the size of studies, iii) sample size computation is made a priori but only relies on CTT (mostly using the classical formula for comparing normally distributed endpoints on expected mean scores) even if IRT models are envisaged for data analysis. In this latter case, previous studies have shown that the classical formula was inadequate for IRT models because it leads to underestimation of the required sample size [22]. From this perspective, a method has been recently developed for power and sample size determination when designing a study using a PRO as a primary endpoint when IRT models coming from the Rasch family are intended to be used for subsequent analysis of the data [23]. This method, named Raschpower, provides the power for a given sample size during the planning stage of a study in the framework of Rasch models. It depends on the following parameters (that are a priori assumed and fixed): the parameters related to the items of the questionnaire (items' number J and difficulties parameters δ_{j}, j = 1,…,J), the variance of the latent trait (σ^{2}) and the mean difference between groups on the latent trait (γ). Some of these parameters are easily known a priori when planning a study (e.g. number of items) others are sometimes more difficult to reach (e.g. items difficulties, σ^{2}, γ) and initial estimates based on the literature or pilot studies are required. Besides, whether all these parameters have the same importance regarding sample size determination for Rasch models is unknown. The aim of our paper is to explore the relative impact of these parameters on sample size computation and to identify the most relevant to be used during study design for reliable power determination for Rasch models. Our main objective is to provide a simple method for sample size determination when a Rasch model is planned for analysing PRO data in a 2group comparison study.
Methods
The Rasch model
In the Rasch model [8], the responses to the items are modelled as a function of a latent variable representing the socalled ability of a patient measured by the questionnaire (e.g. QoL, anxiety, fatigue…). The latent variable is often considered as a random variable assumed to follow a normal distribution. In this model, each item is characterized by one parameter (δ_{j} for the jth item), named item difficulty because the higher its value, the lower the probability of a positive (favourable) response of the patient to this item regarding the latent trait being measured.
Let us consider that two groups of patients are compared and that a total of N patients have answered a questionnaire containing J binary items. Let X_{ij} be a binary random variable representing the response of patient i to item j with realization x_{ij}, θ_{i} be the realization of the latent trait Θ for this patient, and γ the group effect defined as the difference between the means of the latent trait in the two groups.
where δ_{j} represents the difficulty parameter of item j and g_{i} = 0,1 for patients in the first or second group, respectively. The latent variable Θ is usually a random variable following a normal distribution with unknown parameters μ and σ^{2}. Marginal maximum likelihood estimation is often used for estimating the parameters of the model.
Sample size determination in the framework of the Rasch model – The Raschpower method
We assume that we want to design a clinical trial using a given dimension of a PRO (e.g. the Mental Health dimension of the SF36) as a primary outcome in a twogroup crosssectional study. Let γ (assumed > 0) be the difference between the mean values of the latent trait (e.g. mental health) in the two groups and σ^{2} the common variance of the latent trait in both groups. We assume that the study involves the comparison of the two hypotheses H_{0}: γ = 0 against the twosided alternative H_{1}: γ ≠ 0. If we plan to use a Rasch model that includes a group effect γ (Eq 1) to test this null hypothesis on the data that will be gathered during the study with a given power 1β_{R} and type I error α, determination of the required sample size can be made using an adapted formula that has been implemented in the Raschpower method [23]. This method is based on the power of the Wald test of group effect γ for a given sample size and it is briefly described. To perform a Wald test, an estimate Γ of γ is required as well as its standard error. Since we are designing a study, some assumptions are made regarding the expected values of these parameters. More specifically, Γ is set at the assumed value for the group effect, γ, and its standard error is obtained as follows: an expected dataset of the patient’s responses is created conditionally on the planning values that are assumed for the sample size in each group, the group effect γ, the items difficulties δ_{j}, and the variance of the latent trait σ^{2}. The probabilities and the expected frequencies of all possible response patterns for each group are computed with the statistical model that will be used for analyzing the data that will be gathered during the study: a Rasch model. The variance of the group effect $\widehat{\mathit{V}}\mathit{ar}\left(\widehat{\mathit{\gamma}}\right)$ is subsequently estimated using a Rasch model including a group effect with δ_{j} and σ^{2} fixed to their planned expected values.
where Φ is the cumulative standard normal distribution function and z _{1 − α/2} the percentiles of the standard normal distribution. 1 − β_{R} is the power of the Wald test of group effect when a Rasch model is used to detect γ at level α. In practice, γ, σ^{2}, and the items' difficulties are unknown population parameters and initial estimates based on the literature or pilot studies are required for calculations.
Relationship between the Raschpower method and the classical formula for manifest normal variables
Where N_{C1} = k x N_{C0} (when k = 1, the sample sizes are assumed equal in both groups).

since 1β_{R} ≤ 1β, the sample size that provides a power of 1β_{R} using the classical formula (Eq 3 and Figure 1, CF②), say N_{c}, is lower than N_{g} and the ratio $\mathrm{Ra}=\frac{{\mathit{N}}_{\mathit{g}}}{{\mathit{N}}_{\mathit{c}}}$ (Figure 1, ③) is therefore higher than 1

previous observations [23] have shown that this ratio Ra remained stable for different values of N_{g} and 1β_{R}, given γ, J and items difficulties

it has been noticed that multiplying N_{g} by this ratio gave a sample size of N_{R} = N_{g} x Ra (Figure 1, ①) that could provide the desired power 1β for Rasch modelling (Figure 1, RP⑤)
Hence this ratio Ra depends on the wellknown classical formula and can be used to provide sample size calculations for Rasch modelling.
Simulations
A simulation study has been performed in order to get more insight into the relationships between the parameters that are required when planning a study for power determination for a given sample size (γ, σ^{2}, δ_{j}, J) and the ratio Ra. A large number of cases (10^{6}) were simulated with each case corresponding to a single parameter combination (γ, σ^{2}, δ_{j}, J, N_{g}). The parameters values were randomly drawn from continuous or discrete uniform distributions, U[minmax], for: the variance of the latent trait σ^{2} (U[0.259]), the group effect γ (U[0.2xσ  0.8xσ]), the number of items J (U[320]), and the sample size per group N_{g} assumed to be equal in both groups (U[50–500]). The items difficulty parameters δ_{j}, j = 1,…,J, were drawn from a centred normal distribution with variance σ^{2} and set to the percentiles of the distribution. The Raschpower method was applied on each parameter combination and provided the power 1β_{R} for Rasch modelling as well as the ratio Ra. Multiple linear regression was performed to assess the contribution of N_{g}, γ, J, and σ^{2} and the difficulty parameters δj, j = 1,…,J to the variation of the ratio Ra. The effects of the difficulty parameters on Ra were investigated in several ways for different values of J: i) by introducing each parameter individually δ_{j}, j = 1,…,J, ii) by introducing their mean and variance. A twotailed Pvalue < 0.05 was considered significant. The variance explained by the model (R^{2}) and the root mean square error (RMSE) were obtained and contributed to variable selection. Variables were removed if R^{2} and RMSE remained stable (within a 0.01 range). Postregression diagnoses were performed to ensure that all linear regression assumptions were met (normality and homoscedasticity of residuals). Statistical analysis was performed using SAS statistical software version 9.3 (SAS Institute Inc, Cary, North Carolina).
Results
Among the 10^{6} parameter combinations, 15278 corresponded to the largest power for CTT and Raschbased analysis, 100%, where the ratio cannot be computed. Hence all analyses were performed on 984722 parameter combinations.
where ${\mathrm{\u03f5}}_{\mathrm{i}}~\mathrm{N}\phantom{\rule{0.62em}{0ex}}\left(0,{\mathit{\sigma}}_{\mathit{\u03f5}}^{2}\right)$, for i = 1, …, 984722
Parameters estimates of the linear regression model explaining the ratio provided by the Raschpower method
Variables  N_{POP} = 984722  Pvalues 

Intercept  1.012 (7.0 10^{−5})  <10^{−3} 
1/σ^{2}  0.095 (1.0 10^{−4})  <10^{−3} 
1/J  0.939 (5.0 10^{−4})  <10^{−3} 
Interaction (1/σ^{2}*1/J)  3.730 (7.5 10^{−4})  <10^{−3} 
R^{2}  0.994  / 
RMSE  0.030  / 
Distributions of the difference between the ratio (respectively number of subjects per group) predicted by the model and the one expected by the Raschpower method Δ _{ R } (respectively Δ _{ N } ) and according to the threshold (Thres) for Δ _{ N }
Variables  N_{POP} = 1996077 

2.5% / Median / 97.5%  
[minmax]  
Δ_{R}  −0.049 / 0.002 / 0.043 
[−1.236 ; 0.230]  
Δ_{N}  −10.623 / 0.438 / 13.499 
[−179.576 ; 112.064]  
n (%)  
– Thres < Δ_{N} < + Thres  968364 (98.34%) 
Δ_{N} < − Thres  10865 (1.10%)^{§} 
Δ_{N} > + Thres  5493 (0.56%)^{†} 
An example of sample size determination in clinical research using the ratio – NHP data
Multiplying N_{g} by this ratio gives a sample size of ${\widehat{\mathrm{N}}}_{\mathrm{R}}$= 197 × 1.27210 ≈ 251 patients per group that should provide the desired power of 90% for Rasch modelling of the pain dimension of the NHP questionnaire. These results were compared to those obtained with the Raschpower method using the estimated difficulty parameters from the pilot study (2.61, 2.94, 1.75, 0.46, − 0.11, 0.36, 1.28, 2.23), $\widehat{\mathrm{\gamma}}$ = 0.649, ${\widehat{\mathit{\sigma}}}^{2}$= 3.9323, and N_{g} = 197 per group. An 80% power (1β_{R}) is expected using the Raschpower method for Rasch modelling with a sample size of N_{g} = 197 per group (Figure 1, RP①). The proposed ratio is therefore equal to 197 / 147 = 1.34 where N_{c} = 147 is the number of subjects per group that provides a power of 80% using the classical formula (Figure 1, CF②). Hence, using the ratio, 197 × 1.34 ≈ 264 (N_{R}) patients per group should provide the desired power of 90% for Rasch modelling (Figure 1, RP⑤).
Comparison of the required parameters and the results obtained using the linear regression model and the Raschpower method on the NHP data
Variables  Linear regression model  Raschpower method 

σ^{2}  3.9323  3.9323 
J  8  8 
γ  /  0.649 
N_{g}  /  197 
δ  /  (2.61, 2.94, 1.75, 0.46, −0.11, 0.36, 1.28, 2.23) 
Ra  1.27210  1.34 
N_{R}  251  264 
Discussion
The fact that the number of subjects given by the classical formula, based on the latent trait, has to be increased using the ratio to reach the expected power for Rasch modelling could deliver a wrong message. Indeed, it could be interpreted as if Rasch models required more subjects than CTTbased analyses would. In fact, the classical formula is directly computed from the expected difference between the latent traits in both groups and the latent trait's variance in each group, assumed to be equal. By doing so, we assume that the means and variance of the latent traits are "perfectly" known and thus do not take into account the fact that the latent trait is not an observed (manifest) variable. Hence, its estimation requires the use of a model which creates uncertainty, unlike scores that can be directly observed and measured. This uncertainty is taken into account by adjusting the sample size using the ratio to obtain an adequately sized study for Rasch modelling. Moreover, it has been underlined that the socalled effect size (difference in means over the standard deviation) on the score scale was lower than the corresponding effect size on the latent trait scale. Consequently, the sample size requested for CTTbased analysis using the effect size on the score scale is higher than its counterpart on the latent trait scale.
The proposed method can be used with confidence when J stands between 3 and 20 and especially when the variance of the latent trait is expected to be higher than 1. Otherwise (when σ^{2} < 1), the Raschpower method should be preferred since the ratiobased approach might under or overestimate the sample size. One of the limitations of our study is that we focused on one of the most wellknown IRT model, the Rasch model. The Raschpower method has also been developed for other models that are well suited for the analysis of polytomous item responses, such as the Partial Credit Model or the Rating Scale Model (Hardouin, under revision). Moreover, the Raschpower method has recently been extended to deal with longitudinal designs [28] and it might be expected that this ratio would also be worthwhile in these contexts. Finally, the Raschpower method (for dichotomous and polytomous items and for crosssectional and longitudinal designs) and the ratiobased approach (for dichotomous items) have been implemented in the free Raschpower module available at the website PROonline http://proonline.univnantes.fr.
Conclusion
Using the classical formula for normally distributed endpoints along with the proposed ratio only depending on the number of items and the variance of the latent trait can provide a straightforward and reliable formula for sample size computation for subsequent Raschbased analysis of PRO data.
Declarations
Acknowledgments
This study was supported by the French National Research Agency, under reference N 2010 PRSP 008 01.
Authors’ Affiliations
References
 Smith EM, Pang H, Cirrincione C, Fleishman S, Paskett ED, Ahles T, Bressler LR, Fadul CE, Knox C, LeLindqwister N, Gilman PB, Shapiro CL, Alliance for Clinical Trials in Oncology: Effect of duloxetine on pain, function, and quality of life among patients with chemotherapyinduced painful peripheral neuropathy: a randomized clinical trial. JAMA. 2013, 309: 13591367.View ArticlePubMedPubMed CentralGoogle Scholar
 Lamy A, Devereaux PJ, Prabhakaran D, Taggart DP, Hu S, Paolasso E, Straka Z, Piegas LS, Akar AR, Jain AR, Noiseux N, Padmanabhan C, Bahamondes JC, Novick RJ, Vaijyanath P, Reddy SK, Tao L, Olavegogeascoechea PA, Airan B, Sulling TA, Whitlock RP, Ou Y, Pogue J, Chrolavicius S, Yusuf S, CORONARY Investigators: Effects of offpump and onpump coronaryartery bypass grafting at 1 year. N Engl J Med. 2013, 368: 11791188.View ArticlePubMedGoogle Scholar
 Cunningham MA, Swanson V, Holdsworth RJ, O'Carroll RE: Late effects of a brief psychological intervention in patients with intermittent claudication in a randomized clinical trial. Br J Surg. 2013, 100: 756760.View ArticlePubMedGoogle Scholar
 Cartwright M, Hirani SP, Rixon L, Beynon M, Doll H, Bower P, Bardsley M, Steventon A, Knapp M, Henderson C, Rogers A, Sanders C, Fitzpatrick R, Barlow J, Newman SP, Whole Systems Demonstrator Evaluation Team: Effect of telehealth on quality of life and psychological outcomes over 12 months (Whole Systems Demonstrator telehealth questionnaire study): nested study of patient reported outcomes in a pragmatic, cluster randomised controlled trial. BMJ. 2013, 346: f653View ArticlePubMedPubMed CentralGoogle Scholar
 Weis J, Arraras JI, Conroy T, Efficace F, Fleissner C, Görög A, Hammerlid E, Holzner B, Jones L, Lanceley A, Singer S, Wirtz M, Flechtner H, Bottomley A: Development of an EORTC quality of life phase III module measuring cancerrelated fatigue (EORTC QLQFA13). Psychooncology. 2013, 22: 10021007.View ArticlePubMedGoogle Scholar
 Cella D, Riley W, Stone A, Rothrock N, Reeve B, Yount S, Amtmann D, Bode R, Buysse D, Choi S, Cook K, Devellis R, DeWalt D, Fries JF, Gershon R, Hahn EA, Lai JS, Pilkonis P, Revicki D, Rose M, Weinfurt K, Hays R: The patientreported outcomes measurement information system (PROMIS) developed and tested its first wave of adult selfreported health outcome item banks: 2005–2008. J Clin Epidemiol. 2010, 63: 11791194.View ArticlePubMedPubMed CentralGoogle Scholar
 Langer MM, Hill CD, Thissen D, Burwinkle TM, Varni JW, DeWalt DA: Item response theory detected differential item functioning between healthy and ill children in qualityoflife measures. J Clin Epidemiol. 2008, 61: 268276.View ArticlePubMedGoogle Scholar
 Fisher GH, Molenaar IW: Rasch Models, Foundations, Recent Developments, and Applications. 1995, NewYork: SpringerVerlagGoogle Scholar
 Sébille V, Hardouin JB, Mesbah M: Sequential analysis of latent variables using mixedeffect latent variable models: impact of noninformative and informative missing data. Stat Med. 2007, 26: 48894904.View ArticlePubMedGoogle Scholar
 Hardouin JB, Conroy R, Sébille V: Imputation by the mean score should be avoided when validating a patient reported outcomes questionnaire by Rasch model in presence of informative missing data. BMC Med Res Meth. 2011, 11: 105View ArticleGoogle Scholar
 De Bock E, Hardouin JB, Blanchin M, Le Neel T, Kubis G, BonnaudAntignac A, Dantan E, Sébille V: Raschfamily models are more valuable than scorebased approaches for analyzing longitudinal PRO with missing data. Stat Meth Med Res. in pressGoogle Scholar
 Andrich D: Rating scales and rasch measurement. Expert Rev Pharmacoecon Outcomes Res. 2011, 11: 571585.View ArticlePubMedGoogle Scholar
 RavensSieberer U, Herdman M, Devine J, Otto C, Bullinger M, Rose M, Klasen F: The European KIDSCREEN approach to measure quality of life and wellbeing in children: development, current application, and future advances. Qual Life Res. in pressGoogle Scholar
 Waller J, Ostini R, Marlow LAV, McCaffery K, Zimet G: Validation of a measure of knowledge about human papillomavirus (HPV) using item response theory and classical test theory. Prev Med. 2013, 56: 3540.View ArticlePubMedGoogle Scholar
 Sapin C, Simeoni MC, El Khammar M, Antoniotti S, Auquier P: Reliability and validity of the VSPA, a healthrelated quality of life instrument for ill and healthy adolescents. J Adolesc Health. 2005, 36: 327336.View ArticlePubMedGoogle Scholar
 Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, Pitkin R, Rennie D, Schulz KF, Simel D, Stroup DF: Improving the quality of reporting of randomized controlled trials: the CONSORT statement. JAMA. 1996, 276: 637639.View ArticlePubMedGoogle Scholar
 Schulz KF, Altman DG, Moher D, CONSORT Group: CONSORT: Statement: Updated Guidelines for Reporting Parallel Group Randomized Trials. Ann Intern Med. 2010, 2010 (152): 726732.View ArticleGoogle Scholar
 von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP: The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Ann Intern Med. 2007, 147: 573577.View ArticlePubMedGoogle Scholar
 Des Jarlais DC, Lyles C, Crepaz N: Improving the reporting quality of nonrandomized evaluations of behavioral and public health interventions: the TREND statement. Am J Public Health. 2004, 94: 361366.View ArticlePubMedPubMed CentralGoogle Scholar
 Calvert M, Blazeby J, Altman DG, Revicki DA, Moher D, Brundage MD, CONSORT PRO Group: Reporting of patientreported outcomes in randomized trials: the CONSORT PRO extension. JAMA. 2013, 309: 814822.View ArticlePubMedGoogle Scholar
 US Department of Health and Human Services: Guidance for Industry (Patientreported outcome measures: use in medical product development to support labeling claims). http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf,
 Sébille V, Hardouin JB, Le Neel T, Kubis G, Boyer F, Guillemin F, Falissard B: Methodological issues regarding power of classical test theory and IRTbased approaches for the comparison of PatientReported Outcome measures – A simulation study. BMC Med Res Meth. 2010, 10: 24View ArticleGoogle Scholar
 Hardouin JB, Amri S, Feddag ML, Sébille V: Towards power and sample size calculations for the comparison of two groups of patients with item response theory models. Stat Med. 2012, 31: 12771290.View ArticlePubMedGoogle Scholar
 Julious SA: Sample sizes for clinical trials with normal data. Stat Med. 2004, 30: 19211986.View ArticleGoogle Scholar
 Glas CAW: The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika. 1988, 53: 525546.View ArticleGoogle Scholar
 Blanchin M, Hardouin JB, Le Neel T, Kubis G, Blanchard C, Miraillé E, Sébille V: Comparison of CTT and IRT basedapproach for the analysis of longitudinal Patient Reported Outcome. Stat Med. 2011, 30: 825838.PubMedGoogle Scholar
 Hardouin JB, Audureau E, Leplège A, Coste J: Spatiotemporal Rasch analysis of Quality of life outcomes in the french general population. Measurement invariance and group comparisons. BMC Med Res Meth. 2012, 212: 182View ArticleGoogle Scholar
 Feddag ML, Blanchin M, Hardouin JB, Sébille V: Power analysis on the time effect for the longitudinal Rasch model. J Appl Meas. 2014, in pressGoogle Scholar
 The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/14/87/prepub
Prepublication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.