Ridge parameter Stef van Buuren, Netherlands Organization for Applied Scientific Research TNO 15 February 2013 The paper by Hardt, Herke and Leonhart is a welcome addition to the literature. It warns against simplistic approaches that throw just anything into the imputation model. While the imputation model is generally robust against including junk variables, the paper clearly demonstrates that we should not drive this to the edge. In general building the imputation model requires appropriate care. My personal experience is that it is not beneficial to include more than -say- 25 well-chosen variables into the imputation model. In their simulations the authors investigate cases where the number of variables specified in the imputation model exceeds the number of cases. Many programs break down in this case, but MICE will run because it uses ridge regression instead of the usual OLS estimate. The price for this increased computational stability is -as confirmed by Hardt et all - that the parameters estimates will be biased towards zero. It is therefore likely that some of the bias observed by the authors is not intrinsic to PMM, but rather due to the setting of the ridge parameter (the default value 1E-5 may be easily changed as mice(..., ridge = 1E-6)). Would a tighter ridge setting (e.g., 1E-6 or 1E-7) appreciably reduce the bias? The '1 out of 3 of the complete cases' rule is interesting and easily remembered. However, a complication in practice is that there are often no complete cases in real data, especially in merged datasets. What would the authors think of the slightly more liberal rule 'n/3 variables'? Stef van Buuren Competing interests Author of the mice package.