Skip to main content

Archived Comments for: Comparison of methods for imputing limited-range variables: a simulation study

Back to article

  1. How much does this article add to prior research?

    Paul von Hippel, University of Texas

    12 June 2015

    I was interested in Rodwell et al.'s [1] article on the topic of imputing limited-range variables, and pleased to see that they reached approximately the same conclusion as my earlier article on imputing skewed variables [2]. Like me, Rodwell et al. concluded that imputation imputing skewed variables as though they were normal can produce good though not entirely unbiased estimates for some quantities. Like me, Rodwell et al. concluded that methods that try to “correct” the imputations by rounding, transformation, or truncation often make biases worse instead of better.

    Unfortunately Rodwell et al.'s description of my prior research is incomplete and gives the impression that their results are more novel than they are. In introducing their article, Rodwell et al. cite “the limited comparison of methods for handling limited-range variables to date.” As an example of those limitations, they report that my prior article “was fairly restrictive as [it] only considered data from an exponential distribution with the lower range restricted." In the prepublication history I see that one of the reviewers suggested that my article was limited to the case where variables were missing completely at random (MCAR).

    In fact my article was not limited to exponential MCAR data. While I did begin with exponential MCAR data, I continued with a broad simulation that considered 4 different distributions (including but not limited to the exponential); 3 different patterns of missing values (including but not limited to the MCAR pattern); data with 1, 2, and 3 variables, and 5 different relationships between the incomplete variable and the complete variables; 7 different methods of imputation; and 8 different estimands. The simulation was a lot of work and offered a lot of evidence about the properties of different estimators under different circumstances. It troubles me that authors and reviewers working in this area apparently did not see the simulation when they reviewed my article. It might have helped them to identify gaps in knowledge and make a more novel contribution themselves.

    To their credit, Rodwell et al. did evaluate the method of predictive mean matching, a method that I did not cover. There is definitely room for further research in this area.

    Sincerely yours,
    Paul von Hippel
    University of Texas, Austin

    1. Rodwell L, Lee KJ, Romaniuk H, Carlin JB. Comparison of methods for imputing limited-range variables: a simulation study. BMC Medical Research Methodology. 2014;14(1):57. doi:10.1186/1471-2288-14-57.
    2. von Hippel PT. Should a Normal Imputation Model be Modified to Impute Skewed Variables? Sociological Methods & Research. 2013;42(1):105-138. doi:10.1177/0049124112464866.

     

    Competing interests

    I have no competing interests except that I wrote a previous article on the subject.
  2. Response to previous comment 

    Laura Rodwell, Murdoch Childrens Research Institute

    26 October 2015

    My co-authors and I would like to thank Paul von Hippel for his comments on our paper, and we apologise that our review of his simulation study on the multiple imputation of skewed variables did not reflect its full breadth (1). We encourage those who are interested to read von Hippel’s paper.

    As von Hippel noted, there were some similarities between our study, which compared methods for the imputation of limited-range variables (2) and von Hippel’s study, which examined methods for the imputation of skewed variables. Reassuringly, the findings of our studies were generally consistent, with both recommending that it may not be appropriate to round imputed values to the plausible range of a continuous variable. The practice of rounding imputed values to the plausible range of values appears to be commonplace, and we think that the consistency in findings between the two studies is an important step towards reducing this practice.

    However, we consider the main focus of our study to be somewhat different from von Hippel’s. We looked more directly at limited-range variables that had a restriction to both ends of their range and included a variable with a weak, approximately normal, skew in addition to those with a moderate and severe skew.

    As von Hippel mentions, we also examined the performance of predictive mean matching (PMM). The results for PMM as presented in the paper did not generally support its use, with a tendency for the method to produce coverage rates lower than the nominal 95%. In our discussion we speculated that this may have been due to the ‘Type 2’ matching algorithm used as part of Stata’s “mi impute chained” command. (Morris et al. (3) provide details on different possible matching algorithms for PMM.) For my PhD thesis, I also looked at the performance of PMM using ‘Type 1’ matching and found that this method performed well, with negligible bias and good coverage. For those wishing to use PMM for the imputation of limited-range variables, we recommend the use of the Type 1 matching method.

    Laura Rodwell

    1. Rodwell L, Lee KJ, Romaniuk H, Carlin JB. Comparison of methods for imputing limited-range variables: a simulation study. BMC Medical Research Methodology. 2014;14(1):57. doi:10.1186/1471-2288-14-57.
    2. von Hippel PT. Should a Normal Imputation Model be Modified to Impute Skewed Variables? Sociological Methods & Research. 2013;42(1):105-138. doi:10.1177/0049124112464866.
    3. Morris TP, White IR, Royston P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Medical Research Methodology. 2014; 14(1):75.

    Competing interests

    None 

Advertisement