Article,

Tuning multiple imputation by predictive mean matching and local residual draws.

T. Morris, I. White, and P. Royston.
BMC medical research methodology, (January 2014)Dades censurades; Imputació múltiple.
DOI: 10.1186/1471-2288-14-75

Abstract

BACKGROUND: Multiple imputation is a commonly used method for handling incomplete covariates as it can provide valid inference when data are missing at random. This depends on being able to correctly specify the parametric model used to impute missing values, which may be difficult in many realistic settings. Imputation by predictive mean matching (PMM) borrows an observed value from a donor with a similar predictive mean; imputation by local residual draws (LRD) instead borrows the donor's residual. Both methods relax some assumptions of parametric imputation, promising greater robustness when the imputation model is misspecified. METHODS: We review development of PMM and LRD and outline the various forms available, and aim to clarify some choices about how and when they should be used. We compare performance to fully parametric imputation in simulation studies, first when the imputation model is correctly specified and then when it is misspecified. RESULTS: In using PMM or LRD we strongly caution against using a single donor, the default value in some implementations, and instead advocate sampling from a pool of around 10 donors. We also clarify which matching metric is best. Among the current MI software there are several poor implementations. CONCLUSIONS: PMM and LRD may have a role for imputing covariates (i) which are not strongly associated with outcome, and (ii) when the imputation model is thought to be slightly but not grossly misspecified. Researchers should spend efforts on specifying the imputation model correctly, rather than expecting predictive mean matching or local residual draws to do the work.

BibTeX key: Morris2014
entry type: article
year: 2014
month: 1
journal: BMC medical research methodology
pages: 75
volume: 14
pmid: 24903709
issn: 1471-2288
DOI: 10.1186/1471-2288-14-75
url: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4051964&tool=pmcentrez&rendertype=abstract
note: Dades censurades; Imputació múltiple

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Morris2014 %A Morris, Tim P %A White, Ian R %A Royston, Patrick %D 2014 %J BMC medical research methodology %K Albumins Albumins:analysis BiomedicalResearch BiomedicalResearch:methods ComputerSimulation DataInterpretation GlandularandEpithelial GlandularandEpithelial:blood GlandularandEpithelial:mortality Humans Models Neoplasms OvarianNeoplasms OvarianNeoplasms:blood OvarianNeoplasms:mortality SerumAlbumin SerumAlbumin:analysis Statistical %P 75 %R 10.1186/1471-2288-14-75 %T Tuning multiple imputation by predictive mean matching and local residual draws. %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4051964&tool=pmcentrez&rendertype=abstract %V 14 %X BACKGROUND: Multiple imputation is a commonly used method for handling incomplete covariates as it can provide valid inference when data are missing at random. This depends on being able to correctly specify the parametric model used to impute missing values, which may be difficult in many realistic settings. Imputation by predictive mean matching (PMM) borrows an observed value from a donor with a similar predictive mean; imputation by local residual draws (LRD) instead borrows the donor's residual. Both methods relax some assumptions of parametric imputation, promising greater robustness when the imputation model is misspecified. METHODS: We review development of PMM and LRD and outline the various forms available, and aim to clarify some choices about how and when they should be used. We compare performance to fully parametric imputation in simulation studies, first when the imputation model is correctly specified and then when it is misspecified. RESULTS: In using PMM or LRD we strongly caution against using a single donor, the default value in some implementations, and instead advocate sampling from a pool of around 10 donors. We also clarify which matching metric is best. Among the current MI software there are several poor implementations. CONCLUSIONS: PMM and LRD may have a role for imputing covariates (i) which are not strongly associated with outcome, and (ii) when the imputation model is thought to be slightly but not grossly misspecified. Researchers should spend efforts on specifying the imputation model correctly, rather than expecting predictive mean matching or local residual draws to do the work.

@article{Morris2014, abstract = {BACKGROUND: Multiple imputation is a commonly used method for handling incomplete covariates as it can provide valid inference when data are missing at random. This depends on being able to correctly specify the parametric model used to impute missing values, which may be difficult in many realistic settings. Imputation by predictive mean matching (PMM) borrows an observed value from a donor with a similar predictive mean; imputation by local residual draws (LRD) instead borrows the donor's residual. Both methods relax some assumptions of parametric imputation, promising greater robustness when the imputation model is misspecified. METHODS: We review development of PMM and LRD and outline the various forms available, and aim to clarify some choices about how and when they should be used. We compare performance to fully parametric imputation in simulation studies, first when the imputation model is correctly specified and then when it is misspecified. RESULTS: In using PMM or LRD we strongly caution against using a single donor, the default value in some implementations, and instead advocate sampling from a pool of around 10 donors. We also clarify which matching metric is best. Among the current MI software there are several poor implementations. CONCLUSIONS: PMM and LRD may have a role for imputing covariates (i) which are not strongly associated with outcome, and (ii) when the imputation model is thought to be slightly but not grossly misspecified. Researchers should spend efforts on specifying the imputation model correctly, rather than expecting predictive mean matching or local residual draws to do the work.}, added-at = {2023-02-03T11:44:35.000+0100}, author = {Morris, Tim P and White, Ian R and Royston, Patrick}, biburl = {https://www.bibsonomy.org/bibtex/22c5b11b8c952f24c3ec49c24c78baf0d/jepcastel}, doi = {10.1186/1471-2288-14-75}, interhash = {eae5078424ea080844063eaf96944d50}, intrahash = {2c5b11b8c952f24c3ec49c24c78baf0d}, issn = {1471-2288}, journal = {BMC medical research methodology}, keywords = {Albumins Albumins:analysis BiomedicalResearch BiomedicalResearch:methods ComputerSimulation DataInterpretation GlandularandEpithelial GlandularandEpithelial:blood GlandularandEpithelial:mortality Humans Models Neoplasms OvarianNeoplasms OvarianNeoplasms:blood OvarianNeoplasms:mortality SerumAlbumin SerumAlbumin:analysis Statistical}, month = {1}, note = {Dades censurades; Imputació múltiple}, pages = 75, pmid = {24903709}, timestamp = {2023-02-03T11:44:35.000+0100}, title = {Tuning multiple imputation by predictive mean matching and local residual draws.}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4051964&tool=pmcentrez&rendertype=abstract}, volume = 14, year = 2014 }

BibSonomy

Tuning multiple imputation by predictive mean matching and local residual draws.

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on