Article,

Using classification tree analysis to generate propensity score weights.

, and .
Journal of evaluation in clinical practice, 23 (4): 703-712 (August 2017)Propensity score; Classification trees; Machine learning.
DOI: 10.1111/jep.12744

Abstract

RATIONALE, AIMS AND OBJECTIVES In evaluating non-randomized interventions, propensity scores (PS) estimate the probability of assignment to the treatment group given observed characteristics. Machine learning algorithms have been proposed as an alternative to conventional logistic regression for modelling PS in order to avoid limitations of linear methods. We introduce classification tree analysis (CTA) to generate PS which is a "decision-tree"-like classification model that provides accurate, parsimonious decision rules that are easy to display and interpret, reports P values derived via permutation tests, and evaluates cross-generalizability. METHOD Using empirical data, we identify all statistically valid CTA PS models and then use them to compute strata-specific, observation-level PS weights that are subsequently applied in outcomes analyses. We compare findings obtained using this framework to logistic regression and boosted regression, by evaluating covariate balance using standardized differences, model predictive accuracy, and treatment effect estimates obtained using median regression and a weighted CTA outcomes model. RESULTS While all models had some imbalanced covariates, main-effects logistic regression yielded the lowest average standardized difference, whereas CTA yielded the greatest predictive accuracy. Nevertheless, treatment effect estimates were generally consistent across all models. CONCLUSIONS Assessing standardized differences in means as a test of covariate balance is inappropriate for machine learning algorithms that segment the sample into two or more strata. Because the CTA algorithm identifies all statistically valid PS models for a sample, it is most likely to identify a correctly specified PS model, and should be considered as an alternative approach to modeling the PS.

Tags

Users

  • @jkd
  • @jepcastel

Comments and Reviews