Skip to content

Add augmented synthetic control model #365

Open
@drbenvincent

Description

@drbenvincent

As of now, we have "vanilla" synthetic control working with cp.pymc_experiments.SyntheticControl as the experiment class, and that is fed the cp.pymc_models.WeightedSumFitter as the model.

It is the cp.pymc_models.WeightedSumFitter which does the vanilla synthetic control model - as in weightings which sum to 1, and that is done via a Dirichlet distribution.

We want to add the ability to do augmented synthetic control. This will still use the cp.pymc_experiments.SyntheticControl cp.SyntheticControl experiment class, but instead we will feed it a new model, something like cp.pymc_models.AugmentedSyntheticControlModel. (However, see below because we may not need a new model)

Implementation notes

As far as I understand the algorithm for augmented synthetic control is along the lines of:

  • Based on the pre-treatment data, fit vanilla synthetic control model where weights are constrained to sum to 1.
  • Calculate the residuals between the model pre-treatment predictions and the observations
  • Fit these residuals with a model
  • Use the predictions of that model to adjust the synthetic control predictions

That need not be done in separate steps. What you could do is to have a model where the weightings of the control groups are constrained to sum to 1, but then simply add in more components to the model, such as an intercept and trend. For example, the model formula in one of the examples is currently:

Denmark ~ 0 + Austria + Belgium + Bulgaria + Croatia + Cyprus + Czech_Republic

but you could implement augmented synthetic control with something like

Denmark ~ 1 + trend + Austria + Belgium + Bulgaria + Croatia + Cyprus + Czech_Republic

Though you would have to ensure that the weights of the control units are constrained to sum to 1, but the 1 and trend predictors are weighted by 'unconstrained' coefficients.

So practically we might want to keep the original model formula but add a new residuals ~ 1 + trend, or something similar. Though it could just be simpler to do a custom model with something like:

  • control_units = ["Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus", "Czech_Republic"]
  • residuals ~ 1 + trend

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestgeo projectRelated to geo-testing

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions