Multiple Treatments Uplift Model Using Python Package CausalML Multiple treatments ITE/CATE and ATE estimation using meta-learner uplift model in Python

Multiple Treatments Uplift Model Using Python Package CausalML

Multiple treatment groups sometimes exist in an experiment to compare with a control group. In this tutorial, we will talk about how to use the python package causalML to build meta-learner uplift models for an experiment with multiple treatments.

There are different meta-learner algorithms such as S-learner, T-learner, X-learner, and R-learner. We will use S-learner as an example, and other meta-learners can follow the same process. We will cover:

  • How to implement a meta-learner on multiple treatments using the Python package CausalML?
  • How to make an average treatment effect (ATE) estimation for multiple treatments?
  • How to make individual treatment effects (ITE) estimation for multiple treatments?
  • How to get the confidence intervals for the average treatment effect (ATE) and individual treatment effect (ITE) estimation?

Resources for this post:

Multiple Treatments Uplift Model Using Python – GrabNGoInfo.com

Let’s get started!

Step 1: Install and Import Libraries

In step 1, we will install and import the python libraries.

Firstly, let’s install causalml.

# Install package
!pip install causalml

After the installation is completed, we can import the libraries.

  • pandas and numpy are imported for data processing.
  • synthetic_data is imported for synthetic data creation.
  • LGBMRegressorBaseSRegressor, and XGBRegressor are for the machine learning model training.
# Data processing
import pandas as pd
import numpy as np

# Create synthetic data
from causalml.dataset import synthetic_data

# Machine learning model
from causalml.inference.meta import LRSRegressor, BaseSRegressor
from xgboost import XGBRegressor

Step 2: Create Dataset

In step 2, we will create a synthetic dataset for the S-learner uplift model.

Firstly, a random seed is set to make the synthetic dataset reproducible.

Then, using the synthetic_data method from the causalml python package, we created a dataset with five features, one treatment variable, and one continuous outcome variable.

  • mode is for the type of simulation for synthetic dataset creation. It is based on the paper by Nie X. and Wager S. (2018) titled “Quasi-Oracle Estimation of Heterogeneous Treatment Effects”.
    • 1 is for difficult nuisance components and an easy treatment effect.
    • 2 is for a randomized trial.
    • 3 is for an easy propensity and a difficult baseline.
    • 4 is for unrelated treatment and control groups.
    • 5 is for a hidden confounder biasing treatment.
  • n takes in the number of observations. 5000 observations are created in this example.
  • p takes in the number of covariates. We created 5 covariates for this dataset.
  • sigma=1 means the standard deviation of the error term is 1.
  • adj is the adjustment term for the distribution of propensity. High values shift the distribution to 0.

The synthetic_data method produces six outputs.

  • y is the outcome.
  • X is a matrix with all the covariates.
  • w is the treatment flag with two values. 0 represents the control group and 1 represents the treatment group. In this tutorial, w is renamed to treatment.
  • tau is the individual treatment effect (ITE). In this tutorial, tau is renamed to ite.
  • b is the expected outcome.
  • e is the propensity score for receiving treatment.
# Set a seed for reproducibility
np.random.seed(1)

# Generate synthetic data using mode 1
y, X, treatment, ite, b, e = synthetic_data(mode=1, n=5000, p=5, sigma=1, adj=0)

The python causalml package creates one control group and one treatment group by default. We used the random function from numpy to split the treatment group into two treatment groups, treatment_1 and treatment_2.

# Create multiple treatments
treatment = np.array([('treatment_1' if np.random.random() > 0.5 else 'treatment_2') 
                      if t==1 else 'control' for t in treatment])

After that, using value_counts on the treatment variable, we can see that out of 5000 samples, 2421 units did not receive treatments, 1235 received treatment 1, and 1344 received treatment 2.

# Check treatment vs. control counts
pd.Series(treatment).value_counts()

Outputs:

control        2421
treatment_2    1344
treatment_1    1235
dtype: int64

Step 3: Multiple Treatments Average Treatment Effect (ATE)

In step 3, we will estimate the average treatment effect (ATE) for multiple treatments.

The python causalml package provides two methods for building s-learner models.

  • LRSRegressor is a built-in ordinary least squares (OLS) s-learner model that comes with the causalML package.
  • BaseSRegressor is a generalized method that can take in existing machine learning models from pakcages such as sklearn and xgboost, and run s-learners with those models.

To estimate the average treatment effect (ATE) for multiple treatments using LRSRegressor, we first initiate the LRSRegressor, then get the average treatment effect (ATE) and its upper bound and lower bound using the estimate_ate method.

Note that because we have more than one treatment, control_name='control' needs to be specified so the computer knows which one is the control group.

# Use LRSRegressor
lr = LRSRegressor(control_name='control')

# Estimated ATE, upper bound, and lower bound
ate, lb, ub = lr.estimate_ate(X, treatment, y)

# Print out results
print('Average Treatment Effect for treatment 1: {:.2f} ({:.2f}, {:.2f})'.format(ate[0], lb[0], ub[0]))
print('Average Treatment Effect for treatment 2: {:.2f} ({:.2f}, {:.2f})'.format(ate[1], lb[1], ub[1]))

The output includes the average treatment effect (ATE) for every treatment.

  • The estimated average treatment effect (ATE) for treatment 1 is 0.76. The lower bound is 0.68 and the upper bound is 0.84.
  • The estimated average treatment effect (ATE) for treatment 2 is 0.71. The lower bound is 0.63 and the upper bound is 0.79.
Average Treatment Effect for treatment 1: 0.76 (0.68, 0.84)
Average Treatment Effect for treatment 2: 0.71 (0.63, 0.79)

To estimate the average treatment effect (ATE) for multiple treatments using BaseSRegressor, we first initiate the BaseSRegressor. Here we are using the XGBRegressor as the modeling algorithm, but it can be replaced by any other model algorithm. random_state ensures the model results reproducible, and control_name tells the meta-learner which is the control group.

After initiating the model, we can get the average treatment effect (ATE) and its upper bound and lower bound using the estimate_ate method.

Besides passing in the covariates, the treatment variable, and the outcome variable, we need to specify return_ci=True to get the confidence interval for the estimated average treatment effect (ATE).

# Use XGBRegressor with BaseSRegressor
xgb = BaseSRegressor(XGBRegressor(random_state=42), control_name='control')

# Estimated ATE, upper bound, and lower bound
ate, lb, ub = xgb.estimate_ate(X, treatment, y, return_ci=True)

# Print out results
print('Average Treatment Effect for treatment 1: {:.2f} ({:.2f}, {:.2f})'.format(ate[0], lb[0], ub[0]))
print('Average Treatment Effect for treatment 2: {:.2f} ({:.2f}, {:.2f})'.format(ate[1], lb[1], ub[1]))

We can see from the outputs that

  • The estimated average treatment effect (ATE) for treatment 1 is 0.59. The lower bound is 0.53 and the upper bound is 0.65.
  • The estimated average treatment effect (ATE) for treatment 2 is 0.57. The lower bound is 0.51 and the upper bound is 0.63.
Average Treatment Effect for treatment 1: 0.59 (0.53, 0.65)
Average Treatment Effect for treatment 2: 0.57 (0.51, 0.63)

The estimation from BaseSRegressor is different from the LRSRegressor, showing that the model algorithm selection affects the average treatment effect (ATE) estimation accuracy.

Step 4: Multiple Treatments Individual Treatment Effect (ITE)

In step 4, we will estimate the individual treatment effect (ITE) for multiple treatments.

The method fit_predict produces the estimated individual treatment effect (ITE).

From the first five results, we can see that the individual treatment effect (ITE) is estimated for both treatments.

# ITE
xgb_ite = xgb.fit_predict(X, treatment, y)

# Take a look at the data
np.matrix(xgb_ite[:5])

The first column corresponds to the individual treatment effect (ITE) for treatment 1 and the second column corresponds to the individual treatment effect (ITE) for treatment 2.

matrix([[ 0.61860538,  0.79667234],
        [ 0.06657565, -0.00587612],
        [ 0.76414907,  0.63103437],
        [ 0.78046882,  0.78824401],
        [ 0.88066435,  0.83072901]])

We can print out the individual treatment effect (ITE) estimation for each record. For example, the first record has the individual treatment effect (ITE) of 0.62 for treatment 1 and 0.80 for treatment 2.

# Print out estimation for one record
print(f'The estimated ITE for treatment 1 for the first record : {xgb_ite[0][0]:.2f}')
print(f'The estimated ITE for treatment 2 for the first record : {xgb_ite[0][1]:.2f}')

Output:

The estimated ITE for treatment 1 for the first record : 0.62
The estimated ITE for treatment 2 for the first record : 0.80

If the confidence interval for the individual treatment effect (ITE) is needed, we can use bootstrap by specifying the bootstrap number, bootstrap size, and setting return_ci=True.

The output gives us both the estimated individual treatment effect (ITE) and the estimated upper and lower bound for each treatment.

# ITE with confidence interval
xgb_ite, xgb_ite_lb, xgb_ite_ub = xgb.fit_predict(X=X, treatment=treatment, y=y, return_ci=True,
                               n_bootstraps=100, bootstrap_size=500)

# Take a look at the data
print('\nThe first five estimated ITEs are:\n', np.matrix(xgb_ite[:5]))
print('\nThe first five estimated ITE lower bound are:\n', np.matrix(xgb_ite_lb[:5]))
print('\nThe first five estimated ITE upper bound are:\n', np.matrix(xgb_ite_ub[:5]))

The first columns for the individual treatment effect (ITE) and its upper bound and lower bound correspond to treatment 1, and the second columns correspond to treatment 2.

We can print out the individual treatment effect (ITE) estimation and its upper bound and lower bound for each records.

  • The first record has the individual treatment effect (ITE) of 0.62 for treatment 1. The lower bound is 0.04, and the upper bound is 1.28.
  • The first record has the individual treatment effect (ITE) of 0.80 for treatment 2. The lower bound is -0.14, and the upper bound is 1.37.
# Print out estimation for one record
print(f'The estimated ITE for treatment 1 for the first record : {xgb_ite[0][0]:.2f}, the lower bound is {xgb_ite_lb[0][0]:.2f}, and the upper bound is {xgb_ite_ub[0][0]:.2f}')
print(f'The estimated ITE for treatment 2 for the first record : {xgb_ite[0][1]:.2f}, the lower bound is {xgb_ite_lb[0][1]:.2f}, and the upper bound is {xgb_ite_ub[0][1]:.2f}')

Outputs:

The estimated ITE for treatment 1 for the first record : 0.62, the lower bound is 0.04, and the upper bound is 1.28
The estimated ITE for treatment 2 for the first record : 0.80, the lower bound is -0.14, and the upper bound is 1.37

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.

Recommended Tutorials

References

Leave a Comment

Your email address will not be published. Required fields are marked *