T-learner is a meta-learner that uses two machine learning models to estimate the individual level heterogeneous causal treatment effect. In this tutorial, we will talk about:

- How to implement T-learner manually using two machine learning models in Python?
- How to make individual treatment effect (ITE) estimation using T-learner?
- How to calculate average treatment effect (ATE) estimation using T-learner?

**Resources for this post:**

- Click here for the Colab notebook.
- More video tutorials on Uplift Modeling
- More blog posts on Uplift Modeling
- If you are not a Medium member and want to support me to keep providing free content (😄 Buy me a cup of coffee ☕), join Medium membership through this link. You will get full access to posts on Medium for $5 per month, and I will receive a portion of it. Thank you for your support 🙏
- Give me a tip to show your appreciation and help me keep providing free content. Thank you for your generosity 🙏
- Video tutorial for this post on YouTube

Let’s get started!

### Step 0: T-learner Algorithm

T-learner follows three steps to estimate individual treatment effect (ITE):

- Firstly, two machine learning models need to be built, one using the treated units and the other using the untreated units.
- Next, the predictions will be made using the two models separately for all the units, both treated and control.
- Finally, the individual treatment effect (ITE) will be estimated by getting the difference between the predicted outcome from the treated model and the predicted outcome from the control model.

To learn how to make an individual treatment effect (ITE) using a single model, please check out my previous tutorial for S-learner.

### Step 1: Install and Import Libraries

In step 1, we will install and import the python libraries.

Firstly, let’s install `causalml`

for synthetic dataset creation.

# Install package !pip install causalml

After the installation is completed, we can import the libraries.

`pandas`

is imported for data processing.`synthetic_data`

is imported for synthetic data creation.`seaborn`

is for visualization.`LGBMRegressor`

is for the machine learning model training.

# Data processing import pandas as pd import numpy as np # Create synthetic data from causalml.dataset import synthetic_data # Visualization import seaborn as sns # Machine learning model from lightgbm import LGBMRegressor

### Step 2: Create Dataset

In step 2, we will create a synthetic dataset for the T-learner uplift model.

- Firstly, a random seed is set to make the synthetic dataset reproducible.
- Then, using the
`synthetic_data`

method from the`causalml`

python package, we created a dataset with five features, one treatment variable, and one continuous outcome variable. - After that, the dataset is saved in a pandas dataframe.
- Finally, using
`value_counts`

on the`treatment`

variable, we can see that out of 5000 samples, 2582 units received treatment and 2418 did not receive treatment.

# Set a seed for reproducibility np.random.seed(42) # Create a synthetic dataset y, X, treatment, _, _, _ = synthetic_data(mode=1, n=5000, p=5, sigma=1.0) # Save the data in a pandas dataframe df = pd.DataFrame({'y': y, 'X1': X.T[0], 'X2': X.T[1], 'X3': X.T[2], 'X4': X.T[3], 'X5': X.T[4], 'treatment': treatment}) # Check treatment df['treatment'].value_counts()

Output:

1 2582 0 2418 Name: treatment, dtype: int64

### Step 3: T-learner Model Data Processing

In step 3, we will process the data for the T-learner.

- Firstly, the dataset
`df`

is split into two datasets based on the value of the`treatment`

variable. The treated units are saved in a dataframe called`df_treated`

and the control units are saved in a dataframe called`df_control`

. - Then the feature matrix and the outcome variable are created for the treated and control dataframes separately. Features include
`X1`

,`X2`

,`X3`

,`X4`

, and`X5`

. The dependent variable(a.k.a. label) is the outcome column`y`

. The names for the features are`features_treated`

and`features_control`

, and the names for the outcomes are`y_treated`

and`y_control`

. - Finally, we checked the shape of the modeling data for the two models. The treated model has 2582 records and the control model has 2418 records.

# Keep only the treated units df_treated = df[df['treatment'] == 1] # Features features_treated = df_treated.loc[:, ['X1', 'X2', 'X3', 'X4', 'X5']] # Dependent variable y_treated = df_treated.loc[:, ['y']] # Print data shape print(f'The feature maxtrix has {features_treated.shape[0]} records and {features_treated.shape[1]} features.') print(f'The dependent variable has {y_treated.shape[0]} records.') # Keep only the control units df_control = df[df['treatment'] == 0] # Features features_control = df_control.loc[:, ['X1', 'X2', 'X3', 'X4', 'X5']] # Dependent variable y_control = df_control.loc[:, ['y']] # Print data shape print(f'The feature maxtrix has {features_control.shape[0]} records and {features_control.shape[1]} features.') print(f'The dependent variable has {y_control.shape[0]} records.')

Output:

The feature maxtrix has 2582 records and 5 features. The dependent variable has 2582 records. The feature maxtrix has 2418 records and 5 features. The dependent variable has 2418 records.

We also need the features for all the samples for the model predictions, and we gave it the name of `features`

.

# Features for all the samples features = df.loc[:, ['X1', 'X2', 'X3', 'X4', 'X5']]

### Step 4: T-Learner Model Training

In step 4, we will train the T-learner models.

The model selection and hyperparameter tuning are important for the performance of a T-learner. This is because the model performance affects the model predictions hence the accuracy of the individual treatment effect (ITE) estimation.

Many machine learning model algorithms can be used to build the T-learner. The model algorithms include but are not limited to LASSO regression, Ridge regression, random forest, XGBoost, and a neural network model.

A light GBM model is used in this example, and the process is the same for other machine learning model algorithms. The two models for the T-learner can use different algorithms.

Using the `LGBMRegressor`

method, we fit the two models using the features and the outcome variable for the treated and the control group separately.

# Light GBM model for the treated t_treated = LGBMRegressor().fit(features_treated, y_treated) # Light GBM model for the control t_control = LGBMRegressor().fit(features_control, y_control)

### Step 5: T-Learner Model Predictions

In step 5, we will make predictions using the T-learner models.

To make the treatment effect estimation, two separate predictions need to be made using the trained models:

- In the first prediction, the treated model is used for the prediction of all the samples. This gives us the predicted outcome values if all the samples received the treatment.
- In the second prediction, the control model is used for the prediction of all the samples. This gives us the predicted outcome values if none of the samples received the treatment.

# With treatment predictions with_treatment_predict = t_treated.predict(features) # With treatment predictions without_treatment_predict = t_control.predict(features)

### Step 6: T-learner Individual Treatment Effect (ITE)

In step 6, we will calculate the individual treatment effect (ITE) using the T-learner predictions.

Individual treatment effect (ITE) is the difference between the predicted outcomes with and without treatment.

After calculating the individual treatment effect (ITE), the data is saved in a dataframe.

# ITE ite = with_treatment_predict - without_treatment_predict # Save ITE data in a pandas dataframe ite_df = pd.DataFrame({'ITE': ite, 'with_treatment_predict':with_treatment_predict, 'without_treatment_predict': without_treatment_predict}) # Take a look at the data ite_df.head()

The histogram visualization of the individual treatment effect (ITE) shows a normal distribution.

- The average treatment effect is around 0.5.
- Most individuals in the dataset have a positive treatment effect.
- Some individuals have negative treatment effects.

# visualization ite_df.hist(column ='ITE', bins=50, grid=True, figsize=(12, 8))

### Step 7: T-Learner Average Treatment Effect (ATE)

In step 7, we will estimate the average treatment effect (ATE) using the T-learner predictions.

The average treatment effect (ATE) for the population is the average of the individual treatment effect (ITE). We can see that the average treatment effect (ATE) is 0.58.

To learn more about the definition and calculation for the average treatment effect (ATE), please check out my previous tutorial ATE vs CATE vs ATT vs ATC for Causal Inference.

# Calculate ATE ATE = ite.mean() # Print out results print(f'The average treatment effect (ATE) is {ATE:.2f}')

Output:

The average treatment effect (ATE) is 0.58

### Step 8: Customer Segmentation Using T-learner Individual Treatment Effect (ITE)

The customer segmentation using individual treatment effect (ITE) from the T-learner is the same as the S-learner. For the details about how to segment customers, please check out my previous tutorial S Learner Uplift Model for Individual Treatment Effect and Customer Segmentation in Python.

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.

### Recommended Tutorials

- GrabNGoInfo Machine Learning Tutorials Inventory
- S Learner Uplift Model for Individual Treatment Effect and Customer Segmentation in Python.
- ATE vs CATE vs ATT vs ATC for Causal Inference
- Time Series Causal Impact Analysis in Python
- 3 Ways for Multiple Time Series Forecasting Using Prophet in Python
- Four Oversampling And Under-Sampling Methods For Imbalanced Classification Using Python
- Multivariate Time Series Forecasting with Seasonality and Holiday Effect Using Prophet in Python
- Time Series Anomaly Detection Using Prophet in Python
- Autoencoder For Anomaly Detection Using Tensorflow Keras
- Databricks Mount To AWS S3 And Import Data
- Hyperparameter Tuning For XGBoost
- One-Class SVM For Anomaly Detection
- Sentiment Analysis Without Modeling: TextBlob vs. VADER vs. Flair
- Recommendation System: User-Based Collaborative Filtering
- How to detect outliers | Data Science Interview Questions and Answers
- Causal Inference One-to-one Matching on Confounders Using R for Python Users
- Gaussian Mixture Model (GMM) for Anomaly Detection
- Time Series Anomaly Detection Using Prophet in Python
- How to Use R with Google Colab Notebook

### References

- Künzel, Sören R., et al. “Metalearners for estimating heterogeneous treatment effects using machine learning.” Proceedings of the national academy of sciences 116.10 (2019): 4156-4165.
- CausalML documentation