Propensity Score Trimming Using Python Package Causal Inference. Use the Python CausalInference package to estimate propensity scores, trim extreme values, improve balances between treatment and control, and evaluate treatment effects

Propensity Score Trimming Using Python Package Causal Inference

CausalInference is a Python package for causal analysis. It has different functionalities such as propensity score trimming, covariates matching, counterfactual modeling, subclassification, and inverse probability weighting.

In this tutorial, we will talk about how to do propensity score trimming using CausalInference, and how that impacts the causal impact analysis results. Other functionalities will be introduced in future tutorials.

Resources for this post:

Propensity Score Trimming Using Python Package Causal Inference –

Let’s get started!

Step 1: Install and Import Libraries

In step 1, we will install and import libraries.

Firstly, let’s install dowhy for dataset creation and causalinference for propensity score trimming.

# Install dowhy
!pip install dowhy

# Install causal inference
!pip install causalinference

You will see the message below after the libraries are successfully installed.

Successfully installed dowhy-0.8 pydot-1.4.2
Successfully installed causalinference-0.1.3

After the installation is completed, we can import the libraries.

  • The datasets is imported from dowhy for dataset creation.
  • pandas and numpy are imported for data processing.
  • CausalModel is imported from the causalinference package for propensity score trimming and causality analysis.
# Package to create synthetic data for causal inference
from dowhy import datasets

# Data processing
import pandas as pd
import numpy as np

# Causal inference
from causalinference import CausalModel

Step 2:Create Dataset

In step 2, we will create a synthetic dataset for the causal inference.

  • Firstly, we set a random seed using np.random.seed to make the dataset reproducible.
  • Then a dataset with the true causal impact of 10, four confounders, 10,000 samples, a binary treatment variable, and a continuous outcome variable is created.
  • After that, we created a dataframe for the data. In the dataframe, the columns W0, W1, W2, and W3 are the four confounders, v0 is the treatment indicator, and y is the outcome.
# Set random seed

# Create a synthetic dataset
data = datasets.linear_dataset(

# Create Dataframe
df = data['df']

# Take a look at the data
Causal Inference Data –

Next, let’s rename v0 to treatment, rename y to outcome, and convert the boolean values to 0 and 1.

# Rename columns
df = df.rename({'v0': 'treatment', 'y': 'outcome'}, axis=1)

# Create the treatment variable, and change boolean values to 1 and 0
df['treatment'] = df['treatment'].apply(lambda x: 1 if x == True else 0)

# Take a look at the data
Causal Inference Data –

Step 3: Raw Difference

In step 3, we will initiate CausalModel and print the pre-trimming summary statistics. CausalModel takes three arguments:

  • Y is the observed outcome.
  • D is the treatment indicator.
  • X is the covariates matrix.

CausalModel takes arrays as inputs, so .values are used when reading the data.

# Run causal model
causal = CausalModel(Y = df['outcome'].values, D = df['treatment'].values, X = df[['W0', 'W1', 'W2', 'W3']].values)

# Print summary statistics

causal.summary_stats prints out the raw summary statistics. The output shows that:

  • There are 2,269 units in the control group and 7,731 units in the treatment group.
  • The average outcome for the treatment group is 13.94, and the average outcome for the control group is -2.191. So the raw difference between the treatment and the control group is 16.132.
  • Nor-diff is the standardized mean difference (SMD) for covariates between the treatment group and the control group. Standardized Mean Differences(SMD) greater than 0.1 means that the data is imbalanced between the treatment and the control group. We can see that most of the covariates have SMD greater than 0.1.
Python CausalInference raw balance and difference –

Step 4: Propensity Score Estimation

In step 4, we will get the propensity score estimation. Propensity score is the predicted probability of getting treatment. It is calculated by running a logistic regression with the treatment variable as the target, and the covariates as the features.

There are two methods for propensity score estimation, est_propensity_s and est_propensity.

  • est_propensity allows users to add the interaction or quadratic features.
  • est_propensity_s automatically choose the features based on a sequence of likelihood ratio tests.

In this step, we will use est_propensity_s to run the propensity score estimation.

# Automated propensity score estimation

# Propensity model results

From the model results, we can see that the feature selection algorithm decided to include only the raw features, and to not include interaction or quadratic terms.

Python CausalInference propensity score trimming —

To get the propensity score, use causal.propensity['fitted'].

# Propensity scores


array([0.99295272, 0.99217314, 0.00156753, ..., 0.69143426, 0.99983862,

Step 5: Propensity Score Trimming

In step 5, we will talk about propensity score trimming.

Propensity score trimming improves the balance between the treatment group and the control group by dropping units with extreme propensity scores.

The rationale behind the propensity score trimming is that

  • for the units with extremely high propensity scores of being in the treatment group, it’s hard to find reliably comparable units in the control group.
  • similarly, for the units with extremely low propensity scores of being in the treatment group, it’s hard to find reliably comparable units in the treatment group.

By default, the causalinference package set the cutoff value as 0.1 after the propensity score estimation. We can check the cutoff value by running causal.cutoff.

# Check the default propensity score trimming cutoff value



Running causal.trim() will remove all the units with propensity scores greater than 0.9 or less than 0.1.

Alternatively, we can use an automated optimal cutoff search procedure to find the best cutoff value that minimizes the asymptotic sampling variance of the trimmed sample. Instead of running causal.trim(), we will run causal.trim_s().

# Trim using the optimal cutoff value

# Check the optimal cutoff value

We can see that the optimal propensity score cutoff value is 0.08.


Step 6: After-Trimming Difference

In step 6, we will check the difference between the treatment and the control group after the propensity score trimming.

causal.summary_stats prints out the summary statistics after trimming. The output shows that:

  • The number of units in the control group decreased from 2,269 to 1,189, and the units in the treatment group decreased from 7,731 to 1,463.
  • The raw difference between the treatment and the control group decreased from 16.132 to 11.02, which is much closer to the true treatment impact of 10.
  • The standardized mean difference (SMD) for covariates between the treatment group and the control group decreased for every covariate.
# Print summary statistics
Python CausalInference balance after propensity score trimming —

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.

Recommended Tutorials


[1] DoWhy Documentation

[2] CausalInference Documentation

Leave a Comment

Your email address will not be published. Required fields are marked *