Subclassification matching in causal inference stratifies the propensity scores into bins, and the treatment and the control units within the bins are compared to get the treatment effects estimation.

In this tutorial, we will talk about how to do subclassification propensity score matching (PSM) using the Python `CausalInference`

package.

To learn how to do subclassification matching using R, please check out my previous tutorial 8 Matching Methods for Causal Inference Using R

**Resources for this post:**

- Click here for the Colab notebook.
- More video tutorials on Causal Inference
- More blog posts on Causal Inference
- If you are not a Medium member and want to support me as a writer (ðŸ˜„ Buy me a cup of coffee â˜•), join Medium membership through this link. You will get full access to posts on Medium for $5 per month, and I will receive a portion of it. Thank you for your support!
- Give me a tip to show your appreciation
- Video tutorial for this post on YouTube

Let’s get started!

### Step 1: Install and Import Libraries

In step 1, we will install and import libraries.

Firstly, let’s install dowhy for dataset creation and `causalinference`

for subclassification matching.

# Install dowhy !pip install dowhy # Install causal inference !pip install causalinference

After the installation is completed, we can import the libraries.

- The
`datasets`

is imported from`dowhy`

for dataset creation. `pandas`

and`numpy`

are imported for data processing.`CausalModel`

is imported from the`causalinference`

package for subclassification matching.

# Package to create synthetic data for causal inference from dowhy import datasets # Data processing import pandas as pd import numpy as np # Causal inference from causalinference import CausalModel

### Step 2ï¼šCreate Dataset

In step 2, we will create a synthetic dataset for the causal inference.

- Firstly, we set a random seed using
`np.random.seed`

to make the dataset reproducible. - Then a dataset with the true causal impact of 10, four confounders, 10,000 samples, a binary treatment variable, and a continuous outcome variable is created.
- After that, we created a dataframe for the data. In the dataframe, the columns W0, W1, W2, and W3 are the four confounders, v0 is the treatment indicator, and y is the outcome.

# Set random seed np.random.seed(42) # Create a synthetic dataset data = datasets.linear_dataset( beta=10, num_common_causes=4, num_samples=10000, treatment_is_binary=True, outcome_is_binary=False) # Create Dataframe df = data['df'] # Take a look at the data df.head()

Next, let’s rename `v0`

to `treatment`

, rename `y`

to `outcome`

, and convert the boolean values to 0 and 1.

# Rename columns df = df.rename({'v0': 'treatment', 'y': 'outcome'}, axis=1) # Create the treatment variable, and change boolean values to 1 and 0 df['treatment'] = df['treatment'].apply(lambda x: 1 if x == True else 0) # Take a look at the data df.head()

### Step 3: Raw Difference

In step 3, we will initiate `CausalModel`

and print the raw data summary statistics. `CausalModel`

takes three arguments:

`Y`

is the observed outcome.`D`

is the treatment indicator.`X`

is the covariates matrix.

`CausalModel`

takes arrays as inputs, so `.values`

are used when reading the data.

# Run causal model causal = CausalModel(Y = df['outcome'].values, D = df['treatment'].values, X = df[['W0', 'W1', 'W2', 'W3']].values) # Print summary statistics print(causal.summary_stats)

`causal.summary_stats`

prints out the raw summary statistics. The output shows that:

- There are 2,269 units in the control group and 7,731 units in the treatment group.
- The average outcome for the treatment group is 13.94, and the average outcome for the control group is -2.191. So the raw difference between the treatment and the control group is 16.132, which is much higher than the actual treatment effect of 10.
`Nor-diff`

is the standardized mean difference (SMD) for covariates between the treatment group and the control group. Standardized Mean Differences(SMD) greater than 0.1 means that the data is imbalanced between the treatment and the control group. We can see that most of the covariates have SMD greater than 0.1.

### Step 4: Propensity Score Estimation

In step 4, we will get the propensity score estimation. A propensity score is the predicted probability of getting treatment. It is calculated by running a logistic regression with the treatment variable as the target, and the covariates as the features.

There are two methods for propensity score estimation, `est_propensity_s`

and `est_propensity`

.

`est_propensity`

allows users to add interaction or quadratic features.`est_propensity_s`

automatically choose the features based on a sequence of likelihood ratio tests.

In this step, we will use `est_propensity_s`

to run the propensity score estimation.

# Automated propensity score estimation causal.est_propensity_s() # Propensity model results print(causal.propensity)

From the model results, we can see that the feature selection algorithm decided to include only the raw features, and not include interaction or quadratic terms.

To get the propensity score, use `causal.propensity['fitted']`

.

# Propensity scores causal.propensity['fitted']

Output

array([0.99295272, 0.99217314, 0.00156753, ..., 0.69143426, 0.99983862, 0.99943713])

### Step 5: Subclassification Matching by Propensity Score Stratification

In step 5, we will do the subclassification matching by stratifying the propensity scores.

The Python `CausalInference`

package provides two methods for subclassification matching: `stratify`

and `stratify_s`

.

`stratify`

split the units into five equal-sized bins by default. But we can change the hyperparameter value for`blocks`

from 5 to other values to create different number of bins.`stratify_s`

uses a data-driven approach to set the number of bins and the bin boundaries automatically. It recursively divides the samples into two groups until there is no need for further splitting.

We will take the automatic approach for propensity score stratification.

# Propensity score stratification causal.stratify_s() # Print stratification summary print(causal.strata)

From the Stratification Summary, we can see that:

- The samples are divided into 18 subclasses.
- The minimum and maximum propensity scores for each subclass are listed.
- There are more control units than treatment units in the subclasses with low propensity scores. Similarly, there are more treatment units than control units in the subclasses with high propensity scores.
- The within-subclass average propensity scores are similar between the treated and the control group.
- The average outcome raw differences for subclasses are mostly between 9 and 11, which is much closer to the true causal impact of 10 than the raw difference of 16.

### Step 6: Subclassification Treatment Effect Estimation

In step 6, we will talk about how to use the subclassification propensity score matching results for treatment effects estimation.

The treatment effects estimation takes the following steps:

- Step 1: After stratifying the samples into 18 subclasses, estimate the within-subclass treatment effects using ordinary least squares (OLS). To learn more about the ordinary least squares (OLS) treatment effect estimation, check out my previous tutorial OLS Treatment Effects Estimation Using Python Package Causal Inference
- Step 2: Take the sample-weighted average of the within-subclass treatment effects from step 1 to get an overall treatment effects estimation.

Instead of manually calculating the treatment effects, the function est_via_blocking performs the two steps above and outputs the final causal impact estimations.

`causal.estimates`

prints out the treatment effects estimation results.

- ATE is the average treatment effect
- ATC is the average treatment effect on the control
- ATT is the average treatment effect on the treated

To learn more about the average treatment effect (ATE), the average treatment effect on the control (ATC), the average treatment effect on the treated (ATE), and how the values are calculated, please check out my previous tutorial ATE vs CATE vs ATT vs ATC for Causal Inference.

# Subclassification treatment effects estimation causal.est_via_blocking() # Print out the estimation results print(causal.estimates)

From the treatment effect estimation results, we can see that the average treatment effect (ATE), the average treatment effect on the control (ATC), and the average treatment effect on the treated (ATE) are all the same as the true causal impact of 10, which is a much more accurate estimation than the raw difference of 16.

For more information about data science and machine learning, please check out myÂ YouTube channelÂ andÂ Medium PageÂ or follow me onÂ LinkedIn.

### Recommended Tutorials

- GrabNGoInfo Machine Learning Tutorials Inventory
- Multivariate Time Series Forecasting with Seasonality and Holiday Effect Using Prophet in Python
- Four Oversampling And Under-Sampling Methods For Imbalanced Classification Using Python
- 3 Ways for Multiple Time Series Forecasting Using Prophet in Python
- Databricks Mount To AWS S3 And Import Data
- Hyperparameter Tuning For XGBoost
- One-Class SVM For Anomaly Detection
- Time Series Anomaly Detection Using Prophet in Python
- Sentiment Analysis Without Modeling: TextBlob vs. VADER vs. Flair
- Recommendation System: User-Based Collaborative Filtering
- How to detect outliers | Data Science Interview Questions and Answers
- Causal Inference One-to-one Matching on Confounders Using R for Python Users
- Gaussian Mixture Model (GMM) for Anomaly Detection
- Time Series Anomaly Detection Using Prophet in Python
- How to Use R with Google Colab Notebook