Use ChatGPT Code Interpreter for Machine Learning Models Code Interpreter for data analytics, data science, and machine learning

Use ChatGPT Code Interpreter for Machine Learning Models


Code Interpreter is a feature offered by OpenAI’s ChatGPT. It allows ChatGPT to analyze data, create charts, solve math problems, edit files, and perform other tasks using code. It is designed to work with a wide range of data types and supports uploading and downloading files, a feature that was not possible before.

In this tutorial, we will talk about how to use the ChatGPT Code Interpreter for building a machine learning model. We will cover:

  • How to enable Code Interpreter within ChatGPT?
  • How to upload datasets to ChatGPT?
  • How to do data summary and analysis using Code Interpreter?
  • How to run a machine learning model using Code Interpreter?
  • How to do model hyperparameter tuning using Code Interpreter?

Resources for this post:

  • Video tutorial for this post on YouTube
  • More video tutorials on GPT
  • More blog posts on GPT

Let’s get started!

Use ChatGPT Code Interpreter for Machine Learning Models

Step 1: Enable ChatGPT Code Interpreter

To enable ChatGPT Code Interpreter, you need a ChatGPT Plus subscription, which costs $20 per month. Assuming you have the subscription, follow these steps to enable the Code Interpreter plugin:

  1. Log in to ChatGPT. In the bottom-left of the window, next to your login name, click the three-dot menu and select Settings & Beta.
Image by Amy @GrabNGoInfo

2. Select Beta features in the Settings window. Click on the toggle button to enable the Code Interpreter plugin.

Image by Amy @GrabNGoInfo

3. Click the GPT version you would like to use and select Code Interpreter from the dropdown menu.

Image by Amy @GrabNGoInfo

Step 2: Upload Datasets

After enabling the Code Interpreter, you will see a plus sign in the chat window for uploading files.

Image by Amy @GrabNGoInfo

I uploaded an Airbnb listings dataset and add the prompt “Can you identify important features, create visualizations, and make suggestions on potential analyses or models?”

Image by Amy @GrabNGoInfo

Step 3: Code Interpreter Data Analysis

This simple prompt gives us several outputs.

  1. Description of the columns. Code Interpreter inferred the meaning of the columns from the dataset.
Image by Amy @GrabNGoInfo

2. Check data types and missing values. An overview of the missing data and the data types for each column was created.

Image by Amy @GrabNGoInfo

3. Data preprocessing. ChatGPT Code Interpreter analyzed the missing data and proposed to drop the column with all missing data. It also suggested converting the last_review date from string format to datatime format.

Image by Amy @GrabNGoInfo

Clicking the Show work in the white Finished working button shows the Python code for each step.

Image by Amy @GrabNGoInfo

4. Distribution plots for features. ChatGPT code interpreter creates distribution plots for numeric and categorical features separately. It also gives a brief summary of the distribution.

For example, for the feature Price, it stated “The majority of prices are below $500 per night, with a significant number of listings priced below $100 per night. However, there are also some listings with much higher prices. We may need to consider removing outliers if we use this feature in a model.”

Image by Amy @GrabNGoInfo
Image by Amy @GrabNGoInfo
Image by Amy @GrabNGoInfo

5. Correlation matrix of numeric values. Code Interpreter also created the correlation matrix and summarized the key findings from the correlation matrix.

Image by Amy @GrabNGoInfo
Image by Amy @GrabNGoInfo

6. Suggestions for next steps. ChatGPT Code Interpreter suggested price prediction, classification of room type, future exploratory data analysis, recommendation system, and sentiment analysis. It also pointed out that preprocessing steps such as handling missing data, outliers, and categorical variables, as well as feature scaling, would be necessary before using these features in a machine learning model.

Image by Amy @GrabNGoInfo

Step 4: Build a Machine Learning Model and Model Evaluation

ChatGPT Code Interpreter starts to build a machine learning model with the prompt “Please build a price prediction model.”

It first gives an overview of the steps, which includes data processing, feature selection, model training, and model evaluation.

Image by Amy @GrabNGoInfo

Then it uses Python to create the model. I like the fact that the Code Interpreter was able to identify the columns that are not needed for the model such as ‘id’, ‘name’, ‘host_id’, ‘host_name’ etc.

Image by Amy @GrabNGoInfo

After finishing running the code, it summarized the model performance and pointed out the overfitting issue with the model.

Image by Amy @GrabNGoInfo

It also suggested a few ways to improve the model, including hyperparameter tuning, using a different model algorithm, adding more features, feature selection, ensemble methods, and cross-validation.

Image by Amy @GrabNGoInfo

Step 5: Hyperparameter Tuning

I asked Code Interpreter to proceed with hyperparameter tuning, and it suggested to use grid search on four parameters, n_estimators, max_depth, min_samples_split, and min_samples_leaf

Image by Amy @GrabNGoInfo

ChatGPT Code Interpreter was not able to finish the grid search because it is a computationally intensive process.

Image by Amy @GrabNGoInfo

But it provides the code for running the hyperparameter tuning.

Image by Amy @GrabNGoInfo

If you are interested in doing everything using ChatGPT, please check out my previous tutorial on using the ChatGPT plugin Noteable for machine learning projects. Notable was able to handle tasks with higher computational requirements, and all the code are conveniently saved in a notebook format.

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.


Recommended Tutorials

Leave a Comment

Your email address will not be published. Required fields are marked *