Data Science Project Completed in 5 Minutes Using ChatGPT Plugin Noteable Automating your data science workflow using ChatGPT plugin Noteable

Data Science Project Completed in 5 Minutes Using ChatGPT Plugin Noteable


The ChatGPT plugin Noteable is a revolutionary tool designed to streamline and enhance the process of data analysis and modeling. Noteable allows users to describe in natural language what they want to do, such as the data analysis techniques they want to use, and it generates a complete notebook using Python, SQL, or markdown as a result. It can be used for data exploration, data transformations, data visualization, identify patterns or trends, as well as building machine learning models. Additionally, Noteable supports collaborative work, making it an excellent tool for team-based data science projects.

In this tutorial, we will talk about:

  • How to enable Noteable plugin and connect ChatGPT with a Noteable project?
  • How to conduct exploratory data analysis using ChatGPT Noteable plugin?
  • How to build a machine learning model and perform hyperparameter tuning?
  • How to share the notebook with others?

Resources for this post:

Data Science Project Using ChatGPT Plugin Noteable — GrabNGoInfo.com

Let’s get started!

Step 1: Enable ChatGPT Plugins

Before you can start using Noteable, you need to ensure that you have access to it. The plugin is currently only available to all ChatGPT Plus users as this tutorial was created in June 2023. To enable plugins, you’ll need to navigate to the Settings option associated with your username.

Image by Amy @GrabNGoInfo

From there, proceed to the Beta features section where you’ll find the option to enable plugins.

Image by Amy @GrabNGoInfo

Step 2: Install Noteable Plugin

Once plugins are enabled, we can install the Noteable Notebook plugin using the GPT-4 model. Click Plugins (beta) from the dropdown list.

Image by Amy @GrabNGoInfo

The phrase No plugins enabled shows up after the selection.

Image by Amy @GrabNGoInfo

Clicking on No plugins enabled activates the button to the Plugin store.

Image by Amy @GrabNGoInfo

The plugin store popup window shows up after clicking the Plugin store button.

Image by Amy @GrabNGoInfo

You can search for Noteable if it does not show up on the first page.

Image by Amy @GrabNGoInfo

Click the green Install button to install the Noteable plugin for ChatGPT.

You will be directed to a page to Log in to Noteable. There are a few login options available, using Google, GitHub, LinkedIn, or email. You can also click Sign up to create a new account.

Image by Amy @GrabNGoInfo

After logging in, you will be directed back to ChatGPT with Noteable plugin enabled.

Image by Amy @GrabNGoInfo

Step 3: Set Default Project for ChatGPT

The ChatGPT Noteable plugin requires a default project path, so it knows the location to create the notebook. To set the default project path, login your noteable.io account, select your space name under Spaces, and click the Create Project button.

Image by Amy @GrabNGoInfo

A new window pops up asking information about the Project Name and Space. You can also give your project a description or create the project from a git repository.

Image by Amy @GrabNGoInfo

Fill in the information and click the black Create button. You will see the project showing up in the selected space.

Image by Amy @GrabNGoInfo

Click the project name, and you will see that a new notebook called “My First Notebook” has been automatically generated. Copy the project URL from the browser.

Image by Amy @GrabNGoInfo

Go back to ChatGPT, type “Use this link as my default project” and paste the Noteable project URL. ChatGPT will automatically use Noteable and confirm that “I’ve set your default project to “Demo”. Now, any new notebooks that you create will be added to this project by default.”.

Image by Amy @GrabNGoInfo

Step 4: Read Data Using ChatGPT Plugin Noteable

Noteable supports uploading data from local file, connecting to a database, or read data directly from an URL. In this example, we will use a public Airbnb listing dataset on insideairbnb.com. Note that older data is archived on this website, so the save link may not work at a later time.

Image by Amy @GrabNGoInfo

In ChatGPT, I asked “Can you read the Airbnb data from this link: http://data.insideairbnb.com/united-states/dc/washington-dc/2023-03-19/visualisations/listings.csv”, and the Noteable plugin created a new notebook, read the data from the URL, and printed out the first 5 rows.

It provided the link to the notebook and even suggested performing some basic data analysis.

Image by Amy @GrabNGoInfo

Step 5: Exploratory Data Analysis (EDA)

Next, let’s continue with the EDA with the following instruction: “Sounds good! Can you identify important features, create visualizations, and make suggestions on potential analyses or models? Please add comments to the code and create markdowns to explain each step.”

Here is what we got from the prompt. It provided a quick overview of the dataset, listed important features for our analysis, and created some basic visualizations to better understand the data.

Image by Amy @GrabNGoInfo
Image by Amy @GrabNGoInfo
Image by Amy @GrabNGoInfo

At the end of the output, ChatGPT provided four options to continue the analysis:

  • Analyze the relationship between room type and price.
  • Investigate how the number of reviews affects the price.
  • Examine the distribution of prices in different neighbourhoods.
  • Predict the price of a listing based on its features using a regression model.

Let’s choose option four and continue with a model.

Step 6: Build Machine Learning Models

Building a machine learning model with the ChatGPT plugin Notetable is as simple as giving the prompt “Please predict the price of a listing based on its features using a regression model”. A simple linear regression model was automatically built and the performance was analyzed.

Image by Amy @GrabNGoInfo

It also recommended four ways to improve the model performance:

  • Use more features for prediction. We only used a few features here, but there might be other features that are also important for predicting the price.
  • Handle outliers. The price variable might have some extreme values that could affect our model.
  • Use a more complex model. Linear regression is a simple model that might not be able to capture all the complexities of our data. Other models like decision trees, random forest, or gradient boosting might give better results.
  • Tune the model parameters. Most models have parameters that can be tuned to improve their performance.

I would like to continue with option four and see how it does model hyperparameter tuning.

Step 7: Model Hyperparameter Tuning

With the prompt “Please proceed with tuning the model parameters.”, Noteable started to tune the model parameters with grid search. It also changed the model algorithm from a simple linear regression model to a random forest model. Because tuning hyperparameters take a long time, ChatGPT even asked “Would you like to continue waiting, or would you like to proceed with something else while the grid search is running?”

Image by Amy @GrabNGoInfo

After the grid search is completed, the best parameters for the random forest model was reported.

Image by Amy @GrabNGoInfo

This is the code in the notebook corresponding to the hyperparameter tuning.

Image by Amy @GrabNGoInfo

After using the best parameters to train our random forest model, we got the model performance evaluation “The root mean squared error (RMSE) of our random forest model on the test set is approximately 215.55. This means that on average, our model’s predictions are about $215.55 away from the actual price. This is a slight improvement from the linear regression model, which had an RMSE of about $221.61.”

It also gives suggestions for further improving the model:

  • Use more features for prediction. We only used a few features here, but there might be other features that are also important for predicting the price.
  • Handle outliers. The price variable might have some extreme values that could affect our model.
  • Use a more complex model. Random forest is a more complex model than linear regression, but there might be other models that could give better results, such as gradient boosting or neural networks.
Image by Amy @GrabNGoInfo

Step 8: Sharing the Notebook

To share the notebook, we need to click the notebook link in ChatGPT and open the notebook in noteable.io, then click the black Share button on the top right corner.

Image by Amy @GrabNGoInfo

A new window pops up and you can choose to share the notebook with others and decide the permission levels.

Image by Amy @GrabNGoInfo

Summary

In this beginner’s tutorial, we explored the powerful combination of ChatGPT and Noteable, a dynamic duo that brings together the capabilities of AI and data science. We walked through the process of creating a new notebook, loading a dataset, and performing exploratory data analysis. We also built a simple linear regression model and a random forest model to predict Airbnb listing prices, demonstrating how easy it is to build and tune models with this setup.

The ChatGPT plugin for Noteable is a game-changer for data science projects. It allows you to interact with your Jupyter notebooks using natural language, making data science more accessible and intuitive.

In this tutorial, we only scratched the surface of what’s possible with the ChatGPT and Noteable plugin. In future tutorials, we’ll dive deeper and explore how to use this powerful tool to build different types of models, handle more complex data, and tackle more advanced data science tasks.

Stay tuned for more tutorials on how to leverage the power of ChatGPT with Noteable in your data science projects. Follow me to get notified when new tutorials are released. Happy coding!


For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.

Recommended Tutorials

Leave a Comment

Your email address will not be published. Required fields are marked *