Unsupervised Model

4 Clustering Model Algorithms in Python and Which is the Best K-means, Gaussian Mixed Model (GMM), Hierarchical model, and DBSCAN model. Which one to choose for your project? PCA and t-SNE

4 Clustering Model Algorithms in Python and Which is the Best

Welcome to GrabNGoInfo! In this tutorial, we will talk about four clustering model algorithms, compare their results, and discuss how to choose a clustering algorithm for a project. You will learn: Resources for this post: Step 0: Clustering Model Algorithms Based on the underlying algorithm for grouping the data, the clustering model can be divided …

4 Clustering Model Algorithms in Python and Which is the Best Read More »

Gaussian Mixture Model (GMM) for Anomaly Detection. Predict anomalies from a Gaussian Mixture Model (GMM) using percentage threshold and value threshold, and improve anomaly prediction performance

Gaussian Mixture Model (GMM) for Anomaly Detection

Gaussian Mixture Model (GMM) is a probabilistic clustering model that assumes each data point belongs to a Gaussian distribution. Anomaly detection is the process of identifying unusual data points. Gaussian Mixture Model (GMM) detects outliers by identifying the data points in low-density regions [1]. In this tutorial, we will use Python’s sklearn library to implement Gaussian Mixture …

Gaussian Mixture Model (GMM) for Anomaly Detection Read More »

How to decide the number of clusters | Data Science Interview Questions and Answers Elbow method, Silhouette score, Hierarchical graph, AIC & BIC from GMM, Gap statistics, and when to use which method

How to decide the number of clusters | Data Science Interview Questions and Answers

In data science and machine learning interviews, how to decide the number of clusters for an unsupervised model is one of the most commonly asked questions. In this tutorial, we will talk about five ways for deciding the number of clusters and when to use which. The five methods are: Resources for this post: Let’s …

How to decide the number of clusters | Data Science Interview Questions and Answers Read More »

Topic Modeling by Group Using Deep Learning in Python Topics by category using the Python package BERTopic on Airbnb reviews

Topic Modeling by Group Using Deep Learning in Python

Building one general topic model is not enough in some cases, especially when there are different categories with various properties and characteristics. For example, a commercial bank may be interested in topic models built for different lines of products such as credit cards, checking accounts, or student loan. A hotel chain may be interested in …

Topic Modeling by Group Using Deep Learning in Python Read More »

Hyperparameter Tuning for BERTopic Model in Python Hyperparameter optimization for Transformer-based NLP topic modeling using the Python package BERTopic

Hyperparameter Tuning for BERTopic Model in Python

Hyperparameter tuning is an important optimization step for building a good topic model. BERTopic is a topic modeling python library that combines transformer embeddings and clustering model algorithms to identify topics in NLP (Natual Language Processing). In this tutorial, we will talk about the following: Please check out my previous tutorial Topic Modeling with Deep …

Hyperparameter Tuning for BERTopic Model in Python Read More »

Hierarchical Topic Model for Airbnb Reviews Extracting topics and sub-topics hierarchical structure in Airbnb reviews using the Python package BERTopic

Hierarchical Topic Model for Airbnb Reviews

Hierarchical topic models are the models that utilize the semantic hierarchy to identify topics and sub-topics for a collection of text. In this tutorial, we will use Airbnb review data to illustrate the following: The Python package used for the hierarchical model in this tutorial is BERTopic. For more details about using this package, please …

Hierarchical Topic Model for Airbnb Reviews Read More »