Notebook: Customer Segmentation with K-Means Clustering using BigQuery Dataframes #26

NiloFreitas · 2024-09-26T21:18:55Z

Description

This issue proposes the development of a new notebook that demonstrates how to perform customer segmentation using K-Means clustering with BigQuery Dataframes. The notebook should cover the following aspects:

Data Preparation

Select an appropriate customer data from a BigQuery public dataset or one dataset from our public GCS bucket, if available.
Perform feature engineering and selection relevant to customer segmentation (e.g., recency, frequency, monetary value - RFM analysis).
Prepare the data for K-Means clustering using BigQuery Dataframes.

Model Training

Use bigframes.ml.cluster.KMeans to train a K-Means clustering model.
Optimize the number of clusters (k) using techniques like the elbow method or silhouette analysis.

Cluster Analysis

Analyze the characteristics of each customer segment.
Visualize the clusters using appropriate techniques (e.g., scatter plots, t-SNE).

Interpretation and Application

Draw insights from the customer segments.
Discuss.potential applications of the segmentation results (e.g., targeted marketing, personalized recommendations).

Instructions for Contributors

Use the existing notebooks in the repository as a template for structure and style.
Ensure the notebook is well-documented and easy to follow.
Include a clear explanation of the concepts and techniques used.
Provide visualizations to illustrate the results.

Use a publicly available dataset or provide instructions on how to generate synthetic data.
Test the notebook thoroughly before submitting a pull request.

Resources

BigQuery Dataframes documentation: https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.cluster.KMeans
K-Means Clustering documentation: https://cloud.google.com/bigquery/docs/kmeans-tutorial

Contributing guidelines: CONTRIBUTING.md

Note: Please refer to the contributing guidelines for detailed instructions on how to contribute to this repository.

This notebook will provide a valuable resource for users interested in applying K-Means clustering for customer segmentation using BigQuery Dataframes. We encourage contributions from the community to help develop this notebook.

We appreciate a lot your contribution! :)

NiloFreitas added the good first issue Good for newcomers label Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notebook: Customer Segmentation with K-Means Clustering using BigQuery Dataframes #26

Notebook: Customer Segmentation with K-Means Clustering using BigQuery Dataframes #26

NiloFreitas commented Sep 26, 2024

Notebook: Customer Segmentation with K-Means Clustering using BigQuery Dataframes #26

Notebook: Customer Segmentation with K-Means Clustering using BigQuery Dataframes #26

Comments

NiloFreitas commented Sep 26, 2024

Description

Data Preparation

Model Training

Cluster Analysis

Interpretation and Application

Instructions for Contributors

Resources

Contributing guidelines: CONTRIBUTING.md