Embeddings + Clustering + GPT Workshop

This repo contains a Jupyter notebook to lead a quick workshop on how embeddings, clustering, and GPT can be used to extract high level insights from a dataset. In this case, that dataset is about 23k posts from the r/AITA subreddit :)

Setup

Git LFS Setup

$ brew install git-lfs
$ git lfs install

If you haven't cloned the repo, go ahead and do it now. Otherwise -

$ cd clustering_workshop
$ git lfs pull https://github.com/shaw-matt/clustering_workshop.git

Docker Setup

Install Docker
Open Docker Desktop
Go to Settings > Resources > Advanced
Set 'Memory' to 8 GB

Build Docker Container

$ cd clustering_workshop
$ docker build -t clustering-workshop-container  .

Run Docker Container

$ docker run --memory="8g"  -p 8888:8888 -v $(pwd):/home/jovyan/work clustering-workshop-container

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
clustering_workshop.ipynb		clustering_workshop.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embeddings + Clustering + GPT Workshop

Setup

Git LFS Setup

Docker Setup

Build Docker Container

Run Docker Container

About

Releases

Packages

Languages

shaw-matt/clustering_workshop

Folders and files

Latest commit

History

Repository files navigation

Embeddings + Clustering + GPT Workshop

Setup

Git LFS Setup

Docker Setup

Build Docker Container

Run Docker Container

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages