This repo contains a Jupyter notebook to lead a quick workshop on how embeddings, clustering, and GPT can be used to extract high level insights from a dataset. In this case, that dataset is about 23k posts from the r/AITA subreddit :)
$ brew install git-lfs
$ git lfs install
If you haven't cloned the repo, go ahead and do it now. Otherwise -
$ cd clustering_workshop
$ git lfs pull https://github.com/shaw-matt/clustering_workshop.git
- Install Docker
- Open Docker Desktop
- Go to Settings > Resources > Advanced
- Set 'Memory' to 8 GB
$ cd clustering_workshop
$ docker build -t clustering-workshop-container .
$ docker run --memory="8g" -p 8888:8888 -v $(pwd):/home/jovyan/work clustering-workshop-container