Datasets available here. Alternative dataset here, which have data before Jan 28 2021, but does not have post content.
Stock prices can be downloaded with yfinance Python library
-
dataset.py For loading wsb posts and stock data
-
words.py Define custom stop words, stock symbols and market lexicon
-
analysis.py Run data analysis
Run
python3 dataset.py
to generate the market jargons lexicon.
Copy and paste the original vader_lexicon.txt file in the VADER root folder, rename the copy as market_lexicon.txt, and append new generated market jargons to the file.
In the project root folder, run
python3 analysis.py