Repo for HUMA2026 Data Environmentalism 'ExxonMobil's climate change communications: data analysis for climate justice'.
This repo contains two directories:
- one containing 17 pdfs James found at The House of Commons Library page 'Climate change: an overview, published October 2021.
- a second of txt files created by James from those pdfs, totalling 65k words.
- using the bash command line script
for file in *.pdf; do pdftotext "$file" "${file%.*}.txt"; done
. - then combined into one file (with a little descriptive detail at the start of each line) and made into a spreadsheet.
- using the bash command line script
- Download UK-climate-docs_for-LDA_v4.xlsx.
- Upload this document to your Google Drive account, open it, select 'File' and then 'Save as Google Sheets'
- Go to Topic Modeling for HUMA2026 wk8 (data analytics for climate justice) on Google Colab, hit 'File' and 'Save a Copy in Drive', then start analysing!
Documents in this repo are shared under an Open Parliament Licence.