This repository parses conversation data from Google Hangouts and gives
diagnostics on the number of messages in conversations. Two scripts are
currently supported: parser.py
and visualize.py
. The parsing script parses
raw JSON data from Google Takeout and creates pickled summary files for each
conversation. The visualization script creates a histogram of messages over
time using the pickled conversation summaries.
- Clone this repository
- Download your hangouts data
- Navigate to Google Takeout
- Choose "Select None" and manually select Hangouts
- Download the data in zip format and move the
Hangouts.json
file into theraw
folder in this repository
- Install dependencies via
pip
pip install -r requirements.txt
- No dependencies are required for the
parser.py
script, butvisualize.py
requires the dependencies
- Run the parser
- Note: if you did not place your hangouts data as
raw/Hangouts.json
you can specify the path to the.json
file as an argument to theparser.py
script via the-f
flag
- Note: if you did not place your hangouts data as
python parser.py
- Run the visualization
- The
<conversation_id>
can be found in the output of theparser.py
script
- The
python visualize.py -f output/<conversation_id>.pkl
This code is freely available under the GNU Public License (GPL).
All of the data processing in these scripts happens locally on your computer. The data you provide to the script is NOT uploaded to an external server. Feel free to examine the code if you are concerned.
This repository was inspired by MasterScrat/Chatistics. Chatistics can parse Facebook Messenger and Telegram data, but not Hangouts group messages. I originally intended to contribute to that repository and add Hangouts group message support, but my design drifted far from the existing design in that repository so I created a new project. Shoutout to MasterScrat for great work and thanks for the inspiration!