Hackathon - Foreign Affairs Data Visualization

Introduction

In the CANIS Data Visualization and Foreign Interference challenge, we team are given a dataset and a task to visualize the dataset, and our goal, as a participant, is to decode the dataset, uncover patterns, and ultimately transform raw data into meaningful insights for a broader audience that might not be immediately apparent. Youtube demo at this link.

Collaborators

Sina Ziaee - Computer Science, MSc - University of Calgary
Sajad Dadgar - Computer Science, MSc - University of Calgary

Methodology

Our methodology for visualizing the "Foreign Affairs" dataset revolves around a comprehensive approach including preprocessing and data extraction. Our goal is not just to depict the data but to unravel its potential applications, demonstrating a robust understanding of the dataset. The methodology comprises the following steps:

1. Data Preprocessing:

Before visualizing the "Foreign Affairs" dataset, our initial step was to preprocess and refine the data to enhance data quality. First, We dropped the "Name (Chinese)" and "Entity owner (Chinese)" columns, as redundant information, since we have corresponding English names and Entity owners. Additionally, due to the varied nature of the "Region of Focus" column, containing cities, countries, and continents, we assigned each region to its corresponding countries in order to facilitate representing the data on the world map.

Furthermore, to have up-to-date information about the number of followers of each name on the dataset and have more accurate data, we leveraged social media APIs to extract the most recent follower counts in platforms that widely used by the names in the dataset. As a result, the number of followers has been updated to reflect the most recent data. The data was collected on November 18, 2023.

2. Visualization Approach:

Our approach consists of two steps. First, we focused on the dataset itself, extracting valuable information by visualizing the preprocessed dataset. But we don't stop there. We dived into social media platforms for each name in the dataset to extract more data regarding their follower, following, locations, type of accounts, whether their account is verified or not, etc. As shown in the data visualization part, additional data helped us to obtain valuable information to show the scale of China's influence on social media.

3. Technologies

In our data visualization, we employed the following tools and technologies:

Python: Python is a versatile programming language widely used in data science and visualization. For the implementaion, we used version 3.12.0, which is the latest stable releases.
Streamlit: Streamlit is a user-friendly Python library for creating interactive web applications for data visualization. Also, it contains various features, allowing us to design engaging and dynamic visualizations seamlessly. we used version 1.28.2 for this project.
Jupyter notebook: Jupyter Notebook is an open-source tool that facilitates interactive computing and data analysis.

Implementaion Codes

All the source codes can be found on GitHub. This repository consists of all codes associated with this project, such as preprocessing and data extraction codes along with the obtained datasets. For a better understanding of the repository and its content, we explain important files and directories:

Repository

Description of the repository content:

assets: It contains json files to assign each region to its equivalent countries to represent the data based on various metrics on the world map. dataset: All datasets including the main dataset and new datasets are in this directory. findings: Statistics (e.g., name, entity owners, accounts, etc.) obtained from the datasets. pages: Implementaion of the charts and graphs in the projects. data_extraction.ipynb: Codes to extract additional information about users via social media APIs. Home,py: Description of the project (First page). preprocess.ipynb: Codes to preprocess the raw data.

Setup

Below are the steps to set up the Streamlit application using code from the GitHub repository:

Clone the Repository:

git clone https://github.com/sinaziaee/foreign-affairs

Install Dependencies:

pip install -r requirements.txt

Once installed the dependencies, run the app locally by executing the following command in your terminal:

streamlit run introduction.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hackathon - Foreign Affairs Data Visualization

Introduction

Collaborators

Methodology

1. Data Preprocessing:

2. Visualization Approach:

3. Technologies

Implementaion Codes

Repository

Setup

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
assets		assets
dataset		dataset
findings		findings
pages		pages
utils		utils
.gitignore		.gitignore
Home.py		Home.py
README.md		README.md
data_extraction.ipynb		data_extraction.ipynb
graph.html		graph.html
preprocess.ipynb		preprocess.ipynb
requirements.txt		requirements.txt

sinaziaee/foreign-affairs

Folders and files

Latest commit

History

Repository files navigation

Hackathon - Foreign Affairs Data Visualization

Introduction

Collaborators

Methodology

1. Data Preprocessing:

2. Visualization Approach:

3. Technologies

Implementaion Codes

Repository

Setup

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages