Node Classification in Citation Networks using Graph Neural Networks (GNNs)

This project applies Graph Neural Networks (GNNs) to classify academic papers within a citation network based on their subject areas. Using the Cora dataset, we leverage GNN architectures like Graph Convolutional Network (GCN) and GraphSAGE to make predictions by learning from the citation relationships between papers.

Project Overview

This project demonstrates the use of GNNs for node classification tasks, specifically focusing on predicting the subject area of academic papers in a citation network. Citation networks are modeled as graphs where nodes represent papers and edges represent citation relationships.

Dataset

The project uses the Cora dataset, a popular dataset in graph-based research. It consists of:

Nodes: Representing academic papers.
Edges: Representing citation links between papers.
Features: Sparse bag-of-words vectors for each paper.
Labels: Subject areas such as Machine Learning, Data Mining, and Neural Networks.

The Cora dataset is available through libraries like PyTorch Geometric.

Model Architecture

The project implements two main GNN architectures:

Graph Convolutional Network (GCN)
GraphSAGE

The GNN models follow a multi-layer structure:

Input Layer: Initializes node features.
GNN Layers: Performs message-passing operations to learn node embeddings.
Output Layer: Classifies nodes into subject categories using a softmax function.

Installation

Clone the repository:

git clone https://github.com/yourusername/gnn-node-classification.git
cd gnn-node-classification

Install dependencies:

PyTorch with CUDA (for GPU support, replace cu118 with your CUDA version):

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

PyTorch Geometric:

pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric

Other required libraries:
```
pip install scikit-learn matplotlib
```

Usage

Run the training script:
```
python main.py
```
Evaluation results, including accuracy, precision, recall, and F1-score, will be displayed in the console.
You can visualize training loss and node embeddings using the provided code.

Results

The trained GNN model provides metrics like accuracy, precision, recall, and F1-score for node classification on the Cora dataset. Additionally, visualizations of node embeddings (e.g., using t-SNE or PCA) and training loss are included to assess model performance.

Extensions (into extensions folder in 01 directory)

Experiment with different GNN architectures, such as Graph Attention Networks (GAT).
Tune hyperparameters like learning rate, number of layers, and hidden layer size.
Use additional datasets like PubMed or CiteSeer.

Project Structure

gnn-node-classification/
├── 00.Project-Description/
│   ├── Project.Description.docx          # Project overview document
├── 01.Dataset-And-Code/
│   ├── data/                             # Folder for storing the dataset
│   ├── main.py                           # Main script for training and evaluation
├── 02.Reports-And-Presentation/
│   ├── report.pdf                        # Detailed project report
│   ├── presentation.pdf                  # Project presentation slides
├── .gitignore                            # Git ignore file
├── README.md                             # Project README file
└── requirements.txt                      # List of dependencies

Contributing

Contributions are welcome! Please submit issues or pull requests for any improvements or new features.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Node Classification in Citation Networks using Graph Neural Networks (GNNs)

Table of Contents

Project Overview

Dataset

Model Architecture

Installation

Usage

Results

Extensions (into extensions folder in 01 directory)

Project Structure

Contributing

License

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
00.Project-Description		00.Project-Description
01.Dataset-And-Codes		01.Dataset-And-Codes
02.Reports-And-Presentation		02.Reports-And-Presentation
.gitignore		.gitignore
README.md		README.md

Ryan-PG/citation-network-gnn

Folders and files

Latest commit

History

Repository files navigation

Node Classification in Citation Networks using Graph Neural Networks (GNNs)

Table of Contents

Project Overview

Dataset

Model Architecture

Installation

Usage

Results

Extensions (into extensions folder in 01 directory)

Project Structure

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Languages