Code and datasets of thesis "Classification of Cyber-Security Requirements based on open datasets and GitHub harvesting"
Models Training and Testing.ipynb
- notebook with modelslanguage_detection.py
- script that detect language of text using CLD2github_scraper.py
- script that harvest issues and repositories from GitHubdata
:-
security_terms.csv
- list of security terms
-
repositories.csv
- table with GitHub repositories and links that we used for harvesting
-
datasets
- folder with 4 datasets for models training