This is the CTI dataset generator from public APT reports.
This source code is tested on python 2.7, MISP 2.4.109 operated on Ubuntu 18.04.
This work is published in Security and Communication Networks special issue on Data-Driven Cybersecurity with following title, and the specified description of this system and the dataset are available in the paper:
"Automated Dataset Generation System for Collaborative Research of Cyber Threat Analysis"
Following is the event data structure and the comments for each data elements.
Following is the example of event data. The left image is the report event and the right one is corresponding malware event. Data in the report event dataset is retrieved from security reports by the parser, and that in the malware event is extracted from the analysis results from the malware repository.
The generated CTI dataset covers the security reports published from 2008 to June 2019. It can be downloaded from the links: