Ransomware-Detection-Mechanism

Ransomware Detection Mechanism (RDM) is a tool for both combining machine learning to detect ransomware viruses within a network and to collect, visualize, and analyze IOCs for emotet. This is a 2020 University of Ottawa undergraduate honours project. For more a detailed summary of our work, checkout our report " \Report-Ransomware_Detection_using_Supervised Learning.pdf"

Project Members

Project Supervisor

Professor Miguel A. Garzón

Faculty Member Ph.D., P.Eng.: School of Electrical Engineering and Computer Science

Project Organization

├── LICENSE            <- Currently NO License
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   │    ├── binetflow <- Bidirectional netflow files for training set.
│   │    └── validation <- Bidirectional netflow files for validation.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── preprocessed   <- Clean data set.
│   ├── processed      <- The final, canonical data sets for modeling.
│   ├── raw            <- The original, immutable data dump.
│   ├── trained        <- The final data set after training and testing.
│   └── validation     <- Results of validating model with validation data.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── metric             <- Generated files for training and validation metrics
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
└── src                <- Source code for use in this project.
    ├── __init__.py    <- Makes src a Python module
    │
    ├── data           <- Scripts to download or generate data
    │   └── make_dataset.py
    │
    ├── features       <- Scripts to turn raw data into features for modeling
    │   └── build_features.py
    │
    ├── kibana         <- Scripts to collect iocs for kibana.
    │
    ├── models         <- Scripts to train models and then use trained models to make
    │   │                 predictions
    │   ├── predict_model.py
    │   └── train_model.py
    │
    └── visualization  <- Scripts to create exploratory and results oriented visualizations
        └── visualize.py

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Getting Started

Follow these instructions to setup a development environment on your local machine. This is for development or testing purposes.

Prerequisites

You must install and setup the following to be able to run the project.

Setup Python Environment
Install Elasticsearch
Install Kibana

Python Environment

First we must install Python 3 and add Python to your PATH.

Check if Python is correctly installed.

> C:\Users\alan1>python -V
Python 3.8.1

Next is to create a python 3 virtual enviroment (venv) for the project. On CMD or Linux terminal, find a directory path for your venv. Once you chose a directory, create a venv.

> C:\>python3 -m venv RMD-env

This will create a venv and folder called RMD-env. To activate the venv you must run the activate script (must be done for every new terminal or CMD).

> C:\>RMD-env\Scripts\activate.bat (Windows)

> $ source RMD-env/bin/activate (MacOS)

Once the venv is activated, python will be run with the venv and all libraries can be installed specifically in the venv. Now we must install key python libraries.

> (RDM-env) C:\Ransomware-Detection-Mechanism\pip install -r requirements.txt

You are now ready to run scripts. Proceed to Kibana or Training for more information.

IOC Dashboard (Kibana)

Elasticsearch

Download and unzip from https://www.elastic.co/downloads/elasticsearch
Run bin/elasticsearch (or bin\elasticsearch.bat on Windows)
Run curl http://localhost:9200/ or Invoke-RestMethod http://localhost:9200 with PowerShell

Kibana

Download and unzip from https://www.elastic.co/downloads/kibana
Open config/kibana.yml in an editor
Set elasticsearch.hosts to point at your Elasticsearch instance
Run bin/kibana (or bin\kibana.bat on Windows)
Point your browser at http://localhost:5601

Setting up RDM Kibana Environment

For information on how to set up the RDM Kibana Environment with the IOCS, see the Setting Up Kibana Environment page.

Using BulkAPI JSON Scripts

For information on how to use BulkAPI JSON scripts, see the How to Use Bulk JSON Scripts page.

Pipeline

├── 1. Preparing Data (1.3 mil rows)                  <- /src/data> python make_dataset.py
│      ├── 1. Download data sets        (8.5  min)
│      ├── 2. Create raw data           (8.5  sec)
│      ├── 3. Create interim data       (8    sec)
│      └── 4. Create preprocessed data  (30   sec)
├── 2. Build Features (1.3 mil rows)    (2.28 hours)   <- /src/features> python build_features.py
├── 3. Train Model    (1.3 mil rows)    (6.38  hours)  <- /src/models> python train_model.py
│      ├── One Class SVM                (5.03  hours)
│      ├── Confidence SCore             (36.4  min)
│      ├── Save OC Features CSV         (4.4   min)
│      ├── Linear Regression            (22.2  min)
│      └── Save LR Features CSV         (4.6   min)
└── 4. Predict Model                    (24.7  min)     <- /src/models> python predict_model.py

Docker

Prerequisite: Download and extract csv files and place in proper path or run make_dataset and build_features. CSV Files This is done as Github only allows upload of no larger than 100 MB without Git LFS.

To extract files:

tar -xzvf processed.tar.gz
tar -xzvf val_processed.tar.gz

Train

Place processed.csv in /data/processed

$ docker build -f DockerTrain/Dockerfile -t train_model:latest .
$ docker run -d train_model:latest

Predict

Place processed.csv in /data/processed/ Place val_processed.csv in /data/processed/

$ docker build -f DockerPredict/Dockerfile -t predict_model:latest .
$ docker run -d predict_model:latest

Interact with Container

$ docker ps -a
$ docker exec -it (container id) bash

Training and Predict Locally

Follow these instructions to run train or predict off the bat.

$ cd project/src/models
$ python train_model.py
$ python predict_model.py

Performing Pipeline Entirely

Follow these instructions to train or test models from scratch.

$ cd project/src/data
$ python make_dataset.py
$ cd ../features
$ python build_features.py
$ cd ../models
$ python train_model.py
$ python predict_model.py

More Information

Visit the Github Wiki for more documentation and research on the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ransomware-Detection-Mechanism

Project Members

Project Supervisor

Project Organization

Getting Started

Prerequisites

Python Environment

IOC Dashboard (Kibana)

Elasticsearch

Kibana

Setting up RDM Kibana Environment

Using BulkAPI JSON Scripts

Pipeline

Docker

Train

Predict

Interact with Container

Training and Predict Locally

Performing Pipeline Entirely

More Information

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
DockerPredict		DockerPredict
DockerTrain		DockerTrain
data		data
docs		docs
metric		metric
models		models
notebooks		notebooks
references		references
reports		reports
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.pylintrc		.pylintrc
.travis.yml		.travis.yml
Makefile		Makefile
README.md		README.md
Report-Ransomware_Detection_using_Supervised Learning.pdf		Report-Ransomware_Detection_using_Supervised Learning.pdf
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

Peter-Lam/Ransomware-Detection-Mechanism

Folders and files

Latest commit

History

Repository files navigation

Ransomware-Detection-Mechanism

Project Members

Project Supervisor

Project Organization

Getting Started

Prerequisites

Python Environment

IOC Dashboard (Kibana)

Elasticsearch

Kibana

Setting up RDM Kibana Environment

Using BulkAPI JSON Scripts

Pipeline

Docker

Train

Predict

Interact with Container

Training and Predict Locally

Performing Pipeline Entirely

More Information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages