Skip to content
This repository has been archived by the owner on Dec 14, 2022. It is now read-only.

dsp-uga/daphne-p1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

daphne-p1 | CSCI 8360 Data Science Practicum Spring 2021 : Malware Classifier

Goal

To develop a machine learning pipeline that classifies malware (input as byte strings) as accurately as possible.

Getting Started

These instructions describe the prerequisites and steps to get the project up and running.

Setup

This project can be easily set up on the Google Cloud Platform, using their Dataproc service for batch processing. Learn about Dataproc here https://cloud.google.com/dataproc/docs/concepts/overview .

We recommend you setup a virtual environment and install the software listed in requirements.txt. We use Python version 3.7.

Usage

Features

For features we used basic word counts with Laplace smoothing

where word counts are, by default, adjusted using additive smoothing.

Output

A list of numerical predictions each corresponding to a virus, which we feed to an online scoring app.

Directories

  • datasets: contains much smaller subsets of our final training and testing datasets for setup and initial experiments
  • features: csv files containing the features we find for our malware data
  • notebooks: jupyter notebook python files, .jnb
  • output:
    files with results as output from experiments

Branches

  • main: master project branch for tested, working code accepted via pull requests
  • zain: meekail's development branch
  • vance: jonathan's development branch
  • shihan: shihan's development branch

Contributors

See Contributors file for more details.

License

This project is licensed under the MIT License. See LICENSE for more details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •