DSCI-644 Project Repository

This repository consists of the items below:

Raw data file - "project1-commitsRefactoring.xlsx" - Consists the base raw data excluding the duplicates (entire row occuring more than once)
data_train.csv - Train data extracted from Raw data file
data_test.csv - Test data extracted from Raw data file
dataone_train.csv - Train data extracted from data_train.csv consisting only single refactoring labels
datamulti_train.csv - Train data extracted from data_train.csv consisting only multi refactoring label
x_data.csv, y_data.csv, pred_data.csv - files used to analyse the fp and fn cases to understand what is going wrong here

Note: This repository contains several experimentation codes. For Phase-3 of the project, refer to prod.py and test.py

Steps for Execution:

Run test.py -> This executes prod.py file which contains the implementation. Currently this code executes dataone_train.csv (single class data)
To run multi-class - comment out line #105 and uncomment out line #108 to execute datamulti_train.csv (multi class data)

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.idea		.idea
Prajwal		Prajwal
__pycache__		__pycache__
venv		venv
.DS_Store		.DS_Store
A8.py		A8.py
README.md		README.md
attempt_1.py		attempt_1.py
attempt_1_prajwal.py		attempt_1_prajwal.py
data_test.csv		data_test.csv
data_train.csv		data_train.csv
datamulti_train.csv		datamulti_train.csv
dataone_train.csv		dataone_train.csv
fp_fn.xlsx		fp_fn.xlsx
my_evaluation.py		my_evaluation.py
new_data.csv		new_data.csv
pred_data		pred_data
pred_data.csv		pred_data.csv
prod.py		prod.py
project1-commitsRefactoring.xlsx		project1-commitsRefactoring.xlsx
test.py		test.py
x_data.csv		x_data.csv
y_data		y_data
y_data.csv		y_data.csv

Provide feedback