ML_MiniProj

Essay Classification Project

Overview

The Essay Classification Project is a machine learning endeavor designed to categorize essays into specific genres or prompts. The project employs feature extraction techniques and two distinct models—XGBoost and AdaBoost with a logistic regression base estimator—to achieve accurate essay classification.

Key Components

Feature Extraction: The feature_extraction.py module utilizes various functions to extract essential features from essay texts. These features include sentence and word counts, punctuation presence, and the identification of specific words or phrases.
Model Training: The model_training.py module is responsible for training machine learning models. It employs XGBoost and AdaBoost with a logistic regression base estimator. The ROC AUC scores are evaluated and presented in a bar graph for model performance comparison.
Data Handling: The data/ directory stores input data, exemplified by train_essays_7_prompts.csv. The output/ directory contains the generated submission file, submission.csv, which includes the average predictions of both the XGBoost and AdaBoost models.

Usage

Install the required dependencies:
```
pip install -r requirements.txt
```

References

This is an Implementation of the conclusions from the paper cited below

Author(s) Heather Desaire,Aleesa E. Chua,Min-Gyu Kim,David Hua . "Accurately detecting AI text when ChatGPT is told to write like a chemist." Elsevier, 2023, DOI or Link.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
feature_extraction.py		feature_extraction.py
model_training.py		model_training.py
requirements.txt		requirements.txt
submission.csv		submission.csv
train_essays_7_prompts_v2.csv		train_essays_7_prompts_v2.csv
xgboost_model.pkl		xgboost_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_MiniProj

Essay Classification Project

Overview

Key Components

Usage

References

About

Releases

Packages

Languages

License

Suhrud1511/AI_detection

Folders and files

Latest commit

History

Repository files navigation

ML_MiniProj

Essay Classification Project

Overview

Key Components

Usage

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages