Skip to content

Latest commit

 

History

History
16 lines (10 loc) · 1.2 KB

README.md

File metadata and controls

16 lines (10 loc) · 1.2 KB

Salary Prediction Based on Personal Characteristics

This project focuses on predicting whether a person's salary exceeds 50 thousand dollars per year based on certain characteristics. We utilize the Adult dataset from the University of California Irvine for this purpose.

Dataset Information

The dataset contains 14 predictor variables and a target variable, with a total of 32,561 samples. The target variable indicates whether a person's salary is above 50K (">50K") or below or equal to 50K ("<=50K"). The predictor variables include age, work class, education, marital status, occupation, relationship, race, sex, capital gain, capital loss, hours per week, and native country.

Project Objectives

The project involves several key tasks:

Preprocessing the dataset, which includes handling numerical and categorical features, dealing with outliers and null values, and transforming data.

Implementing various supervised learning models, specifically Logistic Regression, K-Nearest Neighbors, and Decision Trees.

Evaluating the performance of these models and selecting the most effective one.

The whole process is implemented and illustrated in a Jupyter notebook, which forms the main deliverable of the project.