Introduction:
The data set is of wine quality. This data set contains various chemical properties of wine, such as acidity, sugar, pH, and alcohol. It also contains a quality metric (3-9, with highest being better) and a color (red or white). The objective of this project is to use data mining techniques like multiple regression models and classification tree methods to predict Wine Quality.
The project will use Python programming language and its packages, such as sklearn, numpy,pandas and matplotlib. The following steps will be followed:
- Data Preprocessing: The data will be cleaned and preprocessed to remove missing values, outliers, and any other anomalies.
- Determine Data Mining Task: Classification Technique
- Determine Data Mining Technique: Multiple linear regression model
- Model Evaluation: The performance of the selected model will be evaluated using various metrics such as mean squared error (MSE) and mean absolute error (MAE).
- Deploying the best Technique/Model: The selected model will be applied on the new data records