This was a 2-people team project and was developed for the "Data Mining and Machine Learning" course. The following objectives were implemented by our team:
-
Part A: Given a dataset of wines with different chemical attributes (e.g. pH, alcohol, density, e.t.c.), we developed and fine-tuned an SVM classifier that predicted the quality of those wines. We then improved the accuracy of the model by 10% after testing different strategies for substituting the missing pH values (mean-value assignment, K-means clustering of similar wines).
-
Part B: Given a dataset of news titles, we developed a neural network classifier that predicted whether the titles were published by a specific magazine (binary classification problem). After performing essential preprocessing steps (stemming, stopwords removal, tf-idf weight assignment), our classifier achieved a precision score of 0.88 and a recall score of 0.83.