Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area.
The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We take an attempt at analysis of classifying these tumors using machine learning and the Breast Cancer Wisconsin (Diagnostic) Dataset.
We use KNN classifier, Logistic Regression model, Decision Tree classifier and Random Forest classifier and compare each model's accuracy and confusion matrix to determine the best model.