Skip to content

Scikit learn Models

Shophine edited this page Mar 19, 2021 · 3 revisions

Naive-Bayes

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

Approach

  • Separate the dataset into training and testing datasets
  • Take the training dataset and separate it by the target values
  • Calculate statistical values such as mean, the standard deviation for the dataset
  • Summarize the data by class
  • Calculate the Gaussian Probability Density Function
  • Estimate the class probabilities

Result

Accuracy: 73%

Logistic Regression

Logistic regression is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary). Like all regression analyses, logistic regression is a predictive analysis.

Approach

  • Separate the dataset into training and testing datasets
  • Train the classifier
  • Predict the values using the test dataset with the classifier

Result

Accuracy: 78%

Random Forest

Random forest is a type of supervised machine learning algorithm based on ensemble learning.

Approach

  • Pick N random records from the dataset
  • Build a decision tree based on N records
  • Choose the number of trees you want to build and repeat the above two steps (More the trees, high the prediction accuracy)

Result

Accuracy: 80%

Clone this wiki locally