Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 1.43 KB

File metadata and controls

33 lines (24 loc) · 1.43 KB

Understanding Regularization in Machine Learning

How to deal with overfitting using regularization

Introduction

In my previous repo, I only used two features (x1, x2) and the decision boundary is a straight line on a 2D coordinate. In most of the real world cases, the data set will have many more features and the decision boundary is more complicated. With so many features, we often overfit the data. Overfitting is a modeling error in a function that is closely fit to a data set. It captures the noise in the data set, and may not fit new incoming data.

To overcome this issue, we mainly have two choices: 1) remove less useful features, 2) penalize the complexity of our model, also called regularization. This repo will focus on regularization.

Example data to be classified

Logistic regression with no regularization

Logistic regression with regularization

You can see the tutorial here.

Similar tutorial with Python can be viewed here.