Understanding Regularization in Machine Learning

How to deal with overfitting using regularization

Introduction

In my previous repo, I only used two features (x1, x2) and the decision boundary is a straight line on a 2D coordinate. In most of the real world cases, the data set will have many more features and the decision boundary is more complicated. With so many features, we often overfit the data. Overfitting is a modeling error in a function that is closely fit to a data set. It captures the noise in the data set, and may not fit new incoming data.

To overcome this issue, we mainly have two choices: 1) remove less useful features, 2) penalize the complexity of our model, also called regularization. This repo will focus on regularization.

Example data to be classified

Logistic regression with no regularization

Logistic regression with regularization

You can see the tutorial here.

Similar tutorial with Python can be viewed here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Understanding Regularization in Machine Learning

Introduction

Example data to be classified

Logistic regression with no regularization

Logistic regression with regularization

Files

README.md

Latest commit

History

README.md

File metadata and controls

Understanding Regularization in Machine Learning

Introduction

Example data to be classified

Logistic regression with no regularization

Logistic regression with regularization