Least Squares Regression with Elastic Net Regularization

This repository contains my own Python implementation of least-squares regression with elastic net regularization.

This work was performed for the Polished Code Release assignment in DATA 558 (Machine Learning), University of Washington, Spring 2017.

Elastic Net

The elastic net is a hybrid approach between the ever-popular ℓ1 (LASSO) and squared ℓ2 (Ridge) regularization penalties. It is capable of mitigating some of LASSO's weaknesses (correlated variables, p > n) while still maintaining its desirable variable selection capabilities.

The elastic net least-squares minimization problem writes as follows:

where α ∈ [0, 1], with the two extremes equating to the Ridge and LASSO problems, respectively.

Because of the ℓ1 component, the objective function above is non-differentiable and therefore cannot be minimized by gradient descent. Instead, we leverage the subgradient of the absolute value function to define a soft-thresholding operator which is used to minimize the objective function one coordinate at a time. This process is known as coordinate descent.

This Repository

This repository contains Python code for solving the minimization problem described above. Specifically, I provide a function called coorddescent (in src/coorddescent.py) which solves the coordinate descent algorithm described above in one of two ways:

cyclic: proceeds sequentially through each coordinate, returning to the first coordinate after all coordinates have been updated; repeats until stopping criterion achieved
random: proceeds randomly through the coordinates; repeats until stopping criterion achieved

src/coorddescent.py also includes several supplemental functions which are either called by coorddescent or are useful for visualizing how the algorithm arrives at its solution. I also include a cross-validation function (coorddescentCV).

Please refer to the following examples in which I demonstrate the functionality of the code in this repository:

Demo 1: Coordinate descent on a simulated dataset
Demo 2: Coordinate descent on a "real-world" dataset
Demo 3: Comparison of my functions to those from scikit-learn

I wrote most of this code for the take-home portion of my DATA 558 Midterm in Spring 2017. I subsequently enhanced and cleaned the code prior to releasing it on GitHub as part of the Polished Code Release assignment for the course.

Installation

To use the code in this repository:

clone the repository
navigate to the main directory (i.e. that which contains this README.md file)
launch python
enter import src.coorddescent (or ... as cd or whatever shorthand you prefer)

The functions from coorddescent.py should now be available to you in Python by typing src.coorddescent.<function_name> (or cd.<function_name> if you use the shorthand recommended in the last bullet above).

This code was developed in Python 3.6.0; functionality is not guaranteed for older versions. You may need to install the following dependencies if they do not already exist on your machine:

copy
matplotlib.pyplot
numpy
pandas
sklearn.metrics.mean_squared_error

References

Hastie, Trevor J., Robert John Tibshirani, and Martin J. Wainwright. "4.2." Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: CRC, Taylor & Francis Group, 2015. N. pag. Print.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
data		data
images		images
src		src
.gitignore		.gitignore
Demo 1 - Simulated Data.ipynb		Demo 1 - Simulated Data.ipynb
Demo 2 - Real World Data.ipynb		Demo 2 - Real World Data.ipynb
Demo 3 - Comparison to Scikit-learn.ipynb		Demo 3 - Comparison to Scikit-learn.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Least Squares Regression with Elastic Net Regularization

Elastic Net

This Repository

Installation

References

About

Releases

Packages

Languages

rexthompson/DATA-558-Spring-2017

Folders and files

Latest commit

History

Repository files navigation

Least Squares Regression with Elastic Net Regularization

Elastic Net

This Repository

Installation

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages