Skip to content

Linear models including instrumental variable estimators and panel data models

Notifications You must be signed in to change notification settings

nhmatheson/linearmodels

 
 

Repository files navigation

Linear Models

Build Status codecov

Linear (regression) models for Python. Extends statsmodels to include Panel regression and instrumental variable estimators:

  • Panel models:

    • Fixed effects (maximum two-way)
    • First difference regression
    • Between estimator for panel data
    • Pooled regression for panel data
    • Fama-MacBeth estimation of panel models
  • Instrumental Variable estimators

    • Two-stage Least Squares
    • Limited Information Maximum Likelihood
    • k-class Estimators
    • Generalized Method of Moments, also with continuously updating
  • Factor Asset Pricing Models:

    • 2- and 3-step estimation
    • Time-series estimation
    • GMM estimation
  • System Regression:

    • Seemingly Unrelated Regression (SUR/SURE)

Designed to work equally well with NumPy, Pandas or xarray data.

Panel models

Like statsmodels to include, supports patsy formulas for specifying models. For example, the classic Grunfeld regression can be specified

import numpy as np
from statsmodels.datasets import grunfeld
data = grunfeld.load_pandas().data
data.year = data.year.astype(np.int64)
# MultiIndex, entity - time
data = data.set_index(['firm','year'])
from linearmodels import PanelOLS
mod = PanelOLS(data.invest, data[['value','capital']], entity_effect=True)
res = mod.fit(cov_type='clustered', cluster_entity=True)

Models can also be specified using the formula interface.

from linearmodels import PanelOLS
mod = PanelOLS.from_formula('invest ~ value + capital + EntityEffect', data)
res = mod.fit(cov_type='clustered', cluster_entity=True)

The formula interface for PanelOLS supports the special values EntityEffects and TimeEffects which add entity (fixed) and time effects, respectively.

Instrumental Variable Models

IV regression models can be similarly specified.

import numpy as np
from linearmodels.iv import IV2SLS
from linearmodels.datasets import mroz
data = mroz.load()
mod = IV2SLS.from_formula('np.log(wage) ~ 1 + exper + exper ** 2 + [educ ~ motheduc + fatheduc]', data)

The expressions in the [ ] indicate endogenous regressors (before ~) and the instruments.

Installing

The latest release can be installed using pip

pip install linearmodels

The master branch can be installed by cloning the repo and running setup

git clone https://github.com/bashtage/linearmodels
cd linearmodels
python setup.py install

Documentation

Stable Documentation is built on every tagged version using doctr. Development Documentation is automatically built on every successful build of master.

Plan and status

Should eventually add some useful linear model estimators such as panel regression. Currently only the single variable IV estimators are polished.

  • Linear Instrumental variable estimation - complete
  • Linear Panel model estimation - complete
  • Fama-MacBeth regression - complete
  • Linear Factor Asset Pricing - complete
  • System regression - partially complete (3SLS not started)
  • Linear IV Panel model estimation - not started

Requirements

Running

With the exception of Python 3.5+, which is a hard requirement, the others are the version that are being used in the test environment. It is possible that older versions work.

  • Python 3.5+: extensive use of @ operator
  • NumPy (1.11+)
  • SciPy (0.17+)
  • pandas (0.18+)
  • xarray (0.9+)
  • statsmodels (0.8+)

Testing

  • py.test

Documentation

  • sphinx
  • guzzle_sphinx_theme
  • nbsphinx
  • nbconvert
  • nbformat
  • ipython
  • jupyter

About

Linear models including instrumental variable estimators and panel data models

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%