Skip to content

nshahpazov/houses-ds-project

Repository files navigation

Houses Data Science Pipeline

This is a Houses Data Science pipeline produced from an analysis you can see as steps in notebooks/

Requirements

  • conda

Installation

Run the following to install the project as a python package

pip install houses_pipeline

Starting and exploring the project

conda create -f environment.yml
conda activate houses

Building the package

python setup.py bdist_wheel

Pipeline steps

Fetch the dataset

./houses_pipeline/fetch/fetch_dataset.sh data/raw

Or simply a one-liner of

kaggle competitions download -c house-prices-advanced-regression-techniques -p data/raw ;
unzip -o data/raw/*.zip -d data/raw/

Preprocess

python houses_pipeline/preprocess data/raw/train.csv data/interim/train.csv

Data Splitting

  • Not Yet Implemented

Model Training

  • Not Yet Implemented

Running tests

conda develop .
pytest

Usage

Preprocessing

python houses_pipeline/preprocess

Training the Lasso Regression

python houses_pipeline/modelling/train_lasso.py

Contributing\Developing

pip install -e .[dev]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published