GitHub - selewaut/forecast_forge: Forecasting Experimentation

Introduction

This repository contains code for the Walmart Sales Forecasting project. The project aims to forecast weekly sales for 45 Walmart stores located in different regions. The data includes historical sales data, holiday events, and store information. Project contains framework to run and test multiple models in a spark environment. It also contains code to build spark docker image and run the spark container locally if needed.

Installation

Clone the repository:

git clone https://github.com/selewaut/forecast_forge.git
cd forecast_forge

Create and activate a virtual environment:

python3 -m venv .venv
source .venv/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```
Install forecast_forge package
```
pip install -e .
```

Install openjdk on local machine

sudo apt-get update
sudo apt-get install openjdk-8-jdk

Usage

Data Preparation

The data is stored in the data/ directory. The data is stored in the following files:

train.csv: historical sales data for 45 Walmart stores
test.csv: test data for forecasting
features.csv: additional data related to the stores and regional activity
stores.csv: store information

Data is originally downloaded using the following kaggle competition: https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data.

For downloading data from kaggle, you need to have a kaggle account and kaggle API key. You can download the data using the following command:

Install kaggle package
```
pip install kaggle
```
Generate API key from kaggle account and save it in ~/.kaggle/kaggle.json

Running the Code

Move to directory containgin spark dockerfile.
```
cd spark-setup
```
Start container.
```
make run
```
This will start a spark container with the code mounted in the container. The container will be running in the background.
Run univariate_weekly.py script to train and test univariate weekly sales forecasting models.
```
spark-submit --master local[*] src/forecast_forge/univariate_weekly.py
```
Results are saved in parquet format in evaluation_output path.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
notebooks		notebooks
spark-setup		spark-setup
src/forecast_forge		src/forecast_forge
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Installation

Usage

Data Preparation

Running the Code

About

Releases

Packages

Languages

License

selewaut/forecast_forge

Folders and files

Latest commit

History

Repository files navigation

Introduction

Installation

Usage

Data Preparation

Running the Code

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages