Skip to content

Latest commit

 

History

History
108 lines (73 loc) · 2.85 KB

README.md

File metadata and controls

108 lines (73 loc) · 2.85 KB

Lazy and Fast Greedy MAP Inference for Determinantal Point Process

RunTest GitHub

This code is the official implementation of Lazy and Fast Greedy MAP Inference for Determinantal Point Process.

Requirements

Compile

When first cloning this repository, run the following commands:

git submodule init
git submodule update

To compile C++ codes, run:

cmake --preset make  # replace "make" with "ninja" if you use Ninja
cmake --build --preset release

Data Preprocessing

To generate the input data used in the experiment, follow these steps. The resulting data will be stored to data/.

Synthetic Datasets

To generate synthetic data, run the following:

./build/gen_wishart

Real-world Datasets

To pre-process the real world datasets, Please follow these steps:

MovieLens 25M

To get the primary data of MovieLens 25M dataset, run the following commands:

mkdir -p data
wget -P data https://files.grouplens.org/datasets/movielens/ml-25m.zip
unzip data/ml-25m.zip -d data
./build/pre_process -d movie_lens

Netflix Prize

To get Netflix Prize dataset, you need a Kaggle account. Logging to Kaggle, download archive.zip from here and store it to data/. For pre-processing, run the following commands.

mkdir -p data
unzip data/archive.zip -d data/netflix_raw
./build/pre_process -d netflix

Computing Product Matrices

The matrix $L = B^\top B$ for Real-world datasets can be computed by the following (run on the root directory):

./build/product -d netflix
./build/product -d movie_lens

Run Experiments

Run commands on the root directory.

Greedy, RandomGreedy, StochasticGreedy, InterlaceGreedy

./build/exp -a [algorithm] -d [dataset_name] -m [input_matrix]
  • algorithm: greedy (default), random, stochastic, interlace
  • dataset_name: wishart, wishart_fixed_k, movie_lens, netflix
  • input_matrix: B (default), L

DoubleGreedy

./build/double -d [dataset_name]
  • dataset_name: wishart, movie_lens, netflix

Experimental results will be stored to result/ in the CSV format.

License

The code is licensed MIT.