Skip to content

Initial release for Python interface.

Compare
Choose a tag to compare
@c-bata c-bata released this 06 May 06:29
· 48 commits to feature-dynalyst since this release
3184ba8

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1.tar.gz

from wheel (binary package)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1-cp38-cp38-manylinux2010_x86_64.whl

Usage

Python API

import ffm

def main():
    # Prepare the data. (field, index, value) format
    X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
         [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
         [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
    y = [1, 1, 0]
    train_data = ffm.Dataset(X, y)

    model = ffm.train(
        train_data,
        quiet=False
    )
    print("Best iteration:", model.best_iteration)

    # Dump FFM weights in ffm-train's "-m" option format.
    with open("./model/prod-cvr.model", 'w') as f:
        model.dump_libffm_weights(f)

if __name__ == '__main__':
    main()

pyffm-train command

pyffm-train provides the same interface with ffm-train.

$ pyffm-train -h
usage: pyffm-train [-h] [-p P] [-W W] [-WV WV] [-f F] [-m M] [--json-meta JSON_META] [-l L] [-k K] [-t T] [-r R] [-s S]
                   [--no-norm] [--no-rand] [--auto-stop] [--auto-stop-threshold AUTO_STOP_THRESHOLD] [--quiet]
                   tr_path

LibFFM CLI

positional arguments:
  tr_path               File path to training set

optional arguments:
  -h, --help            show this help message and exit
  -p P                  Set path to the validation set
  -W W                  Set path of importance weights file for training set
  -WV WV                Set path of importance weights file for validation set
  -f F                  Set path for production model file
  -m M                  Set key prefix for production model
  --json-meta JSON_META
                        Generate a meta file if sets json file path
  -l L                  Set regularization parameter (lambda)
  -k K                  Set number of latent factors
  -t T                  Set number of iterations
  -r R                  Set learning rate (eta)
  -s S                  Set number of threads
  --no-norm             Disable instance-wise normalization
  --no-rand             Disable random update <training_set>.bin will be generated
  --auto-stop           Stop at the iteration that achieves the best validation loss (must be used with -p)
  --auto-stop-threshold AUTO_STOP_THRESHOLD
                        Set the threshold count for stop at the iteration that achieves the best validation loss (must be used
                        with --auto-stop)
  --quiet, -q           quiet

Example:

$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-2.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt

The above command is the same with the following:

$ ffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-1.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt