Skip to content

Releases: CyberAgentAI/libffm

py-v0.4.1

14 Aug 05:06
c2fc26a
Compare
Choose a tag to compare

Changes

Minor update for build dependency

Secured the order of build dependency installation.

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1.tar.gz

from bdist (M1 Mac)

Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1-cp310-cp310-macosx_13_0_arm64.whl

from bdist (linux on aarch64)

Suppose you are using python 3.10 on AWS Graviton

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

py-v0.4.0

30 Jun 05:23
88ee3a3
Compare
Choose a tag to compare

Changes

Support load dataset in C++

The usage is like this:

train_path = "train_path.txt"
valid_path = "valid_path.txt"
ffm.train(train_path=train_path, valid_path=valid_path)

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0.tar.gz

from bdist (M1 Mac)

Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0-cp310-cp310-macosx_13_0_arm64.whl

from bdist (linux on aarch64)

Suppose you are using python 3.10 on AWS Graviton

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

py-v0.3.1

19 Jan 09:00
719fa62
Compare
Choose a tag to compare

Changes

Support aarch64 for M1 Mac and AWS Graviton

We have made it work on Neon for aarch64 when computing matrices.
This allows for faster operation on aarch64 as well.

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1.tar.gz

from bdist (M1 Mac)

Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1-cp310-cp310-macosx_13_0_arm64.whl

from bdist (linux on aarch64)

Suppose you are using python 3.10 on AWS Graviton

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

py-v0.3.0

21 Jul 05:27
74f1dca
Compare
Choose a tag to compare

Changes

Support negative down sampling rate with argument

ffm.train method now provides nds_rate argument for negative down sampling.
default nds_rate is 1.0.

Add best_va_loss to field

best_va_loss added to field for hyperparameter tuning.

import ffm

def main():
    # Prepare the data. (field, index, value) format
    X_train = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
         [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
         [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
    y_train = [1, 1, 0]
    X_valid = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
         [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
         [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
    y_valid = [1, 0, 0]
    train_data = ffm.Dataset(X_train, y_train)
    valid_data = ffm.Dataset(X_valid, y_valid)
    model = ffm.train(train_data, valid_data, quiet=False, nds_rate=0.5)

    va_logloss = model.best_va_loss
    with open("./model/prod-cvr.model", 'w') as f:
        model.dump_model(f)

if __name__ == '__main__':
    main()

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.0/ffm-0.3.0.tar.gz

py-v0.2.0

19 Jul 09:16
Compare
Choose a tag to compare

CHANGES

Support Model.predict() method.

The usage is like this:

import ffm

model = ffm.Model.read_ffm_model("./dummy.model")
X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
     [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
     [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]

for x in X:
    pred_y = model.predict(x)
    print(pred_y)

and you can also predict features from LIBFFM-style dataset.

import ffm

model = ffm.Model.read_ffm_model("./dummy.model")
dataset = ffm.Dataset.read_ffm_data("./bigdata.te.txt")

for x in dataset.data:
    pred_y = model.predict(x)
    print(pred_y)

Add pyffm-train command line interface.

$ pyffm-predict ./bigdata.te.txt ./model/dummy.model ./model/predicted.txt

Note that this is equivalent with the following command.

$ ./ffm-predict ./bigdata.te.txt ./model/dummy.model ./model/predicted.txt

Deprecate Model.dump_libffm_weights() method.

Model.dump_libffm_weights() is deprecated because it seems not to be used at Dynalyst.

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.2.0/ffm-0.2.0.tar.gz

py-v0.1.0

04 Jun 05:28
Compare
Choose a tag to compare

Changes

Add model.dump_model() method to dump model parameters.

ffm.FFM class now provides .dump_model() method that dumps model parameters in LIBFFM's style.

import ffm

def main():
    # Prepare the data. (field, index, value) format
    X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
         [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
         [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
    y = [1, 1, 0]
    train_data = ffm.Dataset(X, y)
    model = ffm.train(train_data, quiet=False)

    with open("./model/prod-cvr.model", 'w') as f:
        model.dump_model(f)

if __name__ == '__main__':
    main()

Add pyffm-train's command-line option to dump model parameters.

It is able to set a second positional CLI argument in pyffm-train.

$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt ./bigdata.tr.txt ./model/dummy.model

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.1.0/ffm-0.1.0.tar.gz

from wheel (binary package)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.1.0/ffm-0.1.0-cp38-cp38-manylinux2010_x86_64.whl

Initial release for Python interface.

06 May 06:29
3184ba8
Compare
Choose a tag to compare

Installation

from source code

$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"

from sdist (source package)

$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1.tar.gz

from wheel (binary package)

$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1-cp38-cp38-manylinux2010_x86_64.whl

Usage

Python API

import ffm

def main():
    # Prepare the data. (field, index, value) format
    X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
         [(1, 0, 1), (2, 3, 1), (3, 7, 1)],
         [(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
    y = [1, 1, 0]
    train_data = ffm.Dataset(X, y)

    model = ffm.train(
        train_data,
        quiet=False
    )
    print("Best iteration:", model.best_iteration)

    # Dump FFM weights in ffm-train's "-m" option format.
    with open("./model/prod-cvr.model", 'w') as f:
        model.dump_libffm_weights(f)

if __name__ == '__main__':
    main()

pyffm-train command

pyffm-train provides the same interface with ffm-train.

$ pyffm-train -h
usage: pyffm-train [-h] [-p P] [-W W] [-WV WV] [-f F] [-m M] [--json-meta JSON_META] [-l L] [-k K] [-t T] [-r R] [-s S]
                   [--no-norm] [--no-rand] [--auto-stop] [--auto-stop-threshold AUTO_STOP_THRESHOLD] [--quiet]
                   tr_path

LibFFM CLI

positional arguments:
  tr_path               File path to training set

optional arguments:
  -h, --help            show this help message and exit
  -p P                  Set path to the validation set
  -W W                  Set path of importance weights file for training set
  -WV WV                Set path of importance weights file for validation set
  -f F                  Set path for production model file
  -m M                  Set key prefix for production model
  --json-meta JSON_META
                        Generate a meta file if sets json file path
  -l L                  Set regularization parameter (lambda)
  -k K                  Set number of latent factors
  -t T                  Set number of iterations
  -r R                  Set learning rate (eta)
  -s S                  Set number of threads
  --no-norm             Disable instance-wise normalization
  --no-rand             Disable random update <training_set>.bin will be generated
  --auto-stop           Stop at the iteration that achieves the best validation loss (must be used with -p)
  --auto-stop-threshold AUTO_STOP_THRESHOLD
                        Set the threshold count for stop at the iteration that achieves the best validation loss (must be used
                        with --auto-stop)
  --quiet, -q           quiet

Example:

$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-2.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt

The above command is the same with the following:

$ ffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-1.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt