Initial release for Python interface.
·
48 commits
to feature-dynalyst
since this release
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1.tar.gz
from wheel (binary package)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1-cp38-cp38-manylinux2010_x86_64.whl
Usage
Python API
import ffm
def main():
# Prepare the data. (field, index, value) format
X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
y = [1, 1, 0]
train_data = ffm.Dataset(X, y)
model = ffm.train(
train_data,
quiet=False
)
print("Best iteration:", model.best_iteration)
# Dump FFM weights in ffm-train's "-m" option format.
with open("./model/prod-cvr.model", 'w') as f:
model.dump_libffm_weights(f)
if __name__ == '__main__':
main()
pyffm-train
command
pyffm-train
provides the same interface with ffm-train
.
$ pyffm-train -h
usage: pyffm-train [-h] [-p P] [-W W] [-WV WV] [-f F] [-m M] [--json-meta JSON_META] [-l L] [-k K] [-t T] [-r R] [-s S]
[--no-norm] [--no-rand] [--auto-stop] [--auto-stop-threshold AUTO_STOP_THRESHOLD] [--quiet]
tr_path
LibFFM CLI
positional arguments:
tr_path File path to training set
optional arguments:
-h, --help show this help message and exit
-p P Set path to the validation set
-W W Set path of importance weights file for training set
-WV WV Set path of importance weights file for validation set
-f F Set path for production model file
-m M Set key prefix for production model
--json-meta JSON_META
Generate a meta file if sets json file path
-l L Set regularization parameter (lambda)
-k K Set number of latent factors
-t T Set number of iterations
-r R Set learning rate (eta)
-s S Set number of threads
--no-norm Disable instance-wise normalization
--no-rand Disable random update <training_set>.bin will be generated
--auto-stop Stop at the iteration that achieves the best validation loss (must be used with -p)
--auto-stop-threshold AUTO_STOP_THRESHOLD
Set the threshold count for stop at the iteration that achieves the best validation loss (must be used
with --auto-stop)
--quiet, -q quiet
Example:
$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-2.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt
The above command is the same with the following:
$ ffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-1.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt