Releases: CyberAgentAI/libffm
py-v0.4.1
Changes
Minor update for build dependency
Secured the order of build dependency installation.
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1.tar.gz
from bdist (M1 Mac)
Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1-cp310-cp310-macosx_13_0_arm64.whl
from bdist (linux on aarch64)
Suppose you are using python 3.10 on AWS Graviton
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.1/ffm-0.4.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
py-v0.4.0
Changes
Support load dataset in C++
The usage is like this:
train_path = "train_path.txt"
valid_path = "valid_path.txt"
ffm.train(train_path=train_path, valid_path=valid_path)
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0.tar.gz
from bdist (M1 Mac)
Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0-cp310-cp310-macosx_13_0_arm64.whl
from bdist (linux on aarch64)
Suppose you are using python 3.10 on AWS Graviton
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.4.0/ffm-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
py-v0.3.1
Changes
Support aarch64 for M1 Mac and AWS Graviton
We have made it work on Neon for aarch64 when computing matrices.
This allows for faster operation on aarch64 as well.
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1.tar.gz
from bdist (M1 Mac)
Suppose you are using python 3.10 on an M1 mac with macOS Ventura(version 13.1)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1-cp310-cp310-macosx_13_0_arm64.whl
from bdist (linux on aarch64)
Suppose you are using python 3.10 on AWS Graviton
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.1/ffm-0.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
py-v0.3.0
Changes
Support negative down sampling rate with argument
ffm.train
method now provides nds_rate
argument for negative down sampling.
default nds_rate is 1.0.
Add best_va_loss
to field
best_va_loss
added to field for hyperparameter tuning.
import ffm
def main():
# Prepare the data. (field, index, value) format
X_train = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
y_train = [1, 1, 0]
X_valid = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
y_valid = [1, 0, 0]
train_data = ffm.Dataset(X_train, y_train)
valid_data = ffm.Dataset(X_valid, y_valid)
model = ffm.train(train_data, valid_data, quiet=False, nds_rate=0.5)
va_logloss = model.best_va_loss
with open("./model/prod-cvr.model", 'w') as f:
model.dump_model(f)
if __name__ == '__main__':
main()
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.3.0/ffm-0.3.0.tar.gz
py-v0.2.0
CHANGES
Support Model.predict()
method.
The usage is like this:
import ffm
model = ffm.Model.read_ffm_model("./dummy.model")
X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
for x in X:
pred_y = model.predict(x)
print(pred_y)
and you can also predict features from LIBFFM-style dataset.
import ffm
model = ffm.Model.read_ffm_model("./dummy.model")
dataset = ffm.Dataset.read_ffm_data("./bigdata.te.txt")
for x in dataset.data:
pred_y = model.predict(x)
print(pred_y)
Add pyffm-train
command line interface.
$ pyffm-predict ./bigdata.te.txt ./model/dummy.model ./model/predicted.txt
Note that this is equivalent with the following command.
$ ./ffm-predict ./bigdata.te.txt ./model/dummy.model ./model/predicted.txt
Deprecate Model.dump_libffm_weights()
method.
Model.dump_libffm_weights()
is deprecated because it seems not to be used at Dynalyst.
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.2.0/ffm-0.2.0.tar.gz
py-v0.1.0
Changes
Add model.dump_model()
method to dump model parameters.
ffm.FFM
class now provides .dump_model()
method that dumps model parameters in LIBFFM's style.
import ffm
def main():
# Prepare the data. (field, index, value) format
X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
y = [1, 1, 0]
train_data = ffm.Dataset(X, y)
model = ffm.train(train_data, quiet=False)
with open("./model/prod-cvr.model", 'w') as f:
model.dump_model(f)
if __name__ == '__main__':
main()
Add pyffm-train's command-line option to dump model parameters.
It is able to set a second positional CLI argument in pyffm-train
.
$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt ./bigdata.tr.txt ./model/dummy.model
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.1.0/ffm-0.1.0.tar.gz
from wheel (binary package)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.1.0/ffm-0.1.0-cp38-cp38-manylinux2010_x86_64.whl
Initial release for Python interface.
Installation
from source code
$ pip install numpy Cython==3.0a6
$ pip install "git+https://github.com/CyberAgent/[email protected]"
from sdist (source package)
$ pip install numpy
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1.tar.gz
from wheel (binary package)
$ pip install https://github.com/CyberAgent/libffm/releases/download/py-v0.0.1/ffm-0.0.1-cp38-cp38-manylinux2010_x86_64.whl
Usage
Python API
import ffm
def main():
# Prepare the data. (field, index, value) format
X = [[(1, 2, 1), (2, 3, 1), (3, 5, 1)],
[(1, 0, 1), (2, 3, 1), (3, 7, 1)],
[(1, 1, 1), (2, 3, 1), (3, 7, 1), (3, 9, 1)], ]
y = [1, 1, 0]
train_data = ffm.Dataset(X, y)
model = ffm.train(
train_data,
quiet=False
)
print("Best iteration:", model.best_iteration)
# Dump FFM weights in ffm-train's "-m" option format.
with open("./model/prod-cvr.model", 'w') as f:
model.dump_libffm_weights(f)
if __name__ == '__main__':
main()
pyffm-train
command
pyffm-train
provides the same interface with ffm-train
.
$ pyffm-train -h
usage: pyffm-train [-h] [-p P] [-W W] [-WV WV] [-f F] [-m M] [--json-meta JSON_META] [-l L] [-k K] [-t T] [-r R] [-s S]
[--no-norm] [--no-rand] [--auto-stop] [--auto-stop-threshold AUTO_STOP_THRESHOLD] [--quiet]
tr_path
LibFFM CLI
positional arguments:
tr_path File path to training set
optional arguments:
-h, --help show this help message and exit
-p P Set path to the validation set
-W W Set path of importance weights file for training set
-WV WV Set path of importance weights file for validation set
-f F Set path for production model file
-m M Set key prefix for production model
--json-meta JSON_META
Generate a meta file if sets json file path
-l L Set regularization parameter (lambda)
-k K Set number of latent factors
-t T Set number of iterations
-r R Set learning rate (eta)
-s S Set number of threads
--no-norm Disable instance-wise normalization
--no-rand Disable random update <training_set>.bin will be generated
--auto-stop Stop at the iteration that achieves the best validation loss (must be used with -p)
--auto-stop-threshold AUTO_STOP_THRESHOLD
Set the threshold count for stop at the iteration that achieves the best validation loss (must be used
with --auto-stop)
--quiet, -q quiet
Example:
$ pyffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-2.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt
The above command is the same with the following:
$ ffm-train -p ./bigdata.te.txt -W ./bigdata.iw.txt -f ./model/dummy-1.model -m key --auto-stop --auto-stop-threshold 3 ./bigdata.tr.txt