-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python cant find executables #202
Comments
Hi @casperkaae ! Let's start from the beginning: please try to install through |
Thanks for the quick response!, the log is below
|
OK, it uses cached wheel... But the version is fresh. Then please do the couple of things:
And does the warning appear only for FastRGF or for RGF too? |
Hi again
I have only seen the warning for the FastRGF executeables. If i run Thanks a lot for the help |
Hmm... Please try to grant executable permission for
Also, you've said
Replace BTW,
only |
Hi again I've tried to (regrant) executeable permissions as well as using my own compiled versions of the executeables, however no luck yet. Can there be anything in the python script detecting the files that doesn't work? |
It's a pity. Python package tries to find FastRGF as follows: rgf/python-package/rgf/utils.py Lines 188 to 196 in 0dd8273
I suppose the package finds executables correctly but they are broken. To be more precise rgf/python-package/rgf/utils.py Line 137 in 0dd8273
function returns False .
To check this, you can call this function with It often happens when the gcc version is less than 5, but it's not your case. Maybe something wrong with the OpenMP library?.. |
I've tested the
the subprocess call fails with
.... so something is not working with the executeables. Btw i get the same error if i reinstall the pacakage from scratch through pip, e.g. not using my own compiled versions of the binaries |
Thank you very much! Let's exclude or confirm that your problem is actually this issue #92. How many processor cores do you have? Please run rgf/python-package/rgf/utils.py Lines 142 to 143 in 0dd8273
4-5 times will be enough. |
Hi @casperkaae ! Have you had a chance to check different #data? |
I ran into this problem while trying to get the RGF R package working on some x86_64 workstations running Ubuntu Linux 16.04 and python3.6. I found that python reported errors trying to "import rgf": dave@alicia:~$ python3
Of my six workstations, the error occurred on the three that had 16, 24, and 24 cores, but not on the machines with 2, 4, and 10 cores. Applying the following mod to /usr/local/lib/python3.6/site-packages/rgf/utils.py "fixed" "import rgf" by limiting the number of threads:
Of course this does not explain why the larger number of cores induces the failure. Sorry about the formatting of my post; I'm not sure how to fix it. To clarify, the line I added to rgf/utils.py was: params_train.append("set.nthreads=4") |
@dslate Thank you very much for your information! It maybe be treated as another proof that the FastRGF error is caused by hidden multithreading issue (race condition?): #92.
Did you want to say We'll appreciate any help from you. Do you have some time and wish to dig deeper in this issue? Do you still have the access to those machines where the error happens? Before your post I could reproduce the error only on Windows, but your workstations run under Linux - it's interesting. PS. I fixed the markdown in your post: just added the code block around the diff. |
@StrikerRUS Thanks for your response. Actually I did mean 16, 24 and 24; two of my machines have 24 cores each. After installing my mod to rgf/utils.py I have been trying to get FastRGF_Regressor working for a current project in R. I am already using lightgbm, xgboost, and catboost with the same data, and recently got the command line version of rgf (version 1.3) running ok with it. This is my first attempt to use FastRGF_Regressor, and I am running into additional difficulties, such as predictions that are basically random numbers that have almost no correlation with the target, which is a 0-1 binary (by the way, should I be treating this as a regression or classification problem; the goal is to maximize AUC score?). I don't know yet whether this problem is due to bugs in my code or in FastRGF. This project has a deadline, so I need to either get FastRGF working pretty soon so I can add it to the algorithms whose predictions I ensemble, or just give up and go with the algorithms I'm already using. I really don't have much time to try to debug this, but if you come up with some potential fixes I may be able to help test them in my environment. I couldn't tell from the posts about this issue whether the errors occur intermittently with the same data or are data dependent. If the latter, perhaps the problem is not a "race condition" but some other bug(s) in the multi-threading algorithm. |
@dslate I agree that there is a possibility that it is not a data race. |
@dslate AUC is a classification metric. For
please take a look at this #199. Unfortunately, FastRGF multithreading issue is rather old and I'm not sure we'll fix it soon. So, for your project you'd better use RGF, since FastRGF is unstable in your environment. R wrapper (#208) will be integrated into this repo very soon (just after @fukatani's review). |
I am on a 12 core machine(7920x) and getting the same error. this change solved my problem and it is good to go now... |
Thanks a lot @ryancheunggit ! Your information is very helpful! |
Sure thing. I have done some further testing, in my case, the minimum number I have to replace 14 with is 23, which is the number of threads - 1. Maybe the test function in utils.py could use something like |
@ryancheunggit Thanks for your investigation! I'm not sure that the dependency is so determined. 14 came from my 8-threaded CPU... |
Environment Info rgf_python version:'3.6.0' Python version (for rgf_python errors): 3.6 Error Message Hi, |
Thank you for your report! To obtain information necessary for debugging, could you try as follows?
from __future__ import absolute_import
import atexit
import codecs
import glob
import numbers
import os
import platform
import stat
import subprocess
import warnings
from tempfile import gettempdir
from threading import Lock
from uuid import uuid4
import numpy as np
import scipy.sparse as sp
from sklearn.base import BaseEstimator
from sklearn.exceptions import NotFittedError
from sklearn.externals import six
from sklearn.utils.extmath import softmax
from sklearn.utils.multiclass import check_classification_targets
from sklearn.utils.validation import check_array, check_consistent_length, check_X_y, column_or_1d
CURRENT_DIR = os.path.abspath(os.path.dirname(__file__))
FLOATS = (float, np.float, np.float16, np.float32, np.float64, np.double)
INTS = (numbers.Integral, np.integer)
NOT_FITTED_ERROR_DESC = "Estimator not fitted, call `fit` before exploiting the model."
NOT_IMPLEMENTED_ERROR_DESC = "This method isn't implemented in base class."
SYSTEM = platform.system()
def get_paths():
config = six.moves.configparser.RawConfigParser()
path = os.path.join(os.path.expanduser('~'), '.rgfrc')
try:
with codecs.open(path, 'r', 'utf-8') as cfg:
with six.StringIO(cfg.read()) as strIO:
config.readfp(strIO)
except six.moves.configparser.MissingSectionHeaderError:
with codecs.open(path, 'r', 'utf-8') as cfg:
with six.StringIO('[glob]\n' + cfg.read()) as strIO:
config.readfp(strIO)
except Exception:
pass
if SYSTEM in ('Windows', 'Microsoft'):
try:
rgf_exe = os.path.abspath(config.get(config.sections()[0], 'exe_location'))
except Exception:
rgf_exe = os.path.join(os.path.expanduser('~'), 'rgf.exe')
def_rgf = 'rgf.exe'
else: # Linux, Darwin (macOS), etc.
try:
rgf_exe = os.path.abspath(config.get(config.sections()[0], 'exe_location'))
except Exception:
rgf_exe = os.path.join(os.path.expanduser('~'), 'rgf')
def_rgf = 'rgf'
try:
fastrgf_path = os.path.abspath(config.get(config.sections()[0], 'fastrgf_location'))
except Exception:
fastrgf_path = os.path.expanduser('~')
def_fastrgf = ''
try:
temp = os.path.abspath(config.get(config.sections()[0], 'temp_location'))
except Exception:
temp = os.path.join(gettempdir(), 'rgf')
return def_rgf, rgf_exe, def_fastrgf, fastrgf_path, temp
DEFAULT_RGF_PATH, RGF_PATH, DEFAULT_FASTRGF_PATH, FASTRGF_PATH, TEMP_PATH = get_paths()
if not os.path.isdir(TEMP_PATH):
os.makedirs(TEMP_PATH)
if not os.access(TEMP_PATH, os.W_OK):
raise Exception("{0} is not writable directory. Please set "
"config flag 'temp_location' to writable directory".format(TEMP_PATH))
UUIDS = []
def is_fastrgf_executable(path):
print("Seaching " + path)
temp_x_loc = os.path.join(TEMP_PATH, 'temp_fastrgf.train.data.x')
temp_y_loc = os.path.join(TEMP_PATH, 'temp_fastrgf.train.data.y')
temp_model_loc = os.path.join(TEMP_PATH, "temp_fastrgf.model")
temp_pred_loc = os.path.join(TEMP_PATH, "temp_fastrgf.predictions.txt")
X = np.tile(np.array([[1, 0, 1, 0], [0, 1, 0, 1]]), (14, 1))
y = np.tile(np.array([1, -1]), 14)
np.savetxt(temp_x_loc, X, delimiter=' ', fmt="%s")
np.savetxt(temp_y_loc, y, delimiter=' ', fmt="%s")
UUIDS.append('temp_fastrgf')
path_train = os.path.join(path, "forest_train")
params_train = []
params_train.append("forest.ntrees=%s" % 10)
params_train.append("tst.target=%s" % "BINARY")
params_train.append("trn.x-file=%s" % temp_x_loc)
params_train.append("trn.y-file=%s" % temp_y_loc)
params_train.append("model.save=%s" % temp_model_loc)
cmd_train = [path_train]
cmd_train.extend(params_train)
path_pred = os.path.join(path, "forest_predict")
params_pred = []
params_pred.append("model.load=%s" % temp_model_loc)
params_pred.append("tst.x-file=%s" % temp_x_loc)
params_pred.append("tst.output-prediction=%s" % temp_pred_loc)
cmd_pred = [path_pred]
cmd_pred.extend(params_pred)
try:
os.chmod(path_train, os.stat(path_train).st_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
os.chmod(path_pred, os.stat(path_pred).st_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
except Exception:
pass
try:
subprocess.check_output(cmd_train, stderr=subprocess.STDOUT)
subprocess.check_output(cmd_pred, stderr=subprocess.STDOUT)
except Exception as e:
print(e)
return False
print("FastRGF found on " + path)
return True
FASTRGF_AVAILABLE = True
if is_fastrgf_executable(CURRENT_DIR):
FASTRGF_PATH = CURRENT_DIR
elif is_fastrgf_executable(DEFAULT_FASTRGF_PATH):
FASTRGF_PATH = DEFAULT_FASTRGF_PATH
elif is_fastrgf_executable(FASTRGF_PATH):
pass
else:
FASTRGF_AVAILABLE = False
warnings.warn("Cannot find FastRGF executable files. FastRGF estimators will be unavailable for usage.")
|
@StrikerRUS |
@fukatani |
Additional Info: |
@casperkaae Could you try another debug script here? from __future__ import absolute_import
import atexit
import codecs
import glob
import numbers
import os
import platform
import stat
import subprocess
import warnings
from tempfile import gettempdir
from threading import Lock
from uuid import uuid4
import numpy as np
import scipy.sparse as sp
from sklearn.base import BaseEstimator
from sklearn.exceptions import NotFittedError
from sklearn.externals import six
from sklearn.utils.extmath import softmax
from sklearn.utils.multiclass import check_classification_targets
from sklearn.utils.validation import check_array, check_consistent_length, check_X_y, column_or_1d
CURRENT_DIR = os.path.abspath(os.path.dirname(__file__))
FLOATS = (float, np.float, np.float16, np.float32, np.float64, np.double)
INTS = (numbers.Integral, np.integer)
NOT_FITTED_ERROR_DESC = "Estimator not fitted, call `fit` before exploiting the model."
NOT_IMPLEMENTED_ERROR_DESC = "This method isn't implemented in base class."
SYSTEM = platform.system()
def get_paths():
config = six.moves.configparser.RawConfigParser()
path = os.path.join(os.path.expanduser('~'), '.rgfrc')
try:
with codecs.open(path, 'r', 'utf-8') as cfg:
with six.StringIO(cfg.read()) as strIO:
config.readfp(strIO)
except six.moves.configparser.MissingSectionHeaderError:
with codecs.open(path, 'r', 'utf-8') as cfg:
with six.StringIO('[glob]\n' + cfg.read()) as strIO:
config.readfp(strIO)
except Exception:
pass
if SYSTEM in ('Windows', 'Microsoft'):
try:
rgf_exe = os.path.abspath(config.get(config.sections()[0], 'exe_location'))
except Exception:
rgf_exe = os.path.join(os.path.expanduser('~'), 'rgf.exe')
def_rgf = 'rgf.exe'
else: # Linux, Darwin (macOS), etc.
try:
rgf_exe = os.path.abspath(config.get(config.sections()[0], 'exe_location'))
except Exception:
rgf_exe = os.path.join(os.path.expanduser('~'), 'rgf')
def_rgf = 'rgf'
try:
fastrgf_path = os.path.abspath(config.get(config.sections()[0], 'fastrgf_location'))
except Exception:
fastrgf_path = os.path.expanduser('~')
def_fastrgf = ''
try:
temp = os.path.abspath(config.get(config.sections()[0], 'temp_location'))
except Exception:
temp = os.path.join(gettempdir(), 'rgf')
return def_rgf, rgf_exe, def_fastrgf, fastrgf_path, temp
DEFAULT_RGF_PATH, RGF_PATH, DEFAULT_FASTRGF_PATH, FASTRGF_PATH, TEMP_PATH = get_paths()
if not os.path.isdir(TEMP_PATH):
os.makedirs(TEMP_PATH)
if not os.access(TEMP_PATH, os.W_OK):
raise Exception("{0} is not writable directory. Please set "
"config flag 'temp_location' to writable directory".format(TEMP_PATH))
UUIDS = []
def is_fastrgf_executable(path):
print("Seaching " + path)
temp_x_loc = os.path.join(TEMP_PATH, 'temp_fastrgf.train.data.x')
temp_y_loc = os.path.join(TEMP_PATH, 'temp_fastrgf.train.data.y')
temp_model_loc = os.path.join(TEMP_PATH, "temp_fastrgf.model")
temp_pred_loc = os.path.join(TEMP_PATH, "temp_fastrgf.predictions.txt")
X = np.tile(np.array([[1, 0, 1, 0], [0, 1, 0, 1]]), (14, 1))
y = np.tile(np.array([1, -1]), 14)
np.savetxt(temp_x_loc, X, delimiter=' ', fmt="%s")
np.savetxt(temp_y_loc, y, delimiter=' ', fmt="%s")
UUIDS.append('temp_fastrgf')
path_train = os.path.join(path, "forest_train")
params_train = []
params_train.append("forest.ntrees=%s" % 10)
params_train.append("set.nthreads=1")
params_train.append("tst.target=%s" % "BINARY")
params_train.append("trn.x-file=%s" % temp_x_loc)
params_train.append("trn.y-file=%s" % temp_y_loc)
params_train.append("model.save=%s" % temp_model_loc)
cmd_train = [path_train]
cmd_train.extend(params_train)
path_pred = os.path.join(path, "forest_predict")
params_pred = []
params_pred.append("model.load=%s" % temp_model_loc)
params_pred.append("tst.x-file=%s" % temp_x_loc)
params_pred.append("set.nthreads=1")
params_pred.append("tst.output-prediction=%s" % temp_pred_loc)
cmd_pred = [path_pred]
cmd_pred.extend(params_pred)
try:
os.chmod(path_train, os.stat(path_train).st_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
os.chmod(path_pred, os.stat(path_pred).st_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
except Exception:
pass
try:
subprocess.check_output(cmd_train, stderr=subprocess.STDOUT)
subprocess.check_output(cmd_pred, stderr=subprocess.STDOUT)
except Exception as e:
print(e)
return False
print("FastRGF found on " + path)
return True
FASTRGF_AVAILABLE = True
if is_fastrgf_executable(CURRENT_DIR):
FASTRGF_PATH = CURRENT_DIR
elif is_fastrgf_executable(DEFAULT_FASTRGF_PATH):
FASTRGF_PATH = DEFAULT_FASTRGF_PATH
elif is_fastrgf_executable(FASTRGF_PATH):
pass
else:
FASTRGF_AVAILABLE = False
warnings.warn("Cannot find FastRGF executable files. FastRGF estimators will be unavailable for usage.") |
|
@fukatani |
@albertnanda Current FastRGF has trouble for handling small data with many thread. I opened #301 , it is work around for this issue. git clone https://github.com/RGF-team/rgf.git
cd rgf
git checkout reduce-checking-nthreads
cd python-package
pip install -e . But I don't know it is appropriate process for Anaconda. We recommend trying small nthread (I mean |
According to this info from @albertnanda , it seems to be useless.
I think that we need to fix #92 and this issue will disappear. |
Hi there
I'm trying to install rgf/fastrgf and use the python wrapper to launch the executables.
I've installed using
pip install rgf_python
However when i import the
rgf
module i get a user warningTo fix this issue i've compiled the rgf and fastrgf binaries* and added them to my
$PATH
variable (confirmed in bash that they are in the PATH) however i still get the same error. I've looked a bit into the rgf/utilsget_paths
andis_fastrgf_executable
functions however i'm not completely sure why it fails?*binaries: i was not sure which binaries are needed so i've added the following
rgf, forest_predict, forest_train, discretized_trainer, discretized_gendata, auc
System
Python: conda 3.6.1
OS: ubuntu 16.04
The text was updated successfully, but these errors were encountered: