-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: add timeout parameter to the .fit() method #6596
Comments
Thanks for using LightGBM and taking the time to open this. I'm -1 on adding this to LightGBM. I understand why this might be useful, but I don't think LightGBM is the right place for this logic. This would introduce some non-trivial maintenance burden and complexity. This would be better handled outside of LightGBM, in code that you use to invoke it. Since you mentioned Alternatively, you could use a import lightgbm as lgb
from datetime import datetime
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=10_000, n_features=20)
dtrain = lgb.Dataset(X, label=y)
class TimeoutCallback:
def __init__(self, timeout_seconds: int):
self.before_iteration = False
self.timeout_seconds = timeout_seconds
self._start = datetime.utcnow()
def __call__(self, *args, **kwargs) -> None:
if (datetime.utcnow() - self._start).total_seconds() > self.timeout_seconds:
raise RuntimeError(f"timing out: elapsed time has exceeded {self.timeout_seconds} seconds")
bst = lgb.train(
params={
"objective": "regression",
"num_leaves": 100
},
train_set=dtrain,
num_boost_round=1000,
callbacks=[TimeoutCallback(2)]
) I just tested that with LightGBM 4.5.0 and saw the following:
That's not perfect, as it only runs after each iteration and individual iterations could run for much longer on a realistic dataset. But hopefully that imperfection also shows one example of how complex this would be to implement in LightGBM. I'm only one vote here though, maybe other maintainers will have a different perspective. |
I did not think of this approach! If i'm using early stopping, are best "weights" applied to the model after this exception is thrown? in other words, is best_iter set correctly? Goal would be to stay within time budget but not lose training progress up to the point. |
Oh interesting! It wasn't clear to me that you would want to see training time out but also keep that model. No, in the Python package A Python exception is used to tell the training process that early stopping has been triggered, and to carry forward details like best iteration and evaluation results. LightGBM/python-package/lightgbm/callback.py Line 436 in e7edb6c
LightGBM/python-package/lightgbm/callback.py Lines 40 to 44 in e7edb6c
LightGBM/python-package/lightgbm/engine.py Lines 327 to 330 in e7edb6c
You could rely on that behavior in your own callback, and have it raise a Alternatively... have you tried
(that might be for the entire experiment though, not per-trial... I'm not sure) |
Hah! ) I'm planning to create my own cool hyperparameters tuner, that's one of the reasons why I'm interested in this functionality. I can easily see how to do time budgeting at the level of the tuner - just in the hyperparameters checking loop, after next combination has been tried, but the underlying estimator has to finish its training gracefully before that, which for some combinations can take extremely long time. Writing great hyperparameters optimizer is one more use case for this timeout feature. Now I think it's the EarlyStopping callback I should subclass (as I almost can't imagine training without early stopping). Does it make sense to prepare a PR that adds timeout parameter to the EarlyStopping callback? That said, it still seems more natural to me to be able to specify timeout in the fit or init methods of the estimator directly, same as we do with n_iters, just in this case we are interested in maximum number of seconds not trees. |
I understand why you want that, but it'd be pretty difficult to get that right in a thorough way. Just a few practical challenges:
Even if we chose to ignore all of these concerns and treat them as "not yet implemented", that'd still add complexity in the form of additional warnings, errors, and notes in documentation.
Sorry it took so long to respond to this @fingoldo . No, I wouldn't support such a PR here in LightGBM. To be honest, I agree with @trivialfis (dmlc/xgboost#10684 (comment))... this feature is not something that should be in libraries like CatBoost / LightGBM / XGBoost. I think it's better implemented outside of those libraries, e.g. in the hyperparameter tuner you're writing and similar tools like I think this should be treated as "won't do" and closed. I'll leave it open a bit longer to give you and others a chance to comment. |
Adding the timeout parameter to the .fit() method, that should force the library to return best known solution found so far as soon as provided number of seconds since the start of training are passed, will allow to satisfy training SLAs, when a user has only a limited time budget to finish certain model training. Also, this will make possible fair comparison of different hyperparameters.
Reaching the timeout should have the same effect as reaching max iterations, maybe with additional warning and/or attribute set so that the training job's finishing reason is clear to the end user.
The text was updated successfully, but these errors were encountered: