-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ENTMOOT in Bofire #278
Conversation
As shown in Is there any functionality that is missing? If not, I will add tests (especially for the converting between features/constraints), and better documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TobyBoyne,
thank you very much for your PR and contributing to BoFire. I let some high level comments. Feel free to question and discuss them. I am looking forward to be able to use ENTMOOT in BoFire.
Best,
Johannes
Hi @jduerholt, Thank you for the review - lots of helpful feedback! I still haven't addressed the issue of hint typing parameters - I started looking into the surrogates, but I'm not sure I understand what function they have? I can implement the types as a simple class that inherits from BaseModel, but I'd rather use surrogates if that is the proper way to do things. As mentioned in the comments, I can't use any of the provided surrogates since they don't have the same attributes (ENTMOOT uses Kind regards, |
For reference, here is a mock-up of what the from typing import Literal
class EntingParams:
unc_params: 'UncParams'
tree_train_parameters: 'TreeTrainParams'
class UncParams:
beta: float = 1.96 #>0
acq_sense: Literal["exploration", "penalty"] = "exploration"
dist_trafo: Literal["normal", "standard"] = "normal"
dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"
class TreeTrainParams:
train_lib: Literal["lgbm"] = "lgbm"
train_params: 'TrainParams'
class TrainParams:
# this is determined by lightbgm
# the typing in the package uses Dict[str, Any]
# the defaults used by ENTMOOT are shown below
objective: str = "regression"
metric: str = "rmse"
boosting: str = "gbdt"
num_boost_round: int = 100
max_depth: int = 3
min_data_in_leaf: int = 1
min_data_per_group: int = 1
verbose: int = -1 |
Thanks @TobyBoyne, I was not at home over the weekend. I will have a detailed look tmr. Best, Johannes |
Hi Toby, I would opt for someting like this: class LGBMSurrogate(Surrogate, TrainableSurrogate):
type: Literal["LGBMSurrogate"] = "LGBMSurrogate"
objective: str = "regression"
metric: str = "rmse"
boosting: str = "gbdt"
num_boost_round: Annotated[int, Field(ge=1)] = 100
max_depth: Annnotated[int, Field(ge=1)] = 3
min_data_in_leaf: Annotated[int, Field(ge=1)] = 1
min_data_per_group: Annotated[int, Field(ge=1)] = 1
# verbose: int = -1
class EntingSurrogates(Surrogates): # Surrogates is a new common base class for both BotorchSurrogates and EntingSurrogates
surrogates: List[LGBMSurrogate]
@validator(surrogates)
def validate_surrogates()
class EntingStrategy(PredictiveStrategy):
type: Literal["EntingStrategy"] = "EntingStrategy"
# unc params
beta: Annotated[float, Filed(gt=0)] = 1.96
acq_sense: Literal["exploration", "penalty"] = "exploration"
dist_trafo: Literal["normal", "standard"] = "normal"
dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"
# surrogate params
surrogate_specs: Optional[EntingSurrogates] = None
# solver params
solver_params: Dict[str, Any] = {} # do we have any defaults here? In Bofire we have surrogates and strategies, surrogates are regression models that can be used to model output values of experiments and (predictive) strategies use surrogates for optimization. The 100% solution would be to implement a new surrogate model called We do the same for the botorch based strategies, there also different model types and hyperparams can be used within the same optimization for different outputs. Even subsets of outputs can be used. I looked a bit in the Entmoot code, and from my understanding the model parameters are globally used, meaning that for every output that should be optimized a model with the same hyperparameters and input features is trained. One could guarantee this behavior with a validator in the class What do you think? Is this somehow clear? Best, Johannes |
Here is the first version of the mentioned PR: #283 |
Hi @jduerholt, Yes, I think that all makes sense - thank you for the explanation. So to confirm, as an example, the Also, in terms of moving forward - do I have write permissions for the Kind regards, |
Good point! Will work on it for Entmoot.
Am Di., 12. Sept. 2023 um 13:23 Uhr schrieb Robert Lee <
***@***.***>:
… ***@***.**** commented on this pull request.
------------------------------
In bofire/data_models/strategies/enting.py
<#278 (comment)>
:
> +)
+from bofire.data_models.features.api import (
+ CategoricalDescriptorInput,
+ CategoricalInput,
+ ContinuousInput,
+ ContinuousOutput,
+ DiscreteInput,
+ Feature,
+)
+from bofire.data_models.objectives.api import MinimizeObjective, Objective
+from bofire.data_models.strategies.strategy import Strategy
+
+
+class EntingStrategy(Strategy):
+ type: Literal["EntingStrategy"] = "EntingStrategy"
+ enting_params: Dict[str, Any] = {}
I am not a big fan of having dictionary fields that can have anything
inside, as it is somehow contraty of using a validation framework like
pydantic. Can you provide the possible parameters and its defaults and
types and validation, that can be used as fields of the data model?
This change should be made directly in entmoot ***@***.***
<https://github.com/spiralulam> wanted to do it); Johannes's argument is
in general a good one and not just valid here
—
Reply to this email directly, view it on GitHub
<#278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AK3L6ODLKJ6DRHFCV472C6DX2BA4NANCNFSM6AAAAAA4NGSZYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
In
The
In the call to
You would have it in both the surrgate and the strategy, but you can call the in the strategies
I try to get the PR trough the latest tmr, but in terms of moving forward, I would recommend that you just create a new branch/PR in which you first just implement the For the What do you think? |
When we have the |
Happy to work on that! Thanks for all the reviewing and help, very much appreciated! |
I've made an initial commit for the surrogate, however in order to implement the regression in #285, I need the utils in this PR - any thoughts on how I can overcome that? |
I've been working on implementing changes in the Entmoot codebase. Since this was last discussed, there are a few things that have changed:
|
Hi @TobyBoyne, I will do a proper review this evening. Thank you already! Best, Johannes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are converging! Biggest obstacle is from my perspective currently some problem with the entmoot
installation in the test pipeline. But this looks for me like an entmoot problem and not a bofire problem (see my comments).
Best,
Johannes
|
||
import numpy as np | ||
import pandas as pd | ||
import pyomo.environ as pyo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the pyomo
import has to be moved into the try except.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore there is something strange with the entmoot
installation. Looking at the logs of the failing test runs, it is installing entmoot but not pyomo, lgbm and gurobipy which are entmoots dependencies. I tried then to install entmoot locally on my laptop into an environment without it and also there it is not installing the dependencies. For me this looks like to be a problem with entmoot. Any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solver dependencies need to be installed manually.
https://entmoot.readthedocs.io/en/master/installation.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then we have to put it into our dependencies for the option entmoot
in setup.py
, because as the strategy is now implemented, it is based on pyomo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added these dependencies in setup.py
in BoFire. However, these dependencies definitely do need to be fixed on entmoot's side - for example, lightgbm
is not installed despite being required, and it should probably just install pyomo
anyway.
# This ensures that new points are far from pending candidates | ||
real_experiments = self.experiments | ||
if self.candidates is not None: | ||
for i in range(len(self.candidates)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense. I have to further think about it, but for now we let it like it is and will merge it like this. So for now, regard this as resolved ;)
preds = self.predict(candidate) | ||
candidate = pd.concat((candidate, preds), axis=1) | ||
as_experiment = self._fantasy_as_experiment(candidate) | ||
self.tell(as_experiment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your case here, just using _fit
should be fine.
assert (input_values != 0).sum().sum() <= allowed_k | ||
|
||
|
||
@pytest.mark.slow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, could you check if gurobi is installed in somehow the same way that you check that entmoot is installed and then add another skipif
. This would be a bit cleaner. If not possible, then we keep it.
def test_propose_optimal_point(common_args): | ||
# regression test, ensure that a good point is proposed | ||
np.random.seed(42) | ||
random.seed(42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer passing the seed explicitly to entmoot via calling strategy._get_seed()
and passing the int then to entmoot. As a general note, it would be also better in entmoot to not globally set the seeds beut using numpys rng
functionality.
Addressed all above comments. I also raised an issue on ENTMOOT about random state because I think you are definitely correct! cog-imperial/entmoot#31 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks overall very good to me. But somehow tests are still failing. One sees in the test logs that entmoot gets installed but somehow the suff in the try except
block fails, so that nothing gets imported. No idea why.
I will try to set the permissions for you in a way that the test pipeline is directly triggered, without needing us for this. So you can iterate faster.
Two sources of issues that I've found so far: Entmoot requires pyright isn't happy with Edit: Actually, |
I've made some progress. There are quite a few However, I'm still getting issues with running the pipeline. On my machine, if I remove the license file, the tests that require Gurobi are skipped. However, the pipeline seems to still be attempting to run these, and then having issues. Any ideas? (Note that |
Using |
Found the results from the latest pipeline execution here: https://github.com/TobyBoyne/bofire/actions/runs/8236200085/job/22522063543, and I also tried your catch for gurobi locally, (i just installed it via pip) without having a liscense and got this: Any ideas? |
Here it is stated that it comes with a trial liscense that can be used only for small problems: https://pypi.org/project/gurobipy/ |
Hi @TobyBoyne, I thought about it, to get it quickly implemented, just mark the test in which gurobi is needed with slow and we are done ;) Please also resolve the merge conflicts and then we can have it finally in. Best, Johannes |
Hi @jduerholt, I've added this in and solved merge conflicts. Looks like the tests are all passing...? Hopefully this should be good to go! Thank you for all of the reviews! Looking forward to the next PR :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TobyBoyne,
looks good to me. Thanks for all your patience!
@bertiqwerty @R-M-Lee also ready to merge from your side?
Best,
Johannes
Thanks a lot, Toby and Johannes! |
@jduerholt, @TobyBoyne Thanks to both of you for getting this through. Are we happy with the treatment of the optional entmoot dependencies now or was this more of a short-term fix and we still have some open to-dos? |
Initial progress on converting a BoFire problem to an ENTMOOT optimisation problem.