Skip to content

Commit

Permalink
Added fobo tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
yyexela committed Dec 3, 2023
1 parent 504ccea commit a39e6f5
Show file tree
Hide file tree
Showing 2 changed files with 578 additions and 0 deletions.
Loading

8 comments on commit a39e6f5

@mreumi
Copy link

@mreumi mreumi commented on a39e6f5 Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yyexela,

I am interested in using BO with derivative observations (FOBO as you term it). I scanned through the botorch discussions and found some of your comments, questions and this commit. I also have a code running with FOBO which uses basic acquisition functions that works well. It looks very much like the tutorial you're suggesting.

I would like to test d-KG as an acquisition function (as you do), but ran into problems with FANTASIZE during implementation. I see you have been working to work around this. Since I'm not a GitHub expert, I'm not sure if I'm getting the status of this implementation right. Can you tell me how far you have gotten? I found an open pull request as the last update. So are you waiting for approval or did you still have problems?

I would be very pleased to receive a short reply and thank you for your contribution!

@yyexela
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mreumi ,
Yes, I think I solved the problem here (cornellius-gp/gpytorch#2452), try getting those changes locally and let me know if that fixes your problem! I'm not sure what the status is of the changes being made in the main repository since that is currently out of my control.

@mreumi
Copy link

@mreumi mreumi commented on a39e6f5 Jan 19, 2024 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mreumi
Copy link

@mreumi mreumi commented on a39e6f5 Jan 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yyexela I tested your code briefly and it seems to be running just fine, thanks!

Btw, I ran into some numerical issues using 'fit_gpytorch_mll' which uses LBFGS to train the derivative-enanbled GP model in some cases (using other test functions and analytical gradients then your tutorial case).
As far as I understand the error messages, there seems to be an issue with the line search. Training with stochastic gradient descent seems to be significantly more robust (I use the code here: https://botorch.org/tutorials/fit_model_with_torch_optimizer). In case you end up with similar problems.

@yyexela
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mreumi great, thanks for letting me know! Glad the code is working

@mreumi
Copy link

@mreumi mreumi commented on a39e6f5 Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yyexela - I digged a bit deeper into FOBO. For some test functions, I only get qualitatively nice surrogate models, which are off in quantitative range. I am not sure if this is a scaling or a model training error. Thinking about it, I realized that output scaling is not so trivial for FOBO.
In case you are still working in this direction, I would have a question there that you might have come across already and that affects your tutorial:

In your tutorial, you apply a scaling to the training data:

    # Standardize domain and range, this prevents numerical issues
    mean_Y = train_Y.mean(dim=0)
    std_Y = train_Y.std(dim=0)
    unscaled_train_Y = train_Y
    scaled_train_Y = (train_Y - mean_Y) / std_Y

Thus mean_Y is something like [mean(f), mean(dfdx1), mean(dfdx2)...] and std_Y accordingly.
Given the fact that train_Y includes function values and the corresponding derivative values doesn't this scaling lead to a mismatch between function values and the corresponding derivatives?

As $d(f-const)/dx = df/dx$ I would think that when subtracting the mean of the function values we should not subtract anything from the gradients. But because $d(constf)/dx = constdf/dx$ we should scale function values and derivatives with the same std to keep them consistent. I would rather think that a scaling of the function values should look like this:

def normalize_target_data(ytrain_nominal):
    ytrain_normalized = torch.clone(ytrain_nominal) #f, dfdx1, dfdx2, ...
    # subtract mean from f only:
    ytrain_normalized[:,0] -= ytrain_normalized[:,0].mean()
    # scale f and gradient by std of f:
    ytrain_normalized /= ytrain_normalized[:,0].std()
    return ytrain_normalized

I couldn't find a clear answer from different tests I ran so I thought maybe you have an idea? If so, I'd appreciate if you could share it!

@yyexela
Copy link
Owner Author

@yyexela yyexela commented on a39e6f5 Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mreumi , I'm glad you're interested in FOBO! Unfortunately, if I remember correctly, the scaling fixed an error in the code where the values were overflowing. I don't have any theoretical justification. Sorry I couldn't help give any more insight.

One thing we noticed when working on FOBO was that the theory Peter Frazier presented for FOBO may not be optimal. As a simple experiment to try on your own, try this:

  • Create a zeroth-order Bayesian Optimization (ZOBO) model
  • Run some initialization (ie. grid search) for your objective and obtain gradient information as well
  • Use a linear approximation around each samples point to generate additional data and use that in your ZOBO model

Doing this resulted in much better results that FOBO in what I've seen empirically. This suggests maybe there's a better way to do FOBO than what Frazier showed and what's implemented in BoTorch.

@mreumi
Copy link

@mreumi mreumi commented on a39e6f5 Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply, I will look into that!

Please sign in to comment.