- scikit-learn
- matplotlib
- numpy
- scipy
python Bayes_opt.py
Bayesian optimization is used to approximate black-box functions that are expensive to evaluate. It has proven useful in several applications, including hyperparameter optimization neural architecture search and material design. Classic BO assumes a search region and a scalar black-box function f evaluated in the presence of additive noise.
Unlike this standard BO formulation, we assume that our search region is for a fixed positive integer m. Thus, for
, f would take in a "set" containing m elements, all of length d, and return a noisy function value
We specifically implement equation 4 in the paper Practical B.O. over Sets which is basically a Set Kernel. To illustrate the algorithm, we take a standard non-convex test function, Branin function, with multiple optima. Furthermore, we discretize the domain into coordinate tuples (x,y). We treat each of these tuples as a set a with cardinality 1 and dimesnion 2. Our entire domain is now a set of sets. Gaussian Process (GP) is used as the surrogate model and Expected improvement acquisition function is used. We make use of our custom SetKernel as the kernel for GP.
In the following GIFs we illustrate the performance of the B.O over sets algorithm using contour plots and surface plots. The B.O is run for 100 iterations and the plots in the GIFs are taken every 5 iterations. The orange triangles represent the global optima of the function. The red dots represent the last 5 points suggested by the B.O. The green X represents the last point suggested by B.O. We can notice that as the iterations proceed, the points being suggested are more closer to the global optima of the function which shows expected behaviour. From the mesh plot we can visualize how the function is getting learned over the course of the iterations.