Implement `AxModelManager` and allow building GP models from diagnostics #178

delaossa · 2024-02-09T16:09:36Z

This class provides easy access to build and evaluate surrogate models using Ax.
If an AxModelManager instance is created from an ExplorationDiagnostics object using the build_gp_model method, it will take automatically the varying parameters in ExplorationDiagnostics as the model parameters.

This an example for one of the tests in Optimas:

diags = ExplorationDiagnostics('tests_output/test_exploration_diagnostics')
print(diags.objectives)
print(diags.varying_parameters)

There are two objectives:

[Objective(name='f1', dtype='<f8', minimize=False), 
 Objective(name='f2', dtype='<f8', minimize=True)]

and two varying parameters:

[VaryingParameter(name='x0', dtype='<f8', lower_bound=-50.0, upper_bound=5.0, is_fidelity=False, fidelity_target_value=None, default_value=None), 
 VaryingParameter(name='x1', dtype='<f8', lower_bound=-5.0, upper_bound=15.0, is_fidelity=False, fidelity_target_value=None, default_value=None)]

Build (Gaussian Process) model for f from Optimas diagnostics data:

mm = diags.build_gp_model('f')

Evaluate the model over the 10 best scoring f evaluations in the exploration diagnostics:

df = diags.get_best_evaluations(top=10)[['x0', 'x1', 'f']]
mean, sed = mm.evaluate_model(df)
df['f_model_mean'] = mean
df['f_model_sed'] = sed
print(df)

            x0        x1           f  f_model_mean  f_model_sed
98  -42.583406 -3.549389 -332.679062   -303.868504    24.938903
54  -50.000000 -3.250379 -331.712969   -314.301356    30.215155
69  -50.000000 -4.121062 -278.755411   -273.341414    29.272605
75  -50.000000 -2.527401 -266.861011   -249.542112    29.578532
64  -36.498331 -3.116813 -266.846823   -281.811295    24.962345
94  -36.071225 -4.069215 -258.980964   -243.572727    25.912735
103 -36.337730 -2.629551 -239.410503   -242.014318    26.215060
78  -23.191039 -2.930704 -209.696421   -199.370949    28.178929
107 -50.000000 -4.605092 -207.422854   -215.148531    28.709964
119 -20.812057 -3.720468 -194.718972   -170.123281    28.301797

Note that the best scoring evaluation in data (index=98) does not coincide with the best scoring according to the model (index=54).

print(mm.get_best_evaluation(use_model_predictions=False))
print(mm.get_best_evaluation())

(98, {'x0': -42.58340615003209, 'x1': -3.549388990644113})
(54, {'x0': -50.0, 'x1': -3.2503794992487682})

Plot the model for f (mean and standard error),
mark with crosses the top 10 evaluations and add their trial indices:

fig, axs = mm.plot_contour(mode='both', figsize=(10, 4.8))
for ax in axs:
    ax.scatter(df['x0'], df['x1'], c='red', marker='x')
    for x, y, id in zip(df.x0, df.x1, df.index):
        ax.annotate(str(id), (x, y), (2, 2),
            va="bottom",
            fontsize="xx-small",
            textcoords="offset points",
        )

One can also build a model for other objectives or analyzed parameters, e.g.:

mm = diags.build_gp_model('f2')

Or a model with multiple metrics. This example use all the objectives present in the Optimas diagnostics:

mm = AxModelManager(source=diags.history, objectives=diags.objectives, varying_parameters=diags.varying_parameters)

Evaluate the model for f2 over the 10 best scoring f2 evaluations in the exploration diagnostics:

df2 = diags.get_best_evaluations(objective='f2', top=10)[['x0', 'x1', 'f2']]
mean, sed = mm.evaluate_model(df2, metric_name='f2')
df2['f2_model_mean'] = mean
df2['f2_model_sed'] = sed
print(df2)

            x0         x1           f2  f2_model_mean  f2_model_sed
10  -46.885061  14.289673  1531.639618    1181.397155     45.673601
45  -50.000000  12.659369  1423.379433    1383.034240     51.006830
65  -35.445952  13.330210  1414.661161    1174.202741     42.240142
112 -50.000000  13.086907  1406.182705    1398.408933     51.072583
90  -50.000000  12.163289  1352.750994    1296.826693     50.739746
68  -50.000000  13.626901  1296.775107    1350.675382     51.690451
53  -50.000000  11.826384  1252.375392    1196.192029     50.619594
89  -43.358383  13.156690  1220.127665    1325.829987     42.757083
74  -33.174457  12.302707  1200.491551    1043.329809     45.902341
85  -50.000000  14.058802  1166.142981    1265.861102     51.450379

Plot the mean of the two models (f and f2):

fig = plt.figure(figsize=(10, 4.8))
gs = mpl.gridspec.GridSpec(1, 2, wspace=0.2, hspace=0.3)
ax1 = mm.plot_contour(metric_name='f', pcolormesh_kw={'cmap': 'GnBu'}, subplot_spec=gs[0, 0])
ax2 = mm.plot_contour(metric_name='f2', pcolormesh_kw={'cmap': 'OrRd'}, subplot_spec=gs[0, 1])
ax1.scatter(df['x0'], df['x1'], c='red', marker='x')
ax2.scatter(df2['x0'], df2['x1'], c='royalblue', marker='x')

The two models for f and f2 are very similar.
The reason is that the real underlying function for f and f2 used in the tests is:

f = -(x0 + 10 * np.cos(x0)) * (x1 + 5 * np.cos(x1))
f2 = -2 *(x0 + 10 * np.cos(x0)) * (x1 + 5 * np.cos(x1))

Sof2 is just 2 * f.
However, since in the example we told optimas to find the minimum of f and the maximum f2, the best evaluations fall in opposite regions.

Let's plot the model of 'f' together with the real function:

parx0 = mm.ax_client.experiment.parameters["x0"]
parx1 = mm.ax_client.experiment.parameters["x1"]
xaxis = np.linspace(parx0.lower, parx0.upper, 200)
yaxis = np.linspace(parx1.lower, parx1.upper, 200)
X, Y = np.meshgrid(xaxis, yaxis)
F =  -(X + 10 * np.cos(X)) * (Y + 5 * np.cos(Y))
sample = {'x0': X.flatten(), 'x1': Y.flatten()}
fm, fsem = mm.evaluate_model(metric_name='f', sample=sample)
FM = fm.reshape(X.shape)

fig, axs = plt.subplots(1, 2, figsize=(10, 4.8))
ax = axs[0]            
imf = ax.pcolormesh(X, Y, F)
cbarf = plt.colorbar(imf, ax=ax, location="top")
cbarf.set_label('f (analytic)')
ax.set_xlabel('x0')
ax.set_ylabel('x1')
ax = axs[1]
imfm = ax.pcolormesh(X, Y, FM, vmin=cbarf.vmin, vmax=cbarf.vmax)
cbarfm = plt.colorbar(imf, ax=ax, location="top")
cbarfm.set_label('f (model)')
ax.set_xlabel('x0')
ax.set_ylabel('x1')

We can see that, while the oscillations in x1 are well captured by the model, this is not the case for x0 where the model just reproduces the "averaged" trend. This is surely due to the less dense sampling in the x0 dimension.

Draws a slice of the model along x1 with x0 fixed to its middle point.

parx0 = mm.ax_client.experiment.parameters["x0"]
parx1 = mm.ax_client.experiment.parameters["x1"]
x0_mid = 0.5 * (parx0.lower + parx0.upper)
Np = 100
x1 = np.linspace(parx1.lower, parx1.upper, Np)
metric_names = mm.ax_client.objective_names
funcs = [-(x0_mid + 10 * np.cos(x0_mid)) * (x1 + 5 * np.cos(x1)),
         -2 * (x0_mid + 10 * np.cos(x0_mid)) * (x1 + 5 * np.cos(x1))]
fig, axs = plt.subplots(len(metric_names), 1, sharex=True)
for i, (ax, metric_name, func) in enumerate(zip(axs, metric_names, funcs)):
    mean, sed = mm.evaluate_model(
        sample={"x0": x0_mid * np.ones_like(x1), "x1": x1}, metric_name=metric_name
    )
    ax.plot(x1, mean, color=f"C{i}", label=f"model")
    ax.fill_between(
        x1, mean - sed, mean + sed, color="lightgray", alpha=0.5
    )
    ax.plot(x1, func, ls=':', c='black', lw=0.8, label='true value')
    ax.set_ylabel(metric_name)
    ax.legend(frameon=False)
plt.xlabel("x1")