Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Allow perturbing a configuration #289

Open
eddiebergman opened this issue Jan 18, 2023 · 3 comments
Open

[Feature] Allow perturbing a configuration #289

eddiebergman opened this issue Jan 18, 2023 · 3 comments

Comments

@eddiebergman
Copy link
Contributor

eddiebergman commented Jan 18, 2023

space = ConfigurationSpace({...})
optimum = space.sample_configuration()  # Or more likely evaluation

close_neighbor = config.perturb(std=0.05)  # std on a scale of 0-1 

std doesn't really make sense as a name since it's more of a percentage sphere around a config.

One thing is that is unclear behaviour for categoricals.

Here's some reference work for mfpbench::Config::perturb():

def perturb(
    value: ValueT,
    hp: (
        Constant
        | UniformIntegerHyperparameter
        | UniformFloatHyperparameter
        | NormalIntegerHyperparameter
        | NormalFloatHyperparameter
        | CategoricalHyperparameter
        | OrdinalHyperparameter
    ),
    std: float,
    seed: int | np.random.RandomState | None = None,
) -> ValueT:
    # TODO:
    # * https://github.com/automl/ConfigSpace/issues/289
    assert 0 <= std <= 1, "Noise must be between 0 and 1"
    rng: np.random.RandomState
    if seed is None:
        rng = np.random.RandomState()
    elif isinstance(seed, int):
        rng = np.random.RandomState(seed)
    else:
        rng = seed

    if isinstance(hp, Constant):
        return value

    if isinstance(
        hp,
        (
            NormalIntegerHyperparameter,
            NormalFloatHyperparameter,
            UniformFloatHyperparameter,
            UniformIntegerHyperparameter,
        ),
    ):
        # TODO:
        # * https://github.com/automl/ConfigSpace/issues/287
        # * https://github.com/automl/ConfigSpace/issues/290
        # * https://github.com/automl/ConfigSpace/issues/291
        assert hp.upper is not None and hp.lower is not None
        assert hp.q is None
        assert isinstance(value, (int, float))

        if isinstance(hp, UniformIntegerHyperparameter):
            if hp.log:
                _lower = np.log(hp.lower)
                _upper = np.log(hp.upper)
            else:
                _lower = hp.lower
                _upper = hp.upper
        elif isinstance(hp, NormalIntegerHyperparameter):
            _lower = hp.nfhp._lower
            _upper = hp.nfhp._upper
        elif isinstance(hp, (UniformFloatHyperparameter, NormalFloatHyperparameter)):
            _lower = hp._lower
            _upper = hp._upper
        else:
            raise RuntimeError("Wut")

        space_length = std * (_upper - _lower)
        rescaled_std = std * space_length



        if not hp.log:
            sample = np.clip(rng.normal(value, rescaled_std), _lower, _upper)
        else:
            logged_value = np.log(value)
            sample = rng.normal(logged_value, rescaled_std)
            sample = np.clip(np.exp(sample), hp.lower, hp.upper)

        if isinstance(hp, (UniformIntegerHyperparameter, NormalIntegerHyperparameter)):
            return int(np.rint(sample))
        elif isinstance(hp, (UniformFloatHyperparameter, NormalFloatHyperparameter)):
            return float(sample)  # type: ignore
        else:
            raise RuntimeError("Please report to github, shouldn't get here")

        # if isinstance(hp, (BetaIntegerHyperparameter, BetaFloatHyperparameter)):
        # TODO
        # raise NotImplementedError(
        # "BetaIntegerHyperparameter, BetaFloatHyperparameter not implemented"
        # )

    if isinstance(hp, CategoricalHyperparameter):
        # We basically with (1 - std) choose the same value, otherwise uniformly select
        # at random
        if rng.uniform() < 1 - std:
            return value

        choices = set(hp.choices) - {value}
        return rng.choice(list(choices))

    if isinstance(hp, OrdinalHyperparameter):
        # TODO:
        # * https://github.com/automl/ConfigSpace/issues/288
        # We build a normal centered at the index of value
        # which acts on index spacings
        index_value = hp.sequence.index(value)
        index_std = std * len(hp.sequence)
        normal_value = rng.normal(index_value, index_std)
        index = int(np.rint(np.clip(normal_value, 0, len(hp.sequence))))
        return hp.sequence[index]

    raise ValueError(f"Can't perturb {hp}")
@mfeurer
Copy link
Contributor

mfeurer commented Jan 19, 2023

Hey, I think this is pretty close to the neighborhood retrieval currently implemented. What would be the exact difference?

@eddiebergman
Copy link
Contributor Author

The get one exchange neighborhood acts slightly different from what I'm aware (at least from the name it sounds like it would be different). Using the get_neighbors functions are not useful, as they act on the values stored in np.ndarray in the configuration, i.e. they're very much private functions.

If the get_once_exchange_neighborhood function could be used for this exact same effect, then we should attach it as a function to a configuration, get_one_exchange_neigborhood from ConfigSpace.util is not somewhere I would look.

Edit: I looked at one exchange, this only acts on one HP at a time by the looks of it and we also needed to treat categoricals with some sort of "strength" to stick to the current categorical, as captured here:

    if isinstance(hp, CategoricalHyperparameter):
        # We basically with (1 - std) choose the same value, otherwise uniformly select
        # at random
        if rng.uniform() < 1 - std:
            return value

        choices = set(hp.choices) - {value}
        return rng.choice(list(choices))

This is because with a low std.dev like 0.1, we would like it to just stick to the same categorical 90% of the time.

I guess functionality like this in ConfigSpace is different then the specific method get_one_exchange_neighborhood and get_neighbors is not friendly enough to use. (Also from a practical standpoint, I tried get_neighbors for the uniforms but we are locked to ConfigSpace version where using get_neighbors got stuck in rejection sampling).

@eddiebergman
Copy link
Contributor Author

eddiebergman commented Jan 19, 2023

I think the best course of action is a more useable form of get_neighbors on each HP, making get_neighbours into something private, since it's on an optimized hotloop that works with the scaled np.ndarray values that require transformations which are non obvious and prone to silent errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants