Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate neighbors and sample configurations correctly with deeply nested conditions #197

Closed
wants to merge 17 commits into from

Conversation

filipbartek
Copy link
Contributor

Resolves #194.

One-exchange neighborhood iterator rigorously checks some of the
generated configurations. The check is performed with the probability 5
%, as decided by `np.random`.

Make the check deterministic (using an explicitly seeded
`np.random.RandomState`) instead of relying on the default numpy random
state, which need not be seeded.
of `get_one_exchange_neighborhood`.

Update the hard-coded expected values.
The PCS contains a conditional dependency with the operator "||",
which is interpreted as `OrConjunction`.
An optimization declares a hyperparameter (HP) inactive
as soon as any of the HPs parents according to a condition is inactive.
This logic is incorrect in case the condition is an `OrConjunction`.
This commit ensures that the optimization is not applied to `OrConjunction`s.
@codecov
Copy link

codecov bot commented Sep 17, 2021

Codecov Report

Merging #197 (3b0dbf1) into master (4d69931) will decrease coverage by 0.19%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #197      +/-   ##
==========================================
- Coverage   67.15%   66.96%   -0.20%     
==========================================
  Files          17       17              
  Lines        1629     1662      +33     
==========================================
+ Hits         1094     1113      +19     
- Misses        535      549      +14     
Impacted Files Coverage Δ
ConfigSpace/read_and_write/json.py 82.91% <0.00%> (-3.69%) ⬇️
ConfigSpace/nx/classes/graph.py 23.66% <0.00%> (-0.51%) ⬇️
ConfigSpace/read_and_write/pcs_new.py 90.69% <0.00%> (-0.22%) ⬇️
ConfigSpace/__init__.py 100.00% <0.00%> (ø)
ConfigSpace/read_and_write/pcs.py 85.53% <0.00%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d69931...3b0dbf1. Read the comment docs.

When searching the hierarchy of hyperparameters (HPs),
every time a HP is activated or deactivated,
its children need to be re-considered for activation or deactivation.
If there is no cycle in the condition graph,
this process will terminate and converge.

Before this commit, each HP was only considered at most once.
This suffices if there is no `OrConjunction` condition:
as soon as one of the parents of a HP is inactive,
the HP is necessarily inactive.
@filipbartek filipbartek marked this pull request as ready for review September 17, 2021 12:29
@filipbartek
Copy link
Contributor Author

This solution resolves the constraints in an arbitrary order, possibly updating the status of a HP more than once. My guess is that following a topological order of the HPs (which seems to be respected by ConfigSpace.get_hyperparameters() and ConfigSpace._children_of) would lead to a linear complexity, while the worst-case complexity of this "arbitrary order" approach is quadratic. Using a topological order would require iterating through all the HPs in the CS, or maintaining to_visit as a priority queue.

Since the parents may be nested in a hierarchy of OrConjunction
and AndConjunction, we may only infer inactivity without inspecting
the whole hierarchy when all the parents are inactive.
Maintain a minimum heap of indices of HPs that should be visited.
Since the HPs are topologically ordered by index,
this ensures that all the parents of a HP are visited
before that HP is visited.

Forbid changing the value of an inactive HP.
Raise ValueError on such attempt.
In some configuration spaces that include deeply nested conditions,
sampling could yield an invalid configuration.
This commit ensures that the procedure that normalizes the freshly sampled configuration
handles all configuration spaces, including the complex ones, correctly.

The new implementation of correct_sampled_array
visits all the conditional hyperparameters in topological order.
If any visited HP is deemed inactive, its vector value is set to NaN.
Break lines that are longer than 100 characters.
@filipbartek
Copy link
Contributor Author

I have extended the changes so that generating neighbors and sampling random configurations works correctly with configuration spaces with conditions nested deeper than 1 level of AbstractConjunction. Both of these procedures now use the topological order of hyperparameters when ensuring that the output configuration is valid.

@@ -1269,6 +1269,7 @@ class ConfigurationSpace(collections.abc.Mapping):

unconditional_hyperparameters = self.get_all_unconditional_hyperparameters()
hyperparameters_with_children = list()
conditional_hyperparameters = sorted(self.get_all_conditional_hyperparameters(), key=lambda hp_name: self._hyperparameter_idx[hp_name])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n ... number of conditional hyperparameters
N ... number of all hyperparameters

This implementation has the complexity O(n log n d(N)), where d(N) is the complexity of fetching an element from a dictionary indexed by N strings.

Alternative:
[hp_name for hp_name in self._hyperparameters if hp_name in self.get_all_conditional_hyperparameters()]
Complexity: O(N s(n)), where s(n) is the complexity of checking membership in a set of n strings.

to_visit = [index]

# Since one hyperparameter may be reachable in several ways, we need to make sure we don't process it twice.
scheduled = np.zeros(len(configuration_space), dtype=bool)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using a np.ndarray without declaring it with cdef harm performance significantly?

@filipbartek
Copy link
Contributor Author

This pull request probably solves the issue solved by #219.

@filipbartek filipbartek changed the title Stop generating invalid neighbors Generate neighbors and sample configurations correctly with deeply nested conditions Jan 26, 2022
@mfeurer
Copy link
Contributor

mfeurer commented May 2, 2022

Hi @dengdifan and @filipbartek,
Please excuse the delay, but I finally get to your two PRs (this one and #219). As a first step I checked, and they both fix the issue in https://github.com/automl/ConfigSpace/pull/219/files#diff-62b93f19eeb2c5ff6e26eaccf3f63ee6641e87f3a6ccfc2ffb0dec86925ae245

As a next step I'd like to figure why the two PRs are so different in size and whether they implement different things underneath.

From what I can see the PR of @difandeng touches the functions:

  • c_util.change_hp_value

and the PR from @filipbartek touches

  • c_util.correct_sampled_array
  • c_util.change_hp_value
  • configuration_space.ConfigSpace.get_active_hyperparameters

and none of your PRs touches

  • c_util.check_configurations()
  • util.deactivate_inactive_hyperparameters()

PR #194 is more complete than PR #219 as it handles the test added there but not vice versa.

I do like the rather small changes of @difandeng to c_util.change_hp_value and from @filipbartek to configuration_space.ConfigSpace.get_active_hyperparameters. However, I don't immediately understand the large changes from @filipbartek to c_util.correct_sampled_array and c_util.change_hp_value, could you please give some further explanation on what you are doing here? Would it be possible to reproduce the changed behavior in the style of the original algorithms? Lastly, we'd need to check the two functions that weren't touched by any of the PRs to ensure that they don't require an update.

Therefore, I propose the following concrete steps forward, let me know what you think about them:

  1. Merge Diamond cond with OrConjunction  #219
  2. @filipbartek rebases this PR on master and keeps the changes from Diamond cond with OrConjunction  #219 for c_util.change_hp_value and changes c_util.correct_sampled_array to be closer to the original version of the algorithm. We then merge this PR, too
  3. We check whether c_util.check_configurations() and util.deactivate_inactive_hyperparameters() require any special treatment, too.

@mfeurer
Copy link
Contributor

mfeurer commented May 5, 2022

We need to check if this PR fixes #196 as well.

@dengdifan
Copy link
Contributor

dengdifan commented May 6, 2022

This should also fix #253 while #219 cannot fix it

to_disable = set()
for ch in children:
to_disable.add(ch.name)
while len(to_disable) > 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop adds all the descendants of current to disabled. No elements are ever removed from disabled. On lines 345 to 346, all of the HPs in disabled are set to NaN. However, if any of these descendants is conditioned by an OrConjunction, deactivating current need not suffice to deactivate the descendant. For each descendant, we should evaluate its conditions before disabling it.

disabled = []
# We maintain to_visit as a minimum heap of indices of hyperparameters that may need to be updated.
# We assume that the hyperparameters are sorted topologically by their index.
to_visit = [index]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a minimum heap (sorted by HP index) instead of a dequeue. Since the HP indices form a topological order with respect to the condition (directed acyclic) graph and since we process the HPs in a strictly increasing order by index, when we process a HP, we know that all of its parents have been processed already. Minimum heap is a data structure that allows us to ensure the topological order of processing with little computational overhead.

to_visit = [index]

# Since one hyperparameter may be reachable in several ways, we need to make sure we don't process it twice.
scheduled = np.zeros(len(configuration_space), dtype=bool)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheduled is a replacement of visited. While visited is a set of HP names, scheduled is a boolean vector representation of a set of HPs, where each HP is identified by its index. I estimate that this change is probably a premature potential optimization.

for condition in conditions:
if not condition._evaluate_vector(configuration_array):
active = False
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of active for the HP current need not be definitive. For example, let's assume that current is being deactivated because one of its parent conditions does not hold. Since we do not process the HPs in topological order, there may be a future iteration of the to_visit loop that activates one of the parents of current. If current is conditioned by an OrConjunction, current may need to be re-activated then. However, we will not get to re-visit and re-activate current afterward because of its membership in visited.
Processing the HPs in topological order resolves this issue.

if current_idx == index:
if not active:
raise ValueError(
"Attempting to change the value of the inactive hyperparameter '%s' to '%s'." % (hp_name, hp_value))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a nice side effect of the code reorganization, we explicitly prevent the caller from changing the value of an inactive HP.

because there is another assertion at line 261 that covers mostly the same issues and because this one is rather uninformative.
@filipbartek
Copy link
Contributor Author

filipbartek commented May 11, 2022

[...] I don't immediately understand the large changes from @filipbartek to c_util.correct_sampled_array and c_util.change_hp_value, could you please give some further explanation on what you are doing here? Would it be possible to reproduce the changed behavior in the style of the original algorithms? [...]

To explain the changes in change_hp_value, I have extended the code comments with commit 3b0dbf1 and added some commit comments here on GitHub.
Summary of the new aspects of the implementation of change_hp_value:

We process the HPs in topological order to ensure that the conditions are resolved correctly even if they contain an OrConjunction while each condition only needs to be evaluated once. We use a minimum heap to maintain the HPs that are yet to be visited to ensure the topological order. We rely on the assumption that the HP indices order the HPs topologically.

When I modified the function, I attempted to make the modification rather conservative, keeping namely variable names and semantics where possible. However, I could still stay closer to the original implementation. I can try to undo some of the changes (namely switching from set visited to bitvector scheduled, dropping unused variable activated_values, preventing code duplication by introducing update) to concentrate the patch on the crucial stuff. Would you prefer that, @mfeurer?

I agree with merging #219 first.

I will try to document the changes in correct_sampled_array and respond to the other concerns you raised later.

@mfeurer
Copy link
Contributor

mfeurer commented Jun 8, 2022

Hi @filipbartek, thank you very much for your explanation. As discussed, I just merged #219. Could you please rebase your PR, I will then check again how to proceed.

@filipbartek
Copy link
Contributor Author

filipbartek commented Oct 18, 2022

I am trying to minimize the changes introduced by this PR. 3 functions are affected:

  • c_util.correct_sampled_array
  • c_util.change_hp_value
  • configuration_space.ConfigSpace.get_active_hyperparameters

I believe each of these is broken in a distinct way and can be fixed independently. Breaking the fix down might further facilitate the inspection of the changes. @mfeurer, would you prefer to have 3 PRs for these 3 fixes, or does breaking them down into sets of commits suffice?

@mfeurer
Copy link
Contributor

mfeurer commented Jan 5, 2023

Hi @filipbartek, a single PR per problem would be appreciated as this will make reviewing and merging as simple and quickly as possible. Please excuse the delay, I've been on a longer break after finishing my PhD.

@eddiebergman
Copy link
Contributor

Sorry to close this after so long of inactivity. I did a major rework in #346 and the code around this has completely changed. It's also been benchmarked to be just as fast/faster than the existing solution. I'm not sure if this closes the existing issue #194 though, I will have to take a deeper look

@filipbartek
Copy link
Contributor Author

filipbartek commented Jul 12, 2024

Thanks for the update, @eddiebergman!

I'm not sure if this closes the existing issue #194 though, I will have to take a deeper look

Consider scavenging the tests from this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

One-exchange neighborhood may raise ValueError non-deterministically
4 participants