Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical Search Spaces with Multiple Independent Search Spaces #2539

Closed
Abrikosoff opened this issue Jun 25, 2024 · 3 comments
Closed

Hierarchical Search Spaces with Multiple Independent Search Spaces #2539

Abrikosoff opened this issue Jun 25, 2024 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@Abrikosoff
Copy link

Abrikosoff commented Jun 25, 2024

Hi Ax Team,

First of all thanks for all your help with my (8 and counting) questions so far! I now have another one :( I have two use (potential) use cases for hierarchical search spaces at the moment:

  1. A NAS application, where in this case I have a parameter search space definition of the form:
def generate_parameters():
    parameters = [
        {
            "name": "lr",
            "type": "range",
            "bounds": [1e-6, 0.4],
            "value_type": "float",
            "log_scale": True,
        },
        {
            "name": "momentum",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ]
    params = [
        {
            "name": "num_conv_layers_to_use",
            "type": "choice",
            "is_ordered": True,
            "values": num_conv_layers_to_use,
            "dependents": {
                **{num_layers: [f"kernel_size_of_layer_{i + 1}_of_{num_layers}" for i in range(num_layers)] 
                    for num_layers in num_conv_layers_to_use},
        },
            },
        *[
            {
                "name": f"kernel_size_of_layer_{i + 1}_of_{j}",
                "type": "choice",
                "is_ordered": True, 
                "values": [1, 5, 10, 20, 50]
            } for j in num_conv_layers_to_use for i in range(j)
        ],
        *[    
            {
                "name": f"activation_{i}_of_layer_num_{j}",
                "type": "choice",
                "values": ["ReLU", "Tanh", "LeakyReLU"],  # Specify the possible activation functions for each layer
            } for j in num_conv_layers_to_use for i in range(j)
        ],
        *[
        {
                "name": f"dropout_{i}_of_layer_num_{j}",
                "type": "range",
                "bounds": [0.0, 1.0],  # Specify the possible dropout functions for each layer
            } for j in num_conv_layers_to_use for i in range(j)
        ],
    ]

    parameters.extend(params)

    return parameters

ax_client.create_experiment(
    name="tune_cnn_on_mnist",
    # parameters=generate_parameters(num_layers=num_layers),
    parameters=generate_parameters(),
    objectives={"MSE": ObjectiveProperties(minimize=True)},
    choose_generation_strategy_kwargs={"use_saasbo": True},
)

where here the num_conv_layers_to_use has been set as the root node. Without the activation and dropout definitions this would have worked, but since a MLP should have those elements as well those should be included. But doing this raises the error
NotImplementedError: Could not find the root parameter; found dependent parameters {'kernel_size_of_layer_2_of_3', 'kernel_size_of_layer_1_of_4', ....}. Having multiple independent parameters is not yet supported.
which I take to mean that the search space must be a complete tree and not consist of separated subspaces.

  1. The other use case is an idea to do discrete multifidelity BO in the Service API (related to this and this). This consists of defining the search space as follows:
{
        {
        "name": "x1",
        "type": "range",
        "bounds": [0.0, 1.0],
        "value_type": "float",  # Optional, defaults to inference from type of "bounds".
        "log_scale": False,  # Optional, defaults to False.
    },
    {
        "name": "x2",
        "type": "range",
        "bounds": [0.0, 1.0],
    },
    {
        "name": "x3",
        "type": "range",
        "bounds": [0.0, 1.0],
    },
    {
        "name": "fidelity_marker",
        "type": "choice",
        "values": ["low", "medium", "high", "max"],
        "dependents": {"low": ["low_fidelity"], "medium": ["medium_fidelity"], "high": ["high_fidelity"], "max": ["max_fidelity"]},
    },
    {
        "name": "low_fidelity",
        "type": "fixed",
        "value": 0.0,
        "is_fidelity": True,
        "target_value": 1.0,  
    },
    {
        "name": "medium_fidelity",
        "type": "fixed",
        "value": 0.5,
        "is_fidelity": True,
        "target_value": 1.0,  
    },
    {
        "name": "high_fidelity",
        "type": "fixed",
        "value": 0.75,
        "is_fidelity": True,
        "target_value": 1.0,  
    },
    {
        "name": "max_fidelity",
        "type": "fixed",
        "value": 1.0,
        "is_fidelity": True,
        "target_value": 1.0,
    }
],

where fidelity_marker is used like a boolean flag to modify the fidelity values. But this also throws me the same error as above.

So my question boils down to: is there actually no support right now for multiple search spaces? And if that's the case, are there any workarounds for such a use case? Because it seems to me that this kind of scenario would appear much more frequently than the case where one full search space tree can be defined for the complete problem.

@Abrikosoff Abrikosoff changed the title Hierarchical Search Spaces with Multiple Independent Hierarchical Search Spaces with Multiple Independent Search Spaces Jun 25, 2024
@danielcohenlive
Copy link

Hi @Abrikosoff , thanks for your question. Are you sure you should be using a hierarchical search space? It seems like you shouldn't need a separate dropout per depth. If so this setup could be simplified.
cc @esantorella

@danielcohenlive danielcohenlive self-assigned this Jun 26, 2024
@danielcohenlive danielcohenlive added the question Further information is requested label Jun 26, 2024
@Abrikosoff
Copy link
Author

Hi Daniel, thanks for your remarks! I was able to construct the HSS for the NAS use case in the following way:

parameters = [
    {
        "name": "num_layers",
        "type": "choice",
        "values": ["two_layers", "three_layers", "four_layers", "five_layers",],  # Specify the range of num_layers
        "dependents": {
            "two_layers": ["2_layer_hidden_size_0", 
                            "2_layer_hidden_size_1", 
                            "2_layer_activation_0", 
                            "2_layer_activation_1", 
                            "2_layer_dropout_0", 
                            "2_layer_dropout_1", 
                            "2_layer_lr", 
                            "2_layer_momentum",],
            "three_layers": ["3_layer_hidden_size_0", 
                                "3_layer_hidden_size_1", 
                                "3_layer_hidden_size_2", 
                                "3_layer_activation_0", 
                                "3_layer_activation_1", 
                                "3_layer_activation_2", 
                                "3_layer_dropout_0", 
                                "3_layer_dropout_1", 
                                "3_layer_dropout_2",
                                "3_layer_lr", 
                                "3_layer_momentum",],
            "four_layers": ["4_layer_hidden_size_0", 
                            "4_layer_hidden_size_1", 
                            "4_layer_hidden_size_2", 
                            "4_layer_hidden_size_3", 
                            "4_layer_activation_0", 
                            "4_layer_activation_1", 
                            "4_layer_activation_2", 
                            "4_layer_activation_3", 
                            "4_layer_dropout_0", 
                            "4_layer_dropout_1", 
                            "4_layer_dropout_2", 
                            "4_layer_dropout_3",
                            "4_layer_lr", 
                            "4_layer_momentum",],
            "five_layers": ["5_layer_hidden_size_0", 
                            "5_layer_hidden_size_1", 
                            "5_layer_hidden_size_2", 
                            "5_layer_hidden_size_3", 
                            "5_layer_hidden_size_4", 
                            "5_layer_activation_0", 
                            "5_layer_activation_1", 
                            "5_layer_activation_2", 
                            "5_layer_activation_3", 
                            "5_layer_activation_4", 
                            "5_layer_dropout_0", 
                            "5_layer_dropout_1", 
                            "5_layer_dropout_2", 
                            "5_layer_dropout_3", 
                            "5_layer_dropout_4",
                            "5_layer_lr", 
                            "5_layer_momentum",],
        },
    },

which I think keeps the spirit of the original question, but you are right, dropout could just be fixed; my original naive intention was to have a separate dropout for each hidden layer. I'll see if @esantorella has other comments about this question (especially about the discrete fidelity); if not I'll close it. Again, thanks a lot!

@danielcohenlive
Copy link

@Abrikosoff I'm going to close this issue for now, but feel free to reopen it if you have further question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants