Replies: 2 comments 4 replies
-
One more potential ease of use addition to this would be some sort of addition to maestro's cli to use an existing pgen to populate this block in a spec. something like: $ maestro attach mypgen.py spec.yaml -o new_spec.yaml And perhaps a component of making this work would be requiring (or encouraging if introspection might work too) numpydoc/googledoc docstrings on the pgen function from which the description and pargs could be parsed from the pgen itself, ensuring the spec is in-sync with the pgen function. So, taking one of the parg examples from the docs and updating it: from maestrowf.datastructures.core import ParameterGenerator
import itertools as iter
def get_custom_generator(env, **kwargs):
'''
Simple filtering parameter generator example
Parameters
------------
env: dict
Environment block from the yaml specification
SIZE_MIN: int, default=10
Set minimum size of generated parameter set
SIZE_STEP: int, default=10
Increment between sizes in generated parameter set
NUM_SIZES: int, default=3
Number of parameter sets to generate for sampling the SIZE parameter
Returns
--------
ParameterGenerator containing selected parameter sets
'''
p_gen = ParameterGenerator()
# Unpack any pargs passed in
size_min = int(kwargs.get('SIZE_MIN', '10'))
size_step = int(kwargs.get('SIZE_STEP', '10'))
num_sizes = int(kwargs.get('NUM_SIZES', '3'))
sizes = range(size_min, size_min+num_sizes*size_step, size_step)
iterations = (10, 20, 30)
size_values = []
iteration_values = []
trial_values = []
for trial, param_combo in enumerate(iter.product(sizes, iterations)):
size_values.append(param_combo[0])
iteration_values.append(param_combo[1])
trial_values.append(trial)
params = {
"TRIAL": {
"values": trial_values,
"label": "TRIAL.%%"
},
"SIZE": {
"values": size_values,
"label": "SIZE.%%"
},
"ITER": {
"values": iteration_values,
"label": "ITER.%%"
},
}
for key, value in params.items():
p_gen.add_parameter(key, value["values"], value["label"])
return p_gen That could then be used to auto generate the following section in the yaml spec with the initial attach invocation: parameter.generator:
default_pgen: mypgen # must be one of the names below, which schema can check for
generators:
- name: mypgen
description: Simple filtering parameter generator example
path: mypgen.py
pargs:
- SIZE_MIN: 10
- SIZE_STEP: 10
- NUM_SIZES: 3 Still leaves the question of whether we could automate documenting constants in some manner this way (the iterations parameter in the above pgen)... Though maybe that would be best in the docstring of the pgen? |
Beta Was this translation helpful? Give feedback.
-
Further thoughts/discussion on this regarding the provenance. Upon executing a study, serializing this to the yaml spec in that specific study's workspace would be a good way to document what was executed. It removes any confusion about the parameter combos in the meta files not matching what's in the global.parameters block. Additional concerns I'm documenting here for the eventual implementation so i don't forget: serializing/copying the pgen's themselves. The copy in the executed study workspace I think makes more sense to have just the one that was executed in the list of generators, still have the default_pgen key calling it out, and update the pargs if any to reflect the actual invocation. One caveat is if users use path dependencies on the env block to verify these are there (useful in the case of just using this default_pgen and skipping having to put it on the command line manually): if all generators are in those dependencies then having them all copied into the workspace and left in that record spec might make more sense? |
Beta Was this translation helpful? Give feedback.
-
Just starting a discussion on some potential design changes in the spec regarding the parameter generators. The problem this aims to solve is the lack of embedded documentation on a spec that's designed to be run with pgen functions. So, maybe we can add a new block to the spec to function as a default invocation of the pgen, which would also document it, while also still allowing the cli invocations to override it. So, something like this:
One thought here is what about spec's that could have multiple pgen's instead of having one giant one that can do many things? (i.e. different parameter sampling strategies). Perhaps enable making the generator entries a list, with another key under parameter generator specifying the default_pgen:
Additionally, we could treat these similarly to the dependency block, with a check on them being at the specified path and block study launch if not. Or maybe some more flexiblity like keying on whether default_pgen is empty or not and if it is, use the global.parameters instead or something?
Beta Was this translation helpful? Give feedback.
All reactions