Skip to content

Commit

Permalink
[ADD] documentation for pipelines and steps (automl#329)
Browse files Browse the repository at this point in the history
* Add documentation for pipelines and steps

* fix flake

* Apply suggestions from code review

Co-authored-by: nabenabe0928 <[email protected]>

* accept shuhei's suggestions

Co-authored-by: nabenabe0928 <[email protected]>
  • Loading branch information
ravinkohli and nabenabe0928 authored Nov 18, 2021
1 parent f6af46f commit a1512d5
Show file tree
Hide file tree
Showing 10 changed files with 384 additions and 144 deletions.
1 change: 1 addition & 0 deletions autoPyTorch/api/tabular_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ def search(
Fit both optimizes the machine learning models and builds an ensemble out of them.
To disable ensembling, set ensemble_size==0.
using the optimizer.
Args:
X_train, y_train, X_test, y_test: Union[np.ndarray, List, pd.DataFrame]
A pair of features (X_train) and targets (y_train) used to fit a
Expand Down
1 change: 1 addition & 0 deletions autoPyTorch/api/tabular_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ def search(
Fit both optimizes the machine learning models and builds an ensemble out of them.
To disable ensembling, set ensemble_size==0.
using the optimizer.
Args:
X_train, y_train, X_test, y_test: Union[np.ndarray, List, pd.DataFrame]
A pair of features (X_train) and targets (y_train) used to fit a
Expand Down
12 changes: 7 additions & 5 deletions autoPyTorch/pipeline/base_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,16 @@


class BasePipeline(Pipeline):
"""Base class for all pipeline objects.
"""
Base class for all pipeline objects.
Args:
config (Optional[Configuration]):
Allows to directly specify a configuration space
steps (Optional[List[Tuple[str, PipelineStepType]]]):
the list of steps that build the pipeline. If provided,
they won't be dynamically produced.
The list of `autoPyTorchComponent` or `autoPyTorchChoice`
that build the pipeline. If provided, they won't be
dynamically produced.
include (Optional[Dict[str, Any]]):
Allows the caller to specify which configurations to honor during
the creation of the configuration space.
Expand All @@ -46,12 +48,12 @@ class BasePipeline(Pipeline):
random_state (np.random.RandomState):
allows to produce reproducible results by
setting a seed for randomized settings
init_params (Optional[Dict[str, Any]])
init_params (Optional[Dict[str, Any]]):
Optional initial settings for the config
search_space_updates (Optional[HyperparameterSearchSpaceUpdates]):
search space updates that can be used to modify the search
space of particular components or choice modules of the pipeline
Attributes:
steps (List[Tuple[str, PipelineStepType]]):
the steps of the current pipeline. Each step in an AutoPyTorch
Expand Down
77 changes: 49 additions & 28 deletions autoPyTorch/pipeline/components/base_choice.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,22 @@


class autoPyTorchChoice(object):
"""Allows for the dynamically generation of components as pipeline steps.
"""
Allows for the dynamically generation of components as pipeline steps.
Args:
dataset_properties (Dict[str, Union[str, BaseDatasetPropertiesType]]): Describes the dataset
to work on
random_state (Optional[np.random.RandomState]): allows to produce reproducible
results by setting a seed for randomized settings
dataset_properties (Dict[str, Union[str, BaseDatasetPropertiesType]]):
Describes the dataset to work on
random_state (Optional[np.random.RandomState]):
Allows to produce reproducible results by setting a
seed for randomized settings
Attributes:
random_state (Optional[np.random.RandomState]): allows to produce reproducible
results by setting a seed for randomized settings
choice (autoPyTorchComponent): the choice of components for this stage
random_state (Optional[np.random.RandomState]):
Allows to produce reproducible results by setting a seed for
randomized settings
choice (autoPyTorchComponent):
the choice of components for this stage
"""
def __init__(self,
dataset_properties: Dict[str, BaseDatasetPropertiesType],
Expand Down Expand Up @@ -67,11 +71,13 @@ def get_components(cls: 'autoPyTorchChoice') -> Dict[str, autoPyTorchComponent]:
for current step.
Args:
cls (autoPyTorchChoice): The choice object from which to query the valid
cls (autoPyTorchChoice):
The choice object from which to query the valid
components
Returns:
Dict[str, autoPyTorchComponent]: The available components via a mapping
Dict[str, autoPyTorchComponent]:
The available components via a mapping
from the module name to the component class
"""
Expand All @@ -88,10 +94,13 @@ def get_available_components(
user specification
Args:
dataset_properties (Optional[Dict[str, BaseDatasetPropertiesType]]): Describes the dataset to work on
include: Optional[Dict[str, Any]]: what components to include. It is an exhaustive
dataset_properties (Optional[Dict[str, BaseDatasetPropertiesType]]):
Describes the dataset to work on
include: Optional[Dict[str, Any]]:
what components to include. It is an exhaustive
list, and will exclusively use this components.
exclude: Optional[Dict[str, Any]]: which components to skip
exclude: Optional[Dict[str, Any]]:
which components to skip. Can't be used together with include
Results:
Dict[str, autoPyTorchComponent]: A dictionary with valid components for this
Expand Down Expand Up @@ -137,10 +146,10 @@ def set_hyperparameters(self,
to an actual parameter of the autoPyTorch component.
Args:
configuration (Configuration): which configuration to apply to
the chosen component
init_params (Optional[Dict[str, any]]): Optional arguments to
initialize the chosen component
configuration (Configuration):
Which configuration to apply to the chosen component
init_params (Optional[Dict[str, any]]):
Optional arguments to initialize the chosen component
Returns:
self: returns an instance of self
Expand Down Expand Up @@ -177,11 +186,15 @@ def get_hyperparameter_search_space(
"""Returns the configuration space of the current chosen components
Args:
dataset_properties (Optional[Dict[str, BaseDatasetPropertiesType]]): Describes the dataset to work on
default: (Optional[str]) : Default component to use in hyperparameters
include: Optional[Dict[str, Any]]: what components to include. It is an exhaustive
dataset_properties (Optional[Dict[str, BaseDatasetPropertiesType]]):
Describes the dataset to work on
default: (Optional[str]):
Default component to use in hyperparameters
include: Optional[Dict[str, Any]]:
what components to include. It is an exhaustive
list, and will exclusively use this components.
exclude: Optional[Dict[str, Any]]: which components to skip
exclude: Optional[Dict[str, Any]]:
which components to skip
Returns:
ConfigurationSpace: the configuration space of the hyper-parameters of the
Expand All @@ -193,8 +206,10 @@ def fit(self, X: Dict[str, Any], y: Any) -> autoPyTorchComponent:
"""Handy method to check if a component is fitted
Args:
X (X: Dict[str, Any]): Dependencies needed by current component to perform fit
y (Any): not used. To comply with sklearn API
X (X: Dict[str, Any]):
Dependencies needed by current component to perform fit
y (Any):
not used. To comply with sklearn API
"""
# Allows to use check_is_fitted on the choice object
self.fitted_ = True
Expand All @@ -205,19 +220,23 @@ def predict(self, X: np.ndarray) -> np.ndarray:
"""Predicts the target given an input, by using the chosen component
Args:
X (np.ndarray): input features from which to predict the target
X (np.ndarray):
input features from which to predict the target
Returns:
np.ndarray: the predicted target
np.ndarray:
the target prediction
"""
assert self.choice is not None, "Cannot call predict without initializing the component"
return self.choice.predict(X)

def transform(self, X: Dict[str, Any]) -> Dict[str, Any]:
"""
Adds the current choice in the fit dictionary
Args:
X (Dict[str, Any]): fit dictionary
X (Dict[str, Any]):
fit dictionary
Returns:
(Dict[str, Any])
Expand All @@ -233,7 +252,8 @@ def check_requirements(self, X: Dict[str, Any], y: Any = None) -> None:
are honored before fit.
Args:
X (Dict[str, Any]): Dictionary with fitted parameters. It is a message passing
X (Dict[str, Any]):
Dictionary with fitted parameters. It is a message passing
mechanism, in which during a transform, a components adds relevant information
so that further stages can be properly fitted
"""
Expand All @@ -246,7 +266,8 @@ def _check_dataset_properties(self, dataset_properties: Dict[str, BaseDatasetPro
"""
A mechanism in code to ensure the correctness of the initialised dataset properties.
Args:
dataset_properties:
dataset_properties (Dict[str, BaseDatasetPropertiesType]):
Describes the dataset to work on
"""
assert isinstance(dataset_properties, dict), "dataset_properties must be a dictionary"
Expand Down
69 changes: 46 additions & 23 deletions autoPyTorch/pipeline/components/base_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,12 @@ def find_components(
that inherit from base_class
Args:
package (str): The associated package that contains the components
directory (str): The directory from which to extract the components
base_class (BaseEstimator): base class to filter out desired components
package (str):
The associated package that contains the components
directory (str):
The directory from which to extract the components
base_class (BaseEstimator):
base class to filter out desired components
that don't inherit from this class
"""
components = OrderedDict()
Expand Down Expand Up @@ -60,7 +63,8 @@ class ThirdPartyComponents(object):
space to work.
Args:
base_class (BaseEstimator) component type desired to be created
base_class (BaseEstimator):
Component type desired to be created
"""

def __init__(self, base_class: BaseEstimator):
Expand Down Expand Up @@ -96,6 +100,16 @@ def add_component(self, obj: BaseEstimator) -> None:


class autoPyTorchComponent(BaseEstimator):
"""
Provides an abstract interface which can be used to
create steps of a pipeline in AutoPyTorch.
Args:
random_state (Optional[np.random.RandomState]):
Allows to produce reproducible results by setting a
seed for randomized settings
"""
_required_properties: Optional[List[str]] = None

def __init__(self, random_state: Optional[np.random.RandomState] = None) -> None:
Expand All @@ -115,7 +129,8 @@ def get_required_properties(cls) -> Optional[List[str]]:
Usually defined in the base class of the component
Returns:
List[str]: list of properties autopytorch component must have for proper functioning of the pipeline
List[str]:
list of properties autopytorch component must have for proper functioning of the pipeline
"""
return cls._required_properties

Expand All @@ -125,8 +140,8 @@ def get_fit_requirements(self) -> Optional[List[FitRequirement]]:
that need to be in the fit dictionary
Returns:
List[FitRequirement]: a list containing required keys
in a named tuple (name: str, type: object)
List[FitRequirement]:
a list containing required keys in a named tuple (name: str, type: object)
"""
return self._fit_requirements

Expand All @@ -139,11 +154,12 @@ def get_properties(dataset_properties: Optional[Dict[str, BaseDatasetPropertiesT
"""Get the properties of the underlying algorithm.
Args:
dataset_properties (Optional[Dict[str, Union[str, int]]): Describes the dataset
to work on
dataset_properties (Optional[Dict[str, Union[str, int]]):
Describes the dataset to work on
Returns:
Dict[str, Any]: Properties of the algorithm
Dict[str, Any]:
Properties of the algorithm
"""
raise NotImplementedError()

Expand All @@ -154,11 +170,12 @@ def get_hyperparameter_search_space(
"""Return the configuration space of this classification algorithm.
Args:
dataset_properties (Optional[Dict[str, Union[str, int]]): Describes the dataset
to work on
dataset_properties (Optional[Dict[str, Union[str, int]]):
Describes the dataset to work on
Returns:
ConfigurationSpace: The configuration space of this algorithm.
ConfigurationSpace:
The configuration space of this algorithm.
"""
raise NotImplementedError()

Expand All @@ -167,13 +184,16 @@ def fit(self, X: Dict[str, Any], y: Any = None) -> "autoPyTorchComponent":
model and returns `self`.
Args:
X (Dict[str, Any]): Dictionary with fitted parameters. It is a message passing
X (Dict[str, Any]):
Dictionary with fitted parameters. It is a message passing
mechanism, in which during a transform, a components adds relevant information
so that further stages can be properly fitted
y (Any): Not Used -- to comply with API
y (Any):
Not Used -- to comply with API
Returns:
self : returns an instance of self.
self:
returns an instance of self.
Notes:
Please see the `scikit-learn API documentation
Expand All @@ -192,10 +212,10 @@ def set_hyperparameters(self,
to an actual parameter of the autoPyTorch component.
Args:
configuration (Configuration): which configuration to apply to
the chosen component
init_params (Optional[Dict[str, any]]): Optional arguments to
initialize the chosen component
configuration (Configuration):
Which configuration to apply to the chosen component
init_params (Optional[Dict[str, any]]):
Optional arguments to initialize the chosen component
Returns:
An instance of self
Expand Down Expand Up @@ -226,7 +246,8 @@ def check_requirements(self, X: Dict[str, Any], y: Any = None) -> None:
are honored before fit.
Args:
X (Dict[str, Any]): Dictionary with fitted parameters. It is a message passing
X (Dict[str, Any]):
Dictionary with fitted parameters. It is a message passing
mechanism, in which during a transform, a components adds relevant information
so that further stages can be properly fitted
"""
Expand Down Expand Up @@ -267,10 +288,12 @@ def _apply_search_space_update(self, hyperparameter_search_space_update: Hyperpa
"""Allows the user to update a hyperparameter
Args:
name (str): name of hyperparameter
name (str):
name of hyperparameter
new_value_range (List[Union[int, str, float]]):
value range can be either lower, upper or a list of possible candidates
log (bool): Whether to use log scale
log (bool):
Whether to use log scale
"""

self._cs_updates[hyperparameter_search_space_update.hyperparameter] = hyperparameter_search_space_update
Expand Down
Loading

0 comments on commit a1512d5

Please sign in to comment.