Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add client side ExperimentDefinition #9

Merged
merged 5 commits into from
Oct 23, 2024
Merged

Add client side ExperimentDefinition #9

merged 5 commits into from
Oct 23, 2024

Conversation

panahiparham
Copy link
Owner

@panahiparham panahiparham commented Sep 24, 2024

Addresses #7

@panahiparham panahiparham requested a review from andnp September 24, 2024 20:43
@panahiparham panahiparham force-pushed the client branch 3 times, most recently from 5286856 to e3f82a8 Compare September 24, 2024 21:29
return _c


def get_results_path(self) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will want some single source-of-truth for this function. Most of it can be refactored out into a utility, I think:

def get_experiment_name():
  import __main__
  return __main__.__file__.split('/')[-2]

def get_results_path(base_path: str, name: str):
  return os.path.join(
    base_path,
    name,
    get_experiment_name(),
  )

Then both sides can reference this (the def'n creator side and the client side).

Copy link
Owner Author

@panahiparham panahiparham Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! But this change breaks stubbed_DefinitionPart in tests/test_ExperimentDefinition.py as this class stubs the get_results_path method. For now I added a property that points to the 'single source of truth` function that we can override in tests.

from ml_experiment._utils.path import get_results_path

class ExperimentDefinition:
    def __init__(self, ...):
        self.get_results_path = get_results_path
        ...

ml_experiment/ExperimentDefinition.py Show resolved Hide resolved
with sqlite3.connect(db_path) as con:
cur = con.cursor()
# ensure config_ids exist
assert all(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth pushing the assertion down into the:

self.table.get_configuration(cur, config_id)

code instead. This way, we can be more specific with our error message:

raise ValueError(f"config_id <{config_id}> is not in table <{self.get_table_name()}>")

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

ml_experiment/metadata/MetadataTable.py Outdated Show resolved Hide resolved
from ml_experiment.ExperimentDefinition import ExperimentDefinition


def test_ExperimentDefinition():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great spot for the tmp_path fixture, which produces (and cleans up) a temporary file path for storing test results and guaranteeing there is no overlapping global state.

def test_ExperimentDefinition(tmp_path):
  ...
  part = stubbed_DefinitionPart(exp_name, part_name, base=str(tmp_path))
  ...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorcery!

assert config == {'alpha': 0.125, 'beta': 0.5, 'id': 0}

configs = exp.get_configs(config_ids)
assert configs == [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does make me wonder if we should consider making some guarantees about ordering. I'm leaning towards "yes", since that's also what PyExpUtils does.

We can make that an issue, then do the guaranteed ordering in another PR.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #11 for this feature!

@panahiparham panahiparham requested a review from andnp October 2, 2024 23:37
Copy link
Collaborator

@andnp andnp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I would go ahead and do the dict(zip) change and whatever git cleanup you'd like, then it's good to merge!

@panahiparham
Copy link
Owner Author

After clean up this pr has 5 commits and resolves issue #7 by adding an experiment_definition class that handles the client side interaction with the experiment after it is created and stored in a metadata database.

Commit 1 and 2 implement the experiment_definition class and its corresponding test. Commit 3 and 4 implement a utility for getting experiment path and commit 5 adds an init file to allow pytest to import this package.

@panahiparham panahiparham merged commit 0343981 into main Oct 23, 2024
7 checks passed
@andnp andnp deleted the client branch October 31, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants