Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove metric plots from metadata panel + remove pandas/plotly dependency from Kedro-viz #1268

Merged
merged 12 commits into from
Mar 2, 2023

Conversation

rashidakanchwala
Copy link
Contributor

@rashidakanchwala rashidakanchwala commented Feb 27, 2023

Description

Resolves #999 and #1000

Development notes

I have removed the metrics plot and now we have a link that takes us straight to the plots on Experiment Tracking. Gabriel and Mackay are fine with the approach but they prefer if @AntonyMilneQB can also comment from a DS perspective.

kedro-viz-gif

QA notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

@rashidakanchwala rashidakanchwala marked this pull request as ready for review February 27, 2023 15:18
@rashidakanchwala rashidakanchwala changed the title Remove metric plots from metadata panel. Remove metric plots from metadata panel + remove pandas/plotly dependency from Kedro-viz Feb 27, 2023
Signed-off-by: Rashida Kanchwala <[email protected]>
@@ -49,7 +49,7 @@ def _setup_context_with_venv(context, venv_dir):
context.pip = str(bin_dir / "pip")
context.python = str(bin_dir / "python")
context.kedro = str(bin_dir / "kedro")
context.requirements_path = Path("requirements.txt").resolve()
context.requirements_path = Path("test_requirements.txt").resolve()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because kedro extras requirements are in the test_requirements and not in requirements. So e2e_tests keep failing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right (both before and after your change). The e2e test should be installing the pandas-iris starter project's requirements, not the kedro-viz requirements or test requirements. Look at the feature file:

        And I have run a non-interactive kedro new with pandas-iris starter
        And I have installed the project's requirements

The installation step is currently this:

@given("I have installed the project's requirements")
def install_project_requirements(context):
    """Run ``pip install -r src/requirements.txt``."""
    cmd = [context.pip, "install", "-r", str(context.requirements_path)]
    res = run(cmd, env=context.env)

    if res.returncode != OK_EXIT_CODE:
        print(res.stdout)
        print(res.stderr)
        assert False

The docstring is describing correctly what the function should do but actually the code isn't doing that. Instead it should be something like this:

@given("I have installed the project's requirements")
def install_project_requirements(context):
    """Run ``pip install -r src/requirements.txt``."""
     = run( [context.pip, "install", "-r", "src/requirements.txt"], env=context.env,  cwd=str(context.root_project_dir))

    if res.returncode != OK_EXIT_CODE:
        print(res.stdout)
        print(res.stderr)
        assert False

And then we can remove context.requirements_path completely.

@Huongg
Copy link
Contributor

Huongg commented Feb 27, 2023

hey @rashidakanchwala quick check, is the known issue in your description now fixed? I thought the remaining issue would be the go back tab doesn't work in the flowchart only?

@rashidakanchwala
Copy link
Contributor Author

hey @rashidakanchwala quick check, is the known issue in your description now fixed? I thought the remaining issue would be the go back tab doesn't work in the flowchart only?

You are right. thanks Huong <3

Copy link
Contributor

@Huongg Huongg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @rashidakanchwala 😄

Signed-off-by: Rashida Kanchwala <[email protected]>
Signed-off-by: Rashida Kanchwala <[email protected]>
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change, very happy to see it 👍 Generally looks good to me but two things we need to check.

Testing

See comments in the review but also very important to check: please try out on a project like pandas-iris that doesn't have any plotly or tracking datasets and do pip uninstall plotly. Does everything still work ok?

Where should links go from and to?

At the moment it seems we have this:

  • tracking.metrics_dataset.MetricsDataSet ➡️ links to /experiment-tracking?view=Metrics. Fine but I wonder whether it would be possible some point in the future to point to the right graph exactly like #metric_name. It might also just make more sense to link to view=Overview instead (again, ideally to the right place using # at some point in the future). Not a biggy though.
  • tracking.json_dataset.JSONDataSet ➡️ links to same place but I don't think that's intentional? It doesn't make much sense to link these to the metrics page, but it would make sense to link to view=Overview#... ultimately I think
  • plots with versioned: True ➡️ no link to experiment tracking. Fine, but at some point in the future might be nice to put the link for these too

No need to implement all the above now, and I think it's fine how you have it as a quick win. But maybe as a separate ticket it would be good to think about exactly what should link where.

@@ -49,7 +49,7 @@ def _setup_context_with_venv(context, venv_dir):
context.pip = str(bin_dir / "pip")
context.python = str(bin_dir / "python")
context.kedro = str(bin_dir / "kedro")
context.requirements_path = Path("requirements.txt").resolve()
context.requirements_path = Path("test_requirements.txt").resolve()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right (both before and after your change). The e2e test should be installing the pandas-iris starter project's requirements, not the kedro-viz requirements or test requirements. Look at the feature file:

        And I have run a non-interactive kedro new with pandas-iris starter
        And I have installed the project's requirements

The installation step is currently this:

@given("I have installed the project's requirements")
def install_project_requirements(context):
    """Run ``pip install -r src/requirements.txt``."""
    cmd = [context.pip, "install", "-r", str(context.requirements_path)]
    res = run(cmd, env=context.env)

    if res.returncode != OK_EXIT_CODE:
        print(res.stdout)
        print(res.stderr)
        assert False

The docstring is describing correctly what the function should do but actually the code isn't doing that. Instead it should be something like this:

@given("I have installed the project's requirements")
def install_project_requirements(context):
    """Run ``pip install -r src/requirements.txt``."""
     = run( [context.pip, "install", "-r", "src/requirements.txt"], env=context.env,  cwd=str(context.root_project_dir))

    if res.returncode != OK_EXIT_CODE:
        print(res.stdout)
        print(res.stderr)
        assert False

And then we can remove context.requirements_path completely.

@@ -1,5 +1,5 @@
-r requirements.txt
kedro[pandas.ParquetDataSet]>=0.17.0
kedro[pandas.ParquetDataSet,pandas.CSVDataSet,plotly.JSONDataSet,plotly.PlotlyDataSet]>=0.17.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this change? See above comment on e2e tests. I think we can probably do without the additional datasets here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need it for unit tests.

@rashidakanchwala
Copy link
Contributor Author

This is a great change, very happy to see it 👍 Generally looks good to me but two things we need to check.

Testing

See comments in the review but also very important to check: please try out on a project like pandas-iris that doesn't have any plotly or tracking datasets and do pip uninstall plotly. Does everything still work ok?

Where should links go from and to?

At the moment it seems we have this:

  • tracking.metrics_dataset.MetricsDataSet ➡️ links to /experiment-tracking?view=Metrics. Fine but I wonder whether it would be possible some point in the future to point to the right graph exactly like #metric_name. It might also just make more sense to link to view=Overview instead (again, ideally to the right place using # at some point in the future). Not a biggy though.
  • tracking.json_dataset.JSONDataSet ➡️ links to same place but I don't think that's intentional? It doesn't make much sense to link these to the metrics page, but it would make sense to link to view=Overview#... ultimately I think
  • plots with versioned: True ➡️ no link to experiment tracking. Fine, but at some point in the future might be nice to put the link for these too

No need to implement all the above now, and I think it's fine how you have it as a quick win. But maybe as a separate ticket it would be good to think about exactly what should link where.

Thanks Antony. the JSONDataSet should no longer have any links but I think your idea to link it to Overview tab is better so I will do that.

Signed-off-by: Rashida Kanchwala <[email protected]>
Signed-off-by: Rashida Kanchwala <[email protected]>
Signed-off-by: Rashida Kanchwala <[email protected]>
Signed-off-by: Rashida Kanchwala <[email protected]>
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! ⭐

Just to check, did you try this:

very important to check: please try out on a project like pandas-iris that doesn't have any plotly or tracking datasets and do pip uninstall plotly. Does everything still work ok?

Also what do you think about adding anchors to the experiment tracking page so we can link to the exact right part? If you think it's a good idea then please do make a new issue for it 🙂 But it's not very important.

@rashidakanchwala
Copy link
Contributor Author

Looks great! ⭐

Just to check, did you try this:

very important to check: please try out on a project like pandas-iris that doesn't have any plotly or tracking datasets and do pip uninstall plotly. Does everything still work ok?

Also what do you think about adding anchors to the experiment tracking page so we can link to the exact right part? If you think it's a good idea then please do make a new issue for it 🙂 But it's not very important.

Yes, it works with pandas-iris :)

The adding anchors is a known issue and @Huongg said it's difficult to do now but she will be working on it.

Signed-off-by: Rashida Kanchwala <[email protected]>
@rashidakanchwala rashidakanchwala removed the request for review from yetudada March 2, 2023 13:50
@rashidakanchwala rashidakanchwala merged commit 44dc3d4 into main Mar 2, 2023
@rashidakanchwala rashidakanchwala deleted the remove-metrics-plot-from-metadata branch March 2, 2023 14:01
@tynandebold tynandebold mentioned this pull request Mar 24, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove pandas and plotly dependency
3 participants