Provide simple mechanism for adding icons to datasets #480

datajoely · 2021-06-14T10:24:26Z

Description

Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."

Users can label their dataset in the catalog and provide layers - but there is very little they can do to differentiate datasets beyond this from a visual perspective. Adding the facility to apply an icon from an existing library of icons would be an effective mechanism for making the pipeline visualisation a clearer and more efficient story-telling tool.

Context

Why is this change important to you? How would you use it? How can it benefit other users?

A simple example for where this would be useful would be to allow users to mark Excel datasources vs SQL datasources at a glance, even more so in the collapsed label-less view.

Possible Implementation

(Optional) Suggest an idea for implementing the addition or change.

On the YAML catalog side there could be an extra key for icon like so:

flight_times:
    type: pandas.CSVDataSet
    layer: raw
    load_args:
          sep: '|'
    icon: carbon-csv

This could pull in the following icon from the Carbon design system (by IBM): provided by the iconfiy framework which collects several open source icon libraries.
https://iconify.design/icon-sets/carbon/csv.html

By using the [iconfiy-react](https://github.com/iconify/iconify-react) library this would hopefully be a low effort addition

Checklist

Include labels so that we can categorise your feature request

The text was updated successfully, but these errors were encountered:

datajoely · 2022-08-15T09:59:07Z

I'm going to reopen this - since users on Dicsord have been asking for the same thing

nicolasboisseau · 2022-08-16T12:00:02Z

Such a feature would be nice, not only at the dataset level, but also at the node level.

Personally when I make pipelines with kedro i try to make nodes responsible for only one main activity (that can be clearly explained to others). It would be nice to represent each node in kedro viz with an icon summarizing the main task the node is doing. e.g. clean, stack, join, filter, train, predict..

And naturally as you said in the issue title, it can be also interesting to identify the dataset types based on their icon, giving a hint to a data source origin : csv, sql, excel file, as a pipeline is often a mix of various dataset types..

Possibilities are quite numerous if a good icon collection can be provided!

b4rlw · 2022-08-16T16:23:55Z

I would second this. Could help keep the pipeline maintainer right as well as improve comprehension among a non-technical audience. It should go without saying that it should not be compulsory, however. Perhaps could be toggled in Viz.

To play devil's advocate, I would also say that the wrong implementation could add unnecessary complexity.
Should Kedro be Laissez-faire about what icon packs can be used?
If so, would it still be easy in most cases to distinguish the difference between a node and a dataset?
Should the user be able to specify custom icons for their custom datasets?
If yes, would certain visualisations begin to look messy? If no, could things look incomplete?
Would the OCD among us end up spending more time customising icons than writing code? lol.

I think it's a great idea, but only with the right design choices. Users frequently say they like Kedro because it's opinionated, so whatever the implementation is, it should be congruent with that broader ethos.

That's my two pence anyways!

tynandebold · 2022-08-22T15:51:26Z

We do support this pattern already for image datasets on Kedro so users can differentiate images and know to click on them. It would be great to infer which icons to use based on what Kedro dataset was chosen so that we reduce complexity for end users and so they won't have to spend a lot of time digging through an icon pack figuring out what icon to use.

For now we've marked this as a minor priority because our current scope of work consists of working on experiment tracking and increasing the general adoption of Viz.

antonymilne · 2022-08-24T08:20:06Z

I like this idea although agree it's not a high priority.

In terms of implementation, to me this is part of a more general question about how to add custom properties to datasets. In this case it actually goes beyond just datasets since @nicolasboisseau said you might like to add an icon to a node, which could obviously not be done by adding a new attribute in catalog.yml. But the question of custom attributes for datasets keeps coming up and we should figure it out. It's particularly relevant for kedro-viz but really a more general kedro problem. My comment from #907:

The question of adding custom properties to datasets comes up quite a bit, e.g. #662 (put number of rows in dataset on kedro-viz), https://github.com/quantumblacklabs/private-kedro/issues/1148 (add metadata to catalog entries than can be consumed by plugins), kedro-org/kedro#1076 (very long-standing issue on how to add metadata to catalog entries). This is not just limited to kedro-viz but there's a more general kedro question of how to attach metadata to a catalog entry.

Also, just for completeness, I mooted the idea of a new viz.yml configuration file in #903 and #907. This is probably the right approach for the implementation here since it would cater for datasets and nodes. In practical terms, solving the question of custom properties is also quickest and easiest to do on the kedro-viz side without needing a general kedro solution which might take a long time.

tynandebold · 2022-10-31T11:02:38Z

I'm going to close this again. We're considering an idea in which we give a more robust set of icons based on datasets and the like, though not anything that could ever be chosen by users. See here #1148.

NeroOkwa · 2022-11-25T17:50:22Z

Context

This feature request was also an output from the Kedro-Viz adoption synthesis #987

Users want to be able to tag datasets, and these tags are inherited by the node. This would allow their team to then tag in the catalog, and then the data scientists can understand better where certain datasets with these tags flow.

Supporting quotes

“Providing additional tags, especially because we could then use that within our data life cycle management and Azure in order to kind of get rid of intermediate data sets in experiment runs that we don't need anymore”.
“Ideally then we could also filter on certain dataset tags in the kedro visualisation to just see where do we have a type of information coming from, and which pipelines are affected by that. So when we look at like the raw data layer and we want to just see data pipelines that use datasets that are tagged in a specific way. That could be incredibly useful”.
- “So what I'm wondering here is, and that might be missing something if these tags are coming from the nodes and the data sets inherit them. I know that that works, but I'm more thinking about the other way around, I don't know if you can tag datasets and then the nodes inherit from that”.
- “That would allow the data sourcing team to then tag in the catalog, and then the data scientists then understands better where certain datasets with these tags flow”.

datajoely added the Issue: Feature Request label Jun 14, 2021

rashidakanchwala changed the title ~~Provide simple mechanism for adding icons to datasets~~ [KED-2725] Provide simple mechanism for adding icons to datasets Jun 23, 2021

datajoely closed this as completed Oct 4, 2021

datajoely reopened this Aug 15, 2022

tynandebold added this to Kedro-Viz Aug 15, 2022

tynandebold moved this to Inbox in Kedro-Viz Aug 15, 2022

tynandebold changed the title ~~[KED-2725] Provide simple mechanism for adding icons to datasets~~ Provide simple mechanism for adding icons to datasets Aug 15, 2022

tynandebold moved this from Inbox to Backlog in Kedro-Viz Aug 22, 2022

tynandebold mentioned this issue Oct 31, 2022

Allow customization of flowchart icons #1148

Open

tynandebold closed this as completed Oct 31, 2022

Repository owner moved this from Backlog to Done in Kedro-Viz Oct 31, 2022

NeroOkwa mentioned this issue Nov 25, 2022

Evaluating Kedro-Viz adoption #987

Closed

datajoely mentioned this issue Jan 18, 2024

[Experiment] Show/Hide Memory Datasets on the flowchart #1707

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide simple mechanism for adding icons to datasets #480

Provide simple mechanism for adding icons to datasets #480

datajoely commented Jun 14, 2021

datajoely commented Aug 15, 2022

nicolasboisseau commented Aug 16, 2022

b4rlw commented Aug 16, 2022

tynandebold commented Aug 22, 2022

antonymilne commented Aug 24, 2022 •

edited

Loading

tynandebold commented Oct 31, 2022

NeroOkwa commented Nov 25, 2022

Provide simple mechanism for adding icons to datasets #480

Provide simple mechanism for adding icons to datasets #480

Comments

datajoely commented Jun 14, 2021

Description

Context

Possible Implementation

Checklist

datajoely commented Aug 15, 2022

nicolasboisseau commented Aug 16, 2022

b4rlw commented Aug 16, 2022

tynandebold commented Aug 22, 2022

antonymilne commented Aug 24, 2022 • edited Loading

tynandebold commented Oct 31, 2022

NeroOkwa commented Nov 25, 2022

Context

Supporting quotes

antonymilne commented Aug 24, 2022 •

edited

Loading