Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update hvPlot default categorical colormap to use glasbey_hv #1447

Open
Azaya89 opened this issue Oct 22, 2024 · 4 comments
Open

Update hvPlot default categorical colormap to use glasbey_hv #1447

Azaya89 opened this issue Oct 22, 2024 · 4 comments

Comments

@Azaya89
Copy link
Contributor

Azaya89 commented Oct 22, 2024

Is your feature request related to a problem? Please describe.

This is not a problem per se, but rather an observation which was raised on the discord channel: the default categorical colormaps in HoloViews and hvPlot do not match.

Describe the solution you’d like

I propose changing the default categorical colormap in hvPlot to glasbey_hv, which is currently used in HoloViews. My reasoning is as follows:

hvPlot is often promoted as a high-level, more user-friendly alternative to HoloViews. For users moving from HoloViews to hvPlot, it would be beneficial to maintain certain features that enhance the user experience. By using the same default categorical colormap, we can create a more seamless transition and avoid the need for users to manually adjust the colormap to preserve their familiar visual settings.

Describe alternatives you’ve considered

1.	Continuing to use the current default colormap in hvPlot.
2.	Manually importing the `glasbey_hv` colormap from colorcet to maintain consistency across HoloViews and hvPlot.

Additional context

Here's a visual representation of the differences:

import pandas as pd

data = {'x': range(10),
 'y': range(10),
 'cat': ['cat1', 'cat2', 'cat3', 'cat4', 'cat5',
         'cat6', 'cat7', 'cat8', 'cat9', 'cat10']}
df = pd.DataFrame(data)

# Using hvplot
import hvplot.pandas # noqa

df.hvplot.points('x', 'y', by='cat')

hvplot_color

# using holoviews
import holoviews as hv
from holoviews.operation.datashader import datashade
import datashader as ds
hv.extension('bokeh')

plot = hv.Points(df, ["x", "y"], ['cat'])
datashade(plot, aggregator=ds.by("cat"))

holoviews_color

@Azaya89
Copy link
Contributor Author

Azaya89 commented Oct 23, 2024

Using holoviews without the datashader aggregator seems to use a continuous colormap as the default:

hv.Points(df, ["x", "y"], ['cat']).opts(color='cat')

holoview_sans_ds

@maximlt maximlt changed the title Update hvPlot default colormap to use glasbey_hv Update hvPlot default categorical colormap to use glasbey_hv Oct 24, 2024
@maximlt
Copy link
Member

maximlt commented Oct 24, 2024

I propose changing the default categorical colormap in hvPlot to glasbey_hv, which is currently used in HoloViews.

This is not exactly right I think, HoloViews seems to just cycle through a default set of colors (blue, red, gold, etc.) defined here:
https://github.com/holoviz/holoviews/blob/4b668d390f63ea5bdb9f96cad1933805ea6400fb/holoviews/plotting/__init__.py#L59-L60

N = 20
data = {'x': range(N), 'y': range(N), 'cat': [f'cat{i}' for i in range(N)]}
df2 = pd.DataFrame(data)
points = hv.Points(df2, ["x", "y"], ['cat'])
points.groupby('cat', hv.NdOverlay).opts(hv.opts.Points(size=5)).opts(width=700, legend_position='right', legend_cols=2)

image

Using holoviews without the datashader aggregator seems to use a continuous colormap as the default:

Yep it looks like HoloViews just defaults to a linear colormap in this case, the style mapping user guide shows how to set a a categorical colormap in this case.

image


import holoviews as hv
import hvplot.pandas

import pandas as pd
import numpy as np

N_POINTS_PER_GROUP = 1_000
GRID_DIM = 3
cat_k = 0
groups = []
for i in range(GRID_DIM):
    for j in range(GRID_DIM):
        groups.append((j, i, 0.25, f'cat{cat_k}'))
        cat_k += 1
dists = [
    pd.DataFrame(dict(x=np.random.normal(x, s, N_POINTS_PER_GROUP), y=np.random.normal(y, s, N_POINTS_PER_GROUP), cat=cat))
     for x,  y, s, cat in groups
]
df = pd.concat(dists, ignore_index=True)
  • Only by=<cat_col>
df.hvplot.points(by='cat').opts(legend_cols=2)

image

Internally, this is equivalent to:

points = hv.Points(df, ["x", "y"], ['cat'])
points.groupby('cat', hv.NdOverlay).opts(hv.opts.Points(size=5)).opts(width=700, legend_position='right')

image

The two plots above from hvPlot and HoloViews use the default HoloViews cycle.

  • by=<cat_col> and datashade=True
df.hvplot.points(by='cat', datashade=True, dynspread=True, threshold=1)

image

hvPlot uses glasbey_category10.

  • color=<cat_col>
df.hvplot.points(color='cat').opts(legend_cols=8)  # had to set a weird value for legend_cols...

image

hvPlot uses glasbey_category10, while, as already mentioned, HoloViews defaults to the linear blues cmap:

hv.Points(df, ["x", "y"], ['cat']).opts(color='cat', legend_position='right', width=700, size=5, legend_cols=8)

image

The hvPlot styling can be reproduced with:

import colorcet as cc
hv.Points(df, ["x", "y"], ['cat']).opts(color='cat', legend_position='right', width=700, size=5, cmap=cc.palette['glasbey_category10'], legend_cols=8)

image


The two categorical colormaps glasbey_hv and glasbey_category10 can easily be compared on colorcet's website:

image

❓ In fact, I don't know if the base color cycle of HoloViews is ad-hoc or comes from a reference color set?

For reference, glasbey_hv was added in holoviz/colorcet#25 (March 2019), see the discussion from holoviz/colorcet#11. hvPlot seems to have been using Category10 since its creation and then glasbey_category10 when it was added to colorcet.

@maximlt
Copy link
Member

maximlt commented Oct 24, 2024

From the above:

  • hvPlot does not use the same cmap when by or color are set, this should be fixed imo.
  • I'm not sure the argument to move to glasbey_hv is strong enough, it's quite an annoying change for users so we'd really have to motivate it. Sure, making the life of HoloViews users would be nice! But hvPlot is designed as a replacement of the Pandas .plot API (and Xarray, etc.) and I'd rather make the life of users coming from these libraries easier (there are many more of them than HoloViews users!). Let's note that this discussion followed the attempt to update the OpenSky example on examples.holoviz.org (Modernize notebook: opensky holoviz-topics/examples#396) from using HoloViews'API to hvPlot's API and, yes, in this kind of exercise the difference in cmap was noticeable and disturbing.

@Azaya89
Copy link
Contributor Author

Azaya89 commented Oct 24, 2024

Thank you for the clarification @maximlt. However, there is a sort of disconnect in my head that happens when you do

points = hv.Points(df, ['x', 'y'])
points.opts(color='cat')

for example and get a plot like this:

holoviews_cat

versus when you do:

points.opts(color='cat', cmap='glasbey_hv')

which is equivalent to

points.groupby('cat', hv.NdOverlay)

and get a plot like this:

holoviews_color_cat

This seems to imply (I don't know if I'm correct) that the glasbey_hv color mapping will only apply by default to groups of NdOverlay that have been...'overlaid' together. This is not intuitive for new users and may cause some confusion on what to do if they get the linear colormapping on first try without specifying the cmap argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants