You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the biggest hurdles to users getting started quickly is the need for them to register datasets with PZ.
While dataset registration makes it easier for the system to track the lineage of computation:
Caching is nowhere near being fully supported in PZ
In an ideal world, PZ can still cache intermediate results effectively without always requiring users to register datasets and provide dataset ids
To emphasize this latter point: a user who wants to quickly test out PZ should not need to provide a dataset_id (or register their dataset) just because some (currently non-existent) small set of power users need this feature for their workloads.
One of the biggest hurdles to users getting started quickly is the need for them to register datasets with PZ.
While dataset registration makes it easier for the system to track the lineage of computation:
To emphasize this latter point: a user who wants to quickly test out PZ should not need to provide a
dataset_id
(or register their dataset) just because some (currently non-existent) small set of power users need this feature for their workloads.There was a lot of discussion in Slack about the best approach to solving this issue, which I am linking to here: https://mitdsg.slack.com/archives/C076WBNJJAH/p1737222894617139
The text was updated successfully, but these errors were encountered: