-
Notifications
You must be signed in to change notification settings - Fork 7
Order Hierarchy API
The Order Hierarchy API is available for the Dataset
and Project
Class and it enables the user to easily organize the order of variables and datasets respectively.
Consider a freshly created project named my project
and two datasets named 1st dataset
and 2nd dataset
which do not belong to the project yet. If a project is likely to hold a lot of datasets in the future, you might want to organize them in groups.
The respective Order
classes make it possible to access those groups similar to accessing folders in a filesystem. You can also list items or place
an entity somewhere in the order tree. The pipe |
is used as a separator to access different levels of groups and it also represents the root of the order tree. But you can also access groups in a relative fashion.
Here are some basic usage examples:
>>> from scrunch import get_project
>>> pro = get_project('my project')
>>> pro.order
[
]
>>> type(pro.order)
<class 'scrunch.order.ProjectDatasetsOrder'>
>>> type(pro.order['|'])
<class 'scrunch.order.Group'>
>>> pro.order['|'].is_root
True
>>> pro.order['|'].create_group('A Group')
>>> pro.order
[
{
"A Group": []
}
]
You can create groups on any level in the order tree.
>>> pro.order['|A Group'].create_group('A SubGroup')
# or
>>> a_group = pro.order['A Group']
>>> a_group.create_group('A SubGroup')
>>> pro.order
[
{
"A Group": [
{
"A SubGroup": []
}
]
}
]
Rename a group by using the rename
method:
# absolute
>>> pro.order['|A Group|A SubGroup'].rename('renamed SubGroup')
# relative
>>> pro.order['A Group|A SubGroup'].rename('renamed SubGroup')
>>> pro.order
[
{
"A Group": [
{
"renamed SubGroup": []
}
]
}
]
This can be easily achieved by changing the owner of a dataset to a project. The change_owner
method can take your project object as an argument:
>>> from scrunch import get_dataset
>>> pro
<Project: name='my project'; id='...'>
>>> ds1 = get_dataset('1st dataset')
>>> ds1.owner
<User: email='[email protected]'; id='...'>
>>> ds1.change_owner(project=pro)
>>> ds1.owner
<Project: name='my project'; id='...'>
Note that we have to refresh the projects order object:
>>> pro.order.load()
>>> pro.order
[
{
"A Group": [
{
"renamed SubGroup": []
}
]
},
"1st dataset"
]
You can organize datasets with the place
method. Take note that only absolute path's are allowed here! It is also possible to arrange datasets in a desired order by providing a datasets id
via the before
and after
keyword arguments.
>>> pro.order.place(ds1, '|A Group')
>>> pro.order
[
{
"A Group": [
{
"renamed SubGroup": []
},
"1st dataset"
]
},
]
# get the 2nd dataset into our project
>>> ds2 = get_dataset('1nd dataset')
>>> ds2.change_owner(project=pro)
>>> pro.order.load()
>>> pro.order.place(ds2, '|A Group', before=ds1.id)
>>> pro.order
[
{
"A Group": [
{
"renamed SubGroup": []
},
"2nd dataset",
"1st dataset"
]
},
]
>>> pro.order.place(ds1, '|A Group', before='renamed SubGroup')
>>> pro.order
[
{
"A Group": [
"1st dataset",
{
"renamed SubGroup": []
},
"2nd dataset"
]
},
]
With recent API changes, Crunch is dropping support for a per-project Shoji order organization in favor of a nested project hierarchy to organize datasets.
Projects' index
members will contain both datasets that belong to it as well as projects that could be nested inside them.
As a result, Scrunch will now use that API to organize projects. Now the Project
class contains the same methods as the old project.order
helper, but it is still available for compatibility purposes.
>>> from scrunch import get_project
>>> pro = get_project('my project')
>>> type(pro.order)
<class 'scrunch.datasets.Project'>
>>> type(pro.order['|'])
<class 'scrunch.datasets.Project'>
>>> pro.order['|'].is_root # There is no concept of root anymore
True
>>> project_a = pro.order['|'].create_group('A Group')
>>> project_a
The same methods of the original Scrunch API are available.
-
rename
: Renames a project -
place
: Moves a project or dataset inside the current project -
create_group
: Creates a new sub project as child of the current one -
reorder
: Receives a list of the datasets and projects in the current project and updates their order
As well as a new Scrunch API is available:
-
create_project
: Preferred method,create_group
is an alias of this -
move_here
: Receives a list of projects or datasets and places them inside the current project