Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyCPA port: tools.conversion.map_data #6

Open
mikapfl opened this issue Feb 11, 2021 · 2 comments
Open

pyCPA port: tools.conversion.map_data #6

mikapfl opened this issue Feb 11, 2021 · 2 comments
Labels
enhancement New feature or request priority: high High priority issue: prioritize and solve as soon as possible

Comments

@mikapfl
Copy link
Member

mikapfl commented Feb 11, 2021

Port the functionality of pyCPA's tools.conversion.map_data

@mikapfl mikapfl added the enhancement New feature or request label Feb 11, 2021
@mikapfl
Copy link
Member Author

mikapfl commented Apr 14, 2021

I'm struggling to understand the use cases for this function. The only usage I found is https://github.com/JGuetschow/pyCPA/blob/master/examples/load_CRF.py#L84 with this specification file: https://github.com/JGuetschow/pyCPA/blob/master/examples/metadata/category_conversion/conversion_IPCC2006-1996_pyCPA_example.csv . It contains a list of identical mappings and then one example which sums up a couple of 1996 categories into a single 2006 category (or the other way around, I'm not sure). The use case I extracted from that would be:

User story:
I want to convert data between classifications. For this, I want to state corresponding categories easily. Later, I want to convert another data set between classifications. For this, I want to re-use the previously defined conversion I wrote earlier.

Conditions:
The mapping between classifications can be done by summing or taking the difference between categories.

This is certainly an important use case, and I have been thinking about it already in the context of climate_categories. If the mapping between classifications is nicely just a sum of / difference between categories, it should be rather straight-forward to implement, and the information could be included in climate_categories, for example with a function which produces conversion matrices and a corresponding function in primap2 which then transforms from one classification into the other while handling the meta data. However, in my (limited) experience, most classifications don't map so nicely, and some amount of e.g. splitting needs to be done (thinks for example the splitting of countries or finer sub-categories in a newer classification scheme etc.). So it might be worth thinking about the most common use cases and how they could be solved.

Or maybe I missed the use cases of map_data completely. The function has a lot of potential parameters - do we have other usage examples which would show other use cases?

@JGuetschow
Copy link
Contributor

Map data can be used both to convert between specifications (but you're correct, usually some downscaling is needed) and for the aggregation of data within a categorization. The aggregation for full and clean hierarchies can easily be implemented using the climate categories. In reality data sources often have some deviations from full and clean categorizations and need more complex mappings. The mapping can also be between several meta data columns at once, e.g. multiple secondary categories, or entity dependent target categories. That is the reason for the complicated parameters.
An example of a relatively easy mapping is IPCC1996 to IPCC2006 categories. More complicated is FAO to IPCC2006.

@JGuetschow JGuetschow added the priority: high High priority issue: prioritize and solve as soon as possible label Mar 27, 2023
@JGuetschow JGuetschow added this to the terminology conversion milestone Oct 8, 2024
@JGuetschow JGuetschow linked a pull request Oct 8, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority: high High priority issue: prioritize and solve as soon as possible
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants