-
Notifications
You must be signed in to change notification settings - Fork 0
Handle different primary keys? #2
Comments
|
@lapidus I think I understand the issue, but for the sake of clarity, could you very briefly specify expected vs actual output of the data in the example? |
Overall I think I need to experiment a bit more and understand when exactly we would export multiple indicators from one big dataframe vs having multiple dataframes that generate one indicator each. But the scenario I described above would result in this actual output: Primary key: geo-gender-year Where the preferred output is: Primary key: varying depending on data source availability |
Maybe let's simply put this to test with 3-4 different data sources and see if we can streamline further :) For example these use cases — Produce a DDF from:
I'll try some things from my side, I might submit issue or pull requests :) |
I think there are two issues at play here. 1. Tidy data 2. Disaggregation levels
which I believe is what you are referring to with the preferred output example? I will have to have a think about how to deal with this in an automated fashion. Please let me know if I've misunderstood something. :) |
I believe number above 2 and the original question in this issue was resolved with this commit. Please let me know if that is not the case. |
I need to experiment a bit more with the library but I'm not sure it has the functionality to specify primary keys / generate correct keys for a sparse dataframe?
Currently it assumes that each non-measure is a dimension for every measure:
I am thinking of a scenario where you have a more sparse frame:
I am not sure if this would occur in the wild or if one would try to make different dataframes?
But in the above case the files expected would be something like:
The text was updated successfully, but these errors were encountered: