Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address how to generate document views (e.g. documents by type with "latest" versions) #9

Open
gmaclennan opened this issue Dec 9, 2024 · 2 comments

Comments

@gmaclennan
Copy link
Member

This issue is not directly related to migration, but we may want to consider this before users start migrating their data so that the MLEF file can serve as an archive of their data.

We previously discussed including "head" data in the MLEF, organized by document type (e.g. observation, node, way). By "head" data I mean the latest version of the document, resolved by the same mechanism that we use in the app when there are forks.

This data is useful if users are removing their Mapeo legacy installations after exporting to MLEF, and if they want to use the MLEF for purposes other than migrating to CoMapeo. Currently the contents of the file are technically complex and are not really human readable nor are they compatible with other tools without further processing.

For purposes other than migrating, I think most users would want their Mapeo legacy data in either a JSON, GeoJSON or CSV format and that it reflects what they currently see in the app, e.g. forks are resolved in the same way as in the app.

I can think of two solutions:

  1. We include this data in the MLEF, e.g. current/observations.json, current/nodes.json etc. The advantage of this is that the MLEF file is useful into the future even if any tools to process it stop working, or knowledge about its structure is lost. It's a zip file, so we can assume it will be able to be opened.
  2. We write an additional reader that indexes the data in the MLEF and outputs data by data type with latest versions resolved and deleted documents marked or excluded. The advantage is that we keep the MLEF simple, but the disadvantage is that if users want to use data in the MLEF then they need external tooling, which may stop working at some point. We also need to ensure that we use the same indexing and version resolution strategy as Mapeo Legacy does, so that the user sees the same data.

If we think option (2) is better, then we don't need to address this in the short term. If we think option (1) is better, then we should ensure that this data is included in the MLEF before users start exporting and possibly deleting their Mapeo Legacy installations.

@EvanHahn
Copy link
Contributor

I think option 2 is better because:

  • it is less work in the short term
  • it is no work in the long term if we never need it (versus possible maintenance burden for edge cases)
  • it simplifies this module
    • it exports the data as is with limited "thought", versus a heavier-handed data processor
    • it doesn't have to think about edge cases around version resolution
  • I won't have time to implement option 1 if we decide we want it

@gmaclennan
Copy link
Member Author

Ok agreed, let's go with option (2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants