You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this issue is to discuss and track the progress of exporting Tales that have version/run structures.
Background:
Tales were recently updated to include the notions of versions and runs. Users can create multiple versions of a Tale; each version may contain multiple runs. Each run has a results/ directory where computational outputs are stored. Each run also has a workspace & data directory which are symlinked back to the version's respective folders. Note that this is where the mutability of the workspace folder comes into play.
During the Jan 11 2021 dev call we discussed a few different possibilities as to what this might look like when Tales are exported and re-imported.
The main take aways are that we can either export Tales with run/version structures or without.
Proposed Approaches:
The following approaches each have strengths, weaknesses, and varying levels of complexity.
Exporting All Runs in a Version
This approach exports all of the runs under a particular version of a Tale. The advantage of this is that users can have a record of all of the runs in the version rather than a limited view of what happened. When importing, a more complete version of the Tale is reconstructed. Note that the original Tale may have many versions. The versions that aren't exported will be lost on an import.
This may be confusing for some published Tales because (presumebly) only ony of the runs are going to be referenced in a linked paper. This also conflicts with the idea of exporting/publishing individual recorded runs (how do we let users export ALL runs and only a recorded run).
To import a Tale with multiple versions, we need to know
The name of the version
The name of each run
A mapping between the run folders on the exported Tale and the name of the run that the user may have specified in Whole Tale.
These constraints can be tackled by
Enforcing a naming convention on the folder names (the version folder name is the name of the Tale's version, each run folder is the name of each run). This can easily be parsed during import.
Adding additional structure to the manifest.json to include metadata about each run and version (most likely requires us to come up with new terms for runs & versions).
We can make this arbitrarily complex by inntroducing membership predicates (wasPartOf, etc) to describe relations between versions and runs.
Exporting Individual Runs
This approach exports a particular run of a version, which clearly contrasts exporting all of the runs. The visible difference is that the export looks a little cleaner (personal opinion) and can be useful for users that are interested in a particular result.
This approach is also more streamlined for the use case of exporting reproducible runs: the user interface should look the same for a user exporting a recorded & non-record run.
The constraints for exporting are the same as the case for exporting all of the runs. It may be useful to preserve the original naming that was done in the frontend.
Proposed BagIt structure (2)
This BagIt structure is different than the first in that there isn't any indication that the exported Tale is a version/run other than the filesystem artifacts from the run. This is nice because it's conceptually not that confusing (compared to many symlinks that users would be asking about) and much easier to navigate.
We also need to consider users that want to export Tales without Recorded Runs or versions. I think that this is still a legitimate use case that we should support. On the girer_wholetale side this should be mostly trivial since it's already implemented; the trick is getting a flag from the export endpoint dictating whether a run/Tale is being exported.
ThomasThelen
changed the title
Export Tales that have at least one version and run
[Draft] Export Tales that have at least one version and run
Jan 22, 2021
Purpose:
The purpose of this issue is to discuss and track the progress of exporting Tales that have version/run structures.
Background:
Tales were recently updated to include the notions of versions and runs. Users can create multiple versions of a Tale; each version may contain multiple runs. Each run has a results/ directory where computational outputs are stored. Each run also has a workspace & data directory which are symlinked back to the version's respective folders. Note that this is where the mutability of the workspace folder comes into play.
During the Jan 11 2021 dev call we discussed a few different possibilities as to what this might look like when Tales are exported and re-imported.
The main take aways are that we can either export Tales with run/version structures or without.
Proposed Approaches:
The following approaches each have strengths, weaknesses, and varying levels of complexity.
Exporting All Runs in a Version
This approach exports all of the runs under a particular version of a Tale. The advantage of this is that users can have a record of all of the runs in the version rather than a limited view of what happened. When importing, a more complete version of the Tale is reconstructed. Note that the original Tale may have many versions. The versions that aren't exported will be lost on an import.
This may be confusing for some published Tales because (presumebly) only ony of the runs are going to be referenced in a linked paper. This also conflicts with the idea of exporting/publishing individual recorded runs (how do we let users export ALL runs and only a recorded run).
Proposed BagIt structure
Importing Changes
To import a Tale with multiple versions, we need to know
These constraints can be tackled by
eg
We can make this arbitrarily complex by inntroducing membership predicates (wasPartOf, etc) to describe relations between versions and runs.
Exporting Individual Runs
This approach exports a particular run of a version, which clearly contrasts exporting all of the runs. The visible difference is that the export looks a little cleaner (personal opinion) and can be useful for users that are interested in a particular result.
This approach is also more streamlined for the use case of exporting reproducible runs: the user interface should look the same for a user exporting a recorded & non-record run.
Proposed BagIt structure (1)
Importing Changes (1)
The constraints for exporting are the same as the case for exporting all of the runs. It may be useful to preserve the original naming that was done in the frontend.
Proposed BagIt structure (2)
This BagIt structure is different than the first in that there isn't any indication that the exported Tale is a version/run other than the filesystem artifacts from the run. This is nice because it's conceptually not that confusing (compared to many symlinks that users would be asking about) and much easier to navigate.
Importing Changes (2)
When importing a Tale with this structure there are a few options.
If we want to preserve the version/run names to partially reconstruct the the Tale, these can be encoded in the mannifest.json file.
We can also ignore the version/run information and place the content in the
results/
folder into theworkspace/
folder.The third option is to create a generic Version & Run name and place the results/ artifacts in the appropriate place.
The text was updated successfully, but these errors were encountered: