-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC for CID store annotations and workflow outputs structure #5885
base: cid-store
Are you sure you want to change the base?
Conversation
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far. Left a few minor suggestions. I will try to play with it when I have time
modules/nextflow/src/main/groovy/nextflow/extension/PublishOp.groovy
Outdated
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/trace/TraceObserver.groovy
Outdated
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/trace/TraceObserver.groovy
Outdated
Show resolved
Hide resolved
modules/nf-cid/src/main/nextflow/data/cid/model/WorkflowResults.groovy
Outdated
Show resolved
Hide resolved
Co-authored-by: Ben Sherman <bentshermann@gmail.com> Signed-off-by: Jorge Ejarque <jorgee@users.noreply.github.com>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
I have updated the code with annotations field in the output. It can be Map or a closure. The closure is evaluated per sample. So, we could support the case of adding sample information as annotation such as the sampleId
|
// emit workflow publish event | ||
session.notifyWorkflowPublish(name, normalized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I've been thinking about saving the index metadata to the workflow result. Currently it will always be a list because of how this publish event works. If the source channel is a value channel, it will be saved as a JSON array with a single element rather than just a JSON object
So I would like to refactor this event to instead happen in the onComplete of the publish op, and include either (1) the one record as an object if the source was a value channel or (2) the entire list of index records if the source was a queue channel.
If you think you can implement that, feel free, otherwise I can do it myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do it.
The other doubt I had in this part is about normalized paths. Originally, it was calling notifyWorkflowPublish per value without normalizing the path. It was just normalized for the index file. For testing purposes I moved the notification to this place. But I think it should notify values with normalized paths in all the cases, even if the index file is not set. What do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I wasn't sure originally whether to normalize the paths here or not, since a trace observer could use the onFilePublish events to recover the source-target mapping.
But as we integrate the index file with the cid store, I think it makes sense to normalize them in the index file.
And yes, I think we should do the onWorkflowPublish regardless of whether the index file is specified. The only complication is the index "mapper", but I plan to remove it anyway in the third preview, so you can remove it here if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the workflow outputs branch based on our discussion. So you can just copy the changes from that branch if you want
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
void annotations(Map value) { | ||
setOption('annotations', value) | ||
} | ||
|
||
void annotations(Closure value) { | ||
setOption('annotations', value) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: decide whether to use tags
or phase it out in favor of annotations
Last changes:
|
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
@jorgee also some conflicts to solve here 🙏 |
This PR is a PoC for adding the annotations to metadata entries in the CID and restructure the workflow outputs in the way defined in output DSL
It is currently getting annotations from tags
Tested with a variation of the e2e test with a small modification to include tags.
An example of is the generated WorkflowResults.
publishedData
is a list of all files published and outputs are how they are defined as records in the ouputsDslWorkflow output files are annotated with the provided tags