Derek furst/multiple components datasets #545

DerekFurstPitt · 2023-10-09T18:17:51Z

Implemented datasets/components endpoint to register multiple components from a mult-assay ancestor. These entities are not
created as part of a typical entity creation process. We can't use the normal triggers because those are configured to work with a single ancestor per child dataset. Instead, all of the validation is handled within this endpoint. For activity creation and linkages to the direct ancestor, new functions were created. While, for now, we are only allowing a single parent ancestor, the functions were created such that they would accommodate multiple ancestors. The main difference between this schema function and the existing ones is that it accepts multiple child entities rather than just one. The activity created is of the type Multi-Assay Split

…d. For now, returns uuids of the entities created.

containing both datasets. Still need to apply the same modifications and read triggers done with create_entity. Creation action field is not currently supported in the triggers. Need to merge in changes from creation_action branch

…has its support merged in. Normalized output. Added reindex call like is done on a normal create entity request.

…eation action from the dataset since this is set manually act activity creation time. Replaced json_data_dict with dataset where it was put mistakenly during validation.

yuanzhou

@DerekFurstPitt the way you handle this specialized endpoint with mixed use of schema components can be simplified and the SenNet project will appreciate this effort too. I'll describe the steps:

First check the required X-Hubmap-Application header. And group all the input json validations at the beginning ~~without using any schema_manager.validate_json_data_against_schema()~~. The purpose is to ensure data integrity by checking the required fields and their data values in terms of existence, size, and type...
Similar to create_multiple_samples_details(), we create two set of IDs (which also validates the group_uuid comprehensively) via uuid-api using schema_manager.create_hubmap_ids() for the two component datasets. Then build two merged dicts and call schema_manager.generate_triggered_data() on each to generate all the dataset node properties via the schema.
As what you've already done, generate one Activity node to be linked to the direct ancestor dataset and the two new component datasets. It's also the time to update the Activity.creation_action with the value of "Multi-Assay Split" as you did.
Create a similar app neo4j query as app_neo4j_queries.create_multiple_samples() and get everything (nodes and relationships) into Neo4j, without using the schema.
Add the new datasets to indices via reindex. And return the desired data structure back to ingest-api. Just like what you did.

By following the above approach, we'll have one specialized endpoint that handles the custom json input with custom validations, and one app-specific neo4j query.

src/app.py

src/schema/schema_neo4j_queries.py

…will parse the entity-api-spec.yaml file

…ormat Fixed so that www.yamllint.com and more importantly yaml.safe_load() …

multiple samples endpoint.

maxsibilla

I think this addresses the major points that we all discussed. I believe you also addressed all the issues @yuanzhou mentioned in the card but I will leave that to him to confirm

yuanzhou

@DerekFurstPitt the rework looks great! In addition to updating the comments to indicate the correct return results, can you also sync the latest main into your branch? When I tested locally with your branch, I noticed the dependency issue that had been addressed in main.

src/app.py

…panying details function. Merged in changes from main.

DerekFurstPitt added 10 commits September 28, 2023 15:00

work in progress multiple components endpoint

071d3ab

Merge branch 'main' into Derek-Furst/multiple-components-datasets

e016d8c

Work in progress create multiple components endpoint

9726734

work in progress multiple components endpoint

a1c8c5c

multiple components implemented. Testing required. Output returned tbd

82a8b9c

multiple components implemented. Testing required. Output returned tb…

fc85743

…d. For now, returns uuids of the entities created.

Fixed some more bugs. Commented out creation_action until that field …

a86ec29

…has its support merged in. Normalized output. Added reindex call like is done on a normal create entity request.

Merge branch 'main' into Derek-Furst/multiple-components-datasets

bd254b4

merged in creation_action support. Fixed a validation bug: removed cr…

bf8a67a

…eation action from the dataset since this is set manually act activity creation time. Replaced json_data_dict with dataset where it was put mistakenly during validation.

DerekFurstPitt requested a review from yuanzhou October 9, 2023 18:17

yuanzhou requested changes Oct 10, 2023

View reviewed changes

ChuckKollar and others added 3 commits October 11, 2023 10:27

Fixed so that www.yamllint.com and more importantly yaml.safe_load() …

abc7d9d

…will parse the entity-api-spec.yaml file

Merge pull request #546 from hubmapconsortium/fix_invalid_yaml_file_f…

371bfab

…ormat Fixed so that www.yamllint.com and more importantly yaml.safe_load() …

Reorganized multiple components endpoint to more closely mirror existing

f6fbae2

multiple samples endpoint.

DerekFurstPitt requested a review from maxsibilla October 13, 2023 19:31

maxsibilla approved these changes Oct 13, 2023

View reviewed changes

Update commons to 2.1.12

f3433cf

yuanzhou requested changes Oct 16, 2023

View reviewed changes

src/app.py Outdated Show resolved Hide resolved

src/app.py Show resolved Hide resolved

src/app.py Outdated Show resolved Hide resolved

DerekFurstPitt added 3 commits October 16, 2023 13:11

Merge branch 'main' into Derek-Furst/multiple-components-datasets

5eab03a

Updated comments to reflect changes to /components endpoint and accom…

07dd8aa

…panying details function. Merged in changes from main.

Updated entity api spec for the new endpoint

d2cc665

yuanzhou approved these changes Oct 16, 2023

View reviewed changes

yuanzhou merged commit 1c84ac0 into dev-integrate Oct 16, 2023

yuanzhou mentioned this pull request Oct 16, 2023

Derek furst/multiple components datasets #548

Merged

libpitt mentioned this pull request Oct 17, 2023

Libpitt/multi assay sennetconsortium/entity-api#211

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Derek furst/multiple components datasets #545

Derek furst/multiple components datasets #545

DerekFurstPitt commented Oct 9, 2023

yuanzhou left a comment •

edited

Loading

maxsibilla left a comment

yuanzhou left a comment

Derek furst/multiple components datasets #545

Derek furst/multiple components datasets #545

Conversation

DerekFurstPitt commented Oct 9, 2023

yuanzhou left a comment • edited Loading

Choose a reason for hiding this comment

maxsibilla left a comment

Choose a reason for hiding this comment

yuanzhou left a comment

Choose a reason for hiding this comment

yuanzhou left a comment •

edited

Loading