Collections response fix endpoints #121

mbaudis · 2024-03-21T18:29:15Z

This addresses the easy part of #116 (collections responses w/ strange endpoints etc.)

costero-e

Everything makes sense, as cohorts and datasets endpoints should be a collection response, not a resultSet response.
I'm also seeing updated the way of showing arrays in examples:

        examples:
          - 
            - OMIABIS:0001017

This will need to be updated everywhere to follow this pattern, I guess.

redmitry · 2024-03-22T08:33:18Z

I am in doubts...
We are removing the "CollectionsResponse" and modifying "ResultsOKResponse" to be the collection.
What will happen with endpoint like?
/cohorts/{id}/individuals

Before changes, only the "root" endpoint was returning the collection an the rest "usual" ResultsOKResponse (like any other entryType e.g. "biosamples").

Are we changing the contract?

D.

mbaudis · 2024-03-22T08:52:16Z

@redmitry Oh well, you are right w/ this breaking the resultsetResponses if allowing those endpoints (which I guess we should). I guess the naming of ResultsOKResponse makes this look special but is extremely confusing. I'll iterate another version for discussion...

Reviewer spotted other breaking fixes that need to be reviewed again.

mbaudis · 2024-03-22T09:13:31Z

Have a look at ea4324b Only for cohorts so far.

My reasoning there is that ResultsOKResponse does not make sense as a name, especially if there are several possible responses.

But generally this all is also a big argument for ditching the concept of having separate Boolean and Count response types and just having a single one for each data response flavour (collections, resultsets, filtering terms ...) where we the data payload (and count) are optional... This is probably against some "how it should be done" rule and I remember that the different response schemas looked nice when drafting them; but now we see a confusion of alternative response types and alternative payload types.

Alas, for now how about this?

redmitry · 2024-03-22T09:28:27Z

IMO "ResultsetsResponse" is more appropriate name.
So we'll have "ResultsetsResponse" which is one of three responses (boolean, count, resultsets) and the "CollectionsResponse".

costero-e · 2024-03-22T09:34:14Z

But resultSets response gives results splitted by dataset and if I'm not wrong, boolean and count responses don't do that, do they?
On the other hand, I see with your last commits @mbaudis that cohorts have the resultsets response and datasets still have the resultsOkResponse.

mbaudis · 2024-03-22T09:35:17Z

Yes and I really would like to see this (since the previous generic ResultsOKResponse is kind of confusing - at least it confused me; I thought it was a required term ...).

Now, in principle we should then propagate this everywhere we have such a generic to make it clearer what is being executed.

Pro: clarity, won't break anything since references inside of schemas.
Con: change a parameter name just for these reasons, w/o any functional gain

I'm clearly pro here, but for all entry types.

mbaudis · 2024-03-22T09:40:05Z

@costero-e I left the datasets just since I didn't want to do changes I have to reverse ...

ResultSets are actually split by any type of collection:

      setType:
        description: Entry type of resultSet. It SHOULD MATCH an entry type declared
          as collection in the Beacon configuration.
        type: string
        default: dataset

... so a datasets response for individuals could be for the dataset or split for its cohorts, theoretically.

costero-e · 2024-03-22T09:47:25Z

I think you tagged another Oriol @mbaudis, jeje. Ok, thank you, it's good you are just giving an example using cohorts.
On the other hand, what I meant is that we don't have resultSets in a typical Count or Boolean response:

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "description": "Payload definition for the \"count\" response.",
    "properties": {
        "exists": {
            "$ref": "../../common/beaconCommonComponents.json#/definitions/Exists"
        },
        "numTotalResults": {
            "$ref": "../../common/beaconCommonComponents.json#/definitions/NumTotalResults",
            "description": "Total number of results."
        }
    },
    "required": [
        "exists",
        "numTotalResults"
    ],
    "type": "object"
}

For me, putting the "countResponse" and "booleanResponse" under a "resultSetsresponse" makes it a bit confusing. That's why I'm more in favor of keeping "resultsOkResponse".

mbaudis · 2024-03-22T10:04:09Z

@costero-e It is confusing in any case. Therefore I would just prefer that we have only a ResultSetsResponse with a Boolean granularity instead of switching responses.

But overall the new way would be slightly less confusing... You need to name the response anyway and ResultsOKResponse + CollectionsResponse is worse compared to ResultsetsResponse + CollectionsResponse.

Anyway, naming is schema internal. And IMO it is correct to have an endpoint with a ResultsetsResponse, which can take the format of one of the options. If not doing it for ResultsetsResponse then you have additional inconsistencies.

But again, this is a reason for having a single response for and the granularity handled inside it.

mbaudis · 2024-03-22T10:06:51Z

... and another inconsistency is the "cohorts have individuals for records retrieval while datasets have all entry types". I understand the argument ("cohorts as a group of individuals") but still you may want to get the samples etc. Not wanting to change right now but good to keep in mind.

mbaudis · 2024-03-25T08:25:10Z

@redmitry @costero-e So WDYT - moving ahead in the current version w/o going for the other "ResultsOKResponse" instances, but doing it for the collections which have these 2 types of responses? I'd like to cloes #116

I guess we should have the "response schemas for entry types should be defined somewhere" in a general "remove dependencies on OpenAPI" discussion/issue.

redmitry · 2024-03-25T10:27:45Z

Again we are changing the contract:

was:

"/cohorts/{id}": {
            "get": {
                "responses": {
                    "200": {
                        "$ref": "#/components/responses/ResultsOKResponse"
                    },

now:

        "/cohorts/{id}": {
            "get": {
                "responses": {
                    "200": {
                        "$ref": "#/components/responses/CollectionsResponse"
                    },

Most probably that was the error in the spec coz I remember reporting this to Oriol #98, but since I implemented my java implementation from spec.

Should we fix both the spec. and the implementation?

Cheers,

Dmitry

mbaudis · 2024-03-25T11:27:42Z

@redmitry This just changes the name ResultsOKResponse => CollectionsResponse; the operation is getting a single collection (as it should).

I've fixed now the messy partial changes in the last commit (all the separate ones should be squashed...).

This fixes the wrong response for some of teh collections endpoints and changes the (definitely now) ambiguous ResultsOKResponse to the correct instances of

CollectionsResponse

ResultsetsResponse

redmitry · 2024-03-25T11:37:52Z

This just changes the name ResultsOKResponse => CollectionsResponse; the operation is getting a single collection (as it should).

I do not doubt that it probably (😏) should, but the current spec's "ResultsOKResponse" is
"oneOf": [ beaconBooleanResponse.json, beaconCountResponse.json, beaconResultsetsResponse.json ]

So that my comment: if we want the "/cohorts/{id}" to return "/a single collection/" we have to change implementations also...

Dmitry

collections response new prototypes cleanup This fixes the wrong response for some of teh collections endpoints and changes the (definitely now) ambiguous ResultsOKResponse to the correct instances of * CollectionsResponse * ResultsetsResponse

This is just a temporary removal of files colliding w/ case insensitive file systems. We'll have to address the doc file re-generation at some point ...

mbaudis · 2024-03-25T14:43:30Z

@redmitry That is actually in line with all entry types: biosamples etc. also only know the ResultsOKResponse which is defined as

    ResultsOKResponse:
      description: Successful operation.
      content:
        application/json:
          schema:
            oneOf:
              - $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconBooleanResponse.json
              - $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconCountResponse.json
              - $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconResultsetsResponse.json

... so yes, a biosamples/{id} response will return a beaconResultsetsResponse where the single biosample is wrapped in results (list) in resultSets (list) in the response (object).

One can argue that we should have a 3rd type of response which is an beaconIdrequestResponse where the result could be the single document (and the responseSummary would be as of now).

How do you respond to {id} requests?

redmitry · 2024-03-25T14:55:21Z

How do you respond to {id} requests?
Currently all /{entryType}/{} endpoints return ResultsetsResponse(s) even for "collection" entryTypes.

https://beacons.bsc.es/beacon/v2.0.0/cohorts/CINECA_synthetic_cohort_UK1
https://beacon-apis-demo.ega-archive.org/api/cohorts/CINECA_synthetic_cohort_EUROPE_UK1

mbaudis · 2024-03-25T15:07:33Z

@redmitry Ah; that's what you mean; i.e. the wrong response type (BeaconResultsetsResponse instead of BeaconCollectionsResponse). Yes, that should just be a single collection info. Alt least according to the description in the endpoints files:

  /cohorts/{id}:
    parameters:
      - $ref: '#/components/parameters/entryId'
    get:
      parameters:
        - $ref: '#/components/parameters/requestedSchema'
      description: Get details about one cohort, identified by its (unique) 'id'

... and

  /datasets/{id}:
    parameters:
      - $ref: '#/components/parameters/entryId'
    get:
      parameters:
        - $ref: '#/components/parameters/requestedSchema'
      description: Get details about one dataset, identified by its (unique) 'id'
      operationId: getOneDataset
      tags:

... which is IMO the clear intention; in BeaconCollectionsResponse this would result in a response.results list of one collection. (I didn't write this ¯\_(ツ)_/¯).

My point above was already a bit further - a separate BeaconIdResponse ... or such which would move the list -> object, i.e. a single cohort or biosample ... in response.result.

costero-e · 2024-03-25T18:54:06Z

Agree @redmitry that this will change the implementations but anyway, I'm changing it almost every day for adding new features that are on demand so no problem. Plus, it makes more sense to have a collectionResponse rather than resultSetsResponse as @mbaudis says. For me, the PR is good to be merged and we can proceed.

redmitry · 2024-03-25T18:59:01Z

So that. Once merged, I update java implementation in accordance.

mbaudis · 2024-03-25T19:05:10Z

Great! Note: I removed some schema .md files from the documentation since they always break merges on Mac OS (same name, different case). This doesn't affect anything since the documentation branch is kept separately, and the website is being built from it. IMO the schema -> .md -> web scripts should be redone (and rerun!) but that's for a separate discussion.

jrambla · 2024-03-25T19:06:49Z

Without having read the whole thread in detail yet, as I'm trying to understand which issue we are trying to solve here...
The original issue was described by @redmitry as

cohorts/endpoints.json and datasets/endpoints.json have CollectionsResponse response object derfined as a
choice of beaconBooleanResponse, beaconCountResponse and beaconCollectionsResponse which is wrong.

Should be defined as beaconCollectionsResponse precising the type according the endpoint cohorts/defaultSchema.json or cohorts/defaultSchema.json.

The assertion "is wrong" needs to be clarified. What is exactly wrong in that?
A collection response is different from a resultsets response.

mbaudis · 2024-03-25T19:14:28Z

@jrambla Yeah, well, this filtered down to the collections response in cohorts mapping to a resultsets response as the data option in the beaconOKresponse (in contrast to what was written everywhere else). But we need a resultset response, too, for the cohorts/{id}/individuals (and more in datasets) items. Unless we say they are also just a list since the cohort / dataset is already the wrapper ... But this just isn't clear; it was wrong in any case and the current change basically allows to have now resultsets for the records payloads (e.g. a /datasets/{id}/ endpoint could have individuals from mutiple cohorts resultsets).

mrueda · 2024-03-25T19:17:57Z

Great! Note: I removed some schema .md files from the documentation since they always break merges on Mac OS (same name, different case). This doesn't affect anything since the documentation branch is kept separately, and the website is being built from it. IMO the schema -> .md -> web scripts should be redone (and rerun!) but that's for a separate discussion.

@mbaudis I'll have a look and rename the clashing .md files. My idea was to create a Gihtub action to run the scripts...but that was before we had the re-branching...

jrambla · 2024-03-25T19:19:18Z

I fear this is incorrect.
Cohorts/{id} and datasets/{id} should return details of THAT collection, NOT the contents of the collection.
In contrast, cohorts/{id}/individuals will return a resultset with the individuals (simplyfying at this level of discussion)

redmitry · 2024-03-25T19:19:43Z

The assertion "is wrong" needs to be clarified. What is exactly wrong in that?
A collection response is different from a resultsets response.

Well.. I may probably messed it up...
Thing you are right is that we have 2 thing here... my initial issue and the response from /cohorts/{id}
I would pause it as Jordi suggest.

costero-e · 2024-03-25T19:34:25Z

Yes, @jrambla is right that this is more a change than a fix and can wait. Maybe it needs more discussion on whether the solution proposed here is better than the one we had so I would also stop the PR until it is properly discussed.

mbaudis · 2024-03-27T09:45:58Z

I'm closing this - see fresh start in #123

mbaudis added 2 commits March 21, 2024 19:18

fixing ResultsOKResponse for cohorts

ea3be85

beaconCollectionsResponse

2553223

mbaudis changed the base branch from schema-urgent-fixes to main March 21, 2024 18:29

mbaudis requested review from redmitry and costero-e March 21, 2024 18:30

costero-e previously approved these changes Mar 22, 2024

View reviewed changes

ResultsetsResponse for cohorts/{id}/individuals

ea4324b

mbaudis and others added 3 commits March 25, 2024 15:13

file case bug fix

063cb8a

This is just a temporary removal of files colliding w/ case insensitive file systems. We'll have to address the doc file re-generation at some point ...

file case bug fix

b854dc4

This is just a temporary removal of files colliding w/ case insensitive file systems. We'll have to address the doc file re-generation at some point ...

mbaudis force-pushed the CollectionsResponse-fix-endpoints branch from 3500d55 to b854dc4 Compare March 25, 2024 14:13

mbaudis closed this Mar 27, 2024

mbaudis deleted the CollectionsResponse-fix-endpoints branch March 27, 2024 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collections response fix endpoints #121

Collections response fix endpoints #121

mbaudis commented Mar 21, 2024

costero-e left a comment •

edited

Loading

redmitry commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024

redmitry commented Mar 22, 2024

costero-e commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024 •

edited

Loading

costero-e commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

costero-e commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

jrambla commented Mar 25, 2024

mbaudis commented Mar 25, 2024

mrueda commented Mar 25, 2024

jrambla commented Mar 25, 2024

redmitry commented Mar 25, 2024 •

edited

Loading

costero-e commented Mar 25, 2024

mbaudis commented Mar 27, 2024

Collections response fix endpoints #121

Collections response fix endpoints #121

Conversation

mbaudis commented Mar 21, 2024

costero-e left a comment • edited Loading

Choose a reason for hiding this comment

redmitry commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024

redmitry commented Mar 22, 2024

costero-e commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024 • edited Loading

costero-e commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 22, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

costero-e commented Mar 25, 2024

redmitry commented Mar 25, 2024

mbaudis commented Mar 25, 2024

jrambla commented Mar 25, 2024

mbaudis commented Mar 25, 2024

mrueda commented Mar 25, 2024

jrambla commented Mar 25, 2024

redmitry commented Mar 25, 2024 • edited Loading

costero-e commented Mar 25, 2024

mbaudis commented Mar 27, 2024

costero-e left a comment •

edited

Loading

mbaudis commented Mar 22, 2024 •

edited

Loading

redmitry commented Mar 25, 2024 •

edited

Loading