docs: add caveats about some schema models (alternative) #2450

chaoran-chen · 2024-08-19T11:53:23Z

This is an alternative suggestion to #2433. This proposal more systematically mentions the availability of preprocessing pipeline for each schema, does not put the pipeline information in a box and offers support.

anna-parker

For me both https://github.com/loculus-project/loculus/pull/2433/files and this PR are good - happy to approve this and increase consensus :-)

chaoran-chen · 2024-08-19T12:00:22Z

The documentation is now mentioning very often that people can contact us but I feel that at this stage, it's good to show our strong willingness to provide support (which many people might need because of the lack of documentation).

theosanderson · 2024-08-19T12:14:56Z

Thanks for this! My takes:

(I'm definitely fine with the additional messages offering support -as long as we feel we have bandwidth for that)
IMO an aside is the appropriate place for these messages as they are not actually a description of the schema - they are indeed "asides"
I find the "no known preprocessing pipeline" phrasing a bit strange (I understand what you mean by I don't think the reader will)
I think the fact that we haven't tested these models is important information. As a user I would want to know that and feel a bit misled if these options were set out to me but I was not told that some had been extensively tested and others not at all (and which were which)

theosanderson · 2024-08-19T12:16:43Z

(I think a slightly fainter aside could be nice - I actually preferred the warning one in terms of its paler colors - but I understand why you don't - I could try to make a paler blue note if that were helpful)

chaoran-chen · 2024-08-19T12:36:43Z

IMO an aside is the appropriate place for these messages as they are not actually a description of the schema - they are indeed "asides"

I'm not against asides as a concept but I think it should be below the description and look rather desemphasized than emphasized (e.g., smaller font or lighter color).

I find the "no known preprocessing pipeline" phrasing a bit strange (I understand what you mean by I don't think the reader will)

What about writing instead: "Our Nextclade-based preprocessing pipeline does not support this use case (yet?)."?

chaoran-chen · 2024-08-19T12:42:43Z

I think the fact that we haven't tested these models is important information. As a user I would want to know that and feel a bit misled if these options were set out to me but I was not told that some had been extensively tested and others not at all (and which were which)

Do you have serious doubts that a model won't work at all - not due to a small bug (which we might have anywhere and affect other schemas) but that there is a fundamental problem that will prevent this model to be used? I personally don't and therefore don't feel that it makes sense to add such a note.

theosanderson · 2024-08-19T12:57:35Z

Do you have serious doubts that a model won't work at all - not due to a small bug (which we might have anywhere and affect other schemas) but that there is a fundamental problem that will prevent this model to be used?

No - I have concerns that it may not work due to a small bug, as with #787. But it's incredibly hard for an external person to figure out whether it's not working due to a bug in Loculus or because they are doing something wrong in their configuration. Especially if they've just written a preprocessing pipeline from scratch. And if it's not working due to a bug then they are in our hands in terms of when it is fixed. Personally as a potential user I might well stick with schemas the devs had already tried, and I think that's a decision we should make possible. I might have 2 days to turn around a database in which case I definitely don't have time to wait for the Loculus team to fix some bugs.

chaoran-chen · 2024-08-19T13:05:36Z

I might have 2 days to turn around a database in which case I definitely don't have time to wait for the Loculus team to fix some bugs.

In such a case, I don't think that anyone should try to set up Loculus at this stage. I tried to indicate that in the "Current state and roadmap" page but am happy to strenghten the wording (and I'd also be happy to link to the page for administrators as a warning).

theosanderson · 2024-08-19T13:12:46Z

I mean one can have different views about exactly where the threshold is (that's a decision for users), but don't you agree that there is a meaningful difference between the situation where you are following a pattern that we have tested and shown to work, and a case where there may be unknown bugs?

chaoran-chen · 2024-08-19T13:26:49Z

Firstly, for me, these schemas are high-level models and the intent of the page is to show a few general ways to use Loculus. It's not a detailed guideline on how to implement them: we can write those in the future but for now, the main aim for people is to understand what they can do (in principle - assuming the availability of the necessary resources and knowledge) with Loculus.

Secondly, I really don't see a significant difference from what we already tried out. For the no alignment case, you can always just output a single-character sequence with an N. Technically, it has a sequence but conceptually, it doesn't. For the multi-reference case, you might have to create a few pseudo-sequences but still, I don't see why it wouldn't work. The UI might not be perfect (therefore, I am fine with a note regarding the UI) but that is something we can improve with the time and should not prevent someone to think about these use cases already.

theosanderson · 2024-08-19T13:37:05Z

It's not a detailed guideline on how to implement them: we can write those in the future but for now, the main aim for people is to understand what they can do (in principle - assuming the availability of the necessary resources and knowledge) with Loculus.

I agree. I think "we have tried these things, we have not tried these things" is pretty high level information that fits here. I'm not proposing turning this into an implementation guide.

Secondly, I really don't see a significant difference from what we already tried out. For the no alignment case, you can always just output a single-character sequence with an N. Technically, it has a sequence but conceptually, it doesn't. For the multi-reference case, you might have to create a few pseudo-sequences but still, I don't see why it wouldn't work. The UI might not be perfect (therefore, I am fine with a note regarding the UI) but that is something we can improve with the time and should not prevent someone to think about these use cases already.

Sure - but:

we haven't documented that you have to output an alignment with just an N anywhere yet (and we don't know whether or not you do need to do that)
arguably if you do that then the actual "schema" is one of the first two

chaoran-chen · 2024-08-19T21:18:34Z

we haven't documented that you have to output an alignment with just an N anywhere yet (and we don't know whether or not you do need to do that)

That goes back to my first point in my opinion: this text is about what you can do in principle but not about how exactly you would do that. And I would refrain from documenting the how before we refactored the configuration.

arguably if you do that then the actual "schema" is one of the first two

I wouldn't say so because, again, this text is about the concepts and conceptually, a single-character sequence is not an alignment. The aim of mentioning this schema is to show that Loculus can be used if you don't want to align the sequences / don't know how to / it doesn't make sense.

I agree. I think "we have tried these things, we have not tried these things" is pretty high level information that fits here.

I don't think that this is relevant by itself. It indicates that we are not certain that it (the concept) will not work or that it is not recommended which both, in my opinion, are not the case. So for me, the statement, although true, is more misleading than useful.

theosanderson · 2024-08-19T21:32:36Z

That goes back to my first point in my opinion: this idea is about what you can do in principle but not about how exactly you would do that. And I would refrain from documenting the how before we refactored the configuration.

Sure, I'm not saying we should go into that detail here - I'm just saying that there is a big difference between a schema where the developers know exactly how it works, and there are examples to show how to implement it and one where there isn't, and IMO users will want to know which of those applies to which schema.

I wouldn't say so because, again, this text is about the concepts

I totally accept that one can reasonably argue this! (I think one can argue it both ways)

I don't think that this is relevant by itself. It indicates that we are not certain that it (the concept) will not work or is not recommended which both, in my opinion, are not the case. So for me, the statement, although true, is more misleading than useful.

A) I'd be very happy for us to go into more detail on our view "This approach has not been tested, but we are confident that conceptually Loculus will be able to support it, potentially after some minor tweaks. Feel free to reach out if you have difficulties with implementation" or something to really captures what we believe the true situation is.

B) I think this is another thing where one can poll some people asking "Would you as a potential user want to know X?" where X is a well-described-account of the situation.

chaoran-chen requested review from theosanderson and anna-parker August 19, 2024 11:53

chaoran-chen mentioned this pull request Aug 19, 2024

docs: add caveats about some schema models #2433

Open

anna-parker approved these changes Aug 19, 2024

View reviewed changes

docs: add caveats about some schema models

1259691

theosanderson force-pushed the docs--add-caveats-about-some-schema-models2 branch from 9fa7633 to 1259691 Compare August 19, 2024 14:34

theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Aug 19, 2024

corneliusroemer added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add caveats about some schema models (alternative) #2450

docs: add caveats about some schema models (alternative) #2450

chaoran-chen commented Aug 19, 2024

anna-parker left a comment

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024 •

edited

Loading

theosanderson commented Aug 19, 2024 •

edited

Loading

chaoran-chen commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024 •

edited

Loading

theosanderson commented Aug 19, 2024 •

edited

Loading

docs: add caveats about some schema models (alternative) #2450

Are you sure you want to change the base?

docs: add caveats about some schema models (alternative) #2450

Conversation

chaoran-chen commented Aug 19, 2024

anna-parker left a comment

Choose a reason for hiding this comment

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024 • edited Loading

theosanderson commented Aug 19, 2024 • edited Loading

chaoran-chen commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024

theosanderson commented Aug 19, 2024

chaoran-chen commented Aug 19, 2024 • edited Loading

theosanderson commented Aug 19, 2024 • edited Loading

theosanderson commented Aug 19, 2024 •

edited

Loading

theosanderson commented Aug 19, 2024 •

edited

Loading

chaoran-chen commented Aug 19, 2024 •

edited

Loading

theosanderson commented Aug 19, 2024 •

edited

Loading