Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add caveats about some schema models (alternative) #2450

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chaoran-chen
Copy link
Member

This is an alternative suggestion to #2433. This proposal more systematically mentions the availability of preprocessing pipeline for each schema, does not put the pipeline information in a box and offers support.

Copy link
Contributor

@anna-parker anna-parker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me both https://github.com/loculus-project/loculus/pull/2433/files and this PR are good - happy to approve this and increase consensus :-)

@chaoran-chen
Copy link
Member Author

The documentation is now mentioning very often that people can contact us but I feel that at this stage, it's good to show our strong willingness to provide support (which many people might need because of the lack of documentation).

@theosanderson
Copy link
Member

theosanderson commented Aug 19, 2024

Thanks for this! My takes:

  • (I'm definitely fine with the additional messages offering support -as long as we feel we have bandwidth for that)
  • IMO an aside is the appropriate place for these messages as they are not actually a description of the schema - they are indeed "asides"
  • I find the "no known preprocessing pipeline" phrasing a bit strange (I understand what you mean by I don't think the reader will)
  • I think the fact that we haven't tested these models is important information. As a user I would want to know that and feel a bit misled if these options were set out to me but I was not told that some had been extensively tested and others not at all (and which were which)

@theosanderson
Copy link
Member

theosanderson commented Aug 19, 2024

(I think a slightly fainter aside could be nice - I actually preferred the warning one in terms of its paler colors - but I understand why you don't - I could try to make a paler blue note if that were helpful)

@chaoran-chen
Copy link
Member Author

IMO an aside is the appropriate place for these messages as they are not actually a description of the schema - they are indeed "asides"

I'm not against asides as a concept but I think it should be below the description and look rather desemphasized than emphasized (e.g., smaller font or lighter color).

I find the "no known preprocessing pipeline" phrasing a bit strange (I understand what you mean by I don't think the reader will)

What about writing instead: "Our Nextclade-based preprocessing pipeline does not support this use case (yet?)."?

@chaoran-chen
Copy link
Member Author

I think the fact that we haven't tested these models is important information. As a user I would want to know that and feel a bit misled if these options were set out to me but I was not told that some had been extensively tested and others not at all (and which were which)

Do you have serious doubts that a model won't work at all - not due to a small bug (which we might have anywhere and affect other schemas) but that there is a fundamental problem that will prevent this model to be used? I personally don't and therefore don't feel that it makes sense to add such a note.

@theosanderson
Copy link
Member

Do you have serious doubts that a model won't work at all - not due to a small bug (which we might have anywhere and affect other schemas) but that there is a fundamental problem that will prevent this model to be used?

No - I have concerns that it may not work due to a small bug, as with #787. But it's incredibly hard for an external person to figure out whether it's not working due to a bug in Loculus or because they are doing something wrong in their configuration. Especially if they've just written a preprocessing pipeline from scratch. And if it's not working due to a bug then they are in our hands in terms of when it is fixed. Personally as a potential user I might well stick with schemas the devs had already tried, and I think that's a decision we should make possible. I might have 2 days to turn around a database in which case I definitely don't have time to wait for the Loculus team to fix some bugs.

@chaoran-chen
Copy link
Member Author

I might have 2 days to turn around a database in which case I definitely don't have time to wait for the Loculus team to fix some bugs.

In such a case, I don't think that anyone should try to set up Loculus at this stage. I tried to indicate that in the "Current state and roadmap" page but am happy to strenghten the wording (and I'd also be happy to link to the page for administrators as a warning).

@theosanderson
Copy link
Member

I mean one can have different views about exactly where the threshold is (that's a decision for users), but don't you agree that there is a meaningful difference between the situation where you are following a pattern that we have tested and shown to work, and a case where there may be unknown bugs?

@chaoran-chen
Copy link
Member Author

Firstly, for me, these schemas are high-level models and the intent of the page is to show a few general ways to use Loculus. It's not a detailed guideline on how to implement them: we can write those in the future but for now, the main aim for people is to understand what they can do (in principle - assuming the availability of the necessary resources and knowledge) with Loculus.

Secondly, I really don't see a significant difference from what we already tried out. For the no alignment case, you can always just output a single-character sequence with an N. Technically, it has a sequence but conceptually, it doesn't. For the multi-reference case, you might have to create a few pseudo-sequences but still, I don't see why it wouldn't work. The UI might not be perfect (therefore, I am fine with a note regarding the UI) but that is something we can improve with the time and should not prevent someone to think about these use cases already.

@theosanderson
Copy link
Member

It's not a detailed guideline on how to implement them: we can write those in the future but for now, the main aim for people is to understand what they can do (in principle - assuming the availability of the necessary resources and knowledge) with Loculus.

I agree. I think "we have tried these things, we have not tried these things" is pretty high level information that fits here. I'm not proposing turning this into an implementation guide.

Secondly, I really don't see a significant difference from what we already tried out. For the no alignment case, you can always just output a single-character sequence with an N. Technically, it has a sequence but conceptually, it doesn't. For the multi-reference case, you might have to create a few pseudo-sequences but still, I don't see why it wouldn't work. The UI might not be perfect (therefore, I am fine with a note regarding the UI) but that is something we can improve with the time and should not prevent someone to think about these use cases already.

Sure - but:

  • we haven't documented that you have to output an alignment with just an N anywhere yet (and we don't know whether or not you do need to do that)
  • arguably if you do that then the actual "schema" is one of the first two

@theosanderson theosanderson force-pushed the docs--add-caveats-about-some-schema-models2 branch from 9fa7633 to 1259691 Compare August 19, 2024 14:34
@theosanderson theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Aug 19, 2024
@chaoran-chen
Copy link
Member Author

chaoran-chen commented Aug 19, 2024

  • we haven't documented that you have to output an alignment with just an N anywhere yet (and we don't know whether or not you do need to do that)

That goes back to my first point in my opinion: this text is about what you can do in principle but not about how exactly you would do that. And I would refrain from documenting the how before we refactored the configuration.

  • arguably if you do that then the actual "schema" is one of the first two

I wouldn't say so because, again, this text is about the concepts and conceptually, a single-character sequence is not an alignment. The aim of mentioning this schema is to show that Loculus can be used if you don't want to align the sequences / don't know how to / it doesn't make sense.

I agree. I think "we have tried these things, we have not tried these things" is pretty high level information that fits here.

I don't think that this is relevant by itself. It indicates that we are not certain that it (the concept) will not work or that it is not recommended which both, in my opinion, are not the case. So for me, the statement, although true, is more misleading than useful.

@theosanderson
Copy link
Member

theosanderson commented Aug 19, 2024

That goes back to my first point in my opinion: this idea is about what you can do in principle but not about how exactly you would do that. And I would refrain from documenting the how before we refactored the configuration.

Sure, I'm not saying we should go into that detail here - I'm just saying that there is a big difference between a schema where the developers know exactly how it works, and there are examples to show how to implement it and one where there isn't, and IMO users will want to know which of those applies to which schema.

I wouldn't say so because, again, this text is about the concepts

I totally accept that one can reasonably argue this! (I think one can argue it both ways)

I don't think that this is relevant by itself. It indicates that we are not certain that it (the concept) will not work or is not recommended which both, in my opinion, are not the case. So for me, the statement, although true, is more misleading than useful.

A) I'd be very happy for us to go into more detail on our view "This approach has not been tested, but we are confident that conceptually Loculus will be able to support it, potentially after some minor tweaks. Feel free to reach out if you have difficulties with implementation" or something to really captures what we believe the true situation is.

B) I think this is another thing where one can poll some people asking "Would you as a potential user want to know X?" where X is a well-described-account of the situation.

@corneliusroemer corneliusroemer added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants