Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add caveats about some schema models #2433

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: Schema designs
description: Different ways to design the schema of a Loculus instance
---
import { Aside } from '@astrojs/starlight/components';

Loculus is very flexible in its data model and there are different ways to design the [schema](../../introduction/glossary#schema). Technically, a Loculus instance can have one or multiple organisms and each organism has

Expand Down Expand Up @@ -40,10 +41,19 @@ This is a good model if:

### Multiple references for an organism

An organism has one unaligned sequence (per segment) but multiple aligned ones. Users submit an unaligned nucleotide sequence and the processing pipeline aligns it against all multiple references.
<Aside>
This approach has not yet been tested. It would definitely require building your own custom preprocessing pipeline, and currently would result in some confusing UI elements.
</Aside>


In this model, an organism has one unaligned sequence (per segment) but multiple aligned ones. Users submit an unaligned nucleotide sequence and the processing pipeline aligns it against all multiple references.

This is a good model if there are multiple reference genomes for an organism.

### No alignments at all

It is also possible to use Loculus without defining any aligned nucleotide or amino acid sequences but just use it to share metadata and unaligned nucleotide sequences.
<Aside>
This approach has not yet been tested. It would currently require building your own custom preprocessing pipeline.
theosanderson marked this conversation as resolved.
Show resolved Hide resolved
</Aside>

In this model, Loculus is not configured to perform any alignment, but simply used to share unaligned sequences and associated metadata.