Skip to content

Commit

Permalink
Merge pull request #1187 from ScilifelabDataCentre/develop
Browse files Browse the repository at this point in the history
send latest changes to live
  • Loading branch information
LianeHughes authored Sep 5, 2024
2 parents 6612da3 + a152323 commit 35a69f6
Show file tree
Hide file tree
Showing 73 changed files with 2,827 additions and 741 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,7 @@ __pycache__
# backups
*~
\#*\#

# Node modules and npm cache directory
node_modules/
.npm/
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ The website is intended to provide a central place to provide information about:
- Ongoing research projects and funding opportunities for COVID-19 and other topics within pandemic preparedness research

The site is built using the [Hugo](https://gohugo.io/) static web site generator.
It uses the [Bootstrap](https://getbootstrap.com/) framework. In addition, it uses [Vega-Lite](https://vega.github.io/vega-lite/), [DataTables](https://datatables.net/), [OpenLayers](https://openlayers.org/), [plotly](https://plotly.com/), [ImJoy](https://imjoy.io/) for various features.
It uses the [Bootstrap](https://getbootstrap.com/) framework. In addition, it uses [DataTables](https://datatables.net/), [OpenLayers](https://openlayers.org/), [plotly](https://plotly.com/), [ImJoy](https://imjoy.io/) for various features.

## Cite this portal

Expand Down
25 changes: 20 additions & 5 deletions content/english/about/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,29 +39,44 @@ The team are happy to help with using the Portal, to take suggestions regarding
<div class="container mb-3">
<div class="row row-cols-2 row-cols-md-3 row-cols-lg-6">
<div class="col pt-2">
<div><img src="/img/people/lh.jpg" alt="Picute of Liane H" width="150" class="img-thumbnail"/></div>
<div><img src="/img/people/lh.jpg" alt="Picture of Liane H" width="150" class="img-thumbnail"/></div>
<div><b>Liane Hughes</b></div>
<div><span class="text-muted">Project leader</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/kos.jpg" alt="Picute of Katarina ÖS" width="150" class="img-thumbnail"/></div>
<div><img src="/img/people/kos.jpg" alt="Picture of Katarina ÖS" width="150" class="img-thumbnail"/></div>
<div><b>Katarina Öjefors Stark</b></div>
<div><span class="text-muted">Data steward</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/sp.jpg" alt="Picute of Senthilkumar P" width="150" class="img-thumbnail"/></div>
<div><img src="/img/people/sp.jpg" alt="Picture of Senthilkumar P" width="150" class="img-thumbnail"/></div>
<div><b>Senthilkumar Panneerselvam</b></div>
<div><span class="text-muted">Systems developer</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/hk.jpg" alt="Picute of Hanna K" width="150" class="img-thumbnail"/></div>
<div><img src="/img/people/hk.jpg" alt="Picture of Hanna K" width="150" class="img-thumbnail"/></div>
<div><b>Hanna Kultima</b></div>
<div><span class="text-muted">Vice head of SciLifeLab Data Centre</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/jr.jpg" alt="Picute of Johan R" width="150" class="img-thumbnail"/></div>
<div><img src="/img/people/jr.jpg" alt="Picture of Johan R" width="150" class="img-thumbnail"/></div>
<div><b>Johan Rung</b></div>
<div><span class="text-muted">Head of SciLifeLab Data Centre</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/pd.jpeg" alt="Picture of Paul D" width="150" class="img-thumbnail"/></div>
<div><b>Paul Dulaud</b></div>
<div><span class="text-muted">Systems Developer</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/nh.jpeg" alt="Picture of Nalina H" width="150" class="img-thumbnail"/></div>
<div><b>Nalina Hamsaiyni Venkatesh</b></div>
<div><span class="text-muted">Data Steward</span></div>
</div>
<div class="col pt-2">
<div><img src="/img/people/aa.jpg" alt="Picture of Abdullah A" width="150" class="img-thumbnail"/></div>
<div><b>Abdullah Aziz</b></div>
<div><span class="text-muted">Data Engineer</span></div>
</div>
</div>
</div>
18 changes: 14 additions & 4 deletions content/english/about/editorial_committee.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,23 @@
title: Editorial committee
menu:
navbar_about:
name: "Editorial committee<br><br>"
name: "Editorial committee<br><br>"
weight: 30
layout: about_navbar
---

Below are the six editorial committee members for 2023, alongside their affiliation and area of expertise. The editorial committee enable a more direct link between the Portal and the wider research community. They collaborate with the Portal team to create content and advise on resources that would be beneficial to researchers in their area of expertise.
#### Current Editorial Committee

{{< editorial_committee_cards >}}
Below are the current editorial committee members, alongside their affiliation and area of expertise. The editorial committee enable a more direct link between the Portal and the wider research community. They collaborate with the Portal team to create content and advise on resources that would be beneficial to researchers in their area of expertise.

{{< editorial_committee_cards type="current">}}
<br>

#### Alumni

The following individuals are former members of the editorial committee who have significantly contributed to the development and growth of our platform. During their tenure, they brought invaluable insights from their respective fields, helping to shape the direction of our content and resources. Their legacy continues to influence the work we do, as they laid a strong foundation for the ongoing collaboration between the research community and the Portal. We extend our deepest gratitude to these distinguished alumni for their dedication and impactful service.

{{< editorial_committee_cards type="alumni">}}
<br>
*Images courtesy of: Uppsala University (LH, MN), SciLifeLab (LC), Karolinska Instititet (BM), Stockholm University (JA) and Chalmers (JB-P).*

_Images courtesy of: Uppsala University (LH, MN), SciLifeLab (LC), Karolinska Instititet (AF, BM), Stockholm University (JA) and Chalmers (JB-P)._
4 changes: 2 additions & 2 deletions content/english/about/organisations_and_programs.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Programs and organisations behind the portal
menu:
navbar_about:
name: "Programs & organisations<br>behind the portal"
name: "Programs & organisations<br>behind the portal"
weight: 20
layout: about_navbar
---
Expand Down Expand Up @@ -30,7 +30,7 @@ layout: about_navbar
</div>
<div class="col-12 col-md-8 col-lg-9">
<h6>SciLifeLab & Wallenberg National Program for Data-Driven Life Science</h6>
<p>Life science research is becoming increasingly data-driven. The amount and complexity of data is also growing exponentially. Data is the most valuable product of research, and it is therefore crucially important that we ensure it is managed appropriately throughout its lifecycle. To this end, SciLifeLab and The Knut and Alice Wallenberg Foundation have established the Data-Driven Life Science (DDLS) program in Sweden. The mission of the DDLS program is to recruit and train the next generation of life scientists, and to create strong data science capabilities for life science in Sweden that are internationally competitive. The DDLS program has been funded by The Knut and Alice Wallenberg foundation for 12 years. SciLifeLab, as a national infrastructure for life science, coordinates this program in close collaboration with ten Swedish universities and the Swedish Museum of Natural History. You can read more about the DDLS program <a href="https://www.scilifelab.se/data-driven/">here</a>.</p>
<p>Life science research is rapidly becoming more data-driven. As the volume and complexity of data grows exponentially, the effective management of data throughout its lifecycle becomes ever more important. To address this, SciLifeLab and The Knut and Alice Wallenberg Foundation launched the Data-Driven Life Science (DDLS) program in Sweden. This program focuses on recruiting and training the next generation of life scientists whilst building data science capabilities that will position Sweden at the forefront of life sciences globally. Coordinated by SciLifeLab with ten Swedish universities and the Swedish Museum of Natural History, the DDLS program benefits from 12 years of funding from the Knut and Alice Wallenberg Foundation.</br></br>The DDLS program is structured around four strategically focused research areas: Cell and Molecular Biology (hosted by Chalmers University of Technology), Evolution and Biodiversity (hosted by Uppsala University), Precision Medicine and Diagnostics (hosted by Karolinska Institutet), and Epidemiology and Biology of Infection (hosted by Umeå University). Each research area is supported by a national Data Science Node (DSN) at these institutions. The DSNs connect local research communities with the national DDLS program and SciLifeLab's network. For instance, the Epidemiology and Biology of Infection DSN, hosted at the <a href="https://www.hpc2n.umu.se/">High-Performance Computing Center North (HPC2N)</a> at Umeå University, contributes to the <a href="https://www.pathogens.se/">Swedish Pathogens Portal</a>. The DSN is crucial in reaching out to the Epidemiology and Biology of Infection research community, and in ensuring that their needs are understood and appropriately prioritised. You can read more about the DDLS program <a href="https://www.scilifelab.se/data-driven/">on the SciLifeLab website</a>.</p>
</div>
</div>
<hr class="faded" />
Expand Down
4 changes: 2 additions & 2 deletions content/english/dashboards/RECOVAC.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ In order to infer the impact of vaccination on ICU admissions, it is best to dir

The graphs below have multiple interactive features. In brief, it is possible to view different parts of the data using the buttons above the graphs. For exmaple, it is possible to look only at data from only those over 60 years of age by clicking '>60'. The 'Align timeline' button will change the timeline of the graphs so that only period for which data is available for both types of data shown is visible. The 'Show all data' button can be used to see all of the available data for both datasets (the timelines of the two are not the same).

<div id="dwbuttons"><button class="btn btn-secondary" type="button" data-bs-toggle="collapse" data-bs-target="#vis_instr_one" aria-expanded="False" aria-controls="mandatorycollapse">
<div id="dwbuttonsone"><button class="btn btn-secondary" type="button" data-bs-toggle="collapse" data-bs-target="#vis_instr_one" aria-expanded="False" aria-controls="mandatorycollapse">
Click here for more detailed instructions on using the features of the graphs
</button>
</div>
Expand Down Expand Up @@ -112,7 +112,7 @@ There is clear evidence of the benefits of vaccination for patients with each co

The graphs below have multiple interactive features. It is possible to see all of the data available for a given comorbidity for clicking on the corresponding button. The 'Align timeline' button will change the timeframe shown so that only the time period that is common between the two graphs is shown. The 'Show all data' button can be used to see all of the available data for both datasets (the timelines of the two are not the same).

<div id="dwbuttons"><button class="btn btn-secondary" type="button" data-bs-toggle="collapse" data-bs-target="#vis_instr_two" aria-expanded="False" aria-controls="mandatorycollapse">
<div id="dwbuttonstwo"><button class="btn btn-secondary" type="button" data-bs-toggle="collapse" data-bs-target="#vis_instr_two" aria-expanded="False" aria-controls="mandatorycollapse">
Click here for more detailed instructions on using the features of the graphs
</button>
</div>
Expand Down
2 changes: 2 additions & 0 deletions content/english/dashboards/covid_publications.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ The visualisations on this page evaluate the development of COVID-19 and SARS-Co

The code used to produce the visulations on this page can be found on [GitHub](https://github.com/ScilifelabDataCentre/pathogens-portal-visualisations). Specifically, code related to the number of publications can be found in the ['Count_publications' folder of the repository](https://github.com/ScilifelabDataCentre/pathogens-portal-visualisations/tree/main/Count_publications), and code used to generated the wordclouds can be found in the ['Wordcloud' folder](https://github.com/ScilifelabDataCentre/pathogens-portal-visualisations/tree/main/Wordcloud).

{{% publication_updated_date %}}

## Number of new publications

This graph displays the number of publications (including both journal publications and preprints) published each month, as well as the cumulative daily total of publications contained in the database. The dates reflect either the dates that the articles were uploaded to preprint servers (in the case of preprints) or the official journal publication date, whichever is the most recent. Where a given day of publication is not specified, we assign the date as the first of the month. This causes the appearance of a relatively sharp increase at the start of each month.
Expand Down
67 changes: 67 additions & 0 deletions content/english/dashboards/lineage_competition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: "SARS-CoV-2 Variant Competition"
description: "Estimates of SARS-CoV-2 variant frequencies and growth rate advantages from global SARS-CoV-2 genotype sequencing data"
banner: /dashboard_thumbs/lineage_competition.png
toc: true
plotly: true
menu:
dashboard_menu:
identifier: lineage_competition
name: SARS-CoV-2 variant competition
dashboards_topics: [COVID-19, Infectious diseases]
data_status: "updating" # or "historic"
---

<div class="alert alert-info">All data last updated: {{< last_modified_lineage >}}</div>

SARS-CoV-2 is constantly evolving, with new variants competing against one another for domiance in different regions. This model integrates SARS-CoV-2 genotype sequencing data from around the world to estimate the growth advantage of different variants, which is then used to provide regional estimates of variant frequencies and how these are changing over time. This can be used by researchers who might wish to know which variants to focus on in their studies, or by public health officials who might wish to know which variants are likely to become dominant in their region.

The full set of model estimates, which includes estimates for countries other than Sweden, can be found in the [GitHub repository of the Murrell research group](https://github.com/MurrellGroup/lineages), who conduct this research.

## Global statistics on lineage competition

### Advantage estimates

Growth rate advantages are estimated from all variant frequency data, globally. A variant with a higher growth rate advantage is expected to increase in frequency relative to other variants.

<figure><img alt="Growth advantage estimates for the top variants" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/pruned_lineages_MCMC_lineage_growths.svg"></figure>

For convergent mutations (occuring independently at least three times), the contribution to the growth rate advantage of each mutation is estimated.

<figure><img alt="Estimates of contribution to growth of convergently occuring mutations" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/N_MCMC_mutations.svg"></figure>

The relatedness of SARS-CoV-2 variants, with their estimated growth rate advantages, can be visualised in a phylogenetic tree. Only recent variants, and key ancestral variants, are shown. Lineages with low growth rate estimates are excluded.

<figure><img alt="Growth-annotated phylogeny" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/tree_pruned.svg"></figure>

### Variant trajectories

The "model average" variant frequencies are forecasts from the model with all region-specific effects set to zero. This provides a single "snapshot" of the global variant situation. This is not meant to representative of the true global variant frequencies, since it is influenced by different sequencing coverage in different regions, but it is useful to understand the model's estimates of how quickly one variant might be expected to replace others.

Variants are coloured such that related variants should be similar in colour.

<figure><img alt="Model average variant trajectories" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/variant_trajectories_model_avg.svg"></figure>

<figure><img alt="Model average variant frequencies" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/muller_trajectories_model_avg.svg"></figure>

## Results from Sweden

Estimates of variant frequencies and growth rate advantages for Sweden are always included in the model. As with all data used in the model, Swedish genotype data comes from [GISAID](https://gisaid.org). Sequencing volumes are often low for Sweden, especially when the case counts themselves are low (and there are not many infections to sequence). In such cases, the estimates for variant frequencies in Sweden can be very uncertain. It is therefore important to treat the results of the model with caution when sequencing volumes are low.

<figure><img alt="SARS-CoV-2 genotype volumes" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/sequence_volume_Sweden.svg"></figure>

<figure><img alt="Variant trajectories in Sweden" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/variant_trajectories_Sweden.svg"></figure>

<figure><img alt="Variant frequencies in Sweden" width="1000" src="https://raw.githubusercontent.com/MurrellGroup/lineages/main/plots/muller_trajectories_country_Sweden.svg"></figure>

## Methods

Lineage dynamics are modelled using a Bayesian multinomial logistic regression over lineage counts. The latest global [GISAID](https://gisaid.org/) SARS-CoV-2 dataset (obtained via bulk download) is filtered to include only sequences with collection dates within the last 100 days. Lineage assignment is performed using [NextClade](https://docs.nextstrain.org/projects/nextclade/en/stable/user/nextclade-cli/index.html), retaining sequences with a “good” overall quality control (QC) status, and >90% coverage. Daily lineage counts are aggregated by region (including only countries with sufficient recent sequencing volumes), with low-frequency sub-lineages (too rare to model) merged into their most recent ancestors.

Growth rates are modelled with a hierarchical approach. Each lineage's growth rate in a given region is the sum of a global rate and a region-specific random effect. The global rate for each lineage comprises three components: i) branch-specific terms for each branch ancestral to the lineage, ii) terms for convergent spike mutations occurring 3 or more times independently that are present in the lineage, and iii) a lineage-specific term. This parameterisation allows for shared evidence when mutations occur across multiple branches, and phylogenetic heritability of growth rates, such that growth rates for closely-related lineages are more likely, under the model’s prior, to be similar to one another. Recombinants inherit weighted mixtures of their multiple parents' growth terms. Lineage-specific intercept terms, which control the relative timing of the emergence of variants, comprise a global shared term and region-specific random effects.

Gaussian priors (centred on zero) are used for each parameter type, with Gaussian hyperpriors over the log of their standard deviations. Posterior distributions are jointly sampled (for global and local parameters for all global data) using Hamiltonian Monte Carlo with the No-U-Turn sampler, implemented in the [AdvancedHMC.jl](https://github.com/TuringLang/AdvancedHMC.jl) package of [Julia](https://julialang.org/).

The [Pango designations](https://github.com/cov-lineages/pango-designation/) are used for lineage names in all of the plots produced using the model.

The Murrell group gratefully acknowledges all data contributors, i.e. the Authors and their Originating Laboratories responsible for obtaining the specimens, and their Submitting Laboratories that generated the genetic sequence and metadata and shared via the GISAID Initiative the data on which this research is based.
Loading

0 comments on commit 35a69f6

Please sign in to comment.