Skip to content

Latest commit

 

History

History
209 lines (112 loc) · 8.06 KB

README.md

File metadata and controls

209 lines (112 loc) · 8.06 KB

Data-visualisation-course

Interactive data visualisation

Interactive-bioinformatics-data-visualisation-dev

Course Title: Interactive bioinformatics-data visualisation

Course summary

Visualising and plotting big data can become tricky or even impossible in some situations to do all in one excel sheet. This course will introduce you to the basics of scientific data visualisation. It will help you to tease out the interesting parts of your data by going interactive. Interactive plotting and visualisation is becoming more popular as a part of scientific publication with an advent of tools like D3 (https://github.com/d3/d3/wiki/Gallery), plotly( https://plot.ly/ ), etc. We will go through the most common scenario from next generation sequencing data, specifically RNAseq expression data visualisation. We will start with an excel sheet or typical outputs files from RNAseq pipelines, and we select subset of genes and interactively visualise the variation in 2D and 3D along experimental conditions that include variations across time, temperature and replicates. We will use Degust which an interactive web-tool for RNA-seq analysis (https://github.com/drpowell/degust) and go through few examples.

Objectives

After this course you should be able to:

  • To be able to visualise big RNAseq expression data
  • Be able to fetch an interesting subset of the data
  • Understand what RNAseq interactive visualisation is about

Course Content

  • Introduction - what we'll cover
  • We will go through few exemplar cases: a quick demonstration
  • Practical: A hands-on introduction
  • Practical: RNAseq expression data through IGV viewer quick tour
  • Practical: RNA-seq exploration using Degust
  • Visualising and comparing Gene ontology (GO terms) from expression data set using GO View

Poll

Data Visualisation has become mainstream - no more big tables, spreadsheets, bar/pie charts -

For an inspiration lets visit the story : "The 25 Best Data Visualizations of 2023 " (https://visme.co/blog/best-data-visualizations/)

Interactive plotting

MSA

We go through Alignment Viewer (MSA) component is used to align multiple genomic or proteomic sequences from a FASTA or Clustal file. Among its extensive set of features, the multiple sequence alignment viewer can display multiple subplots showing gap and conservation info, alongside industry standard colorscale support and consensus sequence. No matter what size your alignment is, Alignment

(https://dash-bio.plotly.host/dash-alignment-viewer/)

Circos

RNA seq mapping data visualisation(IGV)

  • We will use IGV(Integrative Genomics Viewer)

  • Integrated visualization tool for multiple data types and genome annotations.

  • Data from NGS analysis can be easily visualized: Interactive and speedy visualization. Run locally on desktop

  • Handles large datasets

  • View multiple datasets in separate panels on the same pane( hence intergrative)

  • Supports several track management options (filtering, grouping, sorting) and in-built tools (index, sort, motif finder etc)

  • Direct visualization from web-resource in the formats: bed, wig, gff, gff3, Bam, bigWig, bigBed, vcf etc

  • IGV execises viewing BAMS and spotting SNPs( data under the repo igv-mutant-data)

RNAseq Expression data visualisation (Degust)

Go term visulisation : GOView

  • We will use (http://www.webgestalt.org/2017/GOView/)

  • The simple solution to view and compare Gene Ontology under DAG Structure

  • Sample file for Biological Process, searching by GO ID ()

  • It allows users visualize and compare multiple provided GO term lists in a directed acyclic graph (DAG) to reveal relationships among the terms

Gene Network construction and Visualisation

  • GeNeCK (Gene Network Construction Kit) is a comprehensive online tool kit that integrate various statistical methods to construct gene networks based on gene expression data and optional hub gene information (http://lce.biohpc.swmed.edu/geneck/index.php)

  • paper: GeNeCK: a web server for gene network construction and visualization, Zhanget al (2019). BMC bioinformatics

Cytoscape

  • We can Import the SIganlingFlow.cys file downloaded from (http://dp.univr.it/~laudanna/LCTST/downloads/) as a network using File --> Import --> Network --> File...

  • Cytoscape user interface: has thee main elements : the Control Panel on the left, the Table Panel on the bottom, and the Network View in the middle

  • Explore various options

  • other network example we have is 05062019-signalink-UTh27h.cys we downloaded from (http://signalink.org/download) using the following config:

SignaLink export config:
   Species:
       H. sapiens
   Layers:
       Pathway members
       Pathway regulators
   Pathways:
       RTK
   Include TF network: Yes
   Range from pathway specific TF: 1
   Format: cytoscape
   Compression: None

Using Expression atlas: Transcription profiling by high throughput sequencing of Arabidopsis roots, leaves, flowers and siliques (https://www.ebi.ac.uk/gxa/experiments/E-GEOD-38612/Results)

Further resources

  • We will use, GeNeCK (http://lce.biohpc.swmed.edu/geneck/) Zhang etal(2019) 1BMC Bioinformatics. 2019; 20: 12` a web server for gene network construction and visualization we will spot the important gene in the networks gene expression data
  • Cytoscape: a software environment for integrated models of biomolecular interaction networks, Genome Research 2003 Nov; 13(11):2498-504