Skip to content

This is a summer training course at CCIC (Faculty of Computers and Information), Mansoura University, Egypt.

Notifications You must be signed in to change notification settings

SaraEl-Metwally/Informatics-on-High-throughput-Sequencing-Data-Course-Summer-2020-

Repository files navigation

Informatics-on-High-throughput-Sequencing-Data-Course-Summer-2020-

This is a summer training course at CCIC (Faculty of Computers and Information), Mansoura University, Egypt.

Course Description / aims

The contents of this workshop are prepared to introduce Bioinformatics as the intersection of informatics and Biology, and how Bioinformatics play an important rule in different areas of our life ranging from agricultural, environmental sciences to our healthcare system. Bioinformatics research has been accelerated with the advent of Next-Generation Sequencing, NGS, machines. These machines produce a deluge of genomic data that is required to clean, manage, process, analyze, and interpret to have a clear understanding of our biological system. We will help you in this course to gain different skills that are required to start your career as a Bioinformatician.

Target audience

Undergraduate and graduate-level students in biology, medicine, science, agriculture, pharmacy, veterinary medicine, biomedical engineering, bioinformatics and other related fields.

Learning Objectives

Participants will gain practical experience and skills to be able to:

  1. Use basic Unix commands for managing genomic data.
  2. Access and Manipulate different genomic databases.
  3. Explore different file formats for biological data.
  4. Perform quality control of sequencing reads.
  5. Map and align sequencing reads.
  6. Perform de novo assembly tasks.
  7. Analyze small variant calling and their annotation.
  8. Visualized NGS data using some visualization tools.

Pre-requisites

  • Basic biology concepts
  • Basic computer skills

Course Agenda and Materials

Day Objective Materials
1
  • Difference among Bioinformatics, Medical Informatics, and Biomedical Informatics
  • Bioinformatics Career in Egypt.
  • How to Survive in Bioinformatics Field
  • Introduction to Sequencing technologies (Three Generations of Sequencing Machines)
  • Setup our working environment (Linux Virtual Machine using VirtualBox)
2
  • Unix-based Systems
  • Why Linux for Bioinformatics
  • What is Shell
  • Getting Started with Command Line Interface, i.e. Terminal
  • Playing with Linux Directory Tree Structure
3
  • Ownership of Linux files
  • Understanding Linux Permissions
  • Playing with Files using Unix/Linux commands
4
  • Linux Commands for System Managment (i.e. Process, Disk and Memory, etc.)
  • Linux Commands for Archiving and Compression
  • FASTA/FASTQ Files Format
  • Understand Quality Scores and Phred+33 Format
5
  • Introduction to GTF files
  • Searching FASTA/FASTQ/GTF files using grep command
  • Regex with grep command (i.e. Counting reads in FASTA/FASTQ files)
  • Piping the output of Linux commands
6
  • Piping/Redirecting output
  • cut, sort, uniq, comm, and diff
  • Linux Commands for networking and communicating with a remote machine
7
  • Introduction to Shell Scripting
  • Variables in Bash
  • User Inputs in Bash
  • Arithmetic in Bash
8
  • Numeric and String Comparisons
  • If stataments, Nested If statements, If Else, If Elif Else, Boolean Operations
  • Case Statements
9
  • While, Until, For loops
  • Ranges and Select Statments
  • Functions in bash
10
  • What is SRA? How it is organized ?
  • Playing with SRA toolkit
11
  • What is FASTQC?
  • Box and Whisker Plot
  • Playing with FASTQC Modules?
12
  • Continue with FASTQC Modules
  • Run FastQC in non-interactively mode
  • Playing with Trimmomatic
13
  • Sequence Alignment Programs
  • BWA Aligner
  • SAM file format
14
  • Flag field in SAM format
  • CIGAR statement in SAM format
  • MD statement in SAM format
  • Introduction to SAMtools
15
  • Install SAMtools
  • SAM <-> BAM Conversion
  • SAMtools view, flagstat, sort, and index commands
  • extract alignments using SAMtools view command
  • Introduction to BED format
16
  • What is Variant Calling, Examples of Genomic Variations
  • SAMtools mpileup format
  • SAMtools mpileup command
17
  • Variant Calling Pipeline
  • Genotype Vs. Phenotype Vs. Alleles Vs. Haplotypes
  • Genotype Vs. Phenotype Vs. Alleles Frequencies
  • Phased/unphased Genotypes
18
  • VCF/BCF format
  • Generate VCF/BCF files with SAMtools
19
  • Generate raw VCF/BCF files with SAMtools
  • Install BCFtools
  • BCFtools view, call and mpileup commands
  • Visualizing Alignment Information with SAMtools tview command
  • Recap of Variant Calling Pipeline
20
  • Recap of Variant Calling Pipeline

References

About

This is a summer training course at CCIC (Faculty of Computers and Information), Mansoura University, Egypt.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published