Skip to content

Commit

Permalink
update course page
Browse files Browse the repository at this point in the history
  • Loading branch information
tobiasrausch committed Nov 22, 2024
1 parent 60029d4 commit 33b8302
Showing 1 changed file with 15 additions and 57 deletions.
72 changes: 15 additions & 57 deletions courses/cg/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,79 +2,36 @@
<html lang="en">
<head>
<BASE HREF="https://tobiasrausch.com/courses/cg/">
<title>Analytical Methods in Cancer Genomics</title>
<title>Analytical Methods in Cancer and Population Genomics</title>
</head>
<body>

<h2>Analytical Methods in Cancer Genomics</h2>
<h2>Analytical Methods in Cancer and Population Genomics</h2>

<h3>Course Content</h3>

This course will focus on the analysis of short-read and long-read sequencing data from cancer genomics studies. Bioinformatic concepts, tools and methods required to analyse tumor sequencing data will be introduced. Learning outcomes include an overview of the challenges in the study of cancer genomics, discovery and visualisation of copy-number and structural variants, understanding the principles of tumor purity, heterogeneity and ploidy and an overview of cancer epigenetics. The course covers different sequencing data modalities (short-reads vs. long-reads) and data types (bulk vs. single-cell). Practical data analysis sessions will complement the course.
This part of the course will focus on the analysis of short-read and long-read sequencing data from population and cancer genomics studies. Bioinformatic concepts, tools and methods required to analyse (tumor) sequencing data will be introduced. Learning outcomes include an overview of the challenges in the study of (cancer) genomic data sets, discovery and visualisation of copy-number and structural variants, understanding the principles of tumor purity, heterogeneity and ploidy and an overview of cancer epigenetics. The course covers different sequencing data modalities (short-reads vs. long-reads) and data types (bulk vs. single-cell). Practical data analysis sessions will complement the course.

<h3>Schedule</h3>

<ul>
<li>Thursday 11th April, 12pm-2pm: Course Overview (Zoom), <a href="https://gear-genomics.embl.de/data/.slides/CourseOverview.pdf">Slides</a></li>
<li>Thursday 11th April - Wednesday 24th April: Watch pre-recorded lectures. See email for videos, slides are below.</li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture1_CancerGenomics.pdf">Lecture1 - Introduction to Cancer Genomics</a></li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture2_GenomeVariation.pdf">Lecture2 - Genome Variation</a></li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture3_StructuralVariants.pdf">Lecture3 - Structural Variants</a></li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture4_Epigenetics.pdf">Lecture4 - Cancer Epigenetics</a></li>
<li>Wednesday 24th April, 9am-4pm: Biocev day (Lectures and Practicals)</li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture5_LongReads.pdf">Lecture5 - Long-reads</a></li>
<li>Thursday 2th May, 12pm-2pm: RNA-Seq lecture (Zoom)</li>
<li><a href="https://gear-genomics.embl.de/data/.slides/Lecture6_RNASeq.pdf">Lecture6 - RNA-Seq</a></li>
<li>Thursday 16th May: Exercises and Questionnaires are due</li>
<li>Friday 22th November, 12:20pm-2pm: Lecture1 - Introduction to Cancer Genomics (Zoom), <a href="https://gear-genomics.embl.de/data/.slides/Lecture1_CancerGenomics.pdf">Slides</a></li>
</ul>

<h3>Exercise 1: Variant Calling (due date 2nd May 2024)</h3>

Please create a GitHub account or login to your existing account and create a new repository to analyse sequencing data. The goal of this exercise is to create a simple variant calling workflow for human sequencing data. Please describe the steps of your workflow using markdown (<a href="https://docs.github.com/en/get-started/writing-on-github">GitHub Markdown</a>). The workflow should contain steps to align the FASTQ files to the human reference genome (<a href="https://github.com/lh3/bwa">bwa</a>), call variants (<a href="https://samtools.github.io/bcftools/howtos/variant-calling.html">bcftools</a>) and annotate all single-nucleotide variants with (<a href="https://www.ensembl.org/Tools/VEP">VEP</a>). For this excercise you can ignore all InDels. Once you have finished the exercise, send me the repository URL of your GitHub repository and the likely causative variant via email.

<ul>
<li>Chromosome 7 human reference, <a href="https://hgdownload.soe.ucsc.edu/goldenPath/hg38/chromosomes/chr7.fa.gz">chr7</a></li>
<li>FASTQ of read1, <a href="https://gear-genomics.embl.de/data/.slides/R1.fastq.gz">Read1</a></li>
<li>FASTQ of read2, <a href="https://gear-genomics.embl.de/data/.slides/R2.fastq.gz">Read2</a></li>
</ul>

<h3>Exercise 2: Cancer Genomics Data Analysis (due date 2nd May 2024)</h3>
In this exercise we want to analyze a cancer genomics sample, namely a paired tumor-normal sample pair.
You can download the data set <a href="https://gear-genomics.embl.de/data/.exercise/">here</a>.
The main objective of this exercise is to align the data to the human reference genome (<a href="https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a>), to sort and index the alignments and to generate a read-depth plot, as discussed in the lectures. Please note that I downsampled the dataset and I also just kept the data for chromosome X from 20Mbp to 40Mbp (GRCh37/hg19 coordinates) because otherwise all analysis take a lot of time for a human genome. Once you have generated the alignment in BAM format you can subset the BAM to the region of interest using `samtools view -b input.bam chrX:20000000-40000000 > output.bam`.
<br>
Please write-up your analysis pipeline using <a href="https://guides.github.com/features/mastering-markdown/">GitHub markdown</a> and use your Github repository to store your analysis scripts in your favorite language, i.e., this could be bash scripts, <a href="https://snakemake.readthedocs.io/en/stable/">Snakemake</a> or <a href="https://www.nextflow.io/">Nextflow</a> pipelines, <a href="https://www.r-project.org/">R</a> or <a href="https://www.python.org/">python</a> scripts.
Likewise feel free to check-in a Makefile or a requirements file for <a href="https://conda.io/projects/conda/en/latest/user-guide/getting-started.html">Bioconda</a> if you use these to install tools.
At the very minimum the repository should contain the produced read-depth plot and a README.md file that explains the steps you have executed to generate the read-depth plot.
Once you are done please email me again the repository link, thanks!
<br>
**Optional**: Once you have successfully computed a read-depth plot you may also want to call structural variants and overlay these with the read-depth plot as arcs or points that indicate SV breakpoints.


<h3>Exercise 3: Working with count matrices (due date 16th May 2024)</h3>
In this exercise we want to run a differential gene expression analysis using an RNA-Seq count matrix (<a href="https://gear-genomics.embl.de/data/.slides/sample.counts">sample.counts</a>).
The sample metadata is available here: <a href="https://gear-genomics.embl.de/data/.slides/sample.info">sample.info</a>.
Starting from an <a href="https://gear-genomics.embl.de/data/.slides/template.R">Rscript template</a> please run a differential expression analysis, generate PCA, Heatmap and MA-plots and export the results into a CSV file.
Once you are done please upload your Rscript to your GitHub repository and email me again the repository link, thanks!
<br>
**Optional**: You may also want to run a gene set enrichment analysis on the differentially expressed genes.

<h3>Exercise 4: Fill out the questionnaires (Google Forms, due date 16th May 2024)</h3>

To be sent via email.

<h3>Useful links</h3>

Below are a couple of links to commonly used Bioinformatics tools in Cancer Genomics (certainly not comprehensive).
Below are a couple of links to commonly used Bioinformatics tools in population and cancer genomics (certainly not comprehensive).
<br>
Next-generation sequencing analysis tutorials
<ul>
<li><a href="https://github.com/ekg/alignment-and-variant-calling-tutorial">Alignment and variant calling</a></li>
<li><a href="https://github.com/tobiasrausch/vc">Structural variant calling tutorial</a></li>
<li><a href="https://github.com/tobiasrausch/vc">Short-read structural variant calling tutorial</a></li>
<li><a href="https://github.com/tobiasrausch/sv">Long-read structural variant calling tutorial</a></li>
</ul>
Commonly used alignment tools
<ul>
<li><a href="https://github.com/lh3/bwa">BWA</a></li>
<li><a href="https://github.com/lh3/minimap2">Minimap2</a></li>
<li><a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml">Bowtie2</a></li>
</ul>
Tools for working with alignment files (BAM files)
Expand All @@ -83,14 +40,20 @@ <h3>Useful links</h3>
<li><a href="https://github.com/samtools/samtools">SAMtools</a></li>
<li><a href="https://github.com/arq5x/bedtools2">bedtools</a></li>
</ul>
Tools for working with variant files (VCF/BCF files)
<ul>
<li><a href="https://github.com/samtools/bcftools">BCFtools</a></li>
<li><a href="https://brentp.github.io/cyvcf2/">cyvcf2 python library</a></li>
<li><a href="https://github.com/samtools/htslib">HTSlib</a></li>
</ul>
Tools to compute read counts in windows
<ul>
<li><a href="https://github.com/brentp/mosdepth">mosdepth</a></li>
<li><a href="https://github.com/tobiasrausch/alfred">alfred</a></li>
<li><a href="https://github.com/samtools/samtools">SAMtools</a></li>
<li><a href="https://github.com/dellytools/delly">delly</a></li>
</ul>
Tools for short variant calling, i.e., point mutations (SNVs) and short insertions and deletions (InDels)
Tools for variant calling, i.e., point mutations (SNVs) and short insertions and deletions (InDels)
<ul>
<li><a href="https://github.com/Illumina/strelka">Strelka</a></li>
<li><a href="https://github.com/freebayes/freebayes">FreeBayes</a></li>
Expand All @@ -100,11 +63,6 @@ <h3>Useful links</h3>
<li><a href="https://github.com/dellytools/delly">delly</a></li>
<li><a href="https://github.com/arq5x/lumpy-sv">lumpy</a></li>
</ul>
Tools for working with variant call files (VCF/BCF)
<ul>
<li><a href="https://github.com/samtools/htslib">HTSlib</a></li>
<li><a href="https://github.com/samtools/bcftools">BCFtools</a></li>
</ul>
Working with count matrices
<ul>
<li><a href="http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html">DESeq2 Tutorial</a></li>
Expand Down

0 comments on commit 33b8302

Please sign in to comment.