Lecture 6: Differential Gene Expression Analysis (07/01/2024)
What we will learn
Principles of Differential Gene Expression (DE) Analysis
DE analysis reveals significant gene expression changes between different conditions and it is a commonly performed analysis on NGS data
This requires at least two data inputs: Gene expression data and Metadata
Various challenges associated with RNA-Seq data and experimental design need to be dealt in DE analysis
DESeq2
package & Basic Commands
Low number of replicates, non-normally distributed counts, overdispersed variance, and more-complex experimental design pose challenges in finding DEGs
DESeq2
(and similar) are all-in-one powerful algorithm for detecting DGEs in RNA-seq dataRAW count for unique gene names/ids should be used as an input to
DESeq2
normalized counts (TPM, FPKM/RPKM) must be avoided
- Most statistical power and correctness of answers can be achieved by giving the model
glm
full experimental design containing all known and relevant information for a given group of samples: e.g. sex, genotype, etc.
What we will Practice
Using a publicly-available RNA seq data (see overview here, same one used from Lecture 5), we will perform each step in the differential gene analysis.