Lecture 6: Differential Gene Expression Analysis (07/01/2024)


What we will learn

Principles of Differential Gene Expression (DE) Analysis

  • DE analysis reveals significant gene expression changes between different conditions and it is a commonly performed analysis on NGS data

  • This requires at least two data inputs: Gene expression data and Metadata

  • Various challenges associated with RNA-Seq data and experimental design need to be dealt in DE analysis

DESeq2 package & Basic Commands

  • Low number of replicates, non-normally distributed counts, overdispersed variance, and more-complex experimental design pose challenges in finding DEGs

  • DESeq2 (and similar) are all-in-one powerful algorithm for detecting DGEs in RNA-seq data

  • RAW count for unique gene names/ids should be used as an input to DESeq2

normalized counts (TPM, FPKM/RPKM) must be avoided

  • Most statistical power and correctness of answers can be achieved by giving the model glm full experimental design containing all known and relevant information for a given group of samples: e.g. sex, genotype, etc.

What we will Practice

Using a publicly-available RNA seq data (see overview here, same one used from Lecture 5), we will perform each step in the differential gene analysis.