Skip to content

Create Count Table

One of the most common applications of RNA-seq is to estimate gene and transcript expression, and it is a required step before differential expression analysis. It starts with the alignment or mapping of reads and there are two possible alternatives: mapping to the genome when a reference sequence is available (RNA-Seq Alignment) or mapping to the transcriptome (RNA-Seq de novo Assembly ). After the mapping, the read quantification is performed. The simplest approach is toaggregate raw counts of mapped reads taking into account gene coordinates. Other cases require more sophisticated algorithms, capable of allocating multi-mapping reads among similar transcripts (isoforms).

This functionality can be found under transcriptomics → RNA-Seq Read Quantification.

Two strategies are available:

  • Gene-level Quantification: The gene-level quantification tool aggregates raw counts of mapped reads (SAM/BAM files) using a gene feature format file (GTF/GFF) containing the genome coordinates of exons and genes. This approach discards multiple aligned reads. It runs locally.
  • Transcript-level Quantification: The transcript-level quantification tool estimates gene and isoform expression from RNA-Seq reads (FASTQ). It is based on the RSEM software package, which allocates multi-mapping reads among transcripts using an expectation-maximization approach. This tool requires a set of reference transcript sequences (FASTA), such as one produced by a de novo transcriptome assembler. It is executed via the BioBam Bioinformatics Cloud Platform.