This repository documents a de novo RNA-seq transcriptome analysis pipeline using Trinity, RSEM, edgeR, TransDecoder, and BLAST+.
It is designed for tumor-free and tumor-bearing Drosophila melanogaster larvae samples.
- Organism: Drosophila melanogaster
- Conditions:
- Control (tumor-free larvae)
- Treated (tumor-bearing larvae)
- Input Data: Adapter-trimmed paired-end FASTQ files
Make sure the following tools are installed before running the pipeline:
- Trinity
- RSEM
- Bowtie
- edgeR (R package)
- TransDecoder
- BLAST+
samtools
assembly-stats
- Concatenate FASTQ samples
- Convert FASTQ β FASTA
- Perform de novo assembly with Trinity
- Calculate assembly statistics (
assembly-stats
) - Estimate transcript abundance with RSEM
- Generate count matrix (
abundance_estimates_to_matrix.pl
) - Perform differential expression with edgeR
- Predict coding regions with TransDecoder
- Functional annotation with BLASTp (against UniProt)
- Assembly:
Trinity.fasta
,Assembly_Statistics.txt
- Abundance Estimates: RSEM output directories
- Count Matrix:
abundance_count.isoform.counts.matrix
- Differential Expression: edgeR results (
DE_results.txt
) - Predicted ORFs: TransDecoder
.pep
and.cds
files - Functional Annotation: BLAST results (
result.txt
)
Devraj Pokhrel
B.Sc. Biotechnology