Skip to content

Basic pipeline using Nextflow to automate bulk RNAseq data preprocessing

Notifications You must be signed in to change notification settings

aa9gj/bulk-rnaseq-nf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

bulk-rnaseq-nf is a bioinformatics pipeline that can be used to analyse RNA sequencing data. It takes a samplesheet and FASTQ files as input, performs lane concatenation, quality control (QC), trimming, alignment, assembly, quantification, and prepares data for input into packages (e.g. DESeq2) for differential expression analysis.

  1. Lane concatenation for samples sequences on multiple lanes
  2. Adapter trimming, and read QC (Trim Galore!)
  3. HiSAT2 index generation if not readily available
  4. HiSAT2 alignment
  5. Sort and index alignments (SAMtools)
  6. Transcript assembly and quantification (StringTie)
  7. Present QC for raw read, alignment, gene biotype, sample similarity, and strand-specificity checks (MultiQC, R)

Pipeline structure

Each directory and file is structured to facilitate the processing pipeline.

  • conf/: Configuration files related to the project.
  • modules/: Contains sub-modules, each serving specific roles like preprocessing, alignment, and transcript assembly.
    • preprocess/: Preprocessing scripts, such as concatenating and trimming fastqs.
    • align/: Contains scripts for indexing and alignment using HISAT2.
    • transcript_assembly/: Scripts for transcript assembly and quantification using StringTie.
    • qc: Quality control
  • workflows/: Main pipeline scripts.
  • bin/: Directory for helper scripts.
  • params.yaml: Configuration file specifying input parameters.
  • nextflow.config: Pipeline-wide configuration settings.

Pipeline usage

nextflow run workflows/main.nf -params-files params.yaml

About

Basic pipeline using Nextflow to automate bulk RNAseq data preprocessing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published