This tutorial introduces a beginner-friendly, copy-paste-ready RNA-seq workflow optimized for personal PCs using: FastQC -> fastp -> STAR -> samtools -> featureCounts -> MultiQC. The executable script supports Single-End and Paired-End FASTQ.gz inputs and produces BAMs, gene counts, and a MultiQC report.
bash rnaseq_pipeline.sh ...All workflow files are hosted on GitHub for downloading GitHub. You can choose one of the two options below.
On GitHub, click Code button and choose Download ZIP. This will download the complete package.
If you only want the main script, download or copy rnaseq_pipeline.sh from the repository
and follow the step-by-step instructions on this page.
This workflow has been successfully tested on local personal PCs, but it is still under active development and may require minor adjustments in some environments. If you encounter errors or issues, please report them to: rsat2026@gmail.com. Your input helps improve the tool.
Project location (important): Place your RNA-seq project (including reference/STAR_index) on a local disk under the system root or your home directory. Do not use shared, synced, or network folders (e.g., OneDrive, Dropbox, Google Drive, or Windows /mnt/c/... paths in WSL2), as they can be slow and cause file errors.
Recommended locations
/home/<user>/my_rnaseq_project (inside the Linux root filesystem)~/my_rnaseq_project (local home directory)Avoid: shared or mounted paths like /mnt/c/.../OneDrive/...
Keep your STAR index directly under the main project folder (as shown here).
my_rnaseq_project/
rnaseq_pipeline.sh
fastqs/ # input FASTQ.gz files here
sample1_R1.fastq.gz
sample1_R2.fastq.gz
...
reference/
GRCh38.primary_assembly.genome.fa.gz
gencode.v44.primary_assembly.annotation.gtf.gz
GRCh38.primary_assembly.genome.fa
gencode.v44.primary_assembly.annotation.gtf
STAR_index/ # STAR index folder (GENOME_DIR)
rna_output/ # pipeline output (OUTPUT_DIR) will be generated automatically after the run completes.
Run these commands in Windows Command Prompt (CMD) or PowerShell:
wsl --install -d Ubuntu
Reboot computer, then launch Ubuntu from the Start menu and continue the workflow inside the Ubuntu terminal.
1) Install Miniforge (Conda) and initialize it:
cd ~
curl -L -o Miniforge3.sh https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3.sh -b -p "$HOME/miniforge3"
echo 'export PATH="$HOME/miniforge3/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
conda init2) Restart the terminal. Install Mamba into base environment:
conda install -n base -c conda-forge mamba -y1) Install Miniforge (Conda):
cd ~
# Apple Silicon (M1/M2/M3):
curl -L -o Miniforge3.sh https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
# Intel Mac (older):
# curl -L -o Miniforge3.sh https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-x86_64.sh
bash Miniforge3.sh -b -p "$HOME/miniforge3"
echo 'export PATH="$HOME/miniforge3/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc2) Install Mamba:
conda install -n base -c conda-forge mamba -yImportant (macOS only): your script needs Bash 4+. Install modern bash:
brew install bash gawkCreate an isolated environment and install all tools:
mamba create -n rnaseq -c conda-forge -c bioconda \
fastqc multiqc fastp star samtools subread python pigz wget -y
mamba activate rnaseqVerify installs:
fastqc --version
multiqc --version
fastp --version
STAR --version
samtools --version
featureCounts -v
python --version
pigz --version
wget --versioncd ~
mkdir -p my_rnaseq_project/{fastqs,reference}
cd my_rnaseq_project/referenceCopy the command below to download the reference genome and annotation.
wget -O gencode.v44.primary_assembly.annotation.gtf.gz \
https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/gencode.v44.primary_assembly.annotation.gtf.gz
wget -O GRCh38.primary_assembly.genome.fa.gz \
https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/GRCh38.primary_assembly.genome.fa.gz
Unzip both gz files using pigz:
pigz -dk gencode.v44.primary_assembly.annotation.gtf.gz
pigz -dk GRCh38.primary_assembly.genome.fa.gz
From this point onward, all STAR index building and commands must be run inside your project directory (cd ~/my_rnaseq_project).
Create the STAR index folder directly under reference/:
cd ~/my_rnaseq_project
mkdir -p reference/STAR_indexBuild the STAR index:
STAR \
--runMode genomeGenerate \
--runThreadN 8 \
--genomeDir reference/STAR_index \
--genomeFastaFiles reference/GRCh38.primary_assembly.genome.fa \
--sjdbGTFfile reference/gencode.v44.primary_assembly.annotation.gtf \
--sjdbOverhang 100 \
--genomeSAindexNbases 14 \
--limitGenomeGenerateRAM 16000000000STAR \
--runMode genomeGenerate \
--runThreadN 4 \
--genomeDir reference/STAR_index \
--genomeFastaFiles reference/GRCh38.primary_assembly.genome.fa \
--sjdbGTFfile reference/gencode.v44.primary_assembly.annotation.gtf \
--sjdbOverhang 100 \
--genomeSAindexNbases 12 \
--genomeSAsparseD 2 \
--genomeChrBinNbits 18 \
2>&1 | tee output/star_genomeGenerate.lowram.log
STAR index generation is critical and may take hours. Try the standard build first; if memory errors occur, use the low-RAM version, which uses a compact genome index and runs reliably on most personal PCs. The full set of STAR index folder is roughly 15 to 20 GB in size for the human genome.
fastqs/ls fastqs/*.gzchmod +x rnaseq_pipeline.shbash rnaseq_pipeline.sh \
-i fastqs \
-o rna_output \
-g reference/STAR_index \
-a reference/gencode.v44.primary_assembly.annotation.gtf \
-t 4 \
--stranded 0Find your brew bash path:
which -a bashThen run using the brew bash path (example shown):
/opt/homebrew/bin/bash rnaseq_pipeline.sh \
-i fastqs \
-o rna_output\
-g reference/STAR_index \
-a reference/gencode.v44.primary_assembly.annotation.gtf \
-t 4 \
--stranded 0
Processing time varies by dataset and hardware; a paired sample may take ~30+ minutes. If the job freezes, is killed, or shows errors, this usually indicates insufficient RAM. To solve this, reduce the thread value (-t, e.g., use 2), close other programs, or run the workflow on a higher RAM PC. If unsure, copy the error message into an AI tool for help resolving the memory issue. The default threads are set conservatively for personal PCs, but you can increase -t (e.g., 8) on stronger machines for faster execution.
rna_output/multiqc/multiqc_report.html (open in browser)rna_output/SUMMARY.tsv (per-sample summary)rna_output/counts/combined_counts.txt (combined gene counts matrix)