Aligns raw sequencing reads against the GRCh38 human reference genome. Produces a sorted, indexed BAM file.
Alignment maps each 150bp sequencing read to its position in the human genome. Required for all downstream variant calling.
quay.io/biocontainers/minimap2:2.28--he4a0461_0 (minimap2 aligner)staphb/samtools:1.20 (samtools sort + index)Homo_sapiens_assembly38.fasta).mmi file, ~7GB, generated once)SAMPLE=your_sample
GENOME_DIR=/path/to/your/data
REF=${GENOME_DIR}/reference/Homo_sapiens_assembly38.fasta
# Step 1: Create minimap2 index (one-time, ~30 min)
minimap2 -d ${GENOME_DIR}/reference/GRCh38.mmi $REF
# Step 2: Align + sort (1-2 hours for 30X WGS)
minimap2 -a -x sr -t 16 \
${GENOME_DIR}/reference/GRCh38.mmi \
${GENOME_DIR}/${SAMPLE}/fastq/${SAMPLE}_R1.fastq.gz \
${GENOME_DIR}/${SAMPLE}/fastq/${SAMPLE}_R2.fastq.gz \
| samtools sort -@ 8 -o ${GENOME_DIR}/${SAMPLE}/aligned/${SAMPLE}_sorted.bam
# Step 3: Index BAM
samtools index ${GENOME_DIR}/${SAMPLE}/aligned/${SAMPLE}_sorted.bam
# Output: ~30-40GB BAM + ~9MB BAI index
-x sr for Illumina short reads (short-read preset)bwa-mem2 is equally valid but minimap2 is faster (see scripts/02a-alignment-bwamem2.sh).bai) must always accompany the BAM file