Detects copy number variants (CNVs) using read-depth analysis — complementary to Manta’s paired-end/split-read approach. Especially effective for large CNVs (>1 kb) that Manta may miss.
Manta (step 4) detects SVs from discordant read pairs and split reads, which works well for balanced SVs (inversions, translocations) and smaller deletions/duplications. CNVnator uses read-depth signal only, making it better for:
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11
# Step 1: Extract read mapping from BAM
docker run --rm \
--cpus 4 --memory 8g \
-v ${GENOME_DIR}:/genome \
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11 \
cnvnator \
-root /genome/${SAMPLE}/cnvnator/${SAMPLE}.root \
-tree /genome/${SAMPLE}/aligned/${SAMPLE}_sorted.bam
# Step 2: Generate read-depth histogram
docker run --rm \
--cpus 4 --memory 8g \
-v ${GENOME_DIR}:/genome \
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11 \
cnvnator \
-root /genome/${SAMPLE}/cnvnator/${SAMPLE}.root \
-his 1000 \
-fasta /genome/reference/Homo_sapiens_assembly38.fasta
# Step 3: Statistics
docker run --rm \
--cpus 4 --memory 8g \
-v ${GENOME_DIR}:/genome \
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11 \
cnvnator \
-root /genome/${SAMPLE}/cnvnator/${SAMPLE}.root \
-stat 1000
# Step 4: Partition
docker run --rm \
--cpus 4 --memory 8g \
-v ${GENOME_DIR}:/genome \
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11 \
cnvnator \
-root /genome/${SAMPLE}/cnvnator/${SAMPLE}.root \
-partition 1000
# Step 5: Call CNVs
docker run --rm \
--cpus 4 --memory 8g \
-v ${GENOME_DIR}:/genome \
quay.io/biocontainers/cnvnator:0.4.1--py312h99c8fb2_11 \
cnvnator \
-root /genome/${SAMPLE}/cnvnator/${SAMPLE}.root \
-call 1000 \
> ${GENOME_DIR}/${SAMPLE}/cnvnator/${SAMPLE}_cnvs.txt
The 1000 parameter is the bin size in base pairs. Use:
1000 for 30X WGS (recommended)500 for higher coverage (>50X)100 for targeted/exome data${SAMPLE}.root — ROOT file with read-depth data (intermediate, can be deleted)${SAMPLE}_cnvs.txt — Tab-separated CNV calls with columns:
# Keep only significant CNVs (e-value < 0.01, size > 1kb)
awk '$5 < 0.01 && $3 > 1000' ${SAMPLE}_cnvs.txt
~2-4 hours per 30X WGS genome.