genomics-pipeline

Recommended Resources

Free resources for understanding your genomic data, organized from most accessible to most technical.


Start Here (No Background Needed)

Understanding Genetics Basics

Understanding Your Specific Results


Intermediate (Some Biology Background Helpful)

Variant Interpretation

Structural Variants and CNVs


Advanced (For Researchers and Power Users)

Tools and Methods

Population Genetics

Online Courses (Free)


Useful Command-Line References

bcftools Cheat Sheet

# Count variants by type
bcftools stats file.vcf.gz | grep "^SN"

# Extract specific fields
bcftools query -f '%CHROM\t%POS\t%REF\t%ALT[\t%GT]\n' file.vcf.gz

# Filter to PASS variants only
bcftools view -f PASS file.vcf.gz

# Filter to a specific region
bcftools view -r chr22:16000000-17000000 file.vcf.gz

# Filter to heterozygous variants only
bcftools view -g het file.vcf.gz

# Filter to homozygous ALT variants only
bcftools view -g hom file.vcf.gz

# Count variants per chromosome
bcftools view -f PASS file.vcf.gz | grep -v '^#' | cut -f1 | sort | uniq -c | sort -rn

samtools Cheat Sheet

# Quick alignment summary
samtools flagstat file.bam

# Per-chromosome read counts
samtools idxstats file.bam

# Extract reads from a specific region
samtools view -b file.bam chr22:16000000-17000000 > region.bam

# View BAM header (reference info, read groups)
samtools view -H file.bam

# Calculate average depth
samtools depth -a file.bam | awk '{sum+=$3; n++} END {print sum/n}'

Community