In this tutorial we compare the performance of three statistically-based variant detection tools:
- SAMtools: Mpileup
- GATK: Unified Genotyper
Each of these tools takes as its input a BAM file of aligned reads and generates a list of likely variants in VCF format
This tutorial runs on the GVL Galaxy Tutorial Server. All needed tools are on the server.
The data has been produced from human whole genomic DNA. Only reads that have mapped to a part of chromosome 20 have been used, to make the data suitable for an interactive tutorial. There are about one million 100bp reads in the dataset, produced on an Illumina HiSeq2000. This data was generated as part of the 1000 Genomes project: http://www.1000genomes.org/
Needed datasets exist in Shared Libraries on the server, and are also available via URL.