The alignment.index file is a tab delimited file containing all the meta data you should need to download the bam alignment files on this ftp site The columns are 1. Alignment file 2. MD5 3. Study id 4. Individual 5 bai index file 6 bai index file md5 7 bas statistics file (The format of this file is explained in README.bas) 8 bas statistic file md5 Most files represent an alignment to the whole genome. The high coverage individuals (SRP000032) are split into chromosome based alignments to make the file sizes more manageable. The index files all use the reference genome which can be found under ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/ ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/ The file names are formatted as such data/NA12878/alignment/NA12878.SLX.SRP000033.2009_07.bam data/NA12878/alignment/NA12878.chrom1.454.SRP000032.2009_07.bam The filename starts with the Sample name from Corelli/Hapmap If the alignment has been split by chromosome there will be a chromosome name The sequencing technology is next, SLX for illumina, 454 for 454 and SOLID for SOLiD The SRP is the study identifier, 31 is pilot1 low coverage, 32 is pilot2 high coverage, 33 is pilot3 gene targetted sequencing. 2007_09, this is the relase date. If the filename contains unmapped the bam represents reads associated with that individual which didn't map to the reference. Each mapped bam file is paired with an index file which has the same name but the extension .bai.