home > bioproject > PRJNA74581
identifier PRJNA74581
type bioproject
sameAs
organism Homo sapiens
title Beyond the CCDS exome: Variant Frequencies in Different Targeted Genome Regions May Fuel Evolutionary Development of New Exons
description Background: Enrichment of biologically interesting loci by DNA hybridization followed by high-throughput sequencing has become an important tool in modern genetics, especially for finding disease causing mutations. Currently, the most common capture target is the Consensus CDS (CCDS). The CCDS, however, excludes many actual or computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding functional elements such as untranslated and regulatory regions. The dynamics of capture sequencing outside of the CCDS regions is consequently less well understood. Results: We examine capture sequence data outside of the CCDS regions and find that extremes of GC content in different subregions of the genome can reduce the local coverage to less than 50% relative to the CCDS. Further, we show that while this effect is primarily due to biases inherent in both the Illumina and SOLiD sequencing platforms it is exacerbated by the capture process. Interestingly, for 2 subregion types, miRNA and predicted exons, the capture process seems to favor high relative coverage. Lastly, we examine the mutational spectrum of non-CCDS regions and find that predicted exons, as well as exonic regions specific to RefSeq and Vega, show much higher variant frequencies than the CCDS. Predicted exons, strikingly, show a variant frequency of 1/660bp, more than twice the rate of the CCDS and 30% higher than the overall genomic rate. Conclusions: We show that regions outside of the CCDS capture less efficiently than the CCDS itself, and that variant frequencies vary dramatically in different biologically important loci.
data type Exome
organization
Baylor College of Medicine
publication
Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities.
properties 
{...}
dbXrefs
sra-run  SRR073372SRR073373SRR073380SRR085445SRR085446SRR085447SRR118422
sra-submission  SRA026545
biosample  SAMN00138192SAMN00138193SAMN00138194SAMN00138191SAMN00138195
sra-study  SRP004501
sra-sample  SRS139093SRS139094SRS139095SRS139092SRS139096
sra-experiment  SRX031550SRX031551SRX031552SRX031547SRX031548SRX031549SRX031553
distribution JSONJSON-LD
Download
bioproject.xml  HTTPS FTP
status public
visibility unrestricted-access
dateCreated 2011-10-19T00:00:00Z
dateModified 2011-10-19T00:00:00Z
datePublished