home > bioproject > PRJEB11702
identifier PRJEB11702
type bioproject
sameAs
organism
title NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes
description Background Massive high-throughput sequencing of short, hypervariable segments of the 16S ribosomal RNA (rRNA) gene is transforming the methodological landscape describing microbial diversity within and across complex biomes. However, there is a strong need for standardisation as each new combination of experimental choices affects the results in different ways, restricting true meta-analyses. Results Here we present NG-Tax, a pipeline for 16S rRNA gene amplicon sequence analysis that was validated with different mock communities, specifically designed to challenge issues regarding optimization of routinely used filtering parameters. By sequencing two tandem variable 16S rRNA gene regions, V4 and V5-V6, in three separate sequencing runs on Illumina’s HiSeq2000 platform, the microbial composition of 49 independently amplified mock samples was characterized. This setup allowed for the evaluation of important factors of technical bias in taxonomic classification: 1) run-to-run sequencing variation 2) PCR – error 3) region/primer specific amplification bias. Despite the short read length (~140 nt) and all technical biases, the average specificity of the taxonomic assignment for the phylotypes included in the mock communities was 96%. On average 99.94% of the reads could be assigned to at least family level, while assignment to ‘spurious genera’ represented on average only 0.02% of the reads per sample. Pearson correlations between obtained and expected compositions at genus level were as high as 0.94, and Unifrac distance based PCoA plots confirmed biology guided clustering rather than the aforementioned technical aspects. Conclusions NG-Tax demonstrated improved qualitative and quantitative representation of the true sample composition. The high robustness of the pipeline against technical biases associated with 16S rRNA gene amplicon sequencing studies will additionally improve comparability between studies and facilitate efforts towards standardization.
data type Other
organization
publication
properties 
{...}
dbXrefs
sra-run  ERR1121625ERR1121626ERR1121627ERR1121628ERR1121629ERR1121630ERR1121631ERR1121632ERR1121633ERR1121634 More
sra-submission  ERA532273
biosample  SAMEA3649247SAMEA3649248SAMEA3649249SAMEA3649250SAMEA3649251SAMEA3649252SAMEA3649253SAMEA3649254SAMEA3649255SAMEA3649256 More
sra-study  ERP013110
sra-sample  ERS956396ERS956397ERS956398ERS956399ERS956400ERS956401ERS956402ERS956403ERS956404ERS956405 More
sra-experiment  ERX1201053ERX1201054ERX1201055ERX1201056ERX1201057ERX1201058ERX1201059ERX1201060ERX1201061ERX1201062 More
distribution JSONJSON-LD
Download
bioproject.xml  HTTPS FTP
status public
visibility unrestricted-access
dateCreated 2015-11-12T00:00:00Z
dateModified 2015-11-12T00:00:00Z
datePublished 2015-11-11T00:00:00Z