home > bioproject > PRJDB2278
identifier PRJDB2278
type bioproject
sra-study  DRP000390
organism Secale cereale
title Construction of a rye (Secale cereale L.) reference transcriptome
description As a test case for building a complete reference transcriptome for a eukaryotic species with a large un-sequenced genome, more than 5.6 million high quality 454 pyrosequencing cDNA reads from rye (Secale cereale) were generated and assembled. A novel post de novo assembly sequence analysis program, BLAST-based post-assembly process (BbPAP), was designed to eliminate redundancy, maximize contig length, and correct erroneous assemblies by recognizing and eliminating alternatively spliced transcripts and error-rich termini regions while integrating EST and genome sequence information from related species. A rye reference transcriptome consisting of 56,091 sequences with an average median length value of 1,301 nucleotides was obtained. It includes 41,567 coding loci, covering more than 90% of the hypothetical gene loci. More than 80% of the identified genes have complete coding regions representing the most complete gene index available in any Triticeae species. In addition, we identified that approximately 20% of the genes had putative alternative splicing sites. A minimum expansion of rye genes of 17% was calculated based upon comparison with orthologous single copy genes from Brachypodium. 4,461 novel rye or Triticeae-specific transcripts were identified and half of these were subjected to purification selection thus indicating their important functions in the formation of the Triticeae species. none provided
data type DDBJ SRA Study
external link