<?xml version="1.0" encoding="UTF-8"?>
<STUDY_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <STUDY center_name="The Research Institute of Forestry, Chinese Academ" alias="De novo root transcriptome sequencing of purple sweet potato" accession="SRP007758">
    <IDENTIFIERS>
      <PRIMARY_ID>SRP007758</PRIMARY_ID>
      <EXTERNAL_ID namespace="BioProject" label="primary">PRJNA80119</EXTERNAL_ID>
      <SUBMITTER_ID namespace="The Research Institute of Forestry, Chinese Academ">De novo root transcriptome sequencing of purple sweet potato</SUBMITTER_ID>
    </IDENTIFIERS>
    <DESCRIPTOR>
      <STUDY_TITLE>De novo sequencing and comprehensive analysis of purple sweetpotato (Impomoea batatas L.) transcriptome</STUDY_TITLE>
      <STUDY_TYPE existing_study_type="Other"/>
      <STUDY_ABSTRACT>High-throughput RNA sequencing (RNA-seq) was performed on fresh tuberous roots of the purple sweetpotato. A total of 25,888,890 high quality reads with 2,330,000,100 nucleotides (nt) and a GC content of 48.92%, were generated from pair-end sequencing. According to de novo assembly by SOAPdenovo, 58,800 unigenes were obtained and ranged from 200 nt to 10,380 nt with an average length of 476 nt. One unigene on average is assembled by 137 reads with maximum reads of 6,242. Using the RPKM (Reads Per Kb per Million reads) method, we estimated that the average expression of one unigene is 34 RPKM with a maximum expression of 1,935 RPKM. BLASTX searches between all unigenes and the database of non-redundant proteins (nr) in NCBI showed that at least 40,280 (68.5%) unigenes were identified to be protein-coding genes. Amongst the 40,280 unigenes, 11,978 and 5,184 genes are homologous to Arabidopsis proteins and rice proteins, respectively. Based on the analysis against the databases of Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG), 19,707 (33.5%) unigenes were classified to 1,807 terms of GO including molecular functions, biological processes, and cellular components, and 9,970 (17.0%) unigenes were enriched to 11,119 KEGG pathways. According to functional analysis of unigenes, we found that at least 3,553 genes may be involved in biosynthesis pathways of starches, alkaloids, anthocyanin pigments, and vitamins. In addition to mining more potential molecular markers for sweetpotato breeding, 851 simple sequence repeats (SSRs, also named microsatellites) were identified in all unigenes.</STUDY_ABSTRACT>
      <CENTER_PROJECT_NAME>Ipomoea batatas</CENTER_PROJECT_NAME>
    </DESCRIPTOR>
  </STUDY>
</STUDY_SET>
