*--------------------------------------------------------------------------------------------------* Release information of H-InvDB_9.0 http://www.h-invitational.jp Dataset fixed on June 25, 2013. Released on May 27, 2015. *--------------------------------------------------------------------------------------------------* --------------------------- H-InvDB statistics --------------------------- 1. number of H-Invitational transcripts (HIT) all HIT: 220,058 * protein coding transcripts: 196,634 * non-protein-coding transcripts: 23,290 * psudogene candidates: 134 2. number of H-Invitational clusters (HIX) all HIX: 48,065 * protein coding: 39,443 * non-protein-coding: 8,579 * psudogene candidates: 43 3. number of H-Invitational proteins (HIP) all HIP: 139,573 --------------------------- Human nucleotide datasets --------------------------- 1. Human full-length cDNA dataset The dataset contains sequences produced by six institutes. All the sequences are already in DDBJ/EMBL/GenBank. 2. Human mRNA dataset Human mRNA sequences registered in DDBJ/EMBL/GenBank other than full-length cDNA were extracted from DDBJ release 93 obtained on June 25, 2013. 3. Human genome dataset Repeat masked human genome assembly NCBI build 37.1 was obtained from UCSC. (GRCh37: UCSC hg19, Feb. 2009: human genome NCBI b37.1) --------------------------- Databases --------------------------- 1. RefSeq mRNA RefSeq curated mRNAs were obtained from NCBI on May 20, 2013. (RefSeq release 59) 2. Ensembl transcripts Ensembl transcripts were obtained from Ensembl on May 23, 2013. Ensembl [release 71] 2. RefSeq protein RefSeq proteins were obtained from NCBI on May 20, 2013. (RefSeq release 59) 3. UniProt(SWISS-PROT/TrEMBL) UniProt(SWISS-PROT/TrEMBL) entries were obtained from EBI on May 23, 2013. (Release 2013_05) 4. HUGO approved gene symbol http://www.gene.ucl.ac.uk/nomenclature/ Human gene name data fixed on March 4, 2015. 5. Entrez Gene database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene Relations of H-InvDB genes to Entrez Genes were fixed on May 24, 2013. 6. dbSNP Relations of H-InvDB genes to dbSNP build137 were fixed on May 20, 2013. --------------------------- Contents --------------------------- README.txt :this file acc2hinv_id.txt.gz :Summary of the cDNA data provider and the Accession Number of INSD (DDBJ/EMBL/GenBank) versus the H-Invitational Identifiers new_del_update_hinvid.txt :List of new, deleted and updated H-Invitational IDs jbirc_ff/ :H-InvDB annotated data sets in jbirc-format (refer to http://www.h-invitational.jp/hinv/help/help_index.html or http://www.h-invitational.jp/hinv/dataset/download.cgi for more information) jbirc_xml/ :H-InvDB annotated data sets in jbirc-xml-format (refer to http://www.h-invitational.jp/hinv/help/help_index.html or http://www.h-invitational.jp/hinv/dataset/download.cgi for more information) analysis/ :H-InvDB annotated dataset by computational analysis sequence/ :H-InvDB annotated sequence datasets for transcript, protein and genmoe sequences *--------------------------------------------------------------------------------------------------*