DNA Data Bank of Japan

                              DNA Database

Release 70.1, July 24 2007, including 72,801,679 entries, 76,788,510,646 bases
--------------------------------------------------------------------------------
DDBJ release 70.0 revised as 70.1 at July 24, 2007
--------------------------------------------------------------------------------
  Feature Table format errors were found in the DDBJ release 70.0 (released on
  June 2007). We corrected these errors and released again on July 24, 2007.
  Corrected file: ddbjbct1.seq
                  ddbjhum5.seq
                  ddbjinv1.seq
  Reference URL: http://www.ddbj.nig.ac.jp/whatsnew/2007/070724-e.html

-------------------------------------------------------------------------------
Table of contents
-------------------------------------------------------------------------------

  1. Introduction
    1.1.  Announcement for changes in the present release
    1.2.  Announcement for the forthcoming changes

  2. DDBJ flat file format
    2.1.  LOCUS line
    2.2.  DEFINITION line
    2.3.  ACCESSION line
    2.4.  VERSION line
    2.5.  KEYWORDS line
    2.6.  SOURCE line
    2.7.  REFERENCE line
    2.8.  COMMENT line
    2.9.  FEATURES line
    2.10. BASE COUNT line
    2.11. ORIGIN line

  3. Dataset categories
    3.1.  Division categories
    3.2.  TPA separated from primary dataset
    3.3.  Notice for patented data

  4. DDBJ staff

  5. Acknowledgment

  6. File categories

  7. Sample of the contents in each file
    7.1.  Part of the contents in the file 'ddbjbct1.seq'
    7.2.  Part of the contents in the accession number index file 'ddbjacc1.idx'
    7.3.  Part of the contents in the keyword phrase index file 'ddbjkey1.idx'
    7.4.  Part of the contents in the journal citation index file 'ddbjjou1.idx'
    7.5.  Part of the contents in the gene name index 'ddbjgen.idx'

  8. Release history

  9. File list

-------------------------------------------------------------------------------

1. Introduction

This database contains nucleotide sequence data for any organism, not only 
those with DNA genomes but also those with RNA genomes.

This database may be copied and redistributed without permission on the 
condition that all the statements in this release note are reproduced in each 
copy.  See also '3.3. Notice for patented data' below.  

The present release contains the newest data prepared by the DNA Data Bank of 
Japan (DDBJ), GenBank (*), and European Molecular Biology Laboratory/European 
Bioinformatics Institute (EMBL/EBI) as of May 25, 2007.  This unified 
database was made possible thanks to the international collaboration among the 
three data banks.  All the entries have accordingly been annotated using the 
feature keys common to them.  

*'GenBank' is a trademark of NIH, USA, and is operated by National Center for 
Biotechnology Information (NCBI) at NIH.

1.1. Announcement for changes in the present release

Nothing particular.  


1.2. Announcement for the forthcoming changes

Deletion of E-mail address, phone and fax numbers from DDBJ flat file  

To follow Japanese law protecting personal information, DDBJ will delete both 
phone and fax numbers, and E-mail address from the flat files of entries 
submitted to DDBJ.  Also, it would be helpful to protect DDBJ releases 
against SPAM mail senders.  
DDBJ plans to retorofit most of all entries submitted to DDBJ, not to GenBank or 
EMBL, by periodical release 72, the end of December 2007.  

Now, the submitter information is described in JOURNAL line at REFERENCE 1 as, 
--------------------------------------------------------------------------------
REFERENCE   1  (bases 1 to 1200)
  AUTHORS   Mishima,T.
  TITLE     Direct Submission
  JOURNAL   Submitted (01-Jan-1990) to the DDBJ/EMBL/GenBank databases.
            Taro Mishima, DNA Data Bank of Japan, National Institute of
            Genetics; 1111, Yata, Mishima, Shizuoka 411-8540, Japan
            (E-mail:ddbj@ddbj.nig.ac.jp, URL:http://www.ddbj.nig.ac.jp/,
            Tel:81-12-345-6789, Fax:81-12-345-9876)
--------------------------------------------------------------------------------

After the deletion or the information in question, DDBJ flat file will be 
either one of the following two types;  

Type 1: Phone and fax numbers and E-mail address are deleted.  
--------------------------------------------------------------------------------
REFERENCE   1  (bases 1 to 1200)
  AUTHORS   Mishima,T.
  TITLE     Direct Submission
  JOURNAL   Submitted (01-Jan-1990) to the DDBJ/EMBL/GenBank databases.
            Contact:Taro Mishima
            DNA Data Bank of Japan, National Institute of Genetics; 1111, 
            Yata, Mishima, Shizuoka 411-8540, Japan
            URL    :http://www.ddbj.nig.ac.jp/
-------------------------------------------------------------------------------

Type 2: When the submitters wish to keep their contact information disclosed, 
it will be described as, 
-------------------------------------------------------------------------------
REFERENCE   1  (bases 1 to 1200)
  AUTHORS   Mishima,T.
  TITLE     Direct Submission
  JOURNAL   Submitted (01-Jan-1990) to the DDBJ/EMBL/GenBank databases.
            Contact:Taro Mishima
            DNA Data Bank of Japan, National Institute of Genetics; 1111, 
            Yata, Mishima, Shizuoka 411-8540, Japan
            URL    :http://www.ddbj.nig.ac.jp/
            E-mail :ddbj@ddbj.nig.ac.jp
            Phone  :81-12-345-6789
            Fax    :81-12-345-9876
-------------------------------------------------------------------------------


2. DDBJ flat file format

The database is a collection of "entry" which is the unit of the data.  The 
entries submitted to databanks were processed and publicized according to the 
DDBJ format for distribution (flat file).  The flat file includes the sequence 
and the information of submitters, references, source organisms, and "feature" 
information, etc.  The items of the DDBJ flat file are explained at following; 

-------------------------------------------------------------------------------
LOCUS       AB000000                 450 bp    mRNA    linear   HUM 08-JUL-2002 
DEFINITION  Homo sapiens GAPD mRNA for glyceraldehyde-3-phosphate
            dehydrogenase, partial cds.
ACCESSION   AB000000
VERSION     AB000000.1
KEYWORDS    .
SOURCE      Homo sapiens
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE   1  (bases 1 to 450)
  AUTHORS   Mishima,H. and Shizuoka,T.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-NOV-2000) to the DDBJ/EMBL/GenBank databases.
            Hanako Mishima, National Institute of Genetics, DNA Data
            Bank of Japan; Yata 1111, Mishima, Shizuoka 411-8540, Japan
            (E-mail:mishima@supernig.nig.ac.jp, Tel:81-55-981-6853,
             Fax:81-55-981-6849)
REFERENCE   2  (sites)
  AUTHORS   Mishima,H., Shizuoka,T. and Fuji,I.
  TITLE     Glyceraldehyde-3-phosphate dehydrogenase expressed in human liver
  JOURNAL   Unpublished (2002)
COMMENT     Human cDNA sequencing project.
FEATURES             Location/Qualifiers
     source          1..450
                     /chromosome="12"
                     /clone="GT200015"
                     /clone_lib="lambda gt11 human liver cDNA (GeneTech.
                     No.20)"
                     /map="12p13"
                     /mol_type="mRNA"
                     /organism="Homo sapiens"
                     /tissue_type="liver"
     CDS             86..>450
                     /codon_start=1
                     /gene="GAPD"
                     /product="glyceraldehyde-3-phosphate dehydrogenase"
                     /protein_id="BAA12345.1"
                     /transl_table=1
                     /translation="MAKIKIGINGFGRIGRLVARVALQSDDVELVAVNDPFITTDYMT
                     YMFKYDTVHGQWKHHEVKVKDSKTLLFGEKEVTVFGCRNPKEIPWGETSAEFVVEYTG
                     VFTDKDKAVAQLKGGAKKV"
BASE COUNT          102 a          119 c          131 g           98 t
ORIGIN
        1 cccacgcgtc cggtcgcatc gcacttgtag ctctcgaccc ccgcatctca tccctcctct
       61 cgcttagttc agatcgaaat cgcaaatggc gaagattaag atcgggatca atgggttcgg
      121 gaggatcggg aggctcgtgg ccagggtggc cctgcagagc gacgacgtcg agctcgtcgc
      181 cgtcaacgac cccttcatca ccaccgacta catgacatac atgttcaagt atgacactgt
      241 gcacggccag tggaagcatc atgaggttaa ggtgaaggac tccaagaccc ttctcttcgg
      301 tgagaaggag gtcaccgtgt tcggctgcag gaaccctaag gagatcccat ggggtgagac
      361 tagcgctgag tttgttgtgg agtacactgg tgttttcact gacaaggaca aggccgttgc
      421 tcaacttaag ggtggtgcta agaaggtctg
//
-------------------------------------------------------------------------------


2.1. LOCUS line

The format of LOCUS line in the flat file is shown below; 
---------  --------
Positions  Contents
---------  --------
  01-05    'LOCUS'
  06-12     spaces
  13-28     Locus name
  29-29     space
  30-40     Length of sequence, right-justified
  41-41     space
  42-43     'bp'
  44-47     spaces
  48-54     DNA, RNA, mRNA, pre-RNA, rRNA, scRNA, snRNA, snoRNA, tRNA, 
             left justified
  55-55     space
  56-63     'linear' followed by two spaces, or 'circular'
  64-64     space
  65-67     The division code (see '3.1. Division categories')
  68-68     space
  69-79     Date, in the form dd-MMM-yyyy (e.g., 08-JUL-2002)
------------------------------------------------------------------------------


2.2. DEFINITION line

The definition briefly describes the information of gene(s).  "DEFINITION" is 
constructed by each of the three data banks.  


2.3. ACCESSION line

This line shows accession number of the entry data.  
A unique accession number is issued to the data submitter by each of the three 
data banks.  The accession number is composed of 1 alphabet character and 5 
digits (ex. A12345) or 2 alphabet characters and 6 digits (ex. AB123456).  The 
former style was used in 1980s, but later the latter style was introduced 
because of data explosion.  
All the entries designated by the accession numbers with the prefixes given 
below have been collected and processed by DDBJ, and the rest have been done 
by GenBank and EMBL/EBI.  

-------------------------------------------------------------------------------
  C, D, E, AB, AG, AK, AP, AT, AU, AV, BA, BB, BD, BJ, BP, BR, BS, BW, BY, 
  CI, CJ, DA, DB, DD, DE, DF, DG, DH
-------------------------------------------------------------------------------

You can find the list of the prefixes of  the accession numbers at the 
following URL;
http://www.ddbj.nig.ac.jp/sub/prefix.html
If multiple entries are united to an entry, or if an entry is extensively 
modified after the submission, the responsible data banks may assign a new 
accession number to it.  In these cases, the new accession number is called 
the primary accession number, and the old accession number(s) is/are 
called the secondary accession number(s).  In the flat file, the primary 
accession number is indicated first, then the secondary accession number(s) 
follows.  You can find the same updated entry with both the primary and the 
secondary accession numbers.  


2.4. VERSION line

This line consists of an accession number and a version number, like 
"AB123456.1", in which the digit(s) after the period is a version number.  
The data open to public for the first time is version number as "1".  The 
reason for adding VERSION is that since a released sequence sometimes 
revised by the submitter, the accession number alone cannot specify the 
sequence in question causing the user a trouble.  The number is increased 
by one every time when a revised sequence is made public.  


2.5. KEYWORDS line

The data banks describe this line, if necessary.  In many cases, the 
categories of the data (EST, HTG etc.), gene names and product names 
included in "KEYWORDS".  


2.6. SOURCE line

This line shows the scientific name on organism from which the sequence is 
obtained and an organelle type if the sequence is derived from an organelle 
other than the nucleus.  


2.7. REFERENCE line

The information on the submitters and references related to the submitted 
sequence is indicated in REFERENCE line.  


2.8. COMMENT line.

The information about an entry that can not be described using FEATURES or 
the other fields.  


2.9. FEATURES line

Biological features of a submitted sequence data are described with 
"Feature" key (the biological nature of the annotated feature), "Location"
(the region of the sequence which corresponds to Feature), and "Qualifier" 
(supplementary information about Feature).  The "Feature" and "Qualifier" keys 
used in the present release is defined by DDBJ/EMBL/GenBank Feature Table: 
Definition (Version 6.7 April, 2007).  The document is continuously updated 
every half year.  You can find its newest version on URL;
http://www.ddbj.nig.ac.jp/FT/full_index.html


2.10. BASE COUNT line

In the BASE COUNT line of the DDBJ flat file, 9 digits are allocated for each 
number of a (adenine), c (cytosine), g(guanine) and t (thymine).  In the case 
of RNA sequence, uracil is indicated as "t" according to the rule of the 
international nucleotide database.  In accordance with the relaxation of 
sequence length limitation, GenBank had already dropped the BASE COUNT line 
from their flat file format from GenBank Release 138 (Oct. 2003).  DDBJ has 
decided to maintain the BASE COUNT line in our flat file format from the view 
that GC contents are still important information to characterize the sequence.  


2.11. ORIGIN line

The sequence data starts from the next line of ORIGIN.  The sequence is 
indicated as lower case letters, delimited by space per 10 bases, starts a new 
line by 60 bases.  The numbers described at left side of lines mean the ordinal 
number of the top base of the line.  


3. Dataset categories

There have been a number of genome projects going on worldwide.  Among them 
human genome projects have probably been most productive and yielded a large 
number of ordinary sequences, huge amounts of genome sequences and EST 
(expressed sequence tags).  Thus, we DDBJ have the human (HUM) division solely 
for human sequences and the primate (PRI) division for non-human primate 
sequences, while PRI division of GenBank database contains human sequences too.  
Note that the other divisions such as EST, GSS, and HTC may also contain human 
sequences.  
The present release is divided into 21 categories of organisms and others.  See 
also '6. File categories' and '9. File list' below.  The contents of the 21 
categories are shown in the following.


3.1. Division categories

The first 20 divisions are given below; 

HUM; human 
PRI; primates (other than human) 
ROD; rodents 
MAM; mammals (other than primates and rodents) 
VRT; vertebrates (other than mammals) 
INV; invertebrates (animals other than vertebrates)
PLN; plants, fungi, plastids (eukaryotes other than animals)
BCT; bacteria (including both Eubacteria and Archaea)
VRL; viruses 
PHG; bacteriophages 
ENV; sequences obtained via environmental sampling methods 
SYN; synthetic constructs 
EST; expressed sequence tags; short single pass cDNA sequences 
GSS; genome survey sequences; short single pass genomic sequences 
HTC; high throughput cDNA sequences; 
     The sequence submitted from cDNA sequencing projects except for EST.  
     This division is to include unfinished high throughput cDNA sequences, 
     each of which has 5'UTR and 3'UTR at both ends and part of a coding region.
     The sequence may also include introns.  When the sequence becomes finished 
     later, it moves to the corresponding taxonomic division.  
HTG; high throughput genomic sequences 
     The sequence submitted mainly from genome sequencing projects which 
     regarded a clone as a sequencing unit.  
STS; sequence tagged sites 
     The tag site for genome sequencing.  The information of chromosome, map, 
     PCR_condition is mandatory for this division.  
PAT; patented data 
     The data submitted to JPO (Japan Patent Office), EPO (European Patent 
     Office), or USPTO (United States Patent and Trademark Office).  
     See also '3.3. Notice for patented data' in below.  
UNA; the data not annotated 
     The UNA division is not used for recently submitted sequences.  
CON; Contig / Constructed 
     To conjugate a series of entries, such as those submitted from a genome 
     project, each of data banks constructs an entry and assign an accession 
     number to a large scale sequence dataset.  Such entries are classified 
     into the CON division.  The entry in the CON division has the information 
     of joined accession numbers instead of the sequence data.  The 
     corresponding entries of the CON entry have been submitted to other 
     divisions.  The entries and bases in the CON division are not counted in 
     the released numbers given on the top of the release note.  


3.2. TPA separated from primary dataset

TPA (Third Party Annotation) data are also available.  The TPA data are a 
complement to the existing DDBJ/EMBL/GenBank comprehensive database of primary 
nucleotide sequences, which typically result from direct sequencing of cDNAs, 
ESTs, genomic DNAs etc.  Primary entries are defined to be data for which the 
submitting group has done the sequencing and annotation, and as 'owner' of 
these data has privileges to submit updates/corrections etc.  Primary entries 
used to build a TPA sequence are those that have been experimentally determined 
and are publicly available in the DDBJ/EMBL/GenBank databases.  They may not be 
from a proprietary database.  The entries and bases in TPA are not counted in 
the released numbers given on the top of the release note.  
See also the following URLs;  
http://www.ddbj.nig.ac.jp/sub/tpa-e.html
http://www.insdc.org/TPA.html


3.3. Notice for patented data

This release includes PAT division for patented data as described above.  The 
patented data are those which the Japanese Patent Office (JPO), United States 
Patent and Trademark Office (USPTO), and the European Patent Office (EPO) 
collected, processed and released.  The prefixes of accession numbers for the 
patented data are shown below; 
   -----------------------
    JPO  : E, BD, DD
    USPTO: I, AR, DZ, EA
    EPO  : A, AX, CQ, CS
   -----------------------
Note also that unauthorized use of the patented data may cause legal issues 
for which DDBJ takes no responsibility.


4. DDBJ staff

This release is published by the following DDBJ staff.  

  Gojobori T, Tateno Y, Sugawara H, Saitou N, Okubo K, 
  Ikeo K, Suzuki Y, Fukuchi S, Sumiyama K, Ogasawara O, Ogura A, Minezaki Y,
  Aono H, Atsumi T, Ehara Y, Ejima M, Fukuda D, Gojobori M, Hikino Y, 
  Hirai T, Hoshi N, Hosokawa T, Ikesaka T, Ishida K, Kawamoto T, Kohira J, 
  Koike T, Kosuge T, Kusakabe A,Lee K, Maesako H, Mamiya K, Maruyama M, 
  Maruyama N, Mashima J, Murakata N,Nagira S, Nagura M, Nishida S, 
  Nishinomiya N, Nozaki A, Okido T, Sakai K,Sugita R, Suzuki S, Tanabe W,
  Tsuboi M, Tsutsui H, Yagi H, Yagi Yamada H, Yamamoto K, Yamamoto M, 
  and Yokoyama E

Center for Information Biology and DNA Data Bank of Japan
National Institute of Genetics
Research Organization of Information and Systems

Mishima 411-8540, Japan 
Phone:  +81 55 981 6853
FAX:    +81 55 981 6849
E-mail: ddbj@ddbj.nig.ac.jp  (for general inquiry)
        ddbjsub@ddbj.nig.ac.jp  (for data submission)
        ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication)
WWW:    http://www.ddbj.nig.ac.jp/ (for DDBJ WWW server)
        http://sakura.ddbj.nig.ac.jp/ 
        (for DDBJ sequence data submission system)


5. Acknowledgment

We are grateful to NCBI and EMBL/EBI for a firm friendship and an excellent 
collaboration with us.  We also thank the Japanese Patent Office for a steady 
cooperation with us.  The operation of DDBJ is supported by the Ministry of 
Education, Culture, Sports, Science and Technology, and we would gratefully 
note this here.  DDBJ uses the Super-SINET computer network for data 
collection, data exchange and various services.  


6. File categories

This release covers 21 categories (see also '3. Dataset categories'.) of 
organisms and others as follows: 
------------------------------------------------------------------------------
ddbjbct*** Category for bacteria
ddbjcon*** Category for CON (contig sequences)
ddbjenv*** Category for ENV (environmental samples)
ddbjest*** Category for EST (expressed sequence tags)
ddbjgss*** Category for GSS (genome survey sequences)
ddbjhtc*** Category for HTC (high throughput cDNA sequences)
ddbjhtg*** Category for HTG (high throughput genomic sequences)
ddbjhum*** Category for human
ddbjinv*** Category for invertebrates
ddbjmam*** Category for mammals other than primates and rodents
ddbjpat*** Category for patents
ddbjphg*** Category for phages
ddbjpln*** Category for plants
ddbjpri*** Category for primates other than human
ddbjrod*** Category for rodents
ddbjsts*** Category for STS (sequence tagged sites)
ddbjsyn*** Category for synthetic DNAs
ddbjtpa*** Category for TPA (Third Party Annotation)
ddbjuna*** Category for unannotated sequences
ddbjvrl*** Category for viruses
ddbjvrt*** Category for vertebrates other than mammals
------------------------------------------------------------------------------

Some of above in the present release are recorded in multiple ddbj***##.seq 
files, each of which at most has 1.5 GB storage capacity as follows, 
respectively.  

---------------------
ddbjbct :   4 files
ddbjest :  96 files
ddbjgss :  41 files
ddbjhtc :   2 files
ddbjhtg :  16 files
ddbjhum :   5 files
ddbjinv :   2 files
ddbjpat :   5 files
ddbjpln :   5 files
ddbjrod :   5 files
ddbjsts :   3 files
ddbjvrl :   2 files
ddbjvrt :   3 files
ddbjcon :  13 files
---------------------

The index files included in this release are ddbjacc#.idx, ddbjgen.idx, 
ddbjjou#.idx, and ddbjkey#.idx.  See also '9. File list'.  All of them except 
ddbjgen.idx are recorded in multiple ddbj****.idx files, each of which at most 
has 1.5 GB storage capacity.  


7. Sample of the contents in each file

7.1. Part of the contents in the file 'ddbjbct1.seq'

This shows all pieces of information on one entry in DDBJ format.  
------------------------------------------------------------------------------
LOCUS       D87069                   993 bp    mRNA    linear   BCT 14-APR-2000
DEFINITION  Escherichia coli mRNA for RNA polymerase sigma subunit, truncated
            form of sigma-38, complete cds.
ACCESSION   D87069
VERSION     D87069.1
KEYWORDS    RNA polymerase sigma subunit, truncated form of sigma-38.
SOURCE      Escherichia coli
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
            Escherichia.
REFERENCE   1  (bases 1 to 993)
  AUTHORS   Jishage,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki
            Jishage, National Institute of Genetics, Molecular Genetics; Yata
            1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp,
            Tel:0559-81-6742, Fax:0559-81-6746)
REFERENCE   2  (bases 1 to 993)
  AUTHORS   Jishage,M. and Ishihama,A.
  TITLE     Variation in RNA polymerase sigma subunit composition within
            different stocks of Escherichia coli starin W3110
  JOURNAL   Unpublished (1996)
REFERENCE   3
  AUTHORS   Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A.
  TITLE     DNA base sequence variability in katF (putative sigma factor) gene
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 20, 5479-5480 (1992)
REFERENCE   4
  AUTHORS   Takayanagi,Y., Tanaka,K. and Takahashi,H.
  TITLE     Structure of the 5' upstream region and the regulation of the rpoS
            gene of Escherichia coli
  JOURNAL   Mol. Gen. Genet. 243, 525-531 (1994)
COMMENT
FEATURES             Location/Qualifiers
     source          1..993
                     /mol_type="mRNA"
                     /organism="Escherichia coli"
                     /strain="W3110"
     CDS             1..810
                     /note="the gene has four single base changes, resulting
                     in two amino acid substitutions and an amber mutation"
                     /product="RNA polymerase sigma subunit, truncated form of
                     sigma-38"
                     /protein_id="BAA13238.1"
                     /transl_table=11
                     /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE
                     LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV
                     VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN
                     QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER
                     ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK"
     variation       75
                     /citation=[3]
                     /replace="t"
     variation       97
                     /citation=[3]
                     /replace="t"
     variation       99
                     /citation=[3]
                     /replace="t"
     variation       808
                     /citation=[3]
                     /replace="t"
BASE COUNT          254 a          223 c          291 g          225 t
ORIGIN
        1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac
       61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg
      121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt
      181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg
      241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt
      301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc
      361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc
      421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac
      481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga
      541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag
      601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc
      661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat
      721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc
      781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg
      841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa
      901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag
      961 gggctgaata tcgaagcgct gttccgcgag taa
//
------------------------------------------------------------------------------


7.2. Part of the contents in the accession number index file 'ddbjacc1.idx'

The following excerpt from the accession number index file illustrates the
format of the index.  
------------------------------------------------------------------------------
D00001       ECPBPA       BCT X04516
D00002       ECPYRC       BCT X04469
D00003       HUMP450M     HUM D00003
D00004       FLBFLBL40    VRL D00004
D00005       IBAMEM682    VRL D00005
D00006       BACPNS1981   BCT D00006
D00007       CHKCALGRP    VRT D00007
D00008       ECPNTAB      BCT X04195
D00009       DROPER1      INV D00009
------------------------------------------------------------------------------


7.3. Part of the contents in the keyword phrase index file 'ddbjkey1.idx'

Keyword phrases consist of names for gene products and other characteristics
of sequence entries.  
------------------------------------------------------------------------------
"COAT PROTEIN
             SMO511347    VRL AJ511347
'TNPA GENE
             UBA564903    BCT AJ564903
'ZINC-FINGER' MOTIF
             PRNS53       VRL X60546
(+) MATING TYPE SURFACE PROTEIN
             ABGPSSP      PLN M94861
(1,3
             TABETGLUB    PLN Z22874
(1,3)-BETA-D-GLUCAN BINDING PROTEIN
             AJ606470     INV AJ606470
(1,3)BETA-GLUCAN SYNTHASE
             NCU09275     PLN U09275
(1,4)-BETA-D-ARABINOXYLAN ARABINOFURANOHYDROLASE
             ANAXHA       PLN Z78011      ANTUAXHA     PLN Z78010
(1,6)-BETA-GLUCAN BIOSYNTHESIS
             YSAKRE1A     PLN M81588
(1-3)-BETA-GLUCANASE
             NTSP41AGN    PLN X81560      PA13BGPT     PLN X57794
(1-3,1-4)-BETA-D-GLUCANASE
             HVBDG        PLN X52572
(1-4)-BETA-MANNAN ENDOHYDROLASE
             CAR278996    PLN AJ278996    CAR293305    PLN AJ293305
(2',5'-OLIGOISOADENYLATE SYNTHETASE-DEPENDENT)
             AL138776     HUM AL138776
(2'-5') OLIGO(A) SYNTHASE E16
             SSO4G06      EST F14610
(2'-5')OLIGOADENYLATE SYNTHETASE
             HSA225089    HUM AJ225089    HUMSYN25A    HUM D00068
             SSA225090    MAM AJ225090
(6')-IB' AMINOGLYCOSIDE ACETYLTRANSFERASE
             AXY278514    BCT AJ278514    PAE291609    BCT AJ291609
(8,11)-LINOLEOYL DESATURASE
             COF245938    PLN AJ245938
------------------------------------------------------------------------------


7.4. Part of the contents in the journal citation index file 'ddbjjou1.idx'

The journal citation index file lists all of the citations that appear in the
references.  
------------------------------------------------------------------------------
(ER) AAPS PHARMSCI. 4 (3), DOI 10.1208/PS040315 (2002)
             AY170916     ROD AY170916
(ER) AM. J. HUM. GENET. 76 (1) (2004) IN PRESS
             AY753209S1   HUM AY753209    AY753209S2   HUM AY753210
(ER) ARCH. VIROL. (2004) IN PRESS
             AF531505     VRL AF531505    AY518899     VRL AY518899
             AY518900     VRL AY518900    AY518901     VRL AY518901
             AY518902     VRL AY518902    AY518903     VRL AY518903
             AY518904     VRL AY518904    AY518905     VRL AY518905
             AY518906     VRL AY518906    AY518907     VRL AY518907
             AY518908     VRL AY518908    AY518909     VRL AY518909
             AY518910     VRL AY518910    AY518911     VRL AY518911
             AY518912     VRL AY518912    AY518913     VRL AY518913
             AY518914     VRL AY518914    AY518915     VRL AY518915
             AY518916     VRL AY518916    AY518917     VRL AY518917
             AY518918     VRL AY518918    AY518919     VRL AY518919
             AY518920     VRL AY518920    AY518921     VRL AY518921
             AY518922     VRL AY518922    AY518923     VRL AY518923
             AY518924     VRL AY518924    AY518925     VRL AY518925
             AY518926     VRL AY518926    AY518927     VRL AY518927
             AY518928     VRL AY518928    AY518929     VRL AY518929
             AY518930     VRL AY518930    AY518931     VRL AY518931
             AY518932     VRL AY518932    AY521234     VRL AY521234
             AY521235     VRL AY521235    AY521236     VRL AY521236
             AY521237     VRL AY521237    AY521238     VRL AY521238
(ER) ARTERIOSCLER. THROMB. VASC. BIOL. (2004) IN PRESS
             AY563557     HUM AY563557
(ER) BIOCHEM. BIOPHYS. RES. COMMUN. 325 (1), 203-214 (2004)
             AY563137     HUM AY563137
(ER) BIOCHEM. J./10.1042/BJ20030293
             HSA496460    HUM AJ496460
------------------------------------------------------------------------------


7.5. Part of the contents in the gene name index file 'ddbjgen.idx'

This file lists all the gene names that appear in the feature table.  
------------------------------------------------------------------------------
'ARR
             BX927156     BCT BX927156
'BGLG
             BX927156     BCT BX927156
'BGLS
             BX927148     BCT BX927148
'BGLY'
             BX927156     BCT BX927156
'BRNQ
             AF305888     BCT AF305888
'COMK
             AL591983     BCT AL591983    AL596172     BCT AL596172
'CRCB
             BX927155     BCT BX927155
'CRTI
             BX927155     BCT BX927155
'DPPE
             LDDIPEP      BCT Z34898
'FIC
             BX936398     BCT BX936398
------------------------------------------------------------------------------


8. Release history

Release  Date     Entries     Bases          Comments
 70     06/07  72,801,679  76,788,510,646
 69     03/07  67,523,680  71,775,679,500   PROJECT line started
                                            Indexes for each category terminated
 68     12/06  64,267,978  68,259,314,742   1.5 GB storage started
 67     09/06  61,144,621  65,443,024,193
 66     06/06  58,176,628  62,945,843,881
 65     03/06  55,890,995  60,564,721,635   TPA subcategories started
 64     12/05  52,272,669  56,098,558,378   Some index files split
 63     09/05  47,741,593  52,246,110,341
 62     06/05  45,249,444  49,158,155,283   ENV started
                                            Version for release note started
 61     03/05  43,118,204  47,099,081,750   Changed style of release note
 60     12/04  40,583,945  44,416,752,273   /db_xref="H-inv:**" started
 59     09/04  37,926,117  42,245,956,937
 58     06/04  34,917,581  39,812,635,108
 57     03/04  32,693,678  38,008,449,840
 56     12/03  30,405,173  36,079,046,032
 55     09/03  27,753,140  34,280,225,489
 54     06/03  25,149,821  32,162,041,177
 53     02/03  23,250,813  29,711,299,332
 52     12/02  20,354,812  26,931,456,316
 51     09/02  18,401,358  22,782,404,136   TPA started
 50     06/02  17,260,693  20,158,357,982
 49     04/02  16,503,157  18,579,627,226
 48     01/02  15,016,100  16,197,713,855
 47     10/01  13,266,610  14,145,671,645
 46     07/01  12,313,759  13,037,646,166
 45     04/01  11,434,113  12,207,092,905   HTC division started
 44     01/01  10,165,597  11,136,298,841
 43     10/00   8,666,551  10,034,532,698
 42     07/00   7,554,995   8,880,721,093
 41     04/00   5,962,608   6,409,581,885   CON division started
 40     01/00   5,388,125   4,762,696,173   RNA division terminated
 39     10/99   4,810,773   3,728,000,562   NID and PID discarded
 38     07/99   4,294,369   3,098,519,597
 37     03/99   3,311,627   2,375,261,951   VERSION, /protein_id started
 36     01/99   3,073,166   2,190,425,560
 35     10/98   2,759,261   1,957,341,169
 34     07/98   2,412,785   1,708,580,623
 33     04/98   2,174,769   1,479,303,279
 32     01/98   1,956,669   1,300,950,613
 31     10/97   1,731,532   1,139,869,464   Adoption of the unified taxonomy 
                                            database
 30     07/97   1,534,115     992,788,339   NID and PID terminated
 29     04/97   1,270,194     841,415,232   
 28     01/97   1,154,120     756,785,219   HTG division started
                                            ORG division terminated
 27     10/96     936,697     608,103,057   GSS division started
 26     07/96     835,552     551,932,448   
 25     04/96     744,490     499,300,364   /translation started
 24     01/96     637,508     431,771,652   
 23     10/95     569,757     390,694,350   
 22     07/95     437,588     322,982,425   HUM division started
 21     04/95     274,596     250,875,023   
 20     01/95     239,689     231,299,557   
 19     10/94     204,332     205,274,131   
 18     07/94     185,230     192,473,021   
 17     04/94     169,957     179,942,209   
 16     01/94     154,626     165,017,628   
 15     10/93     131,649     147,224,690   
 14     07/93     120,350     138,686,333   
 13     04/93     112,067     129,784,445   
 12     01/93      97,683     120,815,244   EST division started
 11     07/92      65,693      84,839,075   
 10     01/92      59,317      77,805,556   GenBank/EMBL inclusion started
  9     07/91       1,130       2,002,124   
  8     01/91         879       1,573,442   
  7     07/90         681       1,154,211   
  6     01/90         496         841,236   
  5     07/89         395         679,378   
  4     01/89         302         535,985   
  3     07/88         230         345,850   
  2     01/88         142         199,392   
  1     07/87          66         108,970   Started with DDBJ only


------------------
Since release 69
------------------

Introduction of the project ID at PROJECT line in DDBJ flat file: 
Following the agreement at the INSD collaborative meeting in 2006, INSDC has 
started to assign the project ID for submissions from sequencing projects.  
The description of project ID is shown as below;  
----------------------------------------------------------------------------
  A unique identifier, assigned at the time of the submission by a sequencing 
  project that informed INSDC of the submission beforehand.  It is recommended 
  that the submitter quotes the assigned project ID in all communication with 
  INSDC databases to allow for easier and faster tracking of issues.  
  The project ID field provides an umbrella identifier that points to all 
  related sequence data for the project.  
----------------------------------------------------------------------------
The PROJECT lines contain INSDC-assigned ID for the sequencing project.  
It will be appeared between VERSION and KEYWORDS lines in DDBJ flat files, 
from the DDBJ periodical release, 69 as shown below.  See also '2. DDBJ flat 
file format'.  
----------------------------------------------------------------------------
ACCESSION   AB012345
VERSION     AB012345.1
PROJECT     GenomeProject:123
KEYWORDS    .
----------------------------------------------------------------------------


Termination of providing the index files for each category: 
For users logging in one of our computers (supernig), we provided index 
files for each category.  However, as the computer system in our institute 
was replaced with a new one which does not have a service using the index 
files, we terminated providing the index files.  


------------------
Since release 68
------------------

Split of files:  
We changed the maximum file size from 300 MB to 1.5 GB, because the network 
capacity has been remarkably increased.  Each file named as ddbj***##.seq 
has at most 1.5 GB storage capacity.  See also the sections, '6. File 
categories' and '9. File list'.  


------------------
Since release 65
------------------

Introduction of two types of TPA entries:  
According to the decision of ICM 2005, TPA data set is now classified into 
two categories, "TPA:experimental" and "TPA:inferential", to distinguish TPA 
annotation supported by wet-lab. experimental evidence and that inferred.  
The retrofit to divide TPA entries into two categories starts from the release 
65.  
You can find the description of the two TPA categories at the following URLs;  
http://www.ddbj.nig.ac.jp/sub/tpa-e.html
http://www.insdc.org/TPA.html
See also '3.2. TPA separated from primary dataset'.  


------------------
Since release 64
------------------

Split of index files:  
In the present release, some of index files (ddbjacc.idx, ddbjjou.idx, and 
ddbjkey.idx) have been greater than 2 GB in the file size.  So, these have been 
recorded in multiple ddbj****.idx files, each of which at most has 1.5 GB 
storage capacity as follows, respectively.  See also 6., 7.2., 7.3., 7.4. 
and 9.  


------------------
Since release 62
------------------

Release version number is introduced:  
DDBJ has started to include the item, 'version', for its release note, which 
indicates a version for its periodical release.  It is expressed like '62.0', 
in which the digit(s) after the period is a version number.  The reason for 
adding the version number is that a released data is sometimes revised due to 
urgent and necessary corrections.  The number is increased by one every time 
when a revised periodical release is made public until the next release.  

Introduction of ENV division:  
Recently, the submissions of the sequences derived from environmental samples 
have rapidly increased.  To accommodate such submissions, a new division, ENV, 
has been created (See also '3.1. Division categories').  This division contains 
the sequences obtained via direct molecular isolation such as PCR, DGGE, or any 
anonymous method.  In the past, the sequences derived from environmental 
samples belonged to taxonomic divisions, mainly BCT.  At DDBJ, the retrofit to 
transfer relevant entries from taxonomic divisions to the ENV division starts 
in the present release, and ends by the next periodical release.  Please note 
that during this transitional period, some entries to be eventually placed in 
the ENV division will be found in other divisions.  

Strand information is removed:  
The strand information of LOCUS line in the flat file has been removed as shown 
below.  See also '2.1. LOCUS line'.  
----------------------------------------------------------------------------
Old (-rel. 61):
  44-44     space
  45-47     spaces, ss- (single-stranded), ds- (double-stranded), or 
             ms- (mixed-stranded)
New (rel. 62-):
  44-47     spaces
----------------------------------------------------------------------------


------------------
Since release 61
------------------
The style of release note (this file) has been changed.  

Some entries have the sequential format for the secondary accession numbers in 
the ACCESSION line, in order to make the expression of secondary accession 
numbers in the past short.  For example;
------------------------------------------------------------------------------
Before;
ACCESSION   AB000802 D85885 D85886 D85887
After;
ACCESSION   AB000802 D85885-D85887
------------------------------------------------------------------------------
See also '2.3. ACCESSION line'.  


------------------
Since release 60
------------------
The cross-reference to the H-invitational has been included.


------------------
Since release 56
------------------
The three data banks have agreed that the maximum length limitation (350 kb)
of a submitted sequence be relaxed.

The BASE COUNT line of the DDBJ flat file format has been changed, 
corresponding to the relaxation of the maximum sequence length restriction in 
the entry that had been practiced at DDBJ/EMBL/GenBank International Nucleotide 
Sequence Databases.  In the BASE COUNT line of the DDBJ flat file, 6 digits 
had been allocated for each number of a, c, g, t and other bases in the 
sequence.  Hereafter, in the new flat file format, 9 digits are allocated for 
each number of a, c, g and t, while the numbers of other bases are removed.  
In accordance with the relaxation of sequence length limitation, GenBank had 
already dropped the BASE COUNT line from their flat file format from GenBank 
Release 138 (Oct. 2003).  We DDBJ have decided to maintain the BASE COUNT line 
in our flat file format from the view that GC contents are still important 
information to characterize the sequence.  The changes in the BASE COUNT line 
are shown below.  
----------------------------------------------------------------------------
Old (-rel. 55): 
    1    6   11   16   21   26   31   36   41   46   51   56   61   66   71
    |----|----|----|----|----|----|----|----|----|----|----|----|----|----|
    BASE COUNT   123456 a 123456 c 123456 g 123456 t 123456 others

New (rel. 56-): 
    1    6   11   16   21   26   31   36   41   46   51   56   61   66   71
    |----|----|----|----|----|----|----|----|----|----|----|----|----|----|
    BASE COUNT    123456789 a    123456789 c    123456789 g    123456789 t
----------------------------------------------------------------------------

The SOURCE in the flat file is revisited and revised if necessary in accordance 
with the unified taxonomy database common to the three data banks.


------------------
Since release 54
------------------
'/sequenced_mol' qualifier has been changed to '/mol_type' qualifier.  We 
accordingly completed retrofitting the pertinent entries.  
This change was made on the agreement at the INSD collaborative meeting in 2002.


------------------
Since release 51
------------------
The TPA (Third Party Annotation) dataset has been available.  The dataset is 
a complement to the existing DDBJ/EMBL/GenBank database of the primary 
nucleotide sequences which were obtained from direct sequencing of cDNAs, 
ESTs, genomic DNAs etc.  

The format of LOCUS line in the flat file has been changed as shown below 
to adjust to the GenBank format.  
------------------------------------------------------------------------------
Old (-rel. 50): 
LOCUS       AB000001      660 bp    DNA             PLN       01-FEB-2001
New (rel. 51-): 
LOCUS       AB000001                 660 bp    DNA     linear   PLN 01-FEB-2001
------------------------------------------------------------------------------


------------------
Since release 45
------------------
The HTC (High Throughput cDNA) division has been included.  This is to include 
unfinished high throughput cDNA sequences, each of which has 5'UTR and 3'UTR 
at both ends and part of a coding region.  The sequence may also include 
introns.  When the sequence becomes finished later, it moves to the 
corresponding taxonomic division.  The sequence is accompanied with a keyword, 
HTC (High Throughput cDNA), which is dropped when the sequence is finished and 
moved to a taxonomic division.  


------------------
Since release 41
------------------
The CON division has been included.  This division is to show the order of 
related sequences in a genome, and expressed by join and the accession numbers 
of the sequences.  The contents of the CON division are compiled by the three 
data banks not by the data submitter.  


------------------
Since release 40
------------------
The RNA division was terminated.  The RNA data have been redistributed 
according to the category of the organism.  Therefore, you will find a human 
RNA sequence, for example, in the HUM division.  


------------------
Since release 37
------------------
The three data banks include the item VERSION in the flat file, which 
indicates a version of a submitted nucleotide sequence.  It is expressed 
like AB123456.1, in which the digit(s) after the period is a version number.  
The reason for adding VERSION is that since a released sequence sometimes 
revised by the submitter, the accession number alone cannot specify the 
sequence in question causing the user a trouble.  The number is increased by 
one every time when a revised sequence is made public.  

Accordingly, the translated protein sequence will be accompanied with a 
/protein_id which is expressed as BAA12345.1, in which the digit(s) after the 
period is again a version number.  The number is increased by one when the 

corresponding nucleotide sequence is revised and the protein sequence is 
changed as a result, and when the revised protein sequence is made public.


------------------
Since release 31
------------------
We have started adopting the unified taxonomy database to unify the biological 
source of the sequence.  The database is made up with scientific names, ID of 
unidentified organisms, and synthetic constructs etc.  


------------------
Since release 30
------------------
NID and PID were terminated.  This change was made on the agreement at the 
INSD collaborative meeting in 1999.  


------------------
Since release 28
------------------
The HTG (High Throughput Genomic sequence) has been included.  This division 
was created to cope with genome project teams which deal with a clone as a 
sequencing unit.  

We terminated the ORG (Organelle) division.  Thus, if you are interested in 
human mitochondrial sequences, for example, you are now advised to refer to 
the HUM division.  


------------------
Since release 27
------------------
The GSS division has been included.  GSS stands for Genome Survey Sequence, 
which is similar to EST, except that GSS is genomic DNA whereas EST is cDNA.  


------------------
Since release 25
------------------
DDBJ release contains amino acid sequences that were translated from the 
corresponding nucleotide sequences of the database.  In the translation we paid 
much attention to the fact that some species or organella have a codon 
different from the universal one, and used the proper codon table.  


------------------
Since release 22
------------------
The HUM division has been included.  Human genome projects have probably been 
most productive and yielded a large number of sequences  Thus, we have the 
human (HUM) division solely for human sequences and the primate (PRI) division 
for non-human primate sequences.  


------------------
Since release 12
------------------
The EST (Expressed Sequence Tag) division has been included.  The number of 
ESTs has been increasing at an enormous rate and is expected to be growing even 
more rapidly in the future.  Thus, we created a division for ESTs  


------------------
Since release 10
------------------
The sequences submitted to GenBank or EMBL have been included in the release.  


9. File list

The files in this release are arranged in the following order with non-labeled 
format.  

-----------------------------------------------------------------------
file name                                               file size
-----------------------------------------------------------------------
ddbjrel.txt   (DDBJ release note)                           64321
ddbjacc1.idx  (Accession number index file 1)          1499999975
ddbjacc2.idx  (Accession number index file 2)          1339766753
ddbjgen.idx   (Gene name index file)                    104770898
ddbjjou1.idx  (Journal citation index file 1)          1423922988
ddbjjou2.idx  (Journal citation index file 2)          1383276720
ddbjjou3.idx  (Journal citation index file 3)           317494465
ddbjkey1.idx  (Keyword phrase index file 1)             396145508
ddbjkey2.idx  (Keyword phrase index file 2)            1187440463
ddbjkey3.idx  (Keyword phrase index file 3)            1496349953


-----------------------------------------------------------------------
file name          number of entries   number of bases  file size
-----------------------------------------------------------------------
ddbjbct1.seq                26914       123860002       299027806
:
file name          number of entries   number of bases  file size
-----------------------------------------------------------------------
ddbjbct1.seq               116790       612204313      1500729056
ddbjbct2.seq                81482       636592710      1499441986
ddbjbct3.seq                  332       676439812      1509203405
ddbjbct4.seq               115635       311092165       812162638
ddbjenv.seq                402505       349565340      1132129359
ddbjest1.seq               462723       173137934      1499001882
ddbjest2.seq               490586       191924402      1499001909
ddbjest3.seq               497952       205900930      1499000841
ddbjest4.seq               480048       204838191      1499001969
ddbjest5.seq               546092       299467851      1499001704
ddbjest6.seq               560107       337844378      1499000418
ddbjest7.seq               496045       237894443      1499003741
ddbjest8.seq               344549       107822939      1499000357
ddbjest9.seq               566901       270738634      1499000670
ddbjest10.seq              470062       197023652      1499002351
ddbjest11.seq              481435       206957234      1499002806
ddbjest12.seq              270149        81538639      1499000722
ddbjest13.seq              268825        82287458      1499000224
ddbjest14.seq              268308       116313271      1499002724
ddbjest15.seq              437525       207319899      1499002776
ddbjest16.seq              480797       251188850      1499000738
ddbjest17.seq              460404       259232762      1499000566
ddbjest18.seq              444035       243416617      1499000224
ddbjest19.seq              469183       219174864      1499002196
ddbjest20.seq              464350       288971246      1499000555
ddbjest21.seq              481749       272181559      1499001398
ddbjest22.seq              449806       250528423      1499000845
ddbjest23.seq              446002       262994447      1499000229
ddbjest24.seq              564256       309042359      1499002453
ddbjest25.seq              511364       297587339      1499000167
ddbjest26.seq              410969       216668496      1499002540
ddbjest27.seq              424488       252666839      1499000099
ddbjest28.seq              497951       273048625      1499001908
ddbjest29.seq              510276       236211503      1499000933
ddbjest30.seq              436446       245569935      1499001886
ddbjest31.seq              446456       262588722      1499002484
ddbjest32.seq              436272       304850329      1499002021
ddbjest33.seq              412050       270775601      1499004198
ddbjest34.seq              580977       350781213      1499000624
ddbjest35.seq              586862       316847732      1499000048
ddbjest36.seq              452068       304810123      1499000436
ddbjest37.seq              323731       171194853      1499005806
ddbjest38.seq              253132        96377495      1499001998
ddbjest39.seq              248702       103083115      1499000725
ddbjest40.seq              394603       194771830      1499001142
ddbjest41.seq              441804       280424562      1499001766
ddbjest42.seq              479705       244419541      1499000492
ddbjest43.seq              452368       247150562      1499001238
ddbjest44.seq              532887       305439182      1499001925
ddbjest45.seq              445178       238981574      1499002358
ddbjest46.seq              500953       287378266      1499000527
ddbjest47.seq              518876       266341458      1499002701
ddbjest48.seq              425910       257903329      1499002097
ddbjest49.seq              300428       157062142      1499004778
ddbjest50.seq              257933       119199555      1499002915
ddbjest51.seq              261869       102732221      1499004992
ddbjest52.seq              334982       152371904      1499002806
ddbjest53.seq              500674       306453122      1499001786
ddbjest54.seq              494657       304834657      1499003755
ddbjest55.seq              434864       254080855      1499002599
ddbjest56.seq              446487       257965058      1499001398
ddbjest57.seq              479502       273915775      1499001404
ddbjest58.seq              452932       258455834      1499002292
ddbjest59.seq              426107       242184139      1499005372
ddbjest60.seq              444268       258305262      1499001283
ddbjest61.seq              508220       311494156      1499000231
ddbjest62.seq              449032       294751160      1499000631
ddbjest63.seq              461555       243749746      1499001449
ddbjest64.seq              461483       279771619      1499000569
ddbjest65.seq              400307       260194079      1499003653
ddbjest66.seq              397874       252299416      1499003119
ddbjest67.seq              424830       235127751      1499003336
ddbjest68.seq              418692       234801249      1499002445
ddbjest69.seq              426550       234101508      1499001593
ddbjest70.seq              453172       244903404      1499000141
ddbjest71.seq              517568       307213952      1499001037
ddbjest72.seq              464722       285776645      1499000322
ddbjest73.seq              399225       304021684      1499002674
ddbjest74.seq              507362       291677419      1499000795
ddbjest75.seq              398696       285171998      1499001582
ddbjest76.seq              378817       253146078      1499002666
ddbjest77.seq              380407       270703608      1499002380
ddbjest78.seq              421569       303028278      1499002433
ddbjest79.seq              437558       314513859      1499001196
ddbjest80.seq              466918       311978083      1499001134
ddbjest81.seq              470865       297310493      1498999985
ddbjest82.seq              555209       236689391      1499000356
ddbjest83.seq              560309       281705060      1499001149
ddbjest84.seq              474156       313982672      1499000326
ddbjest85.seq              528040       308455319      1499000723
ddbjest86.seq              606679       318607740      1499000884
ddbjest87.seq              640990       278560600      1499000731
ddbjest88.seq              482884       299142919      1499001420
ddbjest89.seq              496777       322044249      1499001509
ddbjest90.seq              525133       317296300      1499001783
ddbjest91.seq              503145       277362361      1499000818
ddbjest92.seq              423386       242916117      1499000367
ddbjest93.seq              648134       180897990      1499001713
ddbjest94.seq              462653       291504690      1499004118
ddbjest95.seq              475954       217449263      1499001425
ddbjest96.seq              212382        79036030       663660936
ddbjgss1.seq               474292       342579165      1499001508
ddbjgss2.seq               469487       328763532      1499002043
ddbjgss3.seq               469577       329480084      1498999924
ddbjgss4.seq               535139       259072571      1499002742
ddbjgss5.seq               473190       248104893      1499001026
ddbjgss6.seq               440947       237156101      1499002008
ddbjgss7.seq               399860       202689503      1499003247
ddbjgss8.seq               435779       228668997      1499002571
ddbjgss9.seq               529152       307873787      1499000304
ddbjgss10.seq              548394       306208527      1499000555
ddbjgss11.seq              499697       330831634      1499001752
ddbjgss12.seq              541721       313721163      1499000097
ddbjgss13.seq              509139       386718338      1499001173
ddbjgss14.seq              543777       358233135      1499002395
ddbjgss15.seq              646746       336688058      1499001246
ddbjgss16.seq              579938       415836855      1499002184
ddbjgss17.seq              554581       283488326      1499000330
ddbjgss18.seq              506188       371776797      1499002666
ddbjgss19.seq              553861       371668173      1499001233
ddbjgss20.seq              605770       385697797      1499000090
ddbjgss21.seq              593431       409014135      1499001301
ddbjgss22.seq              509118       307254699      1499000456
ddbjgss23.seq              516157       331502624      1499000892
ddbjgss24.seq              536243       355036924      1499000032
ddbjgss25.seq              534800       345868102      1499002271
ddbjgss26.seq              537183       297484728      1499001785
ddbjgss27.seq              525313       310223244      1499001873
ddbjgss28.seq              514852       357660091      1499002617
ddbjgss29.seq              496927       372576093      1499001729
ddbjgss30.seq              599837       364565125      1499000739
ddbjgss31.seq              462359       338886032      1499000814
ddbjgss32.seq              508966       359223981      1499000022
ddbjgss33.seq              568101       356880729      1499000866
ddbjgss34.seq              463846       276738233      1499003564
ddbjgss35.seq              412445       339078336      1499002167
ddbjgss36.seq              426860       349869430      1499001836
ddbjgss37.seq              417809       333051249      1499002270
ddbjgss38.seq              426600       347911222      1499001072
ddbjgss39.seq              424359       350086600      1499000836
ddbjgss40.seq              415082       334460964      1499000156
ddbjgss41.seq              140185       113011716       464262128
ddbjhtc1.seq               274856       358166233      1499005037
ddbjhtc2.seq               181282       199724093       697969621
ddbjhtg1.seq                11402      1118404302      1499217811
ddbjhtg2.seq                 7499      1118516382      1499168396
ddbjhtg3.seq                 5877      1131171118      1499196072
ddbjhtg4.seq                 5473      1140161670      1499297881
ddbjhtg5.seq                 5292      1144438006      1499168156
ddbjhtg6.seq                 5300      1144814479      1499039235
ddbjhtg7.seq                 6530      1132679422      1499169682
ddbjhtg8.seq                 6911      1141269181      1499129375
ddbjhtg9.seq                 5853      1136310906      1499039943
ddbjhtg10.seq                6133      1129729690      1499175311
ddbjhtg11.seq                6464      1127296548      1499027262
ddbjhtg12.seq                7630      1115086606      1499120261
ddbjhtg13.seq                6929      1148019702      1499172134
ddbjhtg14.seq                7205      1141249732      1499027921
ddbjhtg15.seq                7062      1135558546      1499695522
ddbjhtg16.seq                4646       808976242      1061492450
ddbjhum1.seq                24045      1055779939      1499092369
ddbjhum2.seq                 8130      1069741093      1499006006
ddbjhum3.seq               129860       865670962      1499033665
ddbjhum4.seq                68363       981512078      1499006258
ddbjhum5.seq               179714       356903807       968697805
ddbjinv1.seq               219756       730522848      1499000109
ddbjinv2.seq               248470       311201075       945117281
ddbjmam.seq                137366       312087201       691380791
ddbjpat1.seq              1035388       520054374      1499000335
ddbjpat2.seq               776970       494162098      1499002445
ddbjpat3.seq               745212       347103335      1499000661
ddbjpat4.seq               663614       603609123      1499002162
ddbjpat5.seq               579519       386267152      1230484835
ddbjphg.seq                  3410        23534589        57870977
ddbjpln1.seq               218145       733279411      1499001201
ddbjpln2.seq               140559       725459393      1499005549
ddbjpln3.seq                75134       887708202      1499000416
ddbjpln4.seq               396580       523332232      1499001446
ddbjpln5.seq                39345        72011975       190529221
ddbjpri.seq                 58257       785140919      1141781179
ddbjrod1.seq                15451      1043061330      1499117901
ddbjrod2.seq                 5925      1095248858      1499027250
ddbjrod3.seq                41070      1051023724      1499055601
ddbjrod4.seq               162857       835919846      1499001219
ddbjrod5.seq                96671        50520542       236720952
ddbjsts1.seq               419353       209366151      1499003441
ddbjsts2.seq               340669       239544575      1499002727
ddbjsts3.seq               164268        70234399       474340866
ddbjsyn.seq                 50841        68735194       242706658
ddbjuna.seq                   215          117124          446569
ddbjvrl1.seq               401581       405137705      1499002143
ddbjvrl2.seq                59099        58111099       208833377
ddbjvrt1.seq               228860       715882789      1499022114
ddbjvrt2.seq                59828      1068578745      1499001227
ddbjvrt3.seq                78485        66277252       244134389
------------------------------------------------------------------------------
Total                    72801679     76788510646    275304349605

ddbjtpa.seq                  5442       336932512        43280383
ddbjcon1.seq               507631               0      1499188176
ddbjcon2.seq               286468               0      1499000177
ddbjcon3.seq               315166               0      1499003507
ddbjcon4.seq               299628               0      1499000392
ddbjcon5.seq               288489               0      1499004729
ddbjcon6.seq               323644               0      1499000143
ddbjcon7.seq               329329               0      1499003477
ddbjcon8.seq               277274               0      1499002818
ddbjcon9.seq               275879               0      1499002569
ddbjcon10.seq              276991               0      1499002602
ddbjcon11.seq              264037               0      1499003732
ddbjcon12.seq              252470               0      1499003335
ddbjcon13.seq              235745               0      1377590721


The entries and bases in the CON division and TPA dataset are not counted in 
the numbers given on the top of the release note or 'Total' on the above table.