DNA Data Bank of Japan

                              DNA Database

   Release 30, Jul. 1997, including 1,534,115 entries, 992,788,339 bases


This database may be copied and redistributed without permission on the 
condition that all the statements in this release note are reproduced in each 
copy.

The present release contains the newest data prepared by the DNA Data Bank of 
Japan (DDBJ), GenBank, and European Molecular Biology Laboratory/European 
Bioinformatics Institute (EMBL/EBI) as of Jun. 30, 1997.  This unified database 
was made possible thanks to the international collaboration among the three
data banks.  All the entries have accordingly been annotated with the feature 
keys common to them. 

All the entries designated by the accession numbers with the prefixes "C", "D", 
"E" and "AB" have been collected and processed by DDBJ, and the rest have been 
prepared by GenBank and EMBL/EBI.  Since the content of a nucleotide sequence 
is often revised due to the correction, addition and deletion of bases made by 
the submitter, the accession number sometimes does not work to tell which 
sequence is in question.  Thus, an additional identifier was introduced to 
specify a particular sequence in a series of revised sequences.  This identifier
is called NID.  For the same reason for translated amino acid sequences, PID was
brought into being.

There have been a number of genome projects going on worldwide.  Among them 
human genome projects have probably been most productive and yielded a large 
number of ordinary sequences and huge amounts of ESTs.  We have the human (HUM) 
division solely for human sequences and the primate (PRI) division for non-human
primate sequences.  Note that the EST division also contains human sequences.

The present release does not have the ORG division.  Thus, if you are interested
in human mitochondrial sequences, for example, you are now advised to refer to 
the HUM division.

This release also includes an independent division (PAT) for patent data.  The 
patent data are those which the Japanese Patent Office (JPO), United States 
Patent and Trademark Office (USPTO), and the European Patent Office (EPO) 
collected and processed.  The accession numbers of the patent data collected 
by the Japanese Patent Office start with the prefix "E", those collected and 
supplied by USPTO and GenBank respectively start with "I", and those collected 
and supplied by EPO and EMBL/EBI respectively start with "A".  The entries with 
the prefixes "I", "A" and "E" were allocated to a file (ddbjpat.seq) in the 
DDBJ format.  Note also that unauthorized use of the patent data may cause 
legal issues for which we have no responsibility.

The number of ESTs has been increasing at an enormous rate and is expected 
to be growing even more rapidly in the future.  To cope with this situation 
and handle the data files with least possible time and manner, we split the 
EST data in eight files; ddbjest1 contains entries with the accession numbers
with A to M prefixes, ddbjest2 contains those with N to S,ddbjest3 contains 
those with T to Z, and ddbjest4, 5, 6, 7, 8 contain those with two letter 
prefixes.  Each of the last five contains 100,000 entries or the rest.

The present release includes GSS division.  GSS stands for Genome Survey 
Sequence, which is similar to EST, except that GSS is genomic DNA whereas 
EST is cDNA.  This release also includes the High Throughput Genome Sequence 
(HTGS) as a new type of data.  HTGS comes mainly from genome project teams 
which deal with a clone as a sequencing unit.

The index files are not presented in this release except for ddbjacc.idx, 
ddbjaut.idx, ddbjgen.idx, ddbjjou.idx, and ddbjkey.idx.  Instead, we have 
included a program by which to make the index files not presented in this 
release.  For the use of the program, see the files, seq2indexes.doc, 
seq2indexes.c, and seq2indexes.h in this release.

The present release contains amino acid sequences that were translated from 
the corresponding nucleotide sequences in our database. In the translation 
we paid much attention to the fact that some species or organella have a 
codon different from the universal one, and used the proper codon table.  
However, if you find an incorrect codon in a translated sequence, please let 
us know.

This release was published by the following DDBJ staff.

General administration
    T. Gojobori, K. Ikeo, A. Watanabe, J. Sugiyama, Y. Ueda, K. Okuda,
    M. Ogawa, M. Shimoyama, Y. Noguchi
Database construction
    Y. Tateno, K. Fukami-Kobayashi, N. Yasuda, Y. Sato, H. Tsutsui, 
    M. Hirashima, A. Hasegawa, A. Suzuki, Y. Yamamoto, A. Kawabuchi, 
    M. Ejima, M. Okaneya, C. Hamamatsu, M. Iwase, R. Suzuki, R. Uchida,
    Y. Shidahara, M. Gojobori, K. Nomura, M. Imma, J. Muroya, N. Ohkubo
Database software development and management
    H. Sugawara, S. Miyazaki, T. Tamura, K. Goto, S. Misu, T. Koike,
    M. Nishigaya, K. Mamiya, R. Tanabe, T. Okayama, Y. Kawanishi, J. Ishi-i,
    T. Mizunuma, H. Yamamoto, D. Kawasaki, T. Futatsuki, H. Hashimoto,
    T. Takakura
System management
    K. Nishikawa, M. Ota, S. Miyazawa, T. Ito, I. Mochizuki, H. Muto,
    M. Kikuchi, A. Murakami 
Editorial and public relations
    N. Saitou, T. Imanishi, Y. Daito, S. Nagira, Y. Hattori, M. Horie,
    K. Ichikawa, T. Kawamoto


DNA Data Bank of Japan
Center for Information Biology
National Institute of Genetics
Mishima 411, Japan 
Phone:  +81 559 81 6853
FAX:    +81 559 81 6849
E-mail: ddbj@ddbj.nig.ac.jp  (for general inquiry)
        ddbjsub@ddbj.nig.ac.jp  (for data submission)
        ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication)
WWW:    http://www.ddbj.nig.ac.jp (for DDBJ WWW server)
        http://sakura.ddbj.nig.ac.jp (for DDBJ sequence data submission system 
                                   SAKURA)

Acknowledgement: We are grateful to NCBI and EMBL/EBI for permitting us
to include their data in the present release.  We also thank the Japanese
Patent Office and Japan Patent Information Organization for kindly allowing us
to distribute the patent data they collected and processed.


DDBJ Database Release History

Release   Date     Entries     Bases       Comments
------------------------------------------------------------------------
 30     07/97   1,534,115  992,788,339   NID and PID started
 29     04/97   1,270,194  841,415,232   
 28     01/97   1,154,120  756,785,219   HTG division started
                                         ORG division eliminated
 27     10/96     936,697  608,103,057   GSS division started
 26     07/96     835,552  551,932,448   
 25     04/96     744,490  499,300,364   /translation started
 24     01/96     637,508  431,771,652   
 23     10/95     569,757  390,694,350   
 22     07/95     437,588  322,982,425   HUM division started
 21     04/95     274,596  250,875,023   
 20     01/95     239,689  231,299,557   
 19     10/94     204,332  205,274,131   
 18     07/94     185,230  192,473,021   
 17     04/94     169,957  179,942,209   
 16     01/94     154,626  165,017,628   
 15     10/93     131,649  147,224,690   
 14     07/93     120,350  138,686,333   
 13     04/93     112,067  129,784,445   
 12     01/93      97,683  120,815,244   EST division started
 11     07/92      65,693   84,839,075   
 10     01/92      59,317   77,805,556   GenBank/EMBL inclusion started
  9     07/91       1,130    2,002,124   
  8     01/91         879    1,573,442   
  7     07/90         681    1,154,211   
  6     01/90         496      841,236   
  5     07/89         395      679,378   
  4     01/89         302      535,985   
  3     07/88         230      345,850   
  2     01/88         142      199,392   
  1     07/87          66      108,970   Started with DDBJ only
------------------------------------------------------------------------


This release covers 18 categories of organisms and others as follows:
------------------------------------------------------------------------------
ddbjbct.*** Category for bacteria
ddbjest.*** Category for EST (expressed sequence tag)
ddbjhtg.*** Category for HTG (high throughput genomic sequencing)
ddbjhum.*** Category for human
ddbjgss.*** Category for GSS (Genome Survey Sequence)
ddbjinv.*** Category for invertebrates
ddbjmam.*** Category for mammals other than primates and rodents
ddbjpat.*** Category for patents
ddbjphg.*** Category for phages
ddbjpln.*** Category for plants
ddbjpri.*** Category for primates other than human
ddbjrna.*** Category for RNAs
ddbjrod.*** Category for rodents
ddbjsts.*** Category for STS (sequence tagged site)
ddbjsyn.*** Category for synthetic DNAs
ddbjuna.*** Category for unannotated sequences
ddbjvrl.*** Category for viruses
ddbjvrt.*** Category for vertebrates other than mammals
------------------------------------------------------------------------------


Each category then has the following nine files. Note that all the files 
except for ddbj***.seq and ddbj***.sdr may include more than 80 characters in 
one line. If this is the case, the line is folded at every 81th column in the 
file on the distribution tape with the fixed record size of 80 bytes.
------------------------------------------------------------------------------
ddbj***.seq  List of an entry in DDBJ format, see Table 1.
ddbj***.acc  List of the accession numbers, see Table 2 .
ddbj***.aut  List of the authors, see Table 3.
ddbj***.dir  List of the short directory in DDBJ style, see Table 4.
ddbj***.idx  List of indices, see Table 5.
ddbj***.jou  List of the journals, see Table 6.
ddbj***.key  List of the key words, see Table 7.
ddbj***.org  List of the species names, see Table 8.
ddbj***.sdr  List of the short directory in GenBank style, see Table 9.
------------------------------------------------------------------------------


Table 1. Part of the contents in the file 'ddbjbct.seq'.
This shows all pieces of information on one entry in DDBJ format.
------------------------------------------------------------------------------
LOCUS       D87069        993 bp ss-mRNA            BCT       04-SEP-1996
DEFINITION  Escherichia coli mRNA for RNA polymerase sigma subunit, truncated
            form of sigma-38, complete cds.
ACCESSION   D87069
KEYWORDS    RNA polymerase sigma subunit, truncated form of sigma-38.
SOURCE      Escherichia coli (strain:W3110) cDNA to mRNA.
  ORGANISM  Escherichia coli
            Prokaryotae; Prokaryotae; Facultative anaerobic gram-negative
            rods; Enterobacteriaceae.
REFERENCE   1  (bases 1 to 993)
  AUTHORS   Jishage,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki
            Jishage, National Institute of Genetics, Molecular Genetics; Yata
            1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp,
            Tel:0559-81-6742, Fax:0559-81-6746)
  STANDARD  full staff_review
REFERENCE   2  (bases 1 to 993)
  AUTHORS   Jishage,M. and Ishihama,A.
  TITLE     Variation in RNA polymerase sigma subunit composition within
            different stocks of Escherichia coli starin W3110
  JOURNAL   Unpublished (1996)
  STANDARD  full staff_review
REFERENCE   3  (sites)
  AUTHORS   Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A.
  TITLE     DNA base sequence variability in katF (putative sigma factor) gene
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 20, 5479-5480 (1992)
  STANDARD  full staff_review
REFERENCE   4  (sites)
  AUTHORS   Takayanagi,Y., Tanaka,K. and Takahashi,H.
  TITLE     Structure of the 5' upstream region and the regulation of the rpoS
            gene of Escherichia coli
  JOURNAL   Mol Gen Genet 243, 525-531 (1994)
  STANDARD  full staff_review
COMMENT     
FEATURES             Location/Qualifiers
     source          1..993
                     /organism="Escherichia coli"
                     /sequenced_mol="cDNA to mRNA"
                     /strain="W3110"
     CDS             1..810
                     /note="the gene has four single base changes, resulting
                     in two amino acid substitutions and an amber mutation"
                     /product="RNA polymerase sigma subunit, truncated form of
                     sigma-38"
                     /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE
                     LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV
                     VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN
                     QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER
                     ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK"
                     /transl_table=11
     mutation        replace(75, "t")
                     /citation=[3]
     mutation        replace(97, "t")
                     /citation=[3]
     mutation        replace(99, "t")
                     /citation=[3]
     mutation        replace(808, "t")
                     /citation=[3]
BASE COUNT      254 a    223 c    291 g    225 t      0 others
ORIGIN      
        1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac
       61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg
      121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt
      181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg
      241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt
      301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc
      361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc
      421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac
      481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga
      541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag
      601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc
      661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat
      721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc
      781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg
      841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa
      901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag
      961 gggctgaata tcgaagcgct gttccgcgag taa
//
------------------------------------------------------------------------------


Table 2. Part of the contents in the file 'ddbjbct.acc'.
The first column refers to the secondary accession number, second column to 
the locus name, and third to the primary accession number. The primary number 
may be the same as the secondary number. They are arranged in the ascending 
order of the secondary accession numbers.
------------------------------------------------------------------------------
D00001 -> ECOPBPAA   X04516
D00002 -> ECOPYRH    X04469
D00006 -> PNS981TET  D00006
D00020 -> COLE2LYS   D00020
D00021 -> COLE31YS   D00021
D00038 -> BRLAM330   D00038
D00066 -> BAC139AC   D00066
D00067 -> ECONANA    M20207
D00069 -> ECOUVRD2   D00069
D00087 -> BACXYNAA   D00087
------------------------------------------------------------------------------


Table 3. Part of the contents in the file 'ddbjbct.aut'.
For each author name given on the left to the arrow, the corresponding locus 
name and primary accession number are respectively listed on the right. They 
are arranged in the alphabetical order of the author names.
------------------------------------------------------------------------------
Aan,F. -> STYCRR     X05210
Aan,F. -> STYENZI    M76176
Aaronson,W. -> ECOKPSD    M64977
Aaronson,W. -> ECONEUA    J05023
Abad-Lapuebla,M.A. -> VIBTDHI    D90238
Abdel-Mawgood,A.L. -> CYAPSBHA   X16394
Abdel-Meguid,S.S. -> TRNGDRECM  J01843
Abdelal,A. -> STYCARA    M36540
Abdelal,A. -> STYCARAB   X13200
Abdelal,A.H. -> PSENOSA    M60717
------------------------------------------------------------------------------


Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'.
For each locus name given in the first column, the corresponding primary 
accession number, molecular type, number of nucleotide pairs, and description 
for the locus are respectively listed. They are arranged in the alphabetical 
order of the locus names.
------------------------------------------------------------------------------
ABCAARAA   M34830 ds-DNA    1624 A.aceti acetic acid resistance protein (aarA)
gene, complete cds.
ABCADHCC   D00635 ds-DNA    4230 A. polyoxogenes alcohol dehydrogenase (EC 
1.1.99.8) and cytochrome c genes.
ABCALDH    D00521 ds-DNA    2683 A.polyoxogenes membrane-bound aldehyde 
dehydrogenase gene, complete cds and flanks.
ABCBCSAA   M37202 ds-DNA    9540 A.xylinum bcs B, bcs C and bcs D genes, 
complete cds and bcs A gene, partial cds.
ABCCELA    M76548 ds-DNA    1165 Acetobacter xylinum UDP pyrophosphorylase 
(celA) gene, complete cds.
ABCCELSYN  X54676 ds-DNA    5363 A. xylinum gene for cellulose biosynthesis
ABCIS1380  D10043 ds-DNA    1665 A.pasteurianus insertion sequence IS1380.
ACAADH1    D90004 ds-DNA    2467 Acetobacter aceti(K6033) alcohol dehydrogenase
subunit gene(adh1).
ACCAAC2    M62833 ds-DNA    1123 Acinetobacter baumannii aminoglycoside 
acetyltr ansferase (aac2) gene, complete cds.
ACCACEAA   M62822 ds-DNA    1874 A.baumannii chloramphenicol acetyltransferase
(cat) gene, complete cds.
------------------------------------------------------------------------------


Table 5. Part of the contents in the file 'ddbjbct.idx'.
The first column refers to the locus name, second column to the starting site 
of the locus in byte, and third to its ending site in byte. They are arranged 
in the alphabetical order of the locus names.
------------------------------------------------------------------------------
%*****************************
#ABCAARAA       0       3211
#ABCADHCC       3212    10608
#ABCALDH        10609   15864
#ABCBCSAA       15865   29583
#ABCCELA        29584   32289
#ABCCELSYN      32290   40960
#ABCIS1380      40961   44711
#ACAADH1        44712   49357
#ACCAAC2        49358   52395
------------------------------------------------------------------------------


Table 6. Part of the contents in the file 'ddbjbct.jou'.
This gives information on the journal in which sequence data were published.
------------------------------------------------------------------------------
(in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of 
Microorganisms:  129-137, Plenum Press, New York (1987) -> BACAMYABS  M57457
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16S   
M55011
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SA  
M55006
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SB  
M55008
(in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial 
Differentiation:  85-94, American Society for Microbiology, Washington, DC 
(1985) -> BACSPOII   M57606
(in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and 
Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1   M54881
(in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis:  
138-143, Academic Press, New York (1976) -> ECOLAC     J01636
(in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase:  455-472, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1    K01197
(in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and 
polymers. Discovery and commercialization.:  186-200, American Chemical 
Society, Washington, D.C. (1991) -> ECOTGP     J01714
(in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions:  193-207, 
Walter de Gruyter, New York (1975) -> ECOLAC     J01636
(in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, 
part E:  In press, Academic Press, New York, N.Y. (1986) -> PLMCG      M11320
Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1    M54893
Actinomycetologica 5, 14-17 (1991) -> STMARGG    D00799
Adv. Biophys. 21, 115-133 (1986) -> R10REP     M26840
Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA   M26839
Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA   M26893
Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT     M14040
Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA    M20207
Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330   D00038
Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT     D00129
Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP    D00219
Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR   M35503
Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP   D00312
Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY   D00117
Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA   D00087
Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135   D00348
Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR     D00343
Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI      D00342
Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB   M35517
Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI    D00217
------------------------------------------------------------------------------


Table 7. Part of the contents in the file 'ddbjbct.key'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding key words are listed on the left. 
------------------------------------------------------------------------------
A.aceti acetic acid resistance protein (aarA) gene, complete cds.       -> 
ABCAARAA     M34830
acetic acid resistance protein.         -> ABCAARAA     M34830
Cloning of genes responsible for acetic acid resistance in acetobacter aceti   
-> ABCAARAA      M34830
A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes.    
-> ABCADHCC      D00635
alcohol dehydrogenase; cytochrome c.    -> ABCADHCC     D00635
Cloning and sequencing of the gene cluster encoding two subunits of membrane-
bound alcohol dehydrogenase from Acetobacter polyoxogenes  -> ABCADHCC     
D00635
These data kindly submitted in computer readable form by: Toshimi Tamaki 
Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 
475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486     -> ABCADHCC     D00635
A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and 
flanks.     -> ABCALDH      D00521
aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme.  -> 
ABCALDH      D00521
Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from 
Acetobacter polyoxogenes     -> ABCALDH      D00521
------------------------------------------------------------------------------


Table 8. Part of the contents in the file 'ddbjbct.org'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding taxonomic names are listed on the left.  They are 
arranged in the alphabetical order of the species names.
------------------------------------------------------------------------------
A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.   -> ANIRUBPS     X00019
A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; 
Gracilicutes; Oxyphotobacteria; Cyanobacteria.    -> ANIRGGX      X00343
A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.        -> ANIRGG       X00512
A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.      -
> ABCADHCC     D00635
A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum 
Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.       -> 
AQUPCAB      K02660
A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; 
Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.      -> AQUCPCAB     
K02659
A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; 
Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.  -> AVINIFUSV    
M17349
A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; 
Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; 
Azotobacteraceae.       -> ABCAARAA      M34830
A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus 
actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; 
Facultatively anaerobic rods; Pasteurellaceae.      -> ACNLKTXN     M27399
A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Neisseriaceae.         -> ACCCITSYN    M33037
------------------------------------------------------------------------------


Table 9. Part of the short directory file in GenBank style in the file 
'ddbjbct.sdr'.
The short directory file contains brief descriptions of all of the sequence 
entries contained in the GenBank style. 
------------------------------------------------------------------------------
ABCAARAA    A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp
ABCADHCC    A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and      4230bp
ABCALDH     A.polyoxogenes membrane-bound aldehyde dehydrogenase gene,   2683bp
ABCBCSABCD  A.xylinum bcs A, B, C and D genes, complete cds's.           9540bp
ABCCELA     Acetobacter xylinum UDP pyrophosphorylase (celA) gene,       1165bp
ABCCELSYN   A. xylinum gene for cellulose biosynthesis                   5363bp
ABCIS1380   A.pasteurianus insertion sequence IS1380.                    1665bp
ACAADH1     Acetobacter aceti(K6033) alcohol dehydrogenase subunit       2467bp
ACCAAC2     Acinetobacter baumannii aminoglycoside acetyltransferase     1123bp
ACCACEAA    A.baumannii chloramphenicol acetyltransferase (cat) gene,    1874bp
ACCAPHA6    Acinetobacter baumannii aphA-6 gene.                         1170bp
ACCBENABCA  A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins   15922bp
ACCCAT      Acinetobacter calcoaceticus cat operon.                     15922bp
ACCCATAM    A.calcoaceticus catA and catM genes, encoding catechol 1,    5537bp
ACCCHMO     Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp
ACCCITSYN   A.anitratum citrate synthase gene, complete cds.             1895bp
------------------------------------------------------------------------------


In addition to the 9 tables the five following index files are included in 
this release. These files were prepared irrespective of the 14 categories of 
taxonomic divisions.

 Accession number index file
 Keyword phrase index file
 Author name index file
 Journal citation index file
 Gene name index file

A brief description is given for each file in the following.


Table 10. Part of the accession number index file in the 'ddbjacc.idx'.
The following excerpt from the accession number index file illustrates the 
format of the index. Note that as mentioned above there are such a case where 
an accession number for a taxonomic category is the same  as that for EST or 
ORG; for example, PRI D12345 and EST D12345 under the same accession number 
D12345.
------------------------------------------------------------------------------
M33790       SHFINVEA   BCT M33790
M33791       BACORF2    BCT M33791
M33792       FTRCPRBCLC ORG X55829 FTRCPRBCLC PLN X55829
M33793       FTRCPPRBCL ORG X55830 FTRCPPRBCL PLN X55830
M33794       ATPCPARRBC ORG X55831 ATPCPARRBC PLN X55831 ATPCPRBCLB ORG X15925
             ATPCPRBCLB PLN X15925
M33796       NRACPNTRBC ORG X55827 NRACPNTRBC PLN X55827
M33797       NRACPRBCL  ORG X55828 NRACPRBCL  PLN X55828
M33798       ACCPCACGH  BCT M33798
M33799       PSETRPEGDC BCT M33799
------------------------------------------------------------------------------


Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'.
Keyword phrases consist of names for gene products and other characteristics 
of sequence entries. 
------------------------------------------------------------------------------
A CHANNEL
             DROCHA     INV M17155
A COMPONENT
             SQLCVEA    VRL M38183
A LOCUS
             GORGOGOA3  PRI X54375 GORGOGOA4  PRI X54376
A LOCUS ALLELE
             GORA0101   PRI X60258 GORA0201   PRI X60259 GORA0401   PRI X60257
             GORA0501   PRI X60256
A MULTI-GENE FAMILY
             RICGLUTE   PLN D00584
A PROTEIN
             MS2AAR     PHG M25187 ST1APCS    PHG M25396
A SEQUENCE
             HS5TOA30   VRL D00148 HS5TOA31   VRL D00147
------------------------------------------------------------------------------


Table 12. Part of the author name index file in 'ddbjaut.idx'.
The author name index file lists all of the author names that appear in the 
citations. 
------------------------------------------------------------------------------
ABE,A.
             HUMMHDRBWE PRI M27509 HUMMHDRBWF PRI M27510 HUMMHDRBWG PRI M27511
             YSCGAL11A  PLN M22481
ABE,C.
             S85445     BCT S85445
ABE,E.
             M23442     UNA M23442
ABE,H.
             CHKADF     VRT M55660 CHKCOF     VRT M55659
ABE,K.
             CHPCLAC    PRI D11383 CHPIMRF    PRI D11384 CUGCUR09   PLN X64110
             CUGCUR37   PLN X64111 HPCCEXPA   VRL M55970 HPCCPEP1   VRL D10687
             HPCCPEP2   VRL D10688 HPCHABC82  VRL X51587 HPCNS2APA  VRL M55972
             HPCNS2PA   VRL M55971 HPCNS2PB   VRL M55973 HPCNS5PA   VRL M55974
             MUSKE2     ROD M65255 MUSKE2A    ROD M65256 MZECYS     PLN D10622
             RICCPI     PLN J03469 RICGLUTE   PLN D00584 RICLNOCI   PLN J05595
             RICOCS     PLN M29259 RICORYII   PLN X57658 RICOZA     PLN D90406
             RICOZB     PLN D90407 RICOZC     PLN D90408 S54524     PLN S54524
             S54526     PLN S54526 S54530     PLN S54530 S73960     ROD S73960
------------------------------------------------------------------------------


Table 13. Part of the journal citation index file in 'ddbjjou.idx'.
The journal citation index file lists all of the citations that appear in the 
references. 
------------------------------------------------------------------------------
ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992)
             HUMPLASINS PRI M98056
ACTA BIOCHIM. POL. 24, 301-318 (1977)
             LUPTRFJ    RNA K00345 LUPTRFN    RNA K00346
ACTA BIOCHIM. POL. 26, 369-381 (1979)
             BLYTRNPHE  PLN X02683
ACTA BIOCHIM. POL. 29, 143-149 (1982)
             EMEMTA     ORG M32572 EMEMTA     PLN M32572 EMEMTB     ORG M32573
             EMEMTB     PLN M32573 EMEMTC     ORG M32574 EMEMTC     PLN M32574
             EMEMTD     ORG M32575 EMEMTD     PLN M32575 EMEMTE     ORG M32576
             EMEMTE     PLN M32576
ACTA BIOCHIM. POL. 34, 21-27 (1987)
             LUPNOSP    PLN M32571
------------------------------------------------------------------------------


Table 14. Part of the gene name index file in 'ddbjgen.idx'.
This file lists all the gene names that appear in the feature table.
------------------------------------------------------------------------------
AACC8
             STMAACC8   BCT M55426
AACC9
             MPUAACC9   BCT M55427
AACT
             HUMA1ACM   PRI K01500 HUMA1ACMA  PRI X00947 HUMA1ACMB  PRI M18035
             HUMAACT1   PRI M18906 HUMAACT2   PRI M22533 HUMAACTA   PRI J05176
AAD
             INTINTORF  BCT L06418 LMOMO229D  BCT X17478
AAD A1
             ENTAAC3VI  BCT M88012
AAD9
             ENEAAD9A   BCT M69221
AADA
             LMOMO229A  BCT X17479 S52249     BCT S52249 SYNAADA    SYN M60473
             TRNTAAB    BCT M55547 TRNTN21CAS BCT M86913
------------------------------------------------------------------------------


The files in this release are arranged in the following order with non-
labeled format.

Release note
    ddbjrel.txt      694 records
Category for bacteria, 35304 entries, 82212622 bases
    ddbjbct.seq      3501749 records
Category for EST1 (expressed sequence tag), 208152 entries, 70223599 bases
    ddbjest1.seq     10504138 records
Category for EST2 (expressed sequence tag), 178570 entries, 65477061 bases
    ddbjest2.seq     9673679 records
Category for EST3 (expressed sequence tag), 221270 entries, 82157496 bases
    ddbjest3.seq    11898964 records
Category for EST4 (expressed sequence tag), 100000 entries, 37311509 bases
    ddbjest4.seq     5839522 records
Category for EST5 (expressed sequence tag), 100000  entries, 40922818 bases 
    ddbjest5.seq     5727634 records
Category for EST6 (expressed sequence tag), 100000  entries, 37628655 bases 
    ddbjest6.seq     5779364 records
Category for EST7 (expressed sequence tag), 100000  entries, 32381063  bases
    ddbjest7.seq     6020901 records
Category for EST8 (expressed sequence tag), 92239  entries, 35775897 bases 
    ddbjest8.seq     5383612 records
Category for GSS (Genome Survey Sequence), 25588 entries, 14404703 bases
    ddbjgss.seq      1251744 records
Category for HTG (high throughput genomic sequencing), 541 entries, 44763040 
bases
    ddbjhtg.seq      777104 records
Category for human, 66627 entries, 108114157 bases
    ddbjhum.seq     4724979 records
Category for invertebrates, 27802 entries, 96357501 bases
    ddbjinv.seq     3114220 records
Category for mammals, 11668 entries, 11576779 bases
    ddbjmam.seq      677761 records
Category for patents, 65406 entries, 22742130 bases
    ddbjpat.seq     1912659 records
Category for phages, 1293 entries, 2021900 bases
    ddbjphg.seq      105006 records
Category for plants, 41339 entries, 78831515 bases
    ddbjpln.seq     3345181 records
Category for primates, 5499 entries, 3419314 bases
    ddbjpri.seq      266776 records
Category for RNAs, 5073 entries, 2585117 bases
    ddbjrna.seq      203250 records
Category for rodents, 35540 entries, 42194129 bases
    ddbjrod.seq     2256595 records
Category for STS (sequence tagged site), 49358 entries, 16932888 bases
    ddbjsts.seq     2872816 records
Category for synthetic DNAs, 2496 entries, 5418065 bases
    ddbjsyn.seq      199291 records
Category for unannotated sequences, 2409 entries, 2086320 bases
    ddbjuna.seq      126021 records
Category for viruses, 42064 entries, 41569983 bases
    ddbjvrl.seq     2548412 records
Category for vertebrates, 15877 entries, 15680078 bases
    ddbjvrt.seq      916422 records
Accession number index file
    ddbjacc.idx     1547094 records
Keyword phrase index file
    ddbjkey.idx      732443 records
Author name index file
    ddbjaut.idx     9345309 records
Journal citation index file
    ddbjjou.idx      983325 records
Gene name index file
    ddbjgen.idx      172493 records