DNA Data Bank of Japan

                              DNA Database

   Release 40, Jan. 2000, including 5,388,125 entries, 4,762,696,173 bases


This database may be copied and redistributed without permission on the 
condition that all the statements in this release note are reproduced in each 
copy.

The present release contains the newest data prepared by the DNA Data Bank of 
Japan (DDBJ), GenBank, and European Molecular Biology Laboratory/European 
Bioinformatics Institute (EMBL/EBI) as of Dec. 16, 1999.  This unified database 
was made possible thanks to the international collaboration among the three
data banks.  All the entries have accordingly been annotated with the feature 
keys common to them. 

All the entries designated by the accession numbers with the prefixes "C", "D", 
"E", "AB", "AG", "AP", "AT", "AU" and "AV" have been collected and processed by 
DDBJ, and the rest have been prepared by GenBank and EMBL/EBI.

There have been a number of genome projects going on worldwide.  Among them 
human genome projects have probably been most productive and yielded a large 
number of ordinary sequences and huge amounts of ESTs.  Thus, we have the human 
(HUM) division solely for human sequences and the primate (PRI) division for 
non-human primate sequences.  Note that the EST division also contains human 
sequences.

The present release does not have the ORG division.  Thus, if you are interested
in human mitochondrial sequences, for example, you are now advised to refer to 
the HUM division.  The HUM division in this release is divided into four 
subdivisions in which 30,000 entries each are allocated except for the last one 
including the rest.

This release also includes an independent division (PAT) for patent data.  The 
patent data are those which the Japanese Patent Office (JPO), United States 
Patent and Trademark Office (USPTO), and the European Patent Office (EPO) 
collected and processed.  The accession numbers of the patent data collected 
by the Japanese Patent Office start with the prefix "E", those collected and 
supplied by USPTO and GenBank respectively start with "I" and "AR", and those 
collected and supplied by EPO and EMBL/EBI respectively start with "A" and "AX".
The entries with the prefixes "I","AR", "A","AX" and "E" were allocated to a 
file (ddbjpat.seq) in the DDBJ format.  Note also that unauthorized use of the 
patent data may cause legal issues for which we take no responsibility.

In the present release, the SOURCE in the flat file was revisited and revised 
if necessary in accordance with the unified taxonomy database common to the 
three data banks.

The number of ESTs has been increasing at an enormous rate and is expected 
to be growing even more rapidly in the future.  To cope with this situation 
and handle the data files with least possible time and manner, we split the 
EST data in 31 files in the present release; ddbjest1 for entries with 
the accession numbers with A to M prefixes, ddbjest2 for those with N to S, 
ddbjest3 for those with T to Z, and ddbjest4 to ddbjest31 contain those with 
two letter prefixes.  The files from the 4th to 30th contain 100,000 entries 
each, and the 31th does the rest.

The present release includes the GSS division.  GSS stands for the Genome 
Survey Sequence, which is similar to EST, except that GSS is genomic DNA 
whereas EST is cDNA.  This division is divided into 12 files; each of the 
first 11 files contains 100,000 entries and the last one does the rest.  
This release also includes the High Throughput Genome Sequence (HTGS) which 
comes mainly from genome project teams which deal with a clone as a sequencing 
unit.  HTGS in this release are distributed in seven files.  First six files 
contain 2,000 entries each, and the last one contains the rest.

The index files are not presented in this release except for ddbjacc.idx, 
ddbjgen.idx, ddbjjou.idx, and ddbjkey.idx.  Instead, we have included a program 
by which to make the index files not presented in this release.  For the use of 
the program, see the files, seq2indexes.doc, seq2indexes.c, and seq2indexes.h 
in this release.

The present release contains amino acid sequences that were translated from 
the corresponding nucleotide sequences in our database. In the translation 
we paid much attention to the fact that some species or organella have a 
codon different from the universal one, and used the proper codon table.  
However, if you find an incorrect codon in a translated sequence, please let 
us know.

The three data banks include the item VERSION in the flat file, which indicates 
a version of a submitted nucleotide sequence (see Table 1).  It is expressed like AB123456.1, in which the digit(s) after the period is a version number.  
The reason for adding VERSION is that since a submitted sequence sometimes 
revised by the submitter, the accession number alone cannot specify the 
sequence in question causing the user a trouble.  The number is increased by 
one every time when a revised sequence is made public.  Accordingly, the 
translated protein sequence will be accompanied with a /protein_id which is 
expressed like BAA12345.1, in which the digit(s) after the period is again a 
version number.  The number is increased by one when the corresponding 
nucleotide sequence is revised and the protein sequence is changed as a result, 
and when the revised protein sequence is made public.

From the present release, we will terminate the RNA division.  The extant RNA 
data were redistributed according to the category of the organism.  
Therefore, you will find a human RNA sequence, for example, in the HUM division.

This release was published by the following DDBJ staff.

General administration
    T. Gojobori, T. Imanishi, Y. Fukuma, A. Watanabe, Y. Ueda, Y. Katsube, 
    K. Okuda, J. Sugiyama, M. Maruyama, H. Tsutsui(hold)
Database construction
    Y. Tateno, M. Ota, S. Miyazaki(hold), N. Yasuda, Y. Sato, H. Tsutsui, 
    M. Hirashima, A. Hasegawa, A. Suzuki, Y. Yamamoto, M. Ejima, N. Asakawa,
    M. Gojobori,  M. Imma, J. Mashima, M. Suzuki, M. Okaneya, A. Shimada,
    A. Hashizume, N. Mukasa, T. Umezawa
Database software development and management
    H. Sugawara, S. Miyazaki, T. Okayama, S. Misu, T. Mizunuma, Y. Kawanishi,
    K. Goto, K. Mamiya, M. Kikuchi(hold), H. Hashimoto, H. Harimoto,
    Y. Minesaki, S. Kanai, K. Suzuki, S. Sato, T. Takaki, K. Kaneda, Y. Sugiyama
System management
    K. Nishikawa, K. Ikeo, K. Yoshioka, T. Osawa, I. Mochizuki, M. Kikuchi, 
    T. Narita, M. Nagura, N. Hoshi
Editorial and public relations
    N. Saitou, T. Kawamoto, S. Nagira, H. Ichikawa, Y. Daito, K. Ichikawa


DNA Data Bank of Japan
Center for Information Biology
National Institute of Genetics
Mishima 411-8540, Japan 
Phone:  +81 559 81 6853
FAX:    +81 559 81 6849
E-mail: ddbj@ddbj.nig.ac.jp  (for general inquiry)
        ddbjsub@ddbj.nig.ac.jp  (for data submission)
        ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication)
WWW:    http://www.ddbj.nig.ac.jp (for DDBJ WWW server)
        http://sakura.ddbj.nig.ac.jp (for DDBJ sequence data submission system 
                                   SAKURA)

Acknowledgement: We are grateful to NCBI and EMBL/EBI for a firm friendship and 
an excellent collaboration with us.  We also thank the Japanese Patent Office 
for a steady cooperation with us.  The operation of DDBJ is supported by the 
Ministry of Education, Science, Sports and Culture, and we would gratefully 
note this here.


DDBJ Database Release History

Release  Date     Entries     Bases          Comments
------------------------------------------------------------------------------
 40     01/00   5,388,125   4,762,696,173   RNA division eliminated
 39     10/99   4,810,773   3,728,000,562   NID and PID discarded
 38     07/99   4,294,369   3,098,519,597
 37     03/99   3,311,627   2,375,261,951   VERSION, /protein_id started
 36     01/99   3,073,166   2,190,425,560
 35     10/98   2,759,261   1,957,341,169
 34     07/98   2,412,785   1,708,580,623
 33     04/98   2,174,769   1,479,303,279
 32     01/98   1,956,669   1,300,950,613
 31     10/97   1,731,532   1,139,869,464   Adoption of the unified taxonomy
                                            database
 30     07/97   1,534,115     992,788,339   NID and PID started
 29     04/97   1,270,194     841,415,232   
 28     01/97   1,154,120     756,785,219   HTG division started
                                            ORG division eliminated
 27     10/96     936,697     608,103,057   GSS division started
 26     07/96     835,552     551,932,448   
 25     04/96     744,490     499,300,364   /translation started
 24     01/96     637,508     431,771,652   
 23     10/95     569,757     390,694,350   
 22     07/95     437,588     322,982,425   HUM division started
 21     04/95     274,596     250,875,023   
 20     01/95     239,689     231,299,557   
 19     10/94     204,332     205,274,131   
 18     07/94     185,230     192,473,021   
 17     04/94     169,957     179,942,209   
 16     01/94     154,626     165,017,628   
 15     10/93     131,649     147,224,690   
 14     07/93     120,350     138,686,333   
 13     04/93     112,067     129,784,445   
 12     01/93      97,683     120,815,244   EST division started
 11     07/92      65,693      84,839,075   
 10     01/92      59,317      77,805,556   GenBank/EMBL inclusion started
  9     07/91       1,130       2,002,124   
  8     01/91         879       1,573,442   
  7     07/90         681       1,154,211   
  6     01/90         496         841,236   
  5     07/89         395         679,378   
  4     01/89         302         535,985   
  3     07/88         230         345,850   
  2     01/88         142         199,392   
  1     07/87          66         108,970   Started with DDBJ only
------------------------------------------------------------------------


This release covers 17 categories of organisms and others as follows:
------------------------------------------------------------------------------
ddbjbct.*** Category for bacteria
ddbjest.*** Category for EST (expressed sequence tag)
ddbjhtg.*** Category for HTG (high throughput genomic sequencing)
ddbjhum.*** Category for human
ddbjgss.*** Category for GSS (Genome Survey Sequence)
ddbjinv.*** Category for invertebrates
ddbjmam.*** Category for mammals other than primates and rodents
ddbjpat.*** Category for patents
ddbjphg.*** Category for phages
ddbjpln.*** Category for plants
ddbjpri.*** Category for primates other than human
ddbjrod.*** Category for rodents
ddbjsts.*** Category for STS (sequence tagged site)
ddbjsyn.*** Category for synthetic DNAs
ddbjuna.*** Category for unannotated sequences
ddbjvrl.*** Category for viruses
ddbjvrt.*** Category for vertebrates other than mammals
------------------------------------------------------------------------------


Each category then has the following nine files. Note that all the files 
except for ddbj***.seq are created by the user by use of seq2indexes as mentioned 
in the release note.
------------------------------------------------------------------------------
ddbj***.seq  List of an entry in DDBJ format, see Table 1.
ddbj***.acc  List of the accession numbers, see Table 2 .
ddbj***.aut  List of the authors, see Table 3.
ddbj***.dir  List of the short directory in DDBJ style, see Table 4.
ddbj***.idx  List of indices, see Table 5.
ddbj***.jou  List of the journals, see Table 6.
ddbj***.key  List of the key words, see Table 7.
ddbj***.org  List of the species names, see Table 8.
ddbj***.sdr  List of the short directory in DDBJ style, see Table 9.
------------------------------------------------------------------------------


Table 1. Part of the contents in the file 'ddbjbct.seq'.
This shows all pieces of information on one entry in DDBJ format.
------------------------------------------------------------------------------
LOCUS       D87069        993 bp    mRNA            BCT       07-FEB-1999
DEFINITION  Escherichia coli mRNA for RNA polymerase sigma subunit, truncated
            form of sigma-38, complete cds.
ACCESSION   D87069
VERSION     D87069.1
KEYWORDS    RNA polymerase sigma subunit, truncated form of sigma-38.
SOURCE      Escherichia coli (strain:W3110) cDNA to mRNA.
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
            Escherichia.
REFERENCE   1  (bases 1 to 993)
  AUTHORS   Jishage,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki
            Jishage, National Institute of Genetics, Molecular Genetics; Yata
            1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp,
            Tel:0559-81-6742, Fax:0559-81-6746)
  STANDARD  full staff_review
REFERENCE   2  (bases 1 to 993)
  AUTHORS   Jishage,M. and Ishihama,A.
  TITLE     Variation in RNA polymerase sigma subunit composition within
            different stocks of Escherichia coli starin W3110
  JOURNAL   Unpublished (1996)
  STANDARD  full staff_review
REFERENCE   3  (sites)
  AUTHORS   Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A.
  TITLE     DNA base sequence variability in katF (putative sigma factor) gene
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 20, 5479-5480 (1992)
  STANDARD  full staff_review
REFERENCE   4  (sites)
  AUTHORS   Takayanagi,Y., Tanaka,K. and Takahashi,H.
  TITLE     Structure of the 5' upstream region and the regulation of the rpoS
            gene of Escherichia coli
  JOURNAL   Mol Gen Genet 243, 525-531 (1994)
  STANDARD  full staff_review
COMMENT     
FEATURES             Location/Qualifiers
     source          1..993
                     /organism="Escherichia coli"
                     /sequenced_mol="cDNA to mRNA"
                     /strain="W3110"
     CDS             1..810
                     /note="the gene has four single base changes, resulting
                     in two amino acid substitutions and an amber mutation"
                     /product="RNA polymerase sigma subunit, truncated form of
                     sigma-38"
                     /protein_id="BAA13238.1"
                     /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE
                     LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV
                     VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN
                     QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER
                     ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK"
                     /transl_table=11
     mutation        75
                     /citation=[3]
                     /replace="t"
     mutation        97
                     /citation=[3]
                     /replace="t"
     mutation        99
                     /citation=[3]
                     /replace="t"
     mutation        808
                     /citation=[3]
                     /replace="t"
BASE COUNT      254 a    223 c    291 g    225 t      0 others
ORIGIN      
        1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac
       61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg
      121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt
      181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg
      241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt
      301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc
      361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc
      421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac
      481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga
      541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag
      601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc
      661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat
      721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc
      781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg
      841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa
      901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag
      961 gggctgaata tcgaagcgct gttccgcgag taa
//
------------------------------------------------------------------------------


Table 2. Part of the contents in the file 'ddbjbct.acc'.
The first column refers to the secondary accession number, second column to 
the locus name, and third to the primary accession number. The primary number 
may be the same as the secondary number. They are arranged in the ascending 
order of the secondary accession numbers.
------------------------------------------------------------------------------
D00001 -> ECOPBPAA   X04516
D00002 -> ECOPYRH    X04469
D00006 -> PNS981TET  D00006
D00020 -> COLE2LYS   D00020
D00021 -> COLE31YS   D00021
D00038 -> BRLAM330   D00038
D00066 -> BAC139AC   D00066
D00067 -> ECONANA    M20207
D00069 -> ECOUVRD2   D00069
D00087 -> BACXYNAA   D00087
------------------------------------------------------------------------------


Table 3. Part of the contents in the file 'ddbjbct.aut'.
For each author name given on the left to the arrow, the corresponding locus 
name and primary accession number are respectively listed on the right. They 
are arranged in the alphabetical order of the author names.
------------------------------------------------------------------------------
Aan,F. -> STYCRR     X05210
Aan,F. -> STYENZI    M76176
Aaronson,W. -> ECOKPSD    M64977
Aaronson,W. -> ECONEUA    J05023
Abad-Lapuebla,M.A. -> VIBTDHI    D90238
Abdel-Mawgood,A.L. -> CYAPSBHA   X16394
Abdel-Meguid,S.S. -> TRNGDRECM  J01843
Abdelal,A. -> STYCARA    M36540
Abdelal,A. -> STYCARAB   X13200
Abdelal,A.H. -> PSENOSA    M60717
------------------------------------------------------------------------------


Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'.
For each locus name given in the first column, the corresponding primary 
accession number, molecular type, number of nucleotide pairs, and description 
for the locus are respectively listed. They are arranged in the alphabetical 
order of the locus names.
------------------------------------------------------------------------------
ABCAARAA   M34830 ds-DNA    1624 A.aceti acetic acid resistance protein (aarA)
gene, complete cds.
ABCADHCC   D00635 ds-DNA    4230 A. polyoxogenes alcohol dehydrogenase (EC 
1.1.99.8) and cytochrome c genes.
ABCALDH    D00521 ds-DNA    2683 A.polyoxogenes membrane-bound aldehyde 
dehydrogenase gene, complete cds and flanks.
ABCBCSAA   M37202 ds-DNA    9540 A.xylinum bcs B, bcs C and bcs D genes, 
complete cds and bcs A gene, partial cds.
ABCCELA    M76548 ds-DNA    1165 Acetobacter xylinum UDP pyrophosphorylase 
(celA) gene, complete cds.
ABCCELSYN  X54676 ds-DNA    5363 A. xylinum gene for cellulose biosynthesis
ABCIS1380  D10043 ds-DNA    1665 A.pasteurianus insertion sequence IS1380.
ACAADH1    D90004 ds-DNA    2467 Acetobacter aceti(K6033) alcohol dehydrogenase
subunit gene(adh1).
ACCAAC2    M62833 ds-DNA    1123 Acinetobacter baumannii aminoglycoside 
acetyltr ansferase (aac2) gene, complete cds.
ACCACEAA   M62822 ds-DNA    1874 A.baumannii chloramphenicol acetyltransferase
(cat) gene, complete cds.
------------------------------------------------------------------------------


Table 5. Part of the contents in the file 'ddbjbct.idx'.
The first column refers to the locus name, second column to the starting site 
of the locus in byte, and third to its ending site in byte. They are arranged 
in the alphabetical order of the locus names.
------------------------------------------------------------------------------
%*****************************
#ABCAARAA       0       3211
#ABCADHCC       3212    10608
#ABCALDH        10609   15864
#ABCBCSAA       15865   29583
#ABCCELA        29584   32289
#ABCCELSYN      32290   40960
#ABCIS1380      40961   44711
#ACAADH1        44712   49357
#ACCAAC2        49358   52395
------------------------------------------------------------------------------


Table 6. Part of the contents in the file 'ddbjbct.jou'.
This gives information on the journal in which sequence data were published.
------------------------------------------------------------------------------
(in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of 
Microorganisms:  129-137, Plenum Press, New York (1987) -> BACAMYABS  M57457
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16S   
M55011
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SA  
M55006
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SB  
M55008
(in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial 
Differentiation:  85-94, American Society for Microbiology, Washington, DC 
(1985) -> BACSPOII   M57606
(in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and 
Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1   M54881
(in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis:  
138-143, Academic Press, New York (1976) -> ECOLAC     J01636
(in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase:  455-472, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1    K01197
(in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and 
polymers. Discovery and commercialization.:  186-200, American Chemical 
Society, Washington, D.C. (1991) -> ECOTGP     J01714
(in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions:  193-207, 
Walter de Gruyter, New York (1975) -> ECOLAC     J01636
(in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, 
part E:  In press, Academic Press, New York, N.Y. (1986) -> PLMCG      M11320
Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1    M54893
Actinomycetologica 5, 14-17 (1991) -> STMARGG    D00799
Adv. Biophys. 21, 115-133 (1986) -> R10REP     M26840
Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA   M26839
Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA   M26893
Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT     M14040
Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA    M20207
Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330   D00038
Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT     D00129
Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP    D00219
Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR   M35503
Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP   D00312
Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY   D00117
Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA   D00087
Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135   D00348
Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR     D00343
Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI      D00342
Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB   M35517
Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI    D00217
------------------------------------------------------------------------------


Table 7. Part of the contents in the file 'ddbjbct.key'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding key words are listed on the left. 
------------------------------------------------------------------------------
A.aceti acetic acid resistance protein (aarA) gene, complete cds.       -> 
ABCAARAA     M34830
acetic acid resistance protein.         -> ABCAARAA     M34830
Cloning of genes responsible for acetic acid resistance in acetobacter aceti   
-> ABCAARAA      M34830
A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes.    
-> ABCADHCC      D00635
alcohol dehydrogenase; cytochrome c.    -> ABCADHCC     D00635
Cloning and sequencing of the gene cluster encoding two subunits of membrane-
bound alcohol dehydrogenase from Acetobacter polyoxogenes  -> ABCADHCC     
D00635
These data kindly submitted in computer readable form by: Toshimi Tamaki 
Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 
475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486     -> ABCADHCC     D00635
A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and 
flanks.     -> ABCALDH      D00521
aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme.  -> 
ABCALDH      D00521
Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from 
Acetobacter polyoxogenes     -> ABCALDH      D00521
------------------------------------------------------------------------------


Table 8. Part of the contents in the file 'ddbjbct.org'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding taxonomic names are listed on the left.  They are 
arranged in the alphabetical order of the species names.
------------------------------------------------------------------------------
A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.   -> ANIRUBPS     X00019
A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; 
Gracilicutes; Oxyphotobacteria; Cyanobacteria.    -> ANIRGGX      X00343
A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.        -> ANIRGG       X00512
A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.      -
> ABCADHCC     D00635
A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum 
Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.       -> 
AQUPCAB      K02660
A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; 
Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.      -> AQUCPCAB     
K02659
A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; 
Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.  -> AVINIFUSV    
M17349
A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; 
Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; 
Azotobacteraceae.       -> ABCAARAA      M34830
A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus 
actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; 
Facultatively anaerobic rods; Pasteurellaceae.      -> ACNLKTXN     M27399
A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Neisseriaceae.         -> ACCCITSYN    M33037
------------------------------------------------------------------------------


Table 9. Part of the short directory file in DDBJ style in the file 
'ddbjbct.sdr'.
The short directory file contains brief descriptions of all of the sequence 
entries contained in the DDBJ style. 
------------------------------------------------------------------------------
ABCAARAA    A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp
ABCADHCC    A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and      4230bp
ABCALDH     A.polyoxogenes membrane-bound aldehyde dehydrogenase gene,   2683bp
ABCBCSABCD  A.xylinum bcs A, B, C and D genes, complete cds's.           9540bp
ABCCELA     Acetobacter xylinum UDP pyrophosphorylase (celA) gene,       1165bp
ABCCELSYN   A. xylinum gene for cellulose biosynthesis                   5363bp
ABCIS1380   A.pasteurianus insertion sequence IS1380.                    1665bp
ACAADH1     Acetobacter aceti(K6033) alcohol dehydrogenase subunit       2467bp
ACCAAC2     Acinetobacter baumannii aminoglycoside acetyltransferase     1123bp
ACCACEAA    A.baumannii chloramphenicol acetyltransferase (cat) gene,    1874bp
ACCAPHA6    Acinetobacter baumannii aphA-6 gene.                         1170bp
ACCBENABCA  A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins   15922bp
ACCCAT      Acinetobacter calcoaceticus cat operon.                     15922bp
ACCCATAM    A.calcoaceticus catA and catM genes, encoding catechol 1,    5537bp
ACCCHMO     Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp
ACCCITSYN   A.anitratum citrate synthase gene, complete cds.             1895bp
------------------------------------------------------------------------------


In addition to the 9 tables the four following index files are included in 
this release. These files were prepared irrespective of the 14 categories of 
taxonomic divisions.

 Accession number index file
 Keyword phrase index file
 Journal citation index file
 Gene name index file

A brief description is given for each file in the following.


Table 10. Part of the accession number index file in the 'ddbjacc.idx'.
The following excerpt from the accession number index file illustrates the 
format of the index.
------------------------------------------------------------------------------
D00100    PSEASPAA   BCT D00100    
D00101    RABNP450R  MAM D00101    
D00102    HUMLTX     HUM D00102    
D00103    AFARRN5SA  BCT D00103   AFRRN5SA   BCT X05517    
D00104    AFARRN5SB  BCT D00104   AFRRN5SB   BCT X05518    
D00105    AFARRN5S   BCT D00105   ASRRN5S    BCT X05524    
D00106    ACH5SRR    BCT D00106   AXRRN5S    BCT X05522   AXRRN5SA   BCT X05523    
D00107    ACH5SRRX   BCT D00107   ACRRN5S    BCT X05521 
------------------------------------------------------------------------------


Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'.
Keyword phrases consist of names for gene products and other characteristics 
of sequence entries. 
------------------------------------------------------------------------------
A CHANNEL
             DROCHA     INV M17155
A COMPONENT
             SQLCVEA    VRL M38183
A LOCUS
             GORGOGOA3  PRI X54375 GORGOGOA4  PRI X54376
A LOCUS ALLELE
             GORA0101   PRI X60258 GORA0201   PRI X60259 GORA0401   PRI X60257
             GORA0501   PRI X60256
A MULTI-GENE FAMILY
             RICGLUTE   PLN D00584
A PROTEIN
             MS2AAR     PHG M25187 ST1APCS    PHG M25396
A SEQUENCE
             HS5TOA30   VRL D00148 HS5TOA31   VRL D00147
------------------------------------------------------------------------------


Table 12. Part of the author name index file in 'ddbjaut.idx'.
The author name index file lists all of the author names that appear in the 
citations. 
------------------------------------------------------------------------------
ABE,A.
             HUMMHDRBWE PRI M27509 HUMMHDRBWF PRI M27510 HUMMHDRBWG PRI M27511
             YSCGAL11A  PLN M22481
ABE,C.
             S85445     BCT S85445
ABE,E.
             M23442     UNA M23442
ABE,H.
             CHKADF     VRT M55660 CHKCOF     VRT M55659
ABE,K.
             CHPCLAC    PRI D11383 CHPIMRF    PRI D11384 CUGCUR09   PLN X64110
             CUGCUR37   PLN X64111 HPCCEXPA   VRL M55970 HPCCPEP1   VRL D10687
             HPCCPEP2   VRL D10688 HPCHABC82  VRL X51587 HPCNS2APA  VRL M55972
             HPCNS2PA   VRL M55971 HPCNS2PB   VRL M55973 HPCNS5PA   VRL M55974
             MUSKE2     ROD M65255 MUSKE2A    ROD M65256 MZECYS     PLN D10622
             RICCPI     PLN J03469 RICGLUTE   PLN D00584 RICLNOCI   PLN J05595
             RICOCS     PLN M29259 RICORYII   PLN X57658 RICOZA     PLN D90406
             RICOZB     PLN D90407 RICOZC     PLN D90408 S54524     PLN S54524
             S54526     PLN S54526 S54530     PLN S54530 S73960     ROD S73960
------------------------------------------------------------------------------


Table 13. Part of the journal citation index file in 'ddbjjou.idx'.
The journal citation index file lists all of the citations that appear in the 
references. 
------------------------------------------------------------------------------
ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992)
             HUMPLASINS HUM M98056    
ACTA BIOCHIM. BIOPHYS. SIN. 28, 233-239(1996)
             TKTII     PLN X82230    
ACTA BIOCHIM. POL. 24, 301-318 (1977)
             LUPTRFJ   PLN K00345    LUPTRFN   PLN K00346    
ACTA BIOCHIM. POL. 26, 369-381(1979)
             HVTRNPHE  PLN X02683    
ACTA BIOCHIM. POL. 29, 143-149 (1982)
             EMEMTA    PLN M32572    EMEMTB    PLN M32573    EMEMTC    PLN M32574    
             EMEMTD    PLN M32575    EMEMTE    PLN M32576    
ACTA BIOCHIM. POL. 34, 21-27 (1987)
             LUPNOSP    PLN M32571
------------------------------------------------------------------------------


Table 14. Part of the gene name index file in 'ddbjgen.idx'.
This file lists all the gene names that appear in the feature table.
------------------------------------------------------------------------------
AACC8
             STMAACC8   BCT M55426
AACC9
             MPUAACC9   BCT M55427
AACT
             HUMA1ACM   PRI K01500 HUMA1ACMA  PRI X00947 HUMA1ACMB  PRI M18035
             HUMAACT1   PRI M18906 HUMAACT2   PRI M22533 HUMAACTA   PRI J05176
AAD
             INTINTORF  BCT L06418 LMOMO229D  BCT X17478
AAD A1
             ENTAAC3VI  BCT M88012
AAD9
             ENEAAD9A   BCT M69221
AADA
             LMOMO229A  BCT X17479 S52249     BCT S52249 SYNAADA    SYN M60473
             TRNTAAB    BCT M55547 TRNTN21CAS BCT M86913
------------------------------------------------------------------------------


The files in this release are arranged in the following order with non-
labeled format.

Release note
    ddbjrel.txt        814 records
Category for bacteria, 69623 entries, 166935959 bases
    ddbjbct.seq      7137427 records
Category for EST1 (expressed sequence tag), 295322 entries, 102604343 bases
    ddbjest1.seq    15216047 records
Category for EST2 (expressed sequence tag), 178813 entries, 65510109 bases
    ddbjest2.seq    10687263 records
Category for EST3 (expressed sequence tag), 216027 entries, 80257408 bases
    ddbjest3.seq    12370001 records
Category for EST4 (expressed sequence tag), 100000 entries, 37317616 bases
    ddbjest4.seq     6034142 records
Category for EST5 (expressed sequence tag), 100000 entries, 41136593 bases 
    ddbjest5.seq     5904250 records
Category for EST6 (expressed sequence tag), 100000 entries, 37308321 bases 
    ddbjest6.seq     5849054 records
Category for EST7 (expressed sequence tag), 100000 entries, 32463544 bases
    ddbjest7.seq     6086248 records
Category for EST8 (expressed sequence tag), 100000 entries, 38755986 bases 
    ddbjest8.seq     5803613 records
Category for EST9 (expressed sequence tag), 100000 entries, 40191045 bases 
    ddbjest9.seq      5722284 records
Category for EST10 (expressed sequence tag), 100000 entries, 38981287 bases 
    ddbjest10.seq     5722576 records
Category for EST11 (expressed sequence tag), 100000 entries, 38773506 bases 
    ddbjest11.seq     5750628 records
Category for EST12 (expressed sequence tag), 100000 entries, 39112305 bases 
    ddbjest12.seq     5732312 records
Category for EST13 (expressed sequence tag), 100000 entries, 39310393 bases 
    ddbjest13.seq     5677509 records
Category for EST14 (expressed sequence tag), 100000 entries, 42003589 bases 
    ddbjest14.seq     5757943 records
Category for EST15 (expressed sequence tag), 100000 entries, 43538104 bases 
    ddbjest15.seq     5672370 records
Category for EST16 (expressed sequence tag), 100000 entries, 40526727 bases 
    ddbjest16.seq     5317295 records
Category for EST17 (expressed sequence tag), 100000 entries, 40008979 bases 
    ddbjest17.seq     5602086 records
Category for EST18 (expressed sequence tag), 100000 entries, 41297425 bases 
    ddbjest18.seq     5908810 records
Category for EST19 (expressed sequence tag), 100000 entries, 44406442 bases 
    ddbjest19.seq     5846385 records
Category for EST20 (expressed sequence tag), 100000 entries, 41080291 bases 
    ddbjest20.seq     5676625 records
Category for EST21 (expressed sequence tag), 100000 entries, 44501547 bases 
    ddbjest21.seq     5817171 records
Category for EST22 (expressed sequence tag), 100000 entries, 42004735 bases 
    ddbjest22.seq     5911449 records
Category for EST23 (expressed sequence tag), 100000 entries, 42535554 bases 
    ddbjest23.seq     5616725 records
Category for EST24 (expressed sequence tag), 100000 entries, 44591910 bases 
    ddbjest24.seq     4660150 records
Category for EST25 (expressed sequence tag), 100000 entries, 26156226 bases 
    ddbjest25.seq     5929300 records
Category for EST26 (expressed sequence tag), 100000 entries, 27983197 bases 
    ddbjest26.seq     5650843 records
Category for EST27 (expressed sequence tag), 100000 entries, 25417568 bases 
    ddbjest27.seq     9050751 records
Category for EST28 (expressed sequence tag), 100000 entries, 30649736 bases 
    ddbjest28.seq     8035136 records
Category for EST29 (expressed sequence tag), 100000 entries, 44317058 bases 
    ddbjest29.seq     5873506 records
Category for EST30 (expressed sequence tag), 100000 entries, 43847766 bases 
    ddbjest30.seq     6167036 records
Category for EST31 (expressed sequence tag), 31424 entries, 13563321 bases 
    ddbjest31.seq     1803004 records
Category for GSS1 (Genome Survey Sequence), 100000 entries, 67399072 bases
    ddbjgss1.seq      4703199 records
Category for GSS2 (Genome Survey Sequence), 100000 entries, 46905310 bases
    ddbjgss2.seq      5022979 records
Category for GSS3 (Genome Survey Sequence), 100000  entries, 44507291 bases
    ddbjgss3.seq      5088975 records
Category for GSS4 (Genome Survey Sequence), 100000 entries, 47980369 bases
    ddbjgss4.seq      5230301 records
Category for GSS5 (Genome Survey Sequence), 100000 entries, 52895336 bases
    ddbjgss5.seq      5570780 records
Category for GSS6 (Genome Survey Sequence), 100000 entries, 49986578 bases
    ddbjgss6.seq      5418739 records
Category for GSS7 (Genome Survey Sequence), 100000 entries, 51339524 bases
    ddbjgss7.seq      5568545 records
Category for GSS8 (Genome Survey Sequence), 100000 entries, 49634545 bases
    ddbjgss8.seq      5574757 records
Category for GSS9 (Genome Survey Sequence), 100000 entries, 56523363 bases
    ddbjgss9.seq      5605831 records
Category for GSS10 (Genome Survey Sequence), 100000 entries, 51840573 bases
    ddbjgss10.seq      5872406 records
Category for GSS11 (Genome Survey Sequence), 100000 entries, 50136556 bases
    ddbjgss11.seq      5228368 records
Category for GSS12 (Genome Survey Sequence), 41485 entries, 18604449 bases
    ddbjgss12.seq      2066711 records
Category for HTG1 (high throughput genomic sequencing), 2000 entries, 253287795 
bases
    ddbjhtg1.seq      4453305 records
Category for HTG2 (high throughput genomic sequencing), 2000 entries, 213095221 
bases
    ddbjhtg2.seq      3834817 records
Category for HTG3 (high throughput genomic sequencing), 2000 entries, 247436350 
bases
    ddbjhtg3.seq      4395260 records
Category for HTG4 (high throughput genomic sequencing), 2000 entries, 98211215 
bases
    ddbjhtg4.seq      1770949 records
Category for HTG5 (high throughput genomic sequencing), 2000 entries, 161561311 
bases
    ddbjhtg5.seq      2992251 records
Category for HTG6 (high throughput genomic sequencing), 2000 entries, 125739015 
bases
    ddbjhtg6.seq      2347887 records
Category for HTG7 (high throughput genomic sequencing), 1530 entries, 238107902 
bases
    ddbjhtg7.seq      4062092 records
Category for human1, 30000 entries, 412683139 bases
    ddbjhum1.seq      9230995 records
Category for human2, 30000 entries, 138105371 bases
    ddbjhum2.seq      3927241 records
Category for human3, 30000 entries, 56859686 bases
    ddbjhum3.seq      2319796 records
Category for human4, 16426 entries, 18456742 bases
    ddbjhum4.seq      1033043 records
Category for invertebrates, 55472 entries, 186832731 bases
    ddbjinv.seq      5995879 records
Category for mammals, 22126 entries, 19847333 bases
    ddbjmam.seq       1250055 records
Category for patents, 188825 entries, 59192365 bases
    ddbjpat.seq      5073206 records
Category for phages, 1488 entries, 3856453 bases
    ddbjphg.seq       174437 records
Category for plants, 96864 entries, 234172587 bases
    ddbjpln.seq     8714034 records
Category for primates, 6612 entries, 5572945 bases
    ddbjpri.seq       362954 records
Category for rodents, 51609 entries, 76862002 bases
    ddbjrod.seq      3588742 records
Category for STS (sequence tagged site), 90314 entries, 32936947 bases
    ddbjsts.seq     5352581 records
Category for synthetic DNAs, 3551 entries, 8585775 bases
    ddbjsyn.seq       311734 records
Category for unannotated sequences, 497 entries, 334989 bases
    ddbjuna.seq      23655 records
Category for viruses, 82627 entries, 73134433 bases
    ddbjvrl.seq     5013738 records
Category for vertebrates, 35490 entries, 32982310 bases
    ddbjvrt.seq      2023915 records
Accession number index file
    ddbjacc.idx     5413039 records
Keyword phrase index file
    ddbjkey.idx      2026782 records
Journal citation index file
    ddbjjou.idx      3066840 records
Gene name index file
    ddbjgen.idx      384098 records