DNA Data Bank of Japan

                              DNA Database

   Release 38, Jul. 1999, including 4,294,369 entries, 3,098,519,597 bases


This database may be copied and redistributed without permission on the 
condition that all the statements in this release note are reproduced in each 
copy.

The present release contains the newest data prepared by the DNA Data Bank of 
Japan (DDBJ), GenBank, and European Molecular Biology Laboratory/European 
Bioinformatics Institute (EMBL/EBI) as of Jun. 29, 1999.  This unified database 
was made possible thanks to the international collaboration among the three
data banks.  All the entries have accordingly been annotated with the feature 
keys common to them. 

All the entries designated by the accession numbers with the prefixes "C", "D", 
"E", "AB", "AG", "AP", "AT", "AU" and "AV" have been collected and processed by 
DDBJ, and the rest have been prepared by GenBank and EMBL/EBI.

There have been a number of genome projects going on worldwide.  Among them 
human genome projects have probably been most productive and yielded a large 
number of ordinary sequences and huge amounts of ESTs.  Thus, we have the human 
(HUM) division solely for human sequences and the primate (PRI) division for 
non-human primate sequences.  Note that the EST division also contains human 
sequences.

The present release does not have the ORG division.  Thus, if you are interested
in human mitochondrial sequences, for example, you are now advised to refer to 
the HUM division.

This release also includes an independent division (PAT) for patent data.  The 
patent data are those which the Japanese Patent Office (JPO), United States 
Patent and Trademark Office (USPTO), and the European Patent Office (EPO) 
collected and processed.  The accession numbers of the patent data collected 
by the Japanese Patent Office start with the prefix "E", those collected and 
supplied by USPTO and GenBank respectively start with "I" and "AR", and those 
collected and supplied by EPO and EMBL/EBI respectively start with "A".  The 
entries with the prefixes "I","AR", "A" and "E" were allocated to a file 
(ddbjpat.seq) in the DDBJ format.  Note also that unauthorized use of the 
patent data may cause legal issues for which we have no responsibility.

In this release, the SOURCE in the flat file was revisited and revised if 
necessary in accordance with the unified taxonomy database common to the three 
data banks.

The number of ESTs has been increasing at an enormous rate and is expected 
to be growing even more rapidly in the future.  To cope with this situation 
and handle the data files with least possible time and manner, we split the 
EST data in 24 files in the present release; ddbjest1 for entries with 
the accession numbers with A to M prefixes, ddbjest2 for those with N to S, 
ddbjest3 for those with T to Z, and ddbjest4 to ddbjest24 contain those with 
two letter prefixes.  The files from the 4th to 23th contain 100,000 entries 
each, and the 24th does the rest.

The present release includes the GSS division.  GSS stands for the Genome 
Survey Sequence, which is similar to EST, except that GSS is genomic DNA 
whereas EST is cDNA.  This division is divided into nine files; each of the 
first eight files contains 100,000 entries and the last one does the rest.  
This release also includes the High Throughput Genome Sequence (HTGS) which 
comes mainly from genome project teams which deal with a clone as a sequencing 
unit.

The index files are not presented in this release except for ddbjacc.idx, 
ddbjgen.idx, ddbjjou.idx, and ddbjkey.idx.  Instead, we have included a program 
by which to make the index files not presented in this release.  For the use of 
the program, see the files, seq2indexes.doc, seq2indexes.c, and seq2indexes.h 
in this release.

The present release contains amino acid sequences that were translated from 
the corresponding nucleotide sequences in our database. In the translation 
we paid much attention to the fact that some species or organella have a 
codon different from the universal one, and used the proper codon table.  
However, if you find an incorrect codon in a translated sequence, please let 
us know.

The three data banks include the item VERSION in the flat file, which indicates 
a version of a submitted nucleotide sequence (see Table 1).  It is expressed like AB123456.1, in which the digit(s) after the period is a version number.  
The reason for adding VERSION is that since a submitted sequence sometimes 
revised by the submitter, the accession number alone cannot specify the 
sequence in question causing the user a trouble.  The number is increased by 
one every time when a revised sequence is made public.  Accordingly, the 
translated protein sequence will be accompanied with a /protein_id which is 
expressed like BAA12345.1, in which the digit(s) after the period is again a 
version number.  The number is increased by one when the corresponding 
nucleotide sequence is revised and the protein sequence is changed as a result, 
and when the revised protein sequence is made public.  The present NID and PID will 
be not in use in the next release (39th) to be made in October 1999.

From the 40th release to be made in January 2000, we will terminate the RNA division.  The extant RNA data will be redistributed according to the category 
of the organism.  Therefore, you will find the human RNA sequence, for example, 
in the category of human.

This release was published by the following DDBJ staff.

General administration
    T. Gojobori, T. Imanishi, Y. Fukuma, A. Watanabe, Y. Ueda, Y. Katsube, 
    K. Okuda, J. Sugiyama, J. Bellgard, H. Tsutsui(hold), Y. Noguchi, R. Chapman
Database construction
    Y. Tateno, M. Ota, S. Miyazaki(hold), N. Yasuda, Y. Sato, H. Tsutsui, 
    M. Hirashima, A. Hasegawa, A. Suzuki, Y. Yamamoto, M. Ejima, M. Okaneya, 
    N. Endo, M. Gojobori,  M. Imma, J. Muroya, A. Shimada, S. Nomoto, 
    A. Hashizume, M. Horie
Database software development and management
    H. Sugawara, S. Miyazaki, T. Okayama, S. Misu, T. Mizunuma, Y. Kawanishi,
    K. Goto, K. Mamiya, M. Kikuchi(hold), T. Futatsuki, H. Hashimoto, 
    H. Harimoto, Y. Minesaki, T. Takaki, S. Sato, H. Ichinose, K. Kaneda
System management
    K. Nishikawa, K. Ikeo, K. Yoshioka, T. Osawa, I. Mochizuki, M. Kikuchi, 
    T. Narita, M. Nagura
Editorial and public relations
    N. Saitou, K. Fukami-Kobayashi, Y. Daito, Y. Hattori, T. Kawamoto, 
    S. Nagira, K. Ichikawa


DNA Data Bank of Japan
Center for Information Biology
National Institute of Genetics
Mishima 411-8540, Japan 
Phone:  +81 559 81 6853
FAX:    +81 559 81 6849
E-mail: ddbj@ddbj.nig.ac.jp  (for general inquiry)
        ddbjsub@ddbj.nig.ac.jp  (for data submission)
        ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication)
WWW:    http://www.ddbj.nig.ac.jp (for DDBJ WWW server)
        http://sakura.ddbj.nig.ac.jp (for DDBJ sequence data submission system 
                                   SAKURA)

Acknowledgement: We are grateful to NCBI and EMBL/EBI for permitting us
to include the data they have collected and processed in the present release.
We also thank the Japanese Patent Office for kindly allowing us to distribute 
the patent data they collected and processed.


DDBJ Database Release History

Release  Date     Entries     Bases          Comments
------------------------------------------------------------------------------
 38     07/99   4,294,369   3,098,519,597
 37     03/99   3,311,627   2,375,261,951   VERSION, /protein_id started
 36     01/99   3,073,166   2,190,425,560
 35     10/98   2,759,261   1,957,341,169
 34     07/98   2,412,785   1,708,580,623
 33     04/98   2,174,769   1,479,303,279
 32     01/98   1,956,669   1,300,950,613
 31     10/97   1,731,532   1,139,869,464   Adoption of the unified taxonomy
                                            database
 30     07/97   1,534,115     992,788,339   NID and PID started
 29     04/97   1,270,194     841,415,232   
 28     01/97   1,154,120     756,785,219   HTG division started
                                            ORG division eliminated
 27     10/96     936,697     608,103,057   GSS division started
 26     07/96     835,552     551,932,448   
 25     04/96     744,490     499,300,364   /translation started
 24     01/96     637,508     431,771,652   
 23     10/95     569,757     390,694,350   
 22     07/95     437,588     322,982,425   HUM division started
 21     04/95     274,596     250,875,023   
 20     01/95     239,689     231,299,557   
 19     10/94     204,332     205,274,131   
 18     07/94     185,230     192,473,021   
 17     04/94     169,957     179,942,209   
 16     01/94     154,626     165,017,628   
 15     10/93     131,649     147,224,690   
 14     07/93     120,350     138,686,333   
 13     04/93     112,067     129,784,445   
 12     01/93      97,683     120,815,244   EST division started
 11     07/92      65,693      84,839,075   
 10     01/92      59,317      77,805,556   GenBank/EMBL inclusion started
  9     07/91       1,130       2,002,124   
  8     01/91         879       1,573,442   
  7     07/90         681       1,154,211   
  6     01/90         496         841,236   
  5     07/89         395         679,378   
  4     01/89         302         535,985   
  3     07/88         230         345,850   
  2     01/88         142         199,392   
  1     07/87          66         108,970   Started with DDBJ only
------------------------------------------------------------------------


This release covers 18 categories of organisms and others as follows:
------------------------------------------------------------------------------
ddbjbct.*** Category for bacteria
ddbjest.*** Category for EST (expressed sequence tag)
ddbjhtg.*** Category for HTG (high throughput genomic sequencing)
ddbjhum.*** Category for human
ddbjgss.*** Category for GSS (Genome Survey Sequence)
ddbjinv.*** Category for invertebrates
ddbjmam.*** Category for mammals other than primates and rodents
ddbjpat.*** Category for patents
ddbjphg.*** Category for phages
ddbjpln.*** Category for plants
ddbjpri.*** Category for primates other than human
ddbjrna.*** Category for RNAs
ddbjrod.*** Category for rodents
ddbjsts.*** Category for STS (sequence tagged site)
ddbjsyn.*** Category for synthetic DNAs
ddbjuna.*** Category for unannotated sequences
ddbjvrl.*** Category for viruses
ddbjvrt.*** Category for vertebrates other than mammals
------------------------------------------------------------------------------


Each category then has the following nine files. Note that all the files 
except for ddbj***.seq are created by the user by use of seq2indexes as mentioned 
in the release note.
------------------------------------------------------------------------------
ddbj***.seq  List of an entry in DDBJ format, see Table 1.
ddbj***.acc  List of the accession numbers, see Table 2 .
ddbj***.aut  List of the authors, see Table 3.
ddbj***.dir  List of the short directory in DDBJ style, see Table 4.
ddbj***.idx  List of indices, see Table 5.
ddbj***.jou  List of the journals, see Table 6.
ddbj***.key  List of the key words, see Table 7.
ddbj***.org  List of the species names, see Table 8.
ddbj***.sdr  List of the short directory in DDBJ style, see Table 9.
------------------------------------------------------------------------------


Table 1. Part of the contents in the file 'ddbjbct.seq'.
This shows all pieces of information on one entry in DDBJ format.
------------------------------------------------------------------------------
LOCUS       D87069        993 bp    mRNA            BCT       07-FEB-1999
DEFINITION  Escherichia coli mRNA for RNA polymerase sigma subunit, truncated
            form of sigma-38, complete cds.
ACCESSION   D87069
NID         d1070184
VERSION     D87069.1
KEYWORDS    RNA polymerase sigma subunit, truncated form of sigma-38.
SOURCE      Escherichia coli (strain:W3110) cDNA to mRNA.
  ORGANISM  Escherichia coli
            Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
            Escherichia.
REFERENCE   1  (bases 1 to 993)
  AUTHORS   Jishage,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki
            Jishage, National Institute of Genetics, Molecular Genetics; Yata
            1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp,
            Tel:0559-81-6742, Fax:0559-81-6746)
  STANDARD  full staff_review
REFERENCE   2  (bases 1 to 993)
  AUTHORS   Jishage,M. and Ishihama,A.
  TITLE     Variation in RNA polymerase sigma subunit composition within
            different stocks of Escherichia coli starin W3110
  JOURNAL   Unpublished (1996)
  STANDARD  full staff_review
REFERENCE   3  (sites)
  AUTHORS   Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A.
  TITLE     DNA base sequence variability in katF (putative sigma factor) gene
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 20, 5479-5480 (1992)
  STANDARD  full staff_review
REFERENCE   4  (sites)
  AUTHORS   Takayanagi,Y., Tanaka,K. and Takahashi,H.
  TITLE     Structure of the 5' upstream region and the regulation of the rpoS
            gene of Escherichia coli
  JOURNAL   Mol Gen Genet 243, 525-531 (1994)
  STANDARD  full staff_review
COMMENT     
FEATURES             Location/Qualifiers
     source          1..993
                     /organism="Escherichia coli"
                     /sequenced_mol="cDNA to mRNA"
                     /strain="W3110"
     CDS             1..810
                     /db_xref="PID:d1013928"
                     /note="the gene has four single base changes, resulting
                     in two amino acid substitutions and an amber mutation"
                     /product="RNA polymerase sigma subunit, truncated form of
                     sigma-38"
                     /protein_id="BAA13238.1"
                     /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE
                     LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV
                     VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN
                     QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER
                     ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK"
                     /transl_table=11
     mutation        75
                     /citation=[3]
                     /replace="t"
     mutation        97
                     /citation=[3]
                     /replace="t"
     mutation        99
                     /citation=[3]
                     /replace="t"
     mutation        808
                     /citation=[3]
                     /replace="t"
BASE COUNT      254 a    223 c    291 g    225 t      0 others
ORIGIN      
        1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac
       61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg
      121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt
      181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg
      241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt
      301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc
      361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc
      421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac
      481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga
      541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag
      601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc
      661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat
      721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc
      781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg
      841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa
      901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag
      961 gggctgaata tcgaagcgct gttccgcgag taa
//
------------------------------------------------------------------------------


Table 2. Part of the contents in the file 'ddbjbct.acc'.
The first column refers to the secondary accession number, second column to 
the locus name, and third to the primary accession number. The primary number 
may be the same as the secondary number. They are arranged in the ascending 
order of the secondary accession numbers.
------------------------------------------------------------------------------
D00001 -> ECOPBPAA   X04516
D00002 -> ECOPYRH    X04469
D00006 -> PNS981TET  D00006
D00020 -> COLE2LYS   D00020
D00021 -> COLE31YS   D00021
D00038 -> BRLAM330   D00038
D00066 -> BAC139AC   D00066
D00067 -> ECONANA    M20207
D00069 -> ECOUVRD2   D00069
D00087 -> BACXYNAA   D00087
------------------------------------------------------------------------------


Table 3. Part of the contents in the file 'ddbjbct.aut'.
For each author name given on the left to the arrow, the corresponding locus 
name and primary accession number are respectively listed on the right. They 
are arranged in the alphabetical order of the author names.
------------------------------------------------------------------------------
Aan,F. -> STYCRR     X05210
Aan,F. -> STYENZI    M76176
Aaronson,W. -> ECOKPSD    M64977
Aaronson,W. -> ECONEUA    J05023
Abad-Lapuebla,M.A. -> VIBTDHI    D90238
Abdel-Mawgood,A.L. -> CYAPSBHA   X16394
Abdel-Meguid,S.S. -> TRNGDRECM  J01843
Abdelal,A. -> STYCARA    M36540
Abdelal,A. -> STYCARAB   X13200
Abdelal,A.H. -> PSENOSA    M60717
------------------------------------------------------------------------------


Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'.
For each locus name given in the first column, the corresponding primary 
accession number, molecular type, number of nucleotide pairs, and description 
for the locus are respectively listed. They are arranged in the alphabetical 
order of the locus names.
------------------------------------------------------------------------------
ABCAARAA   M34830 ds-DNA    1624 A.aceti acetic acid resistance protein (aarA)
gene, complete cds.
ABCADHCC   D00635 ds-DNA    4230 A. polyoxogenes alcohol dehydrogenase (EC 
1.1.99.8) and cytochrome c genes.
ABCALDH    D00521 ds-DNA    2683 A.polyoxogenes membrane-bound aldehyde 
dehydrogenase gene, complete cds and flanks.
ABCBCSAA   M37202 ds-DNA    9540 A.xylinum bcs B, bcs C and bcs D genes, 
complete cds and bcs A gene, partial cds.
ABCCELA    M76548 ds-DNA    1165 Acetobacter xylinum UDP pyrophosphorylase 
(celA) gene, complete cds.
ABCCELSYN  X54676 ds-DNA    5363 A. xylinum gene for cellulose biosynthesis
ABCIS1380  D10043 ds-DNA    1665 A.pasteurianus insertion sequence IS1380.
ACAADH1    D90004 ds-DNA    2467 Acetobacter aceti(K6033) alcohol dehydrogenase
subunit gene(adh1).
ACCAAC2    M62833 ds-DNA    1123 Acinetobacter baumannii aminoglycoside 
acetyltr ansferase (aac2) gene, complete cds.
ACCACEAA   M62822 ds-DNA    1874 A.baumannii chloramphenicol acetyltransferase
(cat) gene, complete cds.
------------------------------------------------------------------------------


Table 5. Part of the contents in the file 'ddbjbct.idx'.
The first column refers to the locus name, second column to the starting site 
of the locus in byte, and third to its ending site in byte. They are arranged 
in the alphabetical order of the locus names.
------------------------------------------------------------------------------
%*****************************
#ABCAARAA       0       3211
#ABCADHCC       3212    10608
#ABCALDH        10609   15864
#ABCBCSAA       15865   29583
#ABCCELA        29584   32289
#ABCCELSYN      32290   40960
#ABCIS1380      40961   44711
#ACAADH1        44712   49357
#ACCAAC2        49358   52395
------------------------------------------------------------------------------


Table 6. Part of the contents in the file 'ddbjbct.jou'.
This gives information on the journal in which sequence data were published.
------------------------------------------------------------------------------
(in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of 
Microorganisms:  129-137, Plenum Press, New York (1987) -> BACAMYABS  M57457
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16S   
M55011
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SA  
M55006
(in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene 
Regulation in Bacilli:  3-10, Academic Press, New York (1982) -> BACRG16SB  
M55008
(in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial 
Differentiation:  85-94, American Society for Microbiology, Washington, DC 
(1985) -> BACSPOII   M57606
(in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and 
Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1   M54881
(in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis:  
138-143, Academic Press, New York (1976) -> ECOLAC     J01636
(in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase:  455-472, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1    K01197
(in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and 
polymers. Discovery and commercialization.:  186-200, American Chemical 
Society, Washington, D.C. (1991) -> ECOTGP     J01714
(in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions:  193-207, 
Walter de Gruyter, New York (1975) -> ECOLAC     J01636
(in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, 
part E:  In press, Academic Press, New York, N.Y. (1986) -> PLMCG      M11320
Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1    M54893
Actinomycetologica 5, 14-17 (1991) -> STMARGG    D00799
Adv. Biophys. 21, 115-133 (1986) -> R10REP     M26840
Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA   M26839
Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA   M26893
Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT     M14040
Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA    M20207
Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330   D00038
Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT     D00129
Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP    D00219
Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR   M35503
Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP   D00312
Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY   D00117
Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA   D00087
Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135   D00348
Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR     D00343
Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI      D00342
Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB   M35517
Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI    D00217
------------------------------------------------------------------------------


Table 7. Part of the contents in the file 'ddbjbct.key'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding key words are listed on the left. 
------------------------------------------------------------------------------
A.aceti acetic acid resistance protein (aarA) gene, complete cds.       -> 
ABCAARAA     M34830
acetic acid resistance protein.         -> ABCAARAA     M34830
Cloning of genes responsible for acetic acid resistance in acetobacter aceti   
-> ABCAARAA      M34830
A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes.    
-> ABCADHCC      D00635
alcohol dehydrogenase; cytochrome c.    -> ABCADHCC     D00635
Cloning and sequencing of the gene cluster encoding two subunits of membrane-
bound alcohol dehydrogenase from Acetobacter polyoxogenes  -> ABCADHCC     
D00635
These data kindly submitted in computer readable form by: Toshimi Tamaki 
Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 
475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486     -> ABCADHCC     D00635
A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and 
flanks.     -> ABCALDH      D00521
aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme.  -> 
ABCALDH      D00521
Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from 
Acetobacter polyoxogenes     -> ABCALDH      D00521
------------------------------------------------------------------------------


Table 8. Part of the contents in the file 'ddbjbct.org'.
For the locus and accession number respectively given on the right to the 
arrow, the corresponding taxonomic names are listed on the left.  They are 
arranged in the alphabetical order of the species names.
------------------------------------------------------------------------------
A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.   -> ANIRUBPS     X00019
A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; 
Gracilicutes; Oxyphotobacteria; Cyanobacteria.    -> ANIRGGX      X00343
A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; 
Oxyphotobacteria; Cyanobacteria.        -> ANIRGG       X00512
A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.      -
> ABCADHCC     D00635
A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum 
Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.       -> 
AQUPCAB      K02660
A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; 
Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria.      -> AQUCPCAB     
K02659
A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; 
Scotobacteria; Aerobic rods and cocci; Azotobacteraceae.  -> AVINIFUSV    
M17349
A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; 
Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; 
Azotobacteraceae.       -> ABCAARAA      M34830
A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus 
actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; 
Facultatively anaerobic rods; Pasteurellaceae.      -> ACNLKTXN     M27399
A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; 
Gracilicutes; Scotobacteria; Neisseriaceae.         -> ACCCITSYN    M33037
------------------------------------------------------------------------------


Table 9. Part of the short directory file in DDBJ style in the file 
'ddbjbct.sdr'.
The short directory file contains brief descriptions of all of the sequence 
entries contained in the DDBJ style. 
------------------------------------------------------------------------------
ABCAARAA    A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp
ABCADHCC    A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and      4230bp
ABCALDH     A.polyoxogenes membrane-bound aldehyde dehydrogenase gene,   2683bp
ABCBCSABCD  A.xylinum bcs A, B, C and D genes, complete cds's.           9540bp
ABCCELA     Acetobacter xylinum UDP pyrophosphorylase (celA) gene,       1165bp
ABCCELSYN   A. xylinum gene for cellulose biosynthesis                   5363bp
ABCIS1380   A.pasteurianus insertion sequence IS1380.                    1665bp
ACAADH1     Acetobacter aceti(K6033) alcohol dehydrogenase subunit       2467bp
ACCAAC2     Acinetobacter baumannii aminoglycoside acetyltransferase     1123bp
ACCACEAA    A.baumannii chloramphenicol acetyltransferase (cat) gene,    1874bp
ACCAPHA6    Acinetobacter baumannii aphA-6 gene.                         1170bp
ACCBENABCA  A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins   15922bp
ACCCAT      Acinetobacter calcoaceticus cat operon.                     15922bp
ACCCATAM    A.calcoaceticus catA and catM genes, encoding catechol 1,    5537bp
ACCCHMO     Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp
ACCCITSYN   A.anitratum citrate synthase gene, complete cds.             1895bp
------------------------------------------------------------------------------


In addition to the 9 tables the four following index files are included in 
this release. These files were prepared irrespective of the 14 categories of 
taxonomic divisions.

 Accession number index file
 Keyword phrase index file
 Journal citation index file
 Gene name index file

A brief description is given for each file in the following.


Table 10. Part of the accession number index file in the 'ddbjacc.idx'.
The following excerpt from the accession number index file illustrates the 
format of the index.
------------------------------------------------------------------------------
D00100    PSEASPAA   BCT D00100    
D00101    RABNP450R  MAM D00101    
D00102    HUMLTX     HUM D00102    
D00103    AFARRN5SA  BCT D00103   AFRRN5SA   BCT X05517    
D00104    AFARRN5SB  BCT D00104   AFRRN5SB   BCT X05518    
D00105    AFARRN5S   BCT D00105   ASRRN5S    BCT X05524    
D00106    ACH5SRR    BCT D00106   AXRRN5S    BCT X05522   AXRRN5SA   BCT X05523    
D00107    ACH5SRRX   BCT D00107   ACRRN5S    BCT X05521 
------------------------------------------------------------------------------


Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'.
Keyword phrases consist of names for gene products and other characteristics 
of sequence entries. 
------------------------------------------------------------------------------
A CHANNEL
             DROCHA     INV M17155
A COMPONENT
             SQLCVEA    VRL M38183
A LOCUS
             GORGOGOA3  PRI X54375 GORGOGOA4  PRI X54376
A LOCUS ALLELE
             GORA0101   PRI X60258 GORA0201   PRI X60259 GORA0401   PRI X60257
             GORA0501   PRI X60256
A MULTI-GENE FAMILY
             RICGLUTE   PLN D00584
A PROTEIN
             MS2AAR     PHG M25187 ST1APCS    PHG M25396
A SEQUENCE
             HS5TOA30   VRL D00148 HS5TOA31   VRL D00147
------------------------------------------------------------------------------


Table 12. Part of the author name index file in 'ddbjaut.idx'.
The author name index file lists all of the author names that appear in the 
citations. 
------------------------------------------------------------------------------
ABE,A.
             HUMMHDRBWE PRI M27509 HUMMHDRBWF PRI M27510 HUMMHDRBWG PRI M27511
             YSCGAL11A  PLN M22481
ABE,C.
             S85445     BCT S85445
ABE,E.
             M23442     UNA M23442
ABE,H.
             CHKADF     VRT M55660 CHKCOF     VRT M55659
ABE,K.
             CHPCLAC    PRI D11383 CHPIMRF    PRI D11384 CUGCUR09   PLN X64110
             CUGCUR37   PLN X64111 HPCCEXPA   VRL M55970 HPCCPEP1   VRL D10687
             HPCCPEP2   VRL D10688 HPCHABC82  VRL X51587 HPCNS2APA  VRL M55972
             HPCNS2PA   VRL M55971 HPCNS2PB   VRL M55973 HPCNS5PA   VRL M55974
             MUSKE2     ROD M65255 MUSKE2A    ROD M65256 MZECYS     PLN D10622
             RICCPI     PLN J03469 RICGLUTE   PLN D00584 RICLNOCI   PLN J05595
             RICOCS     PLN M29259 RICORYII   PLN X57658 RICOZA     PLN D90406
             RICOZB     PLN D90407 RICOZC     PLN D90408 S54524     PLN S54524
             S54526     PLN S54526 S54530     PLN S54530 S73960     ROD S73960
------------------------------------------------------------------------------


Table 13. Part of the journal citation index file in 'ddbjjou.idx'.
The journal citation index file lists all of the citations that appear in the 
references. 
------------------------------------------------------------------------------
ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992)
             HUMPLASINS HUM M98056    
ACTA BIOCHIM. BIOPHYS. SIN. 28, 233-239(1996)
             TKTII     PLN X82230    
ACTA BIOCHIM. POL. 24, 301-318 (1977)
             LUPTRFJ   PLN K00345    LUPTRFN   PLN K00346    
ACTA BIOCHIM. POL. 26, 369-381(1979)
             HVTRNPHE  PLN X02683    
ACTA BIOCHIM. POL. 29, 143-149 (1982)
             EMEMTA    PLN M32572    EMEMTB    PLN M32573    EMEMTC    PLN M32574    
             EMEMTD    PLN M32575    EMEMTE    PLN M32576    
ACTA BIOCHIM. POL. 34, 21-27 (1987)
             LUPNOSP    PLN M32571
------------------------------------------------------------------------------


Table 14. Part of the gene name index file in 'ddbjgen.idx'.
This file lists all the gene names that appear in the feature table.
------------------------------------------------------------------------------
AACC8
             STMAACC8   BCT M55426
AACC9
             MPUAACC9   BCT M55427
AACT
             HUMA1ACM   PRI K01500 HUMA1ACMA  PRI X00947 HUMA1ACMB  PRI M18035
             HUMAACT1   PRI M18906 HUMAACT2   PRI M22533 HUMAACTA   PRI J05176
AAD
             INTINTORF  BCT L06418 LMOMO229D  BCT X17478
AAD A1
             ENTAAC3VI  BCT M88012
AAD9
             ENEAAD9A   BCT M69221
AADA
             LMOMO229A  BCT X17479 S52249     BCT S52249 SYNAADA    SYN M60473
             TRNTAAB    BCT M55547 TRNTN21CAS BCT M86913
------------------------------------------------------------------------------


The files in this release are arranged in the following order with non-
labeled format.

Release note
    ddbjrel.txt        770 records
Category for bacteria, 62290 entries, 148851593 bases
    ddbjbct.seq      6547144 records
Category for EST1 (expressed sequence tag), 295317 entries, 102601274 bases
    ddbjest1.seq     15639728 records
Category for EST2 (expressed sequence tag), 178588 entries, 65474371 bases
    ddbjest2.seq     11080053 records
Category for EST3 (expressed sequence tag), 215941 entries, 80226795 bases
    ddbjest3.seq     12805700 records
Category for EST4 (expressed sequence tag), 100000 entries, 37323377 bases
    ddbjest4.seq      6264366 records
Category for EST5 (expressed sequence tag), 100000 entries, 41142781 bases 
    ddbjest5.seq      6133682 records
Category for EST6 (expressed sequence tag), 100000 entries, 37300150 bases 
    ddbjest6.seq     6073092 records
Category for EST7 (expressed sequence tag), 100000 entries, 32462434 bases
    ddbjest7.seq     6261910 records
Category for EST8 (expressed sequence tag), 100000 entries, 38755118 bases 
    ddbjest8.seq     6032738 records
Category for EST9 (expressed sequence tag), 100000 entries, 40193139 bases 
    ddbjest9.seq      5955270 records
Category for EST10 (expressed sequence tag), 100000 entries, 38978372 bases 
    ddbjest10.seq     5950778 records
Category for EST11 (expressed sequence tag), 100000 entries, 38770963 bases 
    ddbjest11.seq     5977736 records
Category for EST12 (expressed sequence tag), 100000 entries, 39126233 bases 
    ddbjest12.seq     5954578 records
Category for EST13 (expressed sequence tag), 100000 entries, 38932809 bases 
    ddbjest13.seq     5909065 records
Category for EST14 (expressed sequence tag), 100000 entries, 42062455 bases 
    ddbjest14.seq     5960110 records
Category for EST15 (expressed sequence tag), 100000 entries, 42916076 bases 
    ddbjest15.seq     5954524 records
Category for EST16 (expressed sequence tag), 100000 entries, 40331983 bases 
    ddbjest16.seq     5580807 records
Category for EST17 (expressed sequence tag), 100000 entries, 40844247 bases 
    ddbjest17.seq     5820295 records
Category for EST18 (expressed sequence tag), 100000 entries, 40574510 bases 
    ddbjest18.seq     6177922 records
Category for EST19 (expressed sequence tag), 100000 entries, 44485443 bases 
    ddbjest19.seq     6066843 records
Category for EST20 (expressed sequence tag), 100000 entries, 40926309 bases 
    ddbjest20.seq     5895201 records
Category for EST21 (expressed sequence tag), 100000 entries, 46209384 bases 
    ddbjest21.seq     5521176 records
Category for EST22 (expressed sequence tag), 100000 entries, 35994696 bases 
    ddbjest22.seq     5193434 records
Category for EST23 (expressed sequence tag), 100000 entries, 26005110 bases 
    ddbjest23.seq     6015232 records
Category for EST24 (expressed sequence tag), 32343 entries, 8349630 bases 
    ddbjest24.seq     1953772 records
Category for GSS1 (Genome Survey Sequence), 100000 entries, 64270929 bases
    ddbjgss1.seq      4858672 records
Category for GSS2 (Genome Survey Sequence), 100000 entries, 40516506 bases
    ddbjgss2.seq      5189702 records
Category for GSS3 (Genome Survey Sequence), 100000  entries, 46989852 bases
    ddbjgss3.seq      5185133 records
Category for GSS4 (Genome Survey Sequence), 100000 entries, 49631354 bases
    ddbjgss4.seq      5416493 records
Category for GSS5 (Genome Survey Sequence), 100000 entries, 53153875 bases
    ddbjgss5.seq      5689599 records
Category for GSS6 (Genome Survey Sequence), 100000 entries, 49977820 bases
    ddbjgss6.seq      5492511 records
Category for GSS7 (Genome Survey Sequence), 100000 entries, 50972020 bases
    ddbjgss7.seq      5578374 records
Category for GSS8 (Genome Survey Sequence), 100000 entries, 52187577 bases
    ddbjgss8.seq      5607199 records
Category for GSS9 (Genome Survey Sequence), 83358 entries, 36061615 bases
    ddbjgss9.seq      4204716 records
Category for HTG (high throughput genomic sequencing), 2687 entries, 355929756 
bases
    ddbjhtg.seq      6141217 records
Category for human, 98575 entries, 477728753 bases
    ddbjhum.seq      13467297 records
Category for invertebrates, 47464 entries, 167964786 bases
    ddbjinv.seq      5402775 records
Category for mammals, 19684 entries, 17798695 bases
    ddbjmam.seq       1148283 records
Category for patents, 137719 entries, 43520758 bases
    ddbjpat.seq      3903387 records
Category for phages, 1450 entries, 3276789 bases
    ddbjphg.seq       158776 records
Category for plants, 78187 entries, 194692943 bases
    ddbjpln.seq     7353423 records
Category for primates, 5503 entries, 4493808 bases
    ddbjpri.seq       309354 records
Category for RNAs, 809 entries, 214264 bases
    ddbjrna.seq      36724 records
Category for rodents, 48211 entries, 70128407 bases
    ddbjrod.seq      3399206 records
Category for STS (sequence tagged site), 80420 entries, 28678126 bases
    ddbjsts.seq     4799442 records
Category for synthetic DNAs, 3375 entries, 7828294 bases
    ddbjsyn.seq       294287 records
Category for unannotated sequences, 495 entries, 381934 bases
    ddbjuna.seq      26111 records
Category for viruses, 72022 entries, 64653602 bases
    ddbjvrl.seq     4525891 records
Category for vertebrates, 29931 entries, 28627882 bases
    ddbjvrt.seq      1769249 records
Accession number index file
    ddbjacc.idx     4318837 records
Keyword phrase index file
    ddbjkey.idx      1660564 records
Journal citation index file
    ddbjjou.idx      2368677 records
Gene name index file
    ddbjgen.idx      340299 records