DNA Data Bank of Japan DNA Database Release 22, July 1995, including 437,588 entries, 322,982,425 bases This database may be copied and redistributed without permission on the condition that all the statements in this release note are reproduced in each copy. The present release contains the newest data prepared by the DNA Data Bank of Japan (DDBJ), GenBank, and European Bioinformatics Institute (EBI) as of July, 1995. This unified database was made possible thanks to the international collaborations among the three data banks. All the entries have accordingly been annotated with the common feature keys among them. All the entries designated by the accession numbers with the prefix "D" have been collected and processed by DDBJ, and the rest have been prepared by GenBank and EBI. There have been a number of genome projects going on worldwide. Among them human genome projects have probably been most productive and yielded a large number of ordinary sequences and huge amounts of ESTs. We now think that we will serve better if we have a separate division for human sequences. For example, in that way one could more easily go over from the DNA databases to the human gene mapping database run by the Genome Data Base, or vice versa. Thus from this release we divide the primate (PRI) division into two, the human (HUM) and PRI divisions; HUM includes human sequences only, while (the new) PRI contains the other primate sequences. Note that both the organella (ORG) and EST divisions also contain human sequences of those categories. Thus if you are interested in human mitochondrial sequences, you have to go to ORG division instead of HUM. The present release includes duplicated entries over the divisions. This was originally caused by the fact that GenBank no longer has a separate division for organella, and data belonging to this category were allocated to the other divisions according to the "host" species. For example, a human mitochondrial DNA sequence now belongs to the primate division in the GenBank database. We, however, still maintain the organella division in this release. To revive the organella division in the GenBank database and incorporate it in this release was nevertheless quite cumbersome and time- consuming, and did not allow us to get the complete results. Namely, the pertinent divisions other than the organella division still include the data from organella. Thus please be careful when you want to retrieve and get results only for nuclear sequences. We apologize for that, and are trying to resolve the problem by working together with the US and European data banks. This release also includes independent categories for patent, EST, and STS data. The patent data are those which the Japanese Patent Office (JPO), United States Patent and Trademark Office (USPTO), and the European Patent Office (EPO) collected and processed. The accession numbers of the patent data collected by the Japanese Patent Office start with the prefix "E", those collected and supplied by USPTO and GenBank respectively start with "I", and those collected and supplied by EPO and EBI respectively start with "A". The entries with the prefixes "I" and "A" were allotted together to one division, and those with "E" were allocated to a file (japio.dat). Note also that unauthorized use of the patent data may cause legal issues for which we have no responsibility. The number of ESTs has been growing rapidly and is expected to be growing even more rapidly in the future. To cope with this situation and handle the data files with least possible time and manner, we will split the present EST division into several subdivisions from the next release (Release 23). This release does not include amino acid sequence data, because the genetic code system is known to be no longer uniform among species and organella, and we are not yet prepared for this. Published by: K. Ikeo, T. Imanishi, K. Goto, M. Horie, Y. Sato, H. Tsutsui, M. Iwase, Y. Hattori, A. Hasegawa, A. Suzuki, M. Hirashima, Y. Yamamoto, M. Shimoyama, R. Suzuki, R. Uchida, Y. Shidahara, M. Gojobori, Y. Yamaguchi, K. Nomura, S. Nagira, Y. Daito, A. Watanabe, Y. Ueda, T. Kawamoto, N. Shirakabe, K. Okuda, K. Ichikawa, H. Suzuki, Y. Kitahara, M. Sugizaki, C. Amano, M. Ogawa, R. Terauchi, K. Hatakeyama, N. Nakaya, T. Ito, I. Mochizuki, T. Okayama, T. Tamura, J. Ishi-i, M. Mizumuma, T. Koike, N. Saitou, T. Gojobori, and Y. Tateno DNA Data Bank of Japan Center for Information Biology National Institute of Genetics Mishima 411, Japan Phone: +81 559 81 6853 FAX: +81 559 81 6849 E-mail: ddbj@ddbj.nig.ac.jp (for general inquiry) ddbjsub@ddbj.nig.ac.jp (for data submission) ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication) http://www.nig.ac.jp (for WWW server) Acknowledgement: We are grateful to NCBI, EBI, and the National Center for Genome Resources for permitting us to include their data in the present release. We also thank the Japanese Patent Office and Japan Patent Information Organization for kindly allowing us to distribute the patent data they collected and processed. DDBJ Database Release History Release Date Entries Bases comments -------------------------------------------------------------------- 22 07/95 437,588 322,982,425 GenBank and EMBL included 21 04/95 274,596 250,875,023 GenBank and EMBL included 20 01/95 239,689 231,299,557 GenBank and EMBL included 19 10/94 204,332 205,274,131 GenBank and EMBL included 18 07/94 185,230 192,473,021 GenBank and EMBL included 17 04/94 169,957 179,942,209 GenBank and EMBL included 16 01/94 154,626 165,017,628 GenBank and EMBL included 15 10/93 131,649 147,224,690 GenBank and EMBL included 14 07/93 120,350 138,686,333 GenBank and EMBL included 13 04/93 112,067 129,784,445 GenBank and EMBL included 12 01/93 97,683 120,815,244 GenBank and EMBL included 11 07/92 65,693 84,839,075 GenBank and EMBL included 10 01/92 59,317 77,805,556 GenBank and EMBL included 9 07/91 1,130 2,002,124 DDBJ only 8 01/91 879 1,573,442 DDBJ only 7 07/90 681 1,154,211 DDBJ only 6 01/90 496 841,236 DDBJ only 5 07/89 395 679,378 DDBJ only 4 01/89 302 535,985 DDBJ only 3 07/88 230 345,850 DDBJ only 2 01/88 142 199,392 DDBJ only 1 07/87 66 108,970 DDBJ only -------------------------------------------------------------------- This release covers 16 categories of organisms and others as follows: ------------------------------------------------------------------------------ ddbjbct.*** Category for bacteria ddbjest.*** Category for EST (expressed sequence tag) ddbjhum.*** Category for human ddbjinv.*** Category for invertebrates ddbjmam.*** Category for mammals other than primates and rodents ddbjorg.*** Category for organella ddbjpat.*** Category for patents ddbjphg.*** Category for phages ddbjpln.*** Category for plants ddbjpri.*** Category for primates other than human ddbjrna.*** Category for RNAs ddbjrod.*** Category for rodents ddbjsts.*** Category for STS (sequence tagged site) ddbjsyn.*** Category for synthetic DNAs ddbjuna.*** Category for unannotated sequences ddbjvrl.*** Category for viruses ddbjvrt.*** Category for vertebrates other than mammals ------------------------------------------------------------------------------ Each category then has the following nine files. Note that all the files except for ddbj***.seq and ddbj***.sdr may include more than 80 characters in one line. If this is the case, the line is folded at every 81th column in the file on the distribution tape with the fixed record size of 80 bytes. ------------------------------------------------------------------------------ ddbj***.seq List of an entry in DDBJ format, see Table 1. ddbj***.acc List of the accession numbers, see Table 2 . ddbj***.aut List of the authors, see Table 3. ddbj***.dir List of the short directory in DDBJ style, see Table 4. ddbj***.idx List of indices, see Table 5. ddbj***.jou List of the journals, see Table 6. ddbj***.key List of the key words, see Table 7. ddbj***.org List of the species names, see Table 8. ddbj***.sdr List of the short directory in GenBank style, see Table 9. ------------------------------------------------------------------------------ Table 1. Part of the contents in the file 'ddbjbct.seq'. This shows all pieces of information on one entry in DDBJ format. ------------------------------------------------------------------------------ LOCUS ABCAARAA 1624 bp ds-DNA BCT 15-SEP-1990 DEFINITION A.aceti acetic acid resistance protein (aarA) gene, complete cds. ACCESSION M34830 KEYWORDS acetic acid resistance protein. SOURCE A.aceti (strain 10-8) DNA, clone pAR1611. ORGANISM Acetobacter aceti Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. REFERENCE 1 (bases 1 to 1624) AUTHORS Fukaya,M., Takemura,H., Okumura,H., Kawamura,Y., Horinouchi,S. and Beppu,T. TITLE Cloning of genes responsible for acetic acid resistance in acetobacter aceti JOURNAL J. Bacteriol. 172, 2096-2104 (1990) STANDARD simple staff_entry FEATURES Location/Qualifiers RBS 171..176 /note="ribosome binding site (put.)" CDS 185..1495 /note="acetic acid resistance protein (aarA)" /codon_start=1 misc_signal 1508..1545 /note="transcription termination signal" BASE COUNT 400 a 446 c 404 g 374 t ORIGIN 1 gcatgcattt gcacacattc gcgcgaccct aagcccaaaa aactgtggtt ttccaagcat 61 actcctttcc gataacgctt cgtttatcgc tggcaacctt ccggtttcct tttgaatgag 121 tgacaaagtg tgacgagcag gccgcagcag cgaccgtggc ccaaccatgc agaaggaaac 181 actaatgagc gcgtcgcaga aagaaggtaa gctatctacc gctaccattt cggttgatgg 241 aaaatccgcc gaaatgcctg tgctttcagg cactctggga ccggatgtta tcgacatccg 301 caaacttccg gcgcaactgg gcgttttcac gtttgaccca ggttacgggg aaacagcggc 361 ctgcaacagc aaaatcacct ttattgatgg tgataaaggc gttctgctgc accgtggtta 421 ccctattgcg cagctggacg aaaatgcttc ctacgaagaa gttatttatc tgcttttgaa 481 tggcgaactg cccaacaagg tgcagtacga caccttcacc aacaccctta caaaccatac 541 gctgctgcac gagcagatcc gtaacttctt taacggcttc cggcgtgatg cccacccaat 601 ggccattctg tgtggtacgg ttggggcttt gtctgccttc tacccagatg ccaacgatat 661 tgccattccc gccaatcggg atctggccgc catgcggctg attgccaaaa tcccaaccat 721 tgcggcatgg gcttacaaat acacgcaggg tgaagccttt atctacccgc ggaatgatct 781 gaactacgca gaaaacttcc tgtccatgat gttcgcgcgc atgtccgaac cttacaaggt 841 caaccctgtt ctggcccgcg ccatgaaccg gattctgatt ctgcatgccg atcatgagca 901 gaatgcctct acctccaccg tacgtctggc tggttctaca ggggccaatc cgtttgcctg 961 tattgctgcg ggcattgccg ctctgtgggg acctgcacat ggtggcgcaa acgaagctgt 1021 gctgaaaatg ctggcccgta ttggcaagaa agaaaatatt cctgccttta tcgcacaggt 1081 gaaggacaag aacagcggcg taaagctgat gggctttggc caccgcgttt acaagaactt 1141 cgacccacgt gcgaagatca tgcagcagac ctgccacgaa gtgctgacag aacttggcat 1201 taaggatgat ccgctgctgg atctggcggt tgagctggaa aagattgctc tgagcgatga 1261 ttacttcgtg cagcgcaaac tttacccgaa tgtggatttc tactctggca tcattctcaa 1321 ggccatgggc atccccacca gtatgtttac tgtgctgttt gccgtagccc gcaccaccgg 1381 ctgggtgagc cagtggaagg aaatgattga agaaccgggc cagcgtatca gccgccctcg 1441 ccagctttat attggcgcac cgcagcgtga ctatgtgccg cttgccaaac gctaaaacag 1501 actaacccaa aaagccgact tcccgtaagg aaagtcggct ttttgtttgc acgctgtttc 1561 caaaaaaata gggcggcaga gcgaataaac gctacctagc cttcaggcat aaaaaaacgc 1621 atgc // ------------------------------------------------------------------------------ Table 2. Part of the contents in the file 'ddbjbct.acc'. The first column refers to the secondary accession number, second column to the locus name, and third to the primary accession number. The primary number may be the same as the secondary number. They are arranged in the ascending order of the secondary accession numbers. ------------------------------------------------------------------------------ D00001 -> ECOPBPAA X04516 D00002 -> ECOPYRH X04469 D00006 -> PNS981TET D00006 D00020 -> COLE2LYS D00020 D00021 -> COLE31YS D00021 D00038 -> BRLAM330 D00038 D00066 -> BAC139AC D00066 D00067 -> ECONANA M20207 D00069 -> ECOUVRD2 D00069 D00087 -> BACXYNAA D00087 ------------------------------------------------------------------------------ Table 3. Part of the contents in the file 'ddbjbct.aut'. For each author name given on the left to the arrow, the corresponding locus name and primary accession number are respectively listed on the right. They are arranged in the alphabetical order of the author names. ------------------------------------------------------------------------------ Aan,F. -> STYCRR X05210 Aan,F. -> STYENZI M76176 Aaronson,W. -> ECOKPSD M64977 Aaronson,W. -> ECONEUA J05023 Abad-Lapuebla,M.A. -> VIBTDHI D90238 Abdel-Mawgood,A.L. -> CYAPSBHA X16394 Abdel-Meguid,S.S. -> TRNGDRECM J01843 Abdelal,A. -> STYCARA M36540 Abdelal,A. -> STYCARAB X13200 Abdelal,A.H. -> PSENOSA M60717 ------------------------------------------------------------------------------ Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'. For each locus name given in the first column, the corresponding primary accession number, molecular type, number of nucleotide pairs, and description for the locus are respectively listed. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ ABCAARAA M34830 ds-DNA 1624 A.aceti acetic acid resistance protein (aarA) gene, complete cds. ABCADHCC D00635 ds-DNA 4230 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. ABCALDH D00521 ds-DNA 2683 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. ABCBCSAA M37202 ds-DNA 9540 A.xylinum bcs B, bcs C and bcs D genes, complete cds and bcs A gene, partial cds. ABCCELA M76548 ds-DNA 1165 Acetobacter xylinum UDP pyrophosphorylase (celA) gene, complete cds. ABCCELSYN X54676 ds-DNA 5363 A. xylinum gene for cellulose biosynthesis ABCIS1380 D10043 ds-DNA 1665 A.pasteurianus insertion sequence IS1380. ACAADH1 D90004 ds-DNA 2467 Acetobacter aceti(K6033) alcohol dehydrogenase subunit gene(adh1). ACCAAC2 M62833 ds-DNA 1123 Acinetobacter baumannii aminoglycoside acetyltr ansferase (aac2) gene, complete cds. ACCACEAA M62822 ds-DNA 1874 A.baumannii chloramphenicol acetyltransferase (cat) gene, complete cds. ------------------------------------------------------------------------------ Table 5. Part of the contents in the file 'ddbjbct.idx'. The first column refers to the locus name, second column to the starting site of the locus in byte, and third to its ending site in byte. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ %***************************** #ABCAARAA 0 3211 #ABCADHCC 3212 10608 #ABCALDH 10609 15864 #ABCBCSAA 15865 29583 #ABCCELA 29584 32289 #ABCCELSYN 32290 40960 #ABCIS1380 40961 44711 #ACAADH1 44712 49357 #ACCAAC2 49358 52395 ------------------------------------------------------------------------------ Table 6. Part of the contents in the file 'ddbjbct.jou'. This gives information on the journal in which sequence data were published. ------------------------------------------------------------------------------ (in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of Microorganisms: 129-137, Plenum Press, New York (1987) -> BACAMYABS M57457 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16S M55011 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SA M55006 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SB M55008 (in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial Differentiation: 85-94, American Society for Microbiology, Washington, DC (1985) -> BACSPOII M57606 (in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1 M54881 (in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis: 138-143, Academic Press, New York (1976) -> ECOLAC J01636 (in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase: 455-472, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1 K01197 (in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and polymers. Discovery and commercialization.: 186-200, American Chemical Society, Washington, D.C. (1991) -> ECOTGP J01714 (in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions: 193-207, Walter de Gruyter, New York (1975) -> ECOLAC J01636 (in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, part E: In press, Academic Press, New York, N.Y. (1986) -> PLMCG M11320 Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1 M54893 Actinomycetologica 5, 14-17 (1991) -> STMARGG D00799 Adv. Biophys. 21, 115-133 (1986) -> R10REP M26840 Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA M26839 Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA M26893 Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT M14040 Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA M20207 Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330 D00038 Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT D00129 Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP D00219 Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR M35503 Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP D00312 Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY D00117 Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA D00087 Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135 D00348 Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR D00343 Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI D00342 Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB M35517 Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI D00217 ------------------------------------------------------------------------------ Table 7. Part of the contents in the file 'ddbjbct.key'. For the locus and accession number respectively given on the right to the arrow, the corresponding key words are listed on the left. ------------------------------------------------------------------------------ A.aceti acetic acid resistance protein (aarA) gene, complete cds. -> ABCAARAA M34830 acetic acid resistance protein. -> ABCAARAA M34830 Cloning of genes responsible for acetic acid resistance in acetobacter aceti -> ABCAARAA M34830 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. -> ABCADHCC D00635 alcohol dehydrogenase; cytochrome c. -> ABCADHCC D00635 Cloning and sequencing of the gene cluster encoding two subunits of membrane- bound alcohol dehydrogenase from Acetobacter polyoxogenes -> ABCADHCC D00635 These data kindly submitted in computer readable form by: Toshimi Tamaki Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486 -> ABCADHCC D00635 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. -> ABCALDH D00521 aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme. -> ABCALDH D00521 Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from Acetobacter polyoxogenes -> ABCALDH D00521 ------------------------------------------------------------------------------ Table 8. Part of the contents in the file 'ddbjbct.org'. For the locus and accession number respectively given on the right to the arrow, the corresponding taxonomic names are listed on the left. They are arranged in the alphabetical order of the species names. ------------------------------------------------------------------------------ A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRUBPS X00019 A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGGX X00343 A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGG X00512 A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. - > ABCADHCC D00635 A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUPCAB K02660 A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUCPCAB K02659 A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> AVINIFUSV M17349 A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> ABCAARAA M34830 A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Facultatively anaerobic rods; Pasteurellaceae. -> ACNLKTXN M27399 A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Neisseriaceae. -> ACCCITSYN M33037 ------------------------------------------------------------------------------ Table 9. Part of the short directory file in GenBank style in the file 'ddbjbct.sdr'. The short directory file contains brief descriptions of all of the sequence entries contained in the GenBank style. ------------------------------------------------------------------------------ ABCAARAA A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp ABCADHCC A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and 4230bp ABCALDH A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, 2683bp ABCBCSABCD A.xylinum bcs A, B, C and D genes, complete cds's. 9540bp ABCCELA Acetobacter xylinum UDP pyrophosphorylase (celA) gene, 1165bp ABCCELSYN A. xylinum gene for cellulose biosynthesis 5363bp ABCIS1380 A.pasteurianus insertion sequence IS1380. 1665bp ACAADH1 Acetobacter aceti(K6033) alcohol dehydrogenase subunit 2467bp ACCAAC2 Acinetobacter baumannii aminoglycoside acetyltransferase 1123bp ACCACEAA A.baumannii chloramphenicol acetyltransferase (cat) gene, 1874bp ACCAPHA6 Acinetobacter baumannii aphA-6 gene. 1170bp ACCBENABCA A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins 15922bp ACCCAT Acinetobacter calcoaceticus cat operon. 15922bp ACCCATAM A.calcoaceticus catA and catM genes, encoding catechol 1, 5537bp ACCCHMO Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp ACCCITSYN A.anitratum citrate synthase gene, complete cds. 1895bp ------------------------------------------------------------------------------ In addition to the 9 tables the five following index files are included in this release. These files were prepared irrespective of the 14 categories of taxonomic divisions. Accession number index file Keyword phrase index file Author name index file Journal citation index file Gene name index file A brief description is given for each file in the following. Table 10. Part of the accession number index file in the 'ddbjacc.idx'. The following excerpt from the accession number index file illustrates the format of the index. Note that as mentioned above there are such a case where an accession number for a taxonomic category is the same as that for EST or ORG; for example, PRI D12345 and EST D12345 under the same accession number D12345. ------------------------------------------------------------------------------ M33790 SHFINVEA BCT M33790 M33791 BACORF2 BCT M33791 M33792 FTRCPRBCLC ORG X55829 FTRCPRBCLC PLN X55829 M33793 FTRCPPRBCL ORG X55830 FTRCPPRBCL PLN X55830 M33794 ATPCPARRBC ORG X55831 ATPCPARRBC PLN X55831 ATPCPRBCLB ORG X15925 ATPCPRBCLB PLN X15925 M33796 NRACPNTRBC ORG X55827 NRACPNTRBC PLN X55827 M33797 NRACPRBCL ORG X55828 NRACPRBCL PLN X55828 M33798 ACCPCACGH BCT M33798 M33799 PSETRPEGDC BCT M33799 ------------------------------------------------------------------------------ Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'. Keyword phrases consist of names for gene products and other characteristics of sequence entries. ------------------------------------------------------------------------------ A CHANNEL DROCHA INV M17155 A COMPONENT SQLCVEA VRL M38183 A LOCUS GORGOGOA3 PRI X54375 GORGOGOA4 PRI X54376 A LOCUS ALLELE GORA0101 PRI X60258 GORA0201 PRI X60259 GORA0401 PRI X60257 GORA0501 PRI X60256 A MULTI-GENE FAMILY RICGLUTE PLN D00584 A PROTEIN MS2AAR PHG M25187 ST1APCS PHG M25396 A SEQUENCE HS5TOA30 VRL D00148 HS5TOA31 VRL D00147 ------------------------------------------------------------------------------ Table 12. Part of the author name index file in 'ddbjaut.idx'. The author name index file lists all of the author names that appear in the citations. ------------------------------------------------------------------------------ ABE,A. HUMMHDRBWE PRI M27509 HUMMHDRBWF PRI M27510 HUMMHDRBWG PRI M27511 YSCGAL11A PLN M22481 ABE,C. S85445 BCT S85445 ABE,E. M23442 UNA M23442 ABE,H. CHKADF VRT M55660 CHKCOF VRT M55659 ABE,K. CHPCLAC PRI D11383 CHPIMRF PRI D11384 CUGCUR09 PLN X64110 CUGCUR37 PLN X64111 HPCCEXPA VRL M55970 HPCCPEP1 VRL D10687 HPCCPEP2 VRL D10688 HPCHABC82 VRL X51587 HPCNS2APA VRL M55972 HPCNS2PA VRL M55971 HPCNS2PB VRL M55973 HPCNS5PA VRL M55974 MUSKE2 ROD M65255 MUSKE2A ROD M65256 MZECYS PLN D10622 RICCPI PLN J03469 RICGLUTE PLN D00584 RICLNOCI PLN J05595 RICOCS PLN M29259 RICORYII PLN X57658 RICOZA PLN D90406 RICOZB PLN D90407 RICOZC PLN D90408 S54524 PLN S54524 S54526 PLN S54526 S54530 PLN S54530 S73960 ROD S73960 ------------------------------------------------------------------------------ Table 13. Part of the journal citation index file in 'ddbjjou.idx'. The journal citation index file lists all of the citations that appear in the references. ------------------------------------------------------------------------------ ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992) HUMPLASINS PRI M98056 ACTA BIOCHIM. POL. 24, 301-318 (1977) LUPTRFJ RNA K00345 LUPTRFN RNA K00346 ACTA BIOCHIM. POL. 26, 369-381 (1979) BLYTRNPHE PLN X02683 ACTA BIOCHIM. POL. 29, 143-149 (1982) EMEMTA ORG M32572 EMEMTA PLN M32572 EMEMTB ORG M32573 EMEMTB PLN M32573 EMEMTC ORG M32574 EMEMTC PLN M32574 EMEMTD ORG M32575 EMEMTD PLN M32575 EMEMTE ORG M32576 EMEMTE PLN M32576 ACTA BIOCHIM. POL. 34, 21-27 (1987) LUPNOSP PLN M32571 ------------------------------------------------------------------------------ Table 14. Part of the gene name index file in 'ddbjgen.idx'. This file lists all the gene names that appear in the feature table. ------------------------------------------------------------------------------ AACC8 STMAACC8 BCT M55426 AACC9 MPUAACC9 BCT M55427 AACT HUMA1ACM PRI K01500 HUMA1ACMA PRI X00947 HUMA1ACMB PRI M18035 HUMAACT1 PRI M18906 HUMAACT2 PRI M22533 HUMAACTA PRI J05176 AAD INTINTORF BCT L06418 LMOMO229D BCT X17478 AAD A1 ENTAAC3VI BCT M88012 AAD9 ENEAAD9A BCT M69221 AADA LMOMO229A BCT X17479 S52249 BCT S52249 SYNAADA SYN M60473 TRNTAAB BCT M55547 TRNTN21CAS BCT M86913 ------------------------------------------------------------------------------ The files in this release are arranged in the following order with non- labeled format. Release note FILE.001 ddbjrel.txt 761 records Category for bacteria, 20417 entries, 37270887 bases FILE.002 ddbjbct.acc 22655 records FILE.003 ddbjbct.aut 80073 records FILE.004 ddbjbct.dir 20417 records FILE.005 ddbjbct.idx 20418 records FILE.006 ddbjbct.jou 33727 records FILE.007 ddbjbct.key 94112 records FILE.008 ddbjbct.org 20417 records FILE.009 ddbjbct.sdr 20417 records FILE.010 ddbjbct.seq 1451741 records Category for EST (expressed sequence tag), 237157 entries, 80102513 bases FILE.011 ddbjest.acc 237191 records FILE.012 ddbjest.aut 1120347 records FILE.013 ddbjest.dir 237157 records FILE.014 ddbjest.idx 237158 records FILE.015 ddbjest.jou 296930 records FILE.016 ddbjest.key 999370 records FILE.017 ddbjest.org 237157 records FILE.018 ddbjest.sdr 237157 records FILE.019 ddbjest.seq 11850048 records Category for human, 33168 entries, 36674453 bases FILE.020 ddbjhum.acc 37571 records FILE.021 ddbjhum.aut 149599 records FILE.022 ddbjhum.dir 33168 records FILE.023 ddbjhum.idx 33169 records FILE.024 ddbjhum.jou 49615 records FILE.025 ddbjhum.key 147572 records FILE.026 ddbjhum.org 33168 records FILE.027 ddbjhum.sdr 33168 records FILE.028 ddbjhum.seq 1973271 records Category for invertebrates, 15589 entries, 32537701 bases FILE.029 ddbjinv.acc 17197 records FILE.030 ddbjinv.aut 57247 records FILE.031 ddbjinv.dir 15589 records FILE.032 ddbjinv.idx 15590 records FILE.033 ddbjinv.jou 23967 records FILE.034 ddbjinv.key 70209 records FILE.035 ddbjinv.org 15589 records FILE.036 ddbjinv.sdr 15589 records FILE.037 ddbjinv.seq 1182802 records Category for mammals, 7317 entries, 7826254 bases FILE.038 ddbjmam.acc 8108 records FILE.039 ddbjmam.aut 28673 records FILE.040 ddbjmam.dir 7317 records FILE.041 ddbjmam.idx 7318 records FILE.042 ddbjmam.jou 10891 records FILE.043 ddbjmam.key 32620 records FILE.044 ddbjmam.org 7317 records FILE.045 ddbjmam.sdr 7317 records FILE.046 ddbjmam.seq 408351 records Category for organella, 46781 entries, 65228558 bases FILE.047 ddbjorg.acc 52638 records FILE.048 ddbjorg.aut 185078 records FILE.049 ddbjorg.dir 46781 records FILE.050 ddbjorg.idx 46782 records FILE.051 ddbjorg.jou 87188 records FILE.052 ddbjorg.key 224520 records FILE.053 ddbjorg.org 46781 records FILE.054 ddbjorg.sdr 46781 records FILE.055 ddbjorg.seq 3052443 records Category for patents, 18201 entries, 6258819 bases FILE.056 ddbjpat.acc 18201 records FILE.057 ddbjpat.aut 42738 records FILE.058 ddbjpat.dir 18201 records FILE.059 ddbjpat.idx 18202 records FILE.060 ddbjpat.jou 18200 records FILE.061 ddbjpat.key 71170 records FILE.062 ddbjpat.org 18201 records FILE.063 ddbjpat.sdr 18201 records FILE.064 ddbjpat.seq 483823 records Category for phages, 1062 entries, 1550432 bases FILE.065 ddbjphg.acc 1248 records FILE.066 ddbjphg.aut 4139 records FILE.067 ddbjphg.dir 1062 records FILE.068 ddbjphg.idx 1063 records FILE.069 ddbjphg.jou 1709 records FILE.070 ddbjphg.key 4864 records FILE.071 ddbjphg.org 1062 records FILE.072 ddbjphg.sdr 1062 records FILE.073 ddbjphg.seq 68722 records Category for plants, 22591 entries, 42715709 bases FILE.074 ddbjpln.acc 24502 records FILE.075 ddbjpln.aut 83922 records FILE.076 ddbjpln.dir 22591 records FILE.077 ddbjpln.idx 22592 records FILE.078 ddbjpln.jou 36978 records FILE.079 ddbjpln.key 102590 records FILE.080 ddbjpln.org 22591 records FILE.081 ddbjpln.sdr 22591 records FILE.082 ddbjpln.seq 1624521 records Category for primates, 2679 entries, 1961448 bases FILE.083 ddbjpri.acc 2825 records FILE.084 ddbjpri.aut 11060 records FILE.085 ddbjpri.dir 2679 records FILE.086 ddbjpri.idx 2680 records FILE.087 ddbjpri.jou 3850 records FILE.088 ddbjpri.key 11838 records FILE.089 ddbjpri.org 2679 records FILE.090 ddbjpri.sdr 2679 records FILE.091 ddbjpri.seq 129602 records Category for RNAs, 4922 entries, 2570883 bases FILE.092 ddbjrna.acc 5239 records FILE.093 ddbjrna.aut 20123 records FILE.094 ddbjrna.dir 4922 records FILE.095 ddbjrna.idx 4923 records FILE.096 ddbjrna.jou 6203 records FILE.097 ddbjrna.key 20621 records FILE.098 ddbjrna.org 4922 records FILE.099 ddbjrna.sdr 4922 records FILE.100 ddbjrna.seq 192052 records Category for rodents, 25446 entries, 28710534 bases FILE.101 ddbjrod.acc 29373 records FILE.102 ddbjrod.aut 102841 records FILE.103 ddbjrod.dir 25446 records FILE.104 ddbjrod.idx 25447 records FILE.105 ddbjrod.jou 36464 records FILE.106 ddbjrod.key 111915 records FILE.107 ddbjrod.org 25446 records FILE.108 ddbjrod.sdr 25446 records FILE.109 ddbjrod.seq 1465745 records Category for STS (sequence tagged site), 13577 entries, 4134248 bases FILE.110 ddbjsts.acc 13606 records FILE.111 ddbjsts.aut 46161 records FILE.112 ddbjsts.dir 13577 records FILE.113 ddbjsts.idx 13578 records FILE.114 ddbjsts.jou 17169 records FILE.115 ddbjsts.key 57592 records FILE.116 ddbjsts.org 13577 records FILE.117 ddbjsts.sdr 13577 records FILE.118 ddbjsts.seq 744957 records Category for synthetic DNAs, 2006 entries, 3501071 bases FILE.119 ddbjsyn.acc 2144 records FILE.120 ddbjsyn.aut 7484 records FILE.121 ddbjsyn.dir 2006 records FILE.122 ddbjsyn.idx 2007 records FILE.123 ddbjsyn.jou 2918 records FILE.124 ddbjsyn.key 8876 records FILE.125 ddbjsyn.org 2006 records FILE.126 ddbjsyn.sdr 2006 records FILE.127 ddbjsyn.seq 134533 records Category for unannotated sequences, 363 entries, 332873 bases FILE.128 ddbjuna.acc 371 records FILE.129 ddbjuna.aut 1320 records FILE.130 ddbjuna.dir 363 records FILE.131 ddbjuna.idx 364 records FILE.132 ddbjuna.jou 385 records FILE.133 ddbjuna.key 1472 records FILE.134 ddbjuna.org 363 records FILE.135 ddbjuna.sdr 363 records FILE.136 ddbjuna.seq 15823 records Category for viruses, 23426 entries, 26466028 bases FILE.137 ddbjvrl.acc 25107 records FILE.138 ddbjvrl.aut 101502 records FILE.139 ddbjvrl.dir 23426 records FILE.140 ddbjvrl.idx 23427 records FILE.141 ddbjvrl.jou 35667 records FILE.142 ddbjvrl.key 104749 records FILE.143 ddbjvrl.org 23426 records FILE.144 ddbjvrl.sdr 23426 records FILE.145 ddbjvrl.seq 1293658 records Category for vertebrates, 9308 entries, 9750092 bases FILE.146 ddbjvrt.acc 10153 records FILE.147 ddbjvrt.aut 34495 records FILE.148 ddbjvrt.dir 9308 records FILE.149 ddbjvrt.idx 9309 records FILE.150 ddbjvrt.jou 14175 records FILE.151 ddbjvrt.key 41725 records FILE.152 ddbjvrt.org 9308 records FILE.153 ddbjvrt.sdr 9308 records FILE.154 ddbjvrt.seq 509949 records Accession number index file FILE.155 ddbjacc.idx 449851 records Keyword phrase index file FILE.156 ddbjkey.idx 314403 records Author name index file FILE.157 ddbjaut.idx 1977496 records Journal citation index file FILE.158 ddbjjou.idx 398447 records Gene name index file FILE.159 ddbjgen.idx 78047 records Japan Patent Information Organization sequence file 4551 entries. 3829577 bases FILE.160 japio.dat 214119 records