DNA Data Bank of Japan DNA Database Release 51, Sep. 2002, including 18,401,358 entries, 22,782,404,136 bases This database may be copied and redistributed without permission on the condition that all the statements in this release note are reproduced in each copy. The present release contains the newest data prepared by the DNA Data Bank of Japan (DDBJ), GenBank, and European Molecular Biology Laboratory/European Bioinformatics Institute (EMBL/EBI) as of Aug. 28, 2002. This unified database was made possible thanks to the international collaboration among the three data banks. All the entries have accordingly been annotated using the feature keys common to them. All the entries designated by the accession numbers with the prefixes "C", "D", "E", "AB", "AG", "AK", "AP", "AT", "AU", "AV", "BB", "BD", "BJ" and "BP" have been collected and processed by DDBJ, and the rest have been prepared by GenBank and EMBL/EBI. There have been a number of genome projects going on worldwide. Among them human genome projects have probably been most productive and yielded a large number of ordinary sequences, huge amounts of ESTs and quantities of genome sequences. Thus, we have the human(HUM) division solely for human sequences and the primate (PRI) division for non-human primate sequences. Note that the EST division also contains human sequences. The present release does not have the ORG division. Thus, if you are interested in human mitochondrial sequences, for example, you are now advised to refer to the HUM division. The HUM division in this release was recorded in 18 files each of which had 300 MB storage capacity. Incidentally, the BCT, INV, PLN, PAT, VRL and ROD divisions were recorded in 4, 4, 5, 5, 2, 4 files, respectively. This release also includes a division (PAT) for patent data. The patent data are those which the Japanese Patent Office (JPO), United States Patent and Trademark Office (USPTO), and the European Patent Office (EPO) collected and processed. The accession numbers of the patent data collected by the Japanese Patent Office start with the prefix "E" and "BD", those collected and supplied by USPTO and GenBank respectively start with "I" and "AR", and those collected and supplied by EPO and EMBL/EBI respectively start with "A" and "AX". The entries with the prefixes "I", "AR", "A", "AX", "E" and "BD" were allocated to five files (ddbjpat1.seq _ ddbjpat5.seq) in the DDBJ format. Note also that unauthorized use of the patent data may cause legal issues for which we take no responsibility. In the present release, the SOURCE in the flat file was revisited and revised if necessary in accordance with the unified taxonomy database common to the three data banks. The number of ESTs has been increasing at an enormous rate and is expected to be growing even more rapidly in the future. Therefore, EST data were stored in 136 files each of which had the same storage capacity as the file of the HUM division. The present release includes the GSS division. GSS stands for the Genome Survey Sequence, which is similar to EST, except that GSS is genomic DNA whereas EST is cDNA. This division was recorded in 38 files similarly to the HUM division. This release also includes the High Throughput Genomic Sequence (HTGS), which comes mainly from genome project teams which deal with a clone as a sequencing unit. HTGS in this release were recorded in 37 files similarly to the HUM division. The index files are not presented in this release except for ddbjacc.idx, ddbjgen.idx, ddbjjou.idx, and ddbjkey.idx. Instead, we have included a program by which to make the index files not presented in this release. For the use of the program, see the files, seq2indexes.doc, seq2indexes.c, and seq2indexes.h in this release. The present release contains amino acid sequences that were translated from the corresponding nucleotide sequences in our database. In the translation we paid much attention to the fact that some species or organella have a codon different from the universal one, and used the proper codon table. If you find an incorrect codon in a translated sequence, please let us know. The three data banks include the item VERSION in the flat file, which indicates a version of a submitted nucleotide sequence (see Table 1). It is expressed like AB123456.1, in which the digit(s) after the period is a version number. The reason for adding VERSION is that since a released sequence sometimes revised by the submitter, the accession number alone cannot specify the sequence in question causing the user a trouble. The number is increased by one every time when a revised sequence is made public. Accordingly, the translated protein sequence will be accompanied with a /protein_id which is expressed as BAA12345.1, in which the digit(s) after the period is again a version number. The number is increased by one when the corresponding nucleotide sequence is revised and the protein sequence is changed as a result, and when the revised protein sequence is made public. We terminated the RNA division. The RNA data were redistributed according to the category of the organism. Therefore, you will find a human RNA sequence, for example, in the HUM division. The present release includes a division, CON. The CON division is to show the order of related sequences in a genome, and expressed by join and the accession numbers of the sequences. The contents of the CON division are compiled by the three data banks not by the data submitter. The current number of the entries of this division is 10,043. The present release also includes, HTC (High Throughput cDNA). The definition of the HTC division is as follows. This division is to include unfinished high throughput cDNA sequences, each of which has 5'UTR and 3'UTR at both ends and part of a coding region. The sequence may also include introns. When the sequence becomes finished later, it moves to the corresponding taxonomic division. The sequence is accompanied with a keyword, HTC (High Throughput cDNA), which is dropped when the sequence is finished and moved to a taxonomic division. From this release, TPA (Third Party Annotation) data are available. File list format at the end of this document was changed. This release is published by the following DDBJ staff. General administration T. Gojobori, Y. Fukuma, Y. Katsube, C. Maruyama, K. Okuda, K. Okuno, H. Tsutsui (hold), Y. Ueda, T. Umezawa, A. Watanabe Database construction Y. Tateno, S. Miyazaki, H. Aono, M. Ejima, M. Gojobori, A. Hashizume, T. Kosuge, Y. Maruyama, J. Mashima, N. Murakata, M. Okaneya, T. Okido, K. Sakai, M. Suzuki, H. Tsutsui Database software development and management H. Sugawara, S. Miyazaki (hold), Y. Suzuki, Y. Fujisawa, H. Hashimoto, T. Iizuka (hold), N. Ishizaka, T. Kato, Ta. Koike, To. Koike, S. Kuroda, K. Mamiya, M. Matsuo, K. Mimura, S. Misu, N. Nishimiya, Y. Shigemoto, Y. Sugiyama, K. Suzuki, T. Takaki, K. Watanabe System management K. Nishikawa, K. Ikeo, N. Hoshi, T. Iizuka, A. Kusakabe, M. Nagura, F. Sugiyama, Y. Sugisaki, K. Yoshioka Editorial and public relations N. Saitou, K. Fukami-Kobayashi, H. Ichikawa, K. Ichikawa, T. Kawamoto, J. Kohira, S. Nagira, Y. Yamamoto Center for Information Biology and DNA Data Bank of Japan National Institute of Genetics Mishima 411-8540, Japan Phone: +81 55 981 6853 FAX: +81 55 981 6849 E-mail: ddbj@ddbj.nig.ac.jp (for general inquiry) ddbjsub@ddbj.nig.ac.jp (for data submission) ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication) WWW: http://www.ddbj.nig.ac.jp/ (for DDBJ WWW server) http://sakura.ddbj.nig.ac.jp/ (for DDBJ sequence data submission system SAKURA) Acknowledgement: We are grateful to NCBI and EMBL/EBI for a firm friendship and an excellent collaboration with us. We also thank the Japanese Patent Office for a steady cooperation with us. The operation of DDBJ is supported by the Ministry of Education, Culture, Sports, Science and Technology, and we would gratefully note this here. DDBJ Database Release History Release Date Entries Bases Comments 51 09/02 18,401,358 22,782,404,136 50 06/02 17,260,693 20,158,357,982 49 04/02 16,503,157 18,579,627,226 48 01/02 15,016,100 16,197,713,855 47 10/01 13,266,610 14,145,671,645 46 07/01 12,313,759 13,037,646,166 45 04/01 11,434,113 12,207,092,905 HTC division started 44 01/01 10,165,597 11,136,298,841 43 10/00 8,666,551 10,034,532,698 42 07/00 7,554,995 8,880,721,093 41 04/00 5,962,608 6,409,581,885 CON division started 40 01/00 5,388,125 4,762,696,173 RNA division terminated 39 10/99 4,810,773 3,728,000,562 NID and PID discarded 38 07/99 4,294,369 3,098,519,597 37 03/99 3,311,627 2,375,261,951 VERSION, /protein_id started 36 01/99 3,073,166 2,190,425,560 35 10/98 2,759,261 1,957,341,169 34 07/98 2,412,785 1,708,580,623 33 04/98 2,174,769 1,479,303,279 32 01/98 1,956,669 1,300,950,613 31 10/97 1,731,532 1,139,869,464 Adoption of the unified taxonomy database 30 07/97 1,534,115 992,788,339 NID and PID terminated 29 04/97 1,270,194 841,415,232 28 01/97 1,154,120 756,785,219 HTG division started ORG division terminated 27 10/96 936,697 608,103,057 GSS division started 26 07/96 835,552 551,932,448 25 04/96 744,490 499,300,364 /translation started 24 01/96 637,508 431,771,652 23 10/95 569,757 390,694,350 22 07/95 437,588 322,982,425 HUM division started 21 04/95 274,596 250,875,023 20 01/95 239,689 231,299,557 19 10/94 204,332 205,274,131 18 07/94 185,230 192,473,021 17 04/94 169,957 179,942,209 16 01/94 154,626 165,017,628 15 10/93 131,649 147,224,690 14 07/93 120,350 138,686,333 13 04/93 112,067 129,784,445 12 01/93 97,683 120,815,244 EST division started 11 07/92 65,693 84,839,075 10 01/92 59,317 77,805,556 GenBank/EMBL inclusion started 9 07/91 1,130 2,002,124 8 01/91 879 1,573,442 7 07/90 681 1,154,211 6 01/90 496 841,236 5 07/89 395 679,378 4 01/89 302 535,985 3 07/88 230 345,850 2 01/88 142 199,392 1 07/87 66 108,970 Started with DDBJ only ------------------------------------------------------------------------ This release covers 20 categories of organisms and others as follows: ------------------------------------------------------------------------------ ddbjbct.*** Category for bacteria ddbjest.*** Category for EST (expressed sequence tag) ddbjcon.*** Category for CON (Contig sequences) ddbjhtc.*** Category for HTC (high throughput cDNA) ddbjhtg.*** Category for HTG (high throughput genomic sequence) ddbjhum.*** Category for human ddbjgss.*** Category for GSS (Genome Survey Sequence) ddbjinv.*** Category for invertebrates ddbjmam.*** Category for mammals other than primates and rodents ddbjpat.*** Category for patents ddbjphg.*** Category for phages ddbjpln.*** Category for plants ddbjpri.*** Category for primates other than human ddbjrod.*** Category for rodents ddbjsts.*** Category for STS (sequence tagged site) ddbjsyn.*** Category for synthetic DNAs ddbjtpa.*** Category for TPA (Third Party Annotation) ddbjuna.*** Category for unannotated sequences ddbjvrl.*** Category for viruses ddbjvrt.*** Category for vertebrates other than mammals ------------------------------------------------------------------------------ Each category then has the following nine files. Note that all the files except for ddbj***.seq are created by the user by use of seq2indexes as mentioned in the release note. ------------------------------------------------------------------------------ ddbj***.seq List of an entry in DDBJ format, see Table 1. ddbj***.acc List of the accession numbers, see Table 2 . ddbj***.aut List of the authors, see Table 3. ddbj***.dir List of the short directory in DDBJ style, see Table 4. ddbj***.idx List of indices, see Table 5. ddbj***.jou List of the journals, see Table 6. ddbj***.key List of the key words, see Table 7. ddbj***.org List of the species names, see Table 8. ddbj***.sdr List of the short directory in DDBJ style, see Table 9. ------------------------------------------------------------------------------ The format of LOCUS line in the flat file is changed as shown below to adjust to the GenBank format from the present release. ------------------------------------------------------------------------------ Old (-rel. 50): LOCUS AB000001 660 bp DNA PLN 01-FEB-2001 Present (rel. 51-): LOCUS AB000001 660 bp DNA linear PLN 01-FEB-2001 New format specification: --------- -------- Positions Contents --------- -------- 01-05 'LOCUS' 06-12 spaces 13-28 Locus name 29-29 space 30-40 Length of sequence, right-justified 41-41 space 42-43 bp 44-44 space 45-47 spaces, ss- (single-stranded), ds- (double-stranded), or ms- (mixed-stranded) 48-53 NA, DNA, RNA, tRNA (transfer RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), uRNA (small nuclear RNA), snRNA, snoRNA. Left justified. 54-55 space 56-63 'linear' followed by two spaces, or 'circular' 64-64 space 65-67 The division code 68-68 space 69-79 Date, in the form dd-MMM-yyyy (e.g., 15-MAR-1991) ------------------------------------------------------------------------------ Table 1. Part of the contents in the file 'ddbjbct.seq'. This shows all pieces of information on one entry in DDBJ format. ------------------------------------------------------------------------------ LOCUS D87069 993 bp mRNA linear BCT 14-APR-2000 DEFINITION Escherichia coli mRNA for RNA polymerase sigma subunit, Truncated form of sigma-38, complete cds. ACCESSION D87069 VERSION D87069.1 KEYWORDS RNA polymerase sigma subunit, truncated form of sigma-38. SOURCE Escherichia coli (strain:W3110) cDNA to mRNA. ORGANISM Escherichia coli Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 993) AUTHORS Jishage,M. TITLE Direct Submission JOURNAL Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki Jishage, National Institute of Genetics, Molecular Genetics; Yata 1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp, Tel:0559-81-6742, Fax:0559-81-6746) REFERENCE 2 (bases 1 to 993) AUTHORS Jishage,M. and Ishihama,A. TITLE Variation in RNA polymerase sigma subunit composition within different stocks of Escherichia coli starin W3110 JOURNAL Unpublished (1996) REFERENCE 3 AUTHORS Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A. TITLE DNA base sequence variability in katF (putative sigma factor) gene Escherichia coli JOURNAL Nucleic Acids Res. 20, 5479-5480 (1992) REFERENCE 4 AUTHORS Takayanagi,Y., Tanaka,K. and Takahashi,H. TITLE Structure of the 5' upstream region and the regulation of the rpoS gene of Escherichia coli JOURNAL Mol. Gen. Genet. 243, 525-531 (1994) COMMENT FEATURES Location/Qualifiers source 1..993 /organism="Escherichia coli" /sequenced_mol="cDNA to mRNA" /strain="W3110" CDS 1..810 /note="the gene has four single base changes, resulting in two amino acid substitutions and an amber mutation" /product="RNA polymerase sigma subunit, truncated form of sigma-38" /protein_id="BAA13238.1" /transl_table=11 /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK" variation 75 /citation=[3] /replace="t" variation 97 /citation=[3] /replace="t" variation 99 /citation=[3] /replace="t" variation 808 /citation=[3] /replace="t" BASE COUNT 254 a 223 c 291 g 225 t 0 others ORIGIN 1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac 61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg 121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt 181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg 241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt 301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc 361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc 421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac 481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga 541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag 601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc 661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat 721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc 781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg 841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa 901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag 961 gggctgaata tcgaagcgct gttccgcgag taa // ------------------------------------------------------------------------------ Table 2. Part of the contents in the file 'ddbjbct.acc'. The first column refers to the secondary accession number, second column to the locus name, and third to the primary accession number. The primary number may be the same as the secondary number. They are arranged in the ascending order of the secondary accession numbers. ------------------------------------------------------------------------------ D00001 -> ECOPBPAA X04516 D00002 -> ECOPYRH X04469 D00006 -> PNS981TET D00006 D00020 -> COLE2LYS D00020 D00021 -> COLE31YS D00021 D00038 -> BRLAM330 D00038 D00066 -> BAC139AC D00066 D00067 -> ECONANA M20207 D00069 -> ECOUVRD2 D00069 D00087 -> BACXYNAA D00087 ------------------------------------------------------------------------------ Table 3. Part of the contents in the file 'ddbjbct.aut'. For each author name given on the left to the arrow, the corresponding locus name and primary accession number are respectively listed on the right. They are arranged in the alphabetical order of the author names. ------------------------------------------------------------------------------ Aan,F. -> STYCRR X05210 Aan,F. -> STYENZI M76176 Aaronson,W. -> ECOKPSD M64977 Aaronson,W. -> ECONEUA J05023 Abad-Lapuebla,M.A. -> VIBTDHI D90238 Abdel-Mawgood,A.L. -> CYAPSBHA X16394 Abdel-Meguid,S.S. -> TRNGDRECM J01843 Abdelal,A. -> STYCARA M36540 Abdelal,A. -> STYCARAB X13200 Abdelal,A.H. -> PSENOSA M60717 ------------------------------------------------------------------------------ Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'. For each locus name given in the first column, the corresponding primary accession number, molecular type, number of nucleotide pairs, and description for the locus are respectively listed. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ ABCAARAA M34830 ds-DNA 1624 A.aceti acetic acid resistance protein (aarA) gene, complete cds. ABCADHCC D00635 ds-DNA 4230 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. ABCALDH D00521 ds-DNA 2683 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. ABCBCSAA M37202 ds-DNA 9540 A.xylinum bcs B, bcs C and bcs D genes, complete cds and bcs A gene, partial cds. ABCCELA M76548 ds-DNA 1165 Acetobacter xylinum UDP pyrophosphorylase (celA) gene, complete cds. ABCCELSYN X54676 ds-DNA 5363 A. xylinum gene for cellulose biosynthesis ABCIS1380 D10043 ds-DNA 1665 A.pasteurianus insertion sequence IS1380. ACAADH1 D90004 ds-DNA 2467 Acetobacter aceti(K6033) alcohol dehydrogenase subunit gene(adh1). ACCAAC2 M62833 ds-DNA 1123 Acinetobacter baumannii aminoglycoside acetyltr ansferase (aac2) gene, complete cds. ACCACEAA M62822 ds-DNA 1874 A.baumannii chloramphenicol acetyltransferase (cat) gene, complete cds. ------------------------------------------------------------------------------ Table 5. Part of the contents in the file 'ddbjbct.idx'. The first column refers to the locus name, second column to the starting site of the locus in byte, and third to its ending site in byte. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ %***************************** #ABCAARAA 0 3211 #ABCADHCC 3212 10608 #ABCALDH 10609 15864 #ABCBCSAA 15865 29583 #ABCCELA 29584 32289 #ABCCELSYN 32290 40960 #ABCIS1380 40961 44711 #ACAADH1 44712 49357 #ACCAAC2 49358 52395 ------------------------------------------------------------------------------ Table 6. Part of the contents in the file 'ddbjbct.jou'. This gives information on the journal in which sequence data were published. ------------------------------------------------------------------------------ (in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of Microorganisms: 129-137, Plenum Press, New York (1987) -> BACAMYABS M57457 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16S M55011 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SA M55006 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SB M55008 (in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial Differentiation: 85-94, American Society for Microbiology, Washington, DC (1985) -> BACSPOII M57606 (in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1 M54881 (in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis: 138-143, Academic Press, New York (1976) -> ECOLAC J01636 (in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase: 455-472, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1 K01197 (in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and polymers. Discovery and commercialization.: 186-200, American Chemical Society, Washington, D.C. (1991) -> ECOTGP J01714 (in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions: 193-207, Walter de Gruyter, New York (1975) -> ECOLAC J01636 (in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, part E: In press, Academic Press, New York, N.Y. (1986) -> PLMCG M11320 Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1 M54893 Actinomycetologica 5, 14-17 (1991) -> STMARGG D00799 Adv. Biophys. 21, 115-133 (1986) -> R10REP M26840 Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA M26839 Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA M26893 Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT M14040 Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA M20207 Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330 D00038 Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT D00129 Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP D00219 Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR M35503 Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP D00312 Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY D00117 Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA D00087 Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135 D00348 Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR D00343 Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI D00342 Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB M35517 Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI D00217 ------------------------------------------------------------------------------ Table 7. Part of the contents in the file 'ddbjbct.key'. For the locus and accession number respectively given on the right to the arrow, the corresponding key words are listed on the left. ------------------------------------------------------------------------------ A.aceti acetic acid resistance protein (aarA) gene, complete cds. -> ABCAARAA M34830 acetic acid resistance protein. -> ABCAARAA M34830 Cloning of genes responsible for acetic acid resistance in acetobacter aceti -> ABCAARAA M34830 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. -> ABCADHCC D00635 alcohol dehydrogenase; cytochrome c. -> ABCADHCC D00635 Cloning and sequencing of the gene cluster encoding two subunits of membrane- bound alcohol dehydrogenase from Acetobacter polyoxogenes -> ABCADHCC D00635 These data kindly submitted in computer readable form by: Toshimi Tamaki Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486 -> ABCADHCC D00635 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. -> ABCALDH D00521 aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme. -> ABCALDH D00521 Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from Acetobacter polyoxogenes -> ABCALDH D00521 ------------------------------------------------------------------------------ Table 8. Part of the contents in the file 'ddbjbct.org'. For the locus and accession number respectively given on the right to the arrow, the corresponding taxonomic names are listed on the left. They are arranged in the alphabetical order of the species names. ------------------------------------------------------------------------------ A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRUBPS X00019 A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGGX X00343 A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGG X00512 A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. - > ABCADHCC D00635 A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUPCAB K02660 A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUCPCAB K02659 A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> AVINIFUSV M17349 A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> ABCAARAA M34830 A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Facultatively anaerobic rods; Pasteurellaceae. -> ACNLKTXN M27399 A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Neisseriaceae. -> ACCCITSYN M33037 ------------------------------------------------------------------------------ Table 9. Part of the short directory file in DDBJ style in the file 'ddbjbct.sdr'. The short directory file contains brief descriptions of all of the sequence entries contained in the DDBJ style. ------------------------------------------------------------------------------ ABCAARAA A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp ABCADHCC A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and 4230bp ABCALDH A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, 2683bp ABCBCSABCD A.xylinum bcs A, B, C and D genes, complete cds's. 9540bp ABCCELA Acetobacter xylinum UDP pyrophosphorylase (celA) gene, 1165bp ABCCELSYN A. xylinum gene for cellulose biosynthesis 5363bp ABCIS1380 A.pasteurianus insertion sequence IS1380. 1665bp ACAADH1 Acetobacter aceti(K6033) alcohol dehydrogenase subunit 2467bp ACCAAC2 Acinetobacter baumannii aminoglycoside acetyltransferase 1123bp ACCACEAA A.baumannii chloramphenicol acetyltransferase (cat) gene, 1874bp ACCAPHA6 Acinetobacter baumannii aphA-6 gene. 1170bp ACCBENABCA A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins 15922bp ACCCAT Acinetobacter calcoaceticus cat operon. 15922bp ACCCATAM A.calcoaceticus catA and catM genes, encoding catechol 1, 5537bp ACCCHMO Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp ACCCITSYN A.anitratum citrate synthase gene, complete cds. 1895bp ------------------------------------------------------------------------------ In addition to the 9 tables the four following index files are included in this release. These files were prepared irrespective of the 10 categories of taxonomic divisions. Accession number index file Keyword phrase index file Journal citation index file Gene name index file A brief description is given for each file in the following. Table 10. Part of the accession number index file in the 'ddbjacc.idx'. The following excerpt from the accession number index file illustrates the format of the index. ------------------------------------------------------------------------------ D00100 PSEASPAA BCT D00100 D00101 RABNP450R MAM D00101 D00102 HUMLTX HUM D00102 D00103 AFARRN5SA BCT D00103 AFRRN5SA BCT X05517 D00104 AFARRN5SB BCT D00104 AFRRN5SB BCT X05518 D00105 AFARRN5S BCT D00105 ASRRN5S BCT X05524 D00106 ACH5SRR BCT D00106 AXRRN5S BCT X05522 AXRRN5SA BCT X05523 D00107 ACH5SRRX BCT D00107 ACRRN5S BCT X05521 ------------------------------------------------------------------------------ Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'. Keyword phrases consist of names for gene products and other characteristics of sequence entries. ------------------------------------------------------------------------------ A CHANNEL DROCHA INV M17155 A COMPONENT SQLCVEA VRL M38183 A LOCUS GORGOGOA3 PRI X54375 GORGOGOA4 PRI X54376 A LOCUS ALLELE GORA0101 PRI X60258 GORA0201 PRI X60259 GORA0401 PRI X60257 GORA0501 PRI X60256 A MULTI-GENE FAMILY RICGLUTE PLN D00584 A PROTEIN MS2AAR PHG M25187 ST1APCS PHG M25396 A SEQUENCE HS5TOA30 VRL D00148 HS5TOA31 VRL D00147 ------------------------------------------------------------------------------ Table 12. Part of the journal citation index file in 'ddbjjou.idx'. The journal citation index file lists all of the citations that appear in the references. ------------------------------------------------------------------------------ ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992) HUMPLASINS HUM M98056 ACTA BIOCHIM. BIOPHYS. SIN. 28, 233-239(1996) TKTII PLN X82230 ACTA BIOCHIM. POL. 24, 301-318 (1977) LUPTRFJ PLN K00345 LUPTRFN PLN K00346 ACTA BIOCHIM. POL. 26, 369-381(1979) HVTRNPHE PLN X02683 ACTA BIOCHIM. POL. 29, 143-149 (1982) EMEMTA PLN M32572 EMEMTB PLN M32573 EMEMTC PLN M32574 EMEMTD PLN M32575 EMEMTE PLN M32576 ACTA BIOCHIM. POL. 34, 21-27 (1987) LUPNOSP PLN M32571 ------------------------------------------------------------------------------ Table 13. Part of the gene name index file in 'ddbjgen.idx'. This file lists all the gene names that appear in the feature table. ------------------------------------------------------------------------------ AACC8 STMAACC8 BCT M55426 AACC9 MPUAACC9 BCT M55427 AACT HUMA1ACM PRI K01500 HUMA1ACMA PRI X00947 HUMA1ACMB PRI M18035 HUMAACT1 PRI M18906 HUMAACT2 PRI M22533 HUMAACTA PRI J05176 AAD INTINTORF BCT L06418 LMOMO229D BCT X17478 AAD A1 ENTAAC3VI BCT M88012 AAD9 ENEAAD9A BCT M69221 AADA LMOMO229A BCT X17479 S52249 BCT S52249 SYNAADA SYN M60473 TRNTAAB BCT M55547 TRNTN21CAS BCT M86913 ------------------------------------------------------------------------------ The files in this release are arranged in the following order with non-labeled format. Category number of number of file name number of entries bases records Release note ddbjrel.txt 978 bacteria1 20898 124093035 ddbjbct1.seq 4750893 bacteria2 57803 112582455 ddbjbct2.seq 5144634 bacteria3 26815 123583758 ddbjbct3.seq 4694903 bacteria4 49687 108888142 ddbjbct4.seq 4694334 CON 11043 ddbjcon.seq 317585 EST1 93361 34680539 ddbjest1.seq 5520042 EST2 96179 39213634 ddbjest2.seq 5560235 EST3 97573 37850545 ddbjest3.seq 5558683 EST4 90860 27855103 ddbjest4.seq 5479893 EST5 97174 38248988 ddbjest5.seq 5574051 EST6 101524 40244726 ddbjest6.seq 5623524 EST7 100285 38790723 ddbjest7.seq 5582141 EST8 99357 38413734 ddbjest8.seq 5560728 EST9 100863 39973707 ddbjest9.seq 5614091 EST10 101625 39784508 ddbjest10.seq 5588396 EST11 99014 41098307 ddbjest11.seq 5544936 EST12 101580 44262152 ddbjest12.seq 5592274 EST13 107752 43182431 ddbjest13.seq 5628886 EST14 103367 41877424 ddbjest14.seq 5587017 EST15 99357 41675486 ddbjest15.seq 5546193 EST16 95353 42169459 ddbjest16.seq 5535612 EST17 101120 41865612 ddbjest17.seq 5610824 EST18 99222 43552555 ddbjest18.seq 5582223 EST19 95230 40104166 ddbjest19.seq 5542300 EST20 99998 42227779 ddbjest20.seq 5570121 EST21 126842 59175602 ddbjest21.seq 5632453 EST22 92670 65445227 ddbjest22.seq 5258086 EST23 116844 70253201 ddbjest23.seq 5346092 EST24 126266 65737256 ddbjest24.seq 5414988 EST25 123042 57944450 ddbjest25.seq 5738517 EST26 118355 56471784 ddbjest26.seq 5683263 EST27 89491 24252959 ddbjest27.seq 5538349 EST28 93085 25172728 ddbjest28.seq 5552726 EST29 69852 20796195 ddbjest29.seq 5287413 EST30 59153 16520695 ddbjest30.seq 5206134 EST31 58899 15514896 ddbjest31.seq 5203254 EST32 113003 48705493 ddbjest32.seq 5697728 EST33 113333 53574125 ddbjest33.seq 5562247 EST34 96151 49565352 ddbjest34.seq 5305048 EST35 115824 59147871 ddbjest35.seq 5511303 EST36 115489 56951684 ddbjest36.seq 5601760 EST37 93735 39773899 ddbjest37.seq 5496788 EST38 94630 41993056 ddbjest38.seq 5483613 EST39 93435 39306529 ddbjest39.seq 5478788 EST40 108180 43130473 ddbjest40.seq 5751961 EST41 92504 37202720 ddbjest41.seq 5530819 EST42 87417 39298437 ddbjest42.seq 5417683 EST43 100995 46086222 ddbjest43.seq 5651824 EST44 98425 40667288 ddbjest44.seq 5577620 EST45 101053 37650684 ddbjest45.seq 5637942 EST46 93002 39896307 ddbjest46.seq 5496640 EST47 59232 16614393 ddbjest47.seq 5147846 EST48 58102 18044036 ddbjest48.seq 5119350 EST49 59038 17631385 ddbjest49.seq 5105674 EST50 58767 18868190 ddbjest50.seq 5110630 EST51 58709 17982957 ddbjest51.seq 5122160 EST52 58817 17832461 ddbjest52.seq 5120383 EST53 59794 17639272 ddbjest53.seq 5110145 EST54 60193 18870088 ddbjest54.seq 5097142 EST55 59763 19175149 ddbjest55.seq 5100297 EST56 60416 19081185 ddbjest56.seq 5239810 EST57 55730 34349112 ddbjest57.seq 5104634 EST58 53170 21968828 ddbjest58.seq 5057576 EST59 53070 24125678 ddbjest59.seq 5043698 EST60 53281 22242000 ddbjest60.seq 5050701 EST61 63392 25872862 ddbjest61.seq 5158694 EST62 98189 41289021 ddbjest62.seq 5636796 EST63 97423 40602489 ddbjest63.seq 5566821 EST64 102138 57376035 ddbjest64.seq 5541384 EST65 102079 56018432 ddbjest65.seq 5527955 EST66 101609 47456420 ddbjest66.seq 5592953 EST67 94928 55357530 ddbjest67.seq 5443265 EST68 96819 44016923 ddbjest68.seq 5544306 EST69 93710 53095765 ddbjest69.seq 5462763 EST70 102095 59638156 ddbjest70.seq 5563296 EST71 87178 43721953 ddbjest71.seq 5417866 EST72 96781 50001592 ddbjest72.seq 5514174 EST73 96033 61052102 ddbjest73.seq 5470757 EST74 92655 59998037 ddbjest74.seq 5398293 EST75 95306 43999724 ddbjest75.seq 5566226 EST76 91555 41471172 ddbjest76.seq 5496803 EST77 86585 50210181 ddbjest77.seq 5349835 EST78 91809 53011053 ddbjest78.seq 5453681 EST79 97293 45436050 ddbjest79.seq 5568457 EST80 97996 43350172 ddbjest80.seq 5568450 EST81 98127 42313660 ddbjest81.seq 5579207 EST82 93448 46042221 ddbjest82.seq 5458721 EST83 98722 57162576 ddbjest83.seq 5470780 EST84 104398 58733468 ddbjest84.seq 5530281 EST85 90293 58970950 ddbjest85.seq 5378682 EST86 95328 60245909 ddbjest86.seq 5479466 EST87 91424 58717813 ddbjest87.seq 5349729 EST88 97587 57784048 ddbjest88.seq 5474161 EST89 96887 61372261 ddbjest89.seq 5524763 EST90 95061 63650909 ddbjest90.seq 5410833 EST91 100658 57825660 ddbjest91.seq 5578523 EST92 100265 38632663 ddbjest92.seq 5631885 EST93 106621 61192528 ddbjest93.seq 5596211 EST94 98772 58208090 ddbjest94.seq 5544228 EST95 86527 42922324 ddbjest95.seq 5465324 EST96 92804 49952932 ddbjest96.seq 5464084 EST97 89023 50020347 ddbjest97.seq 5441854 EST98 93558 56671568 ddbjest98.seq 5504725 EST99 91379 55675303 ddbjest99.seq 5423494 EST100 94938 54335476 ddbjest100.seq 5434138 EST101 90943 56388291 ddbjest101.seq 5424080 EST102 81639 47085469 ddbjest102.seq 5272655 EST103 118087 64353087 ddbjest103.seq 5700183 EST104 96386 53536354 ddbjest104.seq 5449541 EST105 121556 64336482 ddbjest105.seq 5688348 EST106 117746 63930818 ddbjest106.seq 5634896 EST107 97306 58038420 ddbjest107.seq 5423188 EST108 85368 40595296 ddbjest108.seq 5409492 EST109 77683 38914719 ddbjest109.seq 5269575 EST110 83104 41381008 ddbjest110.seq 5366017 EST111 72181 37516038 ddbjest111.seq 5209096 EST112 104016 69047502 ddbjest112.seq 5625590 EST113 87195 56729679 ddbjest113.seq 5371790 EST114 96701 55567702 ddbjest114.seq 5658730 EST115 80942 40107817 ddbjest115.seq 5357646 EST116 86973 50373862 ddbjest116.seq 5432482 EST117 85813 55309254 ddbjest117.seq 5353538 EST118 92484 52422728 ddbjest118.seq 5411985 EST119 83999 55157150 ddbjest119.seq 5286712 EST120 89391 49192170 ddbjest120.seq 5422488 EST121 85391 52464007 ddbjest121.seq 5399860 EST122 93767 37969199 ddbjest122.seq 5535172 EST123 95148 55576434 ddbjest123.seq 5486861 EST124 87568 45681691 ddbjest124.seq 5386839 EST125 90816 58832662 ddbjest125.seq 5423619 EST126 91391 60305611 ddbjest126.seq 5397569 EST127 89695 48052191 ddbjest127.seq 5361957 EST128 91738 74040017 ddbjest128.seq 5366284 EST129 92643 46983727 ddbjest129.seq 5459882 EST130 124382 48889682 ddbjest130.seq 5755416 EST131 115340 37445818 ddbjest131.seq 5794033 EST132 94517 34382583 ddbjest132.seq 5565929 EST133 96663 34707202 ddbjest133.seq 5534557 EST134 101009 35028529 ddbjest134.seq 5668808 EST135 96964 37300831 ddbjest135.seq 5555009 EST136 84936 32413609 ddbjest136.seq 4915468 GSS1 102673 76493900 ddbjgss1.seq 5645307 GSS2 102099 72075998 ddbjgss2.seq 5645480 GSS3 105878 78510523 ddbjgss3.seq 5238925 GSS4 84944 70959633 ddbjgss4.seq 5216823 GSS5 83034 74726781 ddbjgss5.seq 5189841 GSS6 79260 65375001 ddbjgss6.seq 5122801 GSS7 116086 49995029 ddbjgss7.seq 5512884 GSS8 121988 50423339 ddbjgss8.seq 6093529 GSS9 115038 55846514 ddbjgss9.seq 5909088 GSS10 105879 54929100 ddbjgss10.seq 5826345 GSS11 101939 51750101 ddbjgss11.seq 5771218 GSS12 105364 52669327 ddbjgss12.seq 5838076 GSS13 98870 50217470 ddbjgss13.seq 5695889 GSS14 96932 54087812 ddbjgss14.seq 5634187 GSS15 96105 49625087 ddbjgss15.seq 5614591 GSS16 94865 50748377 ddbjgss16.seq 5590656 GSS17 96956 46308984 ddbjgss17.seq 5673985 GSS18 94179 45304835 ddbjgss18.seq 5643918 GSS19 100146 56091519 ddbjgss19.seq 5693629 GSS20 86851 40339182 ddbjgss20.seq 5538007 GSS21 75130 38649101 ddbjgss21.seq 5338949 GSS22 76820 34296585 ddbjgss22.seq 5370875 GSS23 84816 50553587 ddbjgss23.seq 5415052 GSS24 78154 35335392 ddbjgss24.seq 5380219 GSS25 90147 53584892 ddbjgss25.seq 5563858 GSS26 82203 32744585 ddbjgss26.seq 5466537 GSS27 83065 40182164 ddbjgss27.seq 5481651 GSS28 84630 42980543 ddbjgss28.seq 5457054 GSS29 97127 51944450 ddbjgss29.seq 5711778 GSS30 93961 61335543 ddbjgss30.seq 5609371 GSS31 104953 51120170 ddbjgss31.seq 5835623 GSS32 101727 55262559 ddbjgss32.seq 5846040 GSS33 118720 73399913 ddbjgss33.seq 5947841 GSS34 120278 73843834 ddbjgss34.seq 5951361 GSS35 120518 70348759 ddbjgss35.seq 5929073 GSS36 113804 47722471 ddbjgss36.seq 5799435 GSS37 115829 53646382 ddbjgss37.seq 6006865 GSS38 24566 10956746 ddbjgss38.seq 1192906 HTC 37975 46911628 ddbjhtc.seq 3591485 HTG1 1574 228027671 ddbjhtg1.seq 3994603 HTG2 3380 225194996 ddbjhtg2.seq 4023263 HTG3 2481 225562192 ddbjhtg3.seq 4015417 HTG4 2435 226865319 ddbjhtg4.seq 4008100 HTG5 1548 224769976 ddbjhtg5.seq 4016716 HTG6 1486 225371631 ddbjhtg6.seq 4012405 HTG7 1486 225223142 ddbjhtg7.seq 4014559 HTG8 1451 226285948 ddbjhtg8.seq 4003355 HTG9 1459 226660565 ddbjhtg9.seq 4004179 HTG10 1741 224140474 ddbjhtg10.seq 4012824 HTG11 1446 225486425 ddbjhtg11.seq 3997642 HTG12 1497 220993310 ddbjhtg12.seq 4012251 HTG13 1496 222205667 ddbjhtg13.seq 4009220 HTG14 1760 220500203 ddbjhtg14.seq 4024743 HTG15 2110 216805875 ddbjhtg15.seq 4057696 HTG16 1709 221566753 ddbjhtg16.seq 4020122 HTG17 1660 221414302 ddbjhtg17.seq 4018081 HTG18 1576 222415155 ddbjhtg18.seq 4010332 HTG19 1532 222615022 ddbjhtg19.seq 4011961 HTG20 1554 221644309 ddbjhtg20.seq 4013173 HTG21 1461 223255310 ddbjhtg21.seq 4008665 HTG22 1522 223145062 ddbjhtg22.seq 4009428 HTG23 1602 222635867 ddbjhtg23.seq 4013335 HTG24 1666 221381834 ddbjhtg24.seq 4018824 HTG25 1732 220541660 ddbjhtg25.seq 4025091 HTG26 1392 227312725 ddbjhtg26.seq 3993900 HTG27 1526 224833329 ddbjhtg27.seq 4005230 HTG28 1392 226015943 ddbjhtg28.seq 3999727 HTG29 1392 225293983 ddbjhtg29.seq 3998841 HTG30 1461 222879158 ddbjhtg30.seq 4002759 HTG31 1626 222931833 ddbjhtg31.seq 4010684 HTG32 1782 223903646 ddbjhtg32.seq 4014595 HTG33 1336 228516080 ddbjhtg33.seq 3992591 HTG34 1257 231863307 ddbjhtg34.seq 3969186 HTG35 1291 230537606 ddbjhtg35.seq 3983793 HTG36 1592 232052374 ddbjhtg36.seq 3968330 HTG37 178 18264762 ddbjhtg37.seq 312547 human1 10677 195797121 ddbjhum1.seq 4365739 human2 1580 213722404 ddbjhum2.seq 4205499 human3 1542 215869417 ddbjhum3.seq 4173824 human4 1357 206910652 ddbjhum4.seq 4285031 human5 1463 216834471 ddbjhum5.seq 4163456 human6 1469 203497490 ddbjhum6.seq 4313233 human7 1562 210639404 ddbjhum7.seq 4238060 human8 1605 206985685 ddbjhum8.seq 4282013 human9 1730 209667498 ddbjhum9.seq 4252919 human10 24266 180872744 ddbjhum10.seq 4558411 human11 70097 120881597 ddbjhum11.seq 5025658 human12 8769 192110144 ddbjhum12.seq 4243938 human13 3155 210994756 ddbjhum13.seq 4115571 human14 2216 217231508 ddbjhum14.seq 4064841 human15 2562 216998879 ddbjhum15.seq 4065120 human16 5010 221213149 ddbjhum16.seq 4070704 human17 31293 163499102 ddbjhum17.seq 4639035 human18 65248 105715868 ddbjhum18.seq 4685580 invertebrates1 8177 214702508 ddbjinv1.seq 4097464 invertebrates2 23398 171167477 ddbjinv2.seq 4470355 invertebrates3 73436 101908787 ddbjinv3.seq 5206448 invertebrates4 30254 99383329 ddbjinv4.seq 3380719 mammals 41792 46672218 ddbjmam.seq 2512674 patents1 243978 97212378 ddbjpat1.seq 6508128 patents2 176032 103665993 ddbjpat2.seq 5795584 patents3 152313 125641601 ddbjpat3.seq 5519171 patents4 153536 65635618 ddbjpat4.seq 6348652 patents5 70477 18695825 ddbjpat5.seq 1661770 phages 2175 7068320 ddbjphg.seq 309973 plants1 27586 164020298 ddbjpln1.seq 4529019 plants2 90131 98289872 ddbjpln2.seq 5417770 plants3 34153 155926067 ddbjpln3.seq 4622314 plants4 54766 118303439 ddbjpln4.seq 4918101 plants5 34073 64171011 ddbjpln5.seq 2744017 primates 15659 27430544 ddbjpri.seq 1158769 rodents1 23225 191078811 ddbjrod1.seq 4418327 rodents2 12528 213916381 ddbjrod2.seq 4150877 rodents3 45507 153513895 ddbjrod3.seq 4767992 rodents4 6661 7666385 ddbjrod4.seq 425203 STS 124063 49076621 ddbjsts.seq 7262424 synthetic DNAs 6986 13149626 ddbjsyn.seq 550908 TPA 35 1591493 ddbjtpa.seq 29817 unannotated sequences 506 254752 ddbjuna.seq 22567 viruses1 90866 75496297 ddbjvrl1.seq 5637599 viruses2 70442 67943044 ddbjvrl2.seq 4433403 vertebrates 81311 96217071 ddbjvrt.seq 5108637 Accession number index file ddbjacc.idx 18435633 Keyword phrase index file ddbjkey.idx 6622856 Journal citation index file ddbjjou.idx 10556139 Gene name index file ddbjgen.idx 1008942 ------------------------------------------------------- EST: expressed sequence tag CON: contig sequences GSS: genome survey sequence HTC: high throughput cDNA HTG: high throughput genome sequence STS: sequence tagged site TPA: third party annotation