DNA Data Bank of Japan DNA Database Release 44, Jan. 2001, including 10,165,597 entries, 11,136,298,841 bases This database may be copied and redistributed without permission on the condition that all the statements in this release note are reproduced in each copy. The present release contains the newest data prepared by the DNA Data Bank of Japan (DDBJ), GenBank, and European Molecular Biology Laboratory/European Bioinformatics Institute (EMBL/EBI) as of Dec. 26, 2000. This unified database was made possible thanks to the international collaboration among the three data banks. All the entries have accordingly been annotated with the feature keys common to them. All the entries designated by the accession numbers with the prefixes "C", "D", "E", "AB", "AG", "AK", "AP", "AT", "AU", "AV", "BA" and "BB" have been collected and processed by DDBJ, and the rest have been prepared by GenBank and EMBL/EBI. There have been a number of genome projects going on worldwide. Among them human genome projects have probably been most productive and yielded a large number of ordinary sequences, huge amounts of ESTs and quantities of genome sequences. Thus, we have the human(HUM) division solely for human sequences and the primate (PRI) division for non-human primate sequences. Note that the EST division also contains human sequences. The present release does not have the ORG division. Thus, if you are interested in human mitochondrial sequences, for example, you are now advised to refer to the HUM division. The HUM division in this release is divided into five subdivisions inwhich 30,000 entries each are allocated except for the last one including the rest. This release also includes an independent division (PAT) for patent data. The patent data are those which the Japanese Patent Office (JPO), United States Patent and Trademark Office (USPTO), and the European Patent Office (EPO) collected and processed. The accession numbers of the patent data collected by the Japanese Patent Office start with the prefix "E", those collected and supplied by USPTO and GenBank respectively start with "I" and "AR", and those collected and supplied by EPO and EMBL/EBI respectively start with "A" and "AX". The entries with the prefixes "I","AR", "A","AX" and "E" were allocated to a file (ddbjpat.seq) in the DDBJ format. Note also that unauthorized use of the patent data may cause legal issues for which we take no responsibility. In the present release, the SOURCE in the flat file was revisited and revised if necessary in accordance with the unified taxonomy database common to the three data banks. The number of ESTs has been increasing at an enormous rate and is expected to be growing even more rapidly in the future. Therefore, EST data were first sorted in terms of accession numbers, and then the result was divided into sets of 100,000 entries each except for the last set. The total number of sets this time is 69. The present release includes the GSS division. GSS stands for the Genome Survey Sequence, which is similar to EST, except that GSS is genomic DNA whereas EST is cDNA. This division is divided into 22 files; each of the first 21 files contains 100,000 entries and the last one does the rest. This release also includes the High Throughput Genome Sequence (HTGS) which comes mainly from genome project teams which deal with a clone as a sequencing unit. HTGS in this release are distributed in 29 files. First 28 files contain 3,000 entries each, and the last one contains the rest. The index files are not presented in this release except for ddbjacc.idx, ddbjgen.idx, ddbjjou.idx, and ddbjkey.idx. Instead, we have included a program by which to make the index files not presented in this release. For the use of the program, see the files, seq2indexes.doc, seq2indexes.c, and seq2indexes.h in this release. The present release contains amino acid sequences that were translated from the corresponding nucleotide sequences in our database. In the translation we paid much attention to the fact that some species or organella have a codon different from the universal one, and used the proper codon table. If you find an incorrect codon in a translated sequence, please let us know. The three data banks include the item VERSION in the flat file, which indicates a version of a submitted nucleotide sequence (see Table 1). It is expressed as AB123456.1, in which the digit(s) after the period is a version number. The reason for adding VERSION is that since a released sequence sometimes revised by the submitter, the accession number alone cannot specify the sequence in question causing the user a trouble. The number is increased by one every time when a revised sequence is made public. Accordingly, the translated protein sequence will be accompanied with a /protein_id which is expressed as BAA12345.1, in which the digit(s) after the period is again a version number. The number is increased by one when the corresponding nucleotide sequence is revised and the protein sequence is changed as a result, and when the revised protein sequence is made public. We terminated the RNA division. The RNA data were redistributed according to the category of the organism. Therefore, you will find a human RNA sequence, for example, in the HUM division. The present release includes a division, CON. The CON division is to show the order of related sequences in a genome, and expressed by join and the accession numbers of the sequences. The contents of the CON division are compiled by the three data banks not by the data submitter. The current number of the entries of this division is 8,456. We plan to include a new division, HTC (High Throughput cDNA). The definition of the HTC division is as follows. This division includes unfinished high throughput cDNA sequences, each of which has 5'UTR and 3'UTR at both ends and part of a coding region. The sequence may also include introns. When the sequence becomes finished later, it moves to the corresponding taxonomic division. The sequence is accompanied with a keyword, HTC (High Throughput cDNA), which is dropped when moved to a taxonomic division. This release was published by the following DDBJ staff. General administration T. Gojobori, M. Ota, Y. Fukuma, Y. Katsube, M. Maruyama, K. Okuda, J. Sugiyama, H. Tsutsui(hold), Y. Ueda, A. Watanabe Database construction Y. Tateno, S. Miyazaki, N. Asakawa, M. Ejima, M. Gojobori, A. Hasegawa, A. Hashizume, M. Hirashima, J. Mashima, N. Mukasa, M. Okaneya, A. Shimada, A. Suzuki, M. Suzuki, H. Tsutsui, T. Umezawa, Y. Yamamoto Database software development and management H. Sugawara, T. Imanishi, S. Miyazaki(hold), M. Fumoto, H. Harimoto, H. Hashimoto, T. Iizuka(hold), N. Ishizaka, K. Kaneda, T. Kato, Y. Kawanishi, K. Mamiya, S. Misu, T. Mizunuma, T. Okayama, Y. Sugiyama, K. Suzuki, T. Takaki System management K. Nishikawa, K. Ikeo, N. Hoshi, T.Iizuka, I. Mochizuki, M. Nagura, T. Narita, T. Osawa, K. Yoshioka Editorial and public relations N. Saitou, K. Fukami-Kobayashi, Y. Daito, H. Ichikawa, K. Ichikawa, T. Kawamoto, S. Nagira DNA Data Bank of Japan Center for Information Biology National Institute of Genetics Mishima 411-8540, Japan Phone: +81 559 81 6853 FAX: +81 559 81 6849 E-mail: ddbj@ddbj.nig.ac.jp (for general inquiry) ddbjsub@ddbj.nig.ac.jp (for data submission) ddbjupdt@ddbj.nig.ac.jp (for updates and notification of publication) WWW: http://www.ddbj.nig.ac.jp (for DDBJ WWW server) http://sakura.ddbj.nig.ac.jp (for DDBJ sequence data submission system SAKURA) Acknowledgement: We are grateful to NCBI and EMBL/EBI for a firm friendship and an excellent collaboration with us. We also thank the Japanese Patent Office for a steady cooperation with us. The operation of DDBJ is supported by the Ministry of Education, Science, Sports and Culture, and we would gratefully note this here. DDBJ Database Release History Release Date Entries Bases Comments ------------------------------------------------------------------------------ 44 01/01 10,165,597 11,136,298,841 43 10/00 8,666,551 10,034,532,698 42 07/00 7,554,995 8,880,721,093 41 04/00 5,962,608 6,409,581,885 CON division started 40 01/00 5,388,125 4,762,696,173 RNA division eliminated 39 10/99 4,810,773 3,728,000,562 NID and PID discarded 38 07/99 4,294,369 3,098,519,597 37 03/99 3,311,627 2,375,261,951 VERSION, /protein_id started 36 01/99 3,073,166 2,190,425,560 35 10/98 2,759,261 1,957,341,169 34 07/98 2,412,785 1,708,580,623 33 04/98 2,174,769 1,479,303,279 32 01/98 1,956,669 1,300,950,613 31 10/97 1,731,532 1,139,869,464 Adoption of the unified taxonomy database 30 07/97 1,534,115 992,788,339 NID and PID eliminated 29 04/97 1,270,194 841,415,232 28 01/97 1,154,120 756,785,219 HTG division started ORG division eliminated 27 10/96 936,697 608,103,057 GSS division started 26 07/96 835,552 551,932,448 25 04/96 744,490 499,300,364 /translation started 24 01/96 637,508 431,771,652 23 10/95 569,757 390,694,350 22 07/95 437,588 322,982,425 HUM division started 21 04/95 274,596 250,875,023 20 01/95 239,689 231,299,557 19 10/94 204,332 205,274,131 18 07/94 185,230 192,473,021 17 04/94 169,957 179,942,209 16 01/94 154,626 165,017,628 15 10/93 131,649 147,224,690 14 07/93 120,350 138,686,333 13 04/93 112,067 129,784,445 12 01/93 97,683 120,815,244 EST division started 11 07/92 65,693 84,839,075 10 01/92 59,317 77,805,556 GenBank/EMBL inclusion started 9 07/91 1,130 2,002,124 8 01/91 879 1,573,442 7 07/90 681 1,154,211 6 01/90 496 841,236 5 07/89 395 679,378 4 01/89 302 535,985 3 07/88 230 345,850 2 01/88 142 199,392 1 07/87 66 108,970 Started with DDBJ only ------------------------------------------------------------------------ This release covers 17 categories of organisms and others as follows: ------------------------------------------------------------------------------ ddbjbct.*** Category for bacteria ddbjest.*** Category for EST (expressed sequence tag) ddbjhtg.*** Category for HTG (high throughput genomic sequencing) ddbjhum.*** Category for human ddbjgss.*** Category for GSS (Genome Survey Sequence) ddbjinv.*** Category for invertebrates ddbjmam.*** Category for mammals other than primates and rodents ddbjpat.*** Category for patents ddbjphg.*** Category for phages ddbjpln.*** Category for plants ddbjpri.*** Category for primates other than human ddbjrod.*** Category for rodents ddbjsts.*** Category for STS (sequence tagged site) ddbjsyn.*** Category for synthetic DNAs ddbjuna.*** Category for unannotated sequences ddbjvrl.*** Category for viruses ddbjvrt.*** Category for vertebrates other than mammals ------------------------------------------------------------------------------ Each category then has the following nine files. Note that all the files except for ddbj***.seq are created by the user by use of seq2indexes as mentioned in the release note. ------------------------------------------------------------------------------ ddbj***.seq List of an entry in DDBJ format, see Table 1. ddbj***.acc List of the accession numbers, see Table 2 . ddbj***.aut List of the authors, see Table 3. ddbj***.dir List of the short directory in DDBJ style, see Table 4. ddbj***.idx List of indices, see Table 5. ddbj***.jou List of the journals, see Table 6. ddbj***.key List of the key words, see Table 7. ddbj***.org List of the species names, see Table 8. ddbj***.sdr List of the short directory in DDBJ style, see Table 9. ------------------------------------------------------------------------------ Table 1. Part of the contents in the file 'ddbjbct.seq'. This shows all pieces of information on one entry in DDBJ format. ------------------------------------------------------------------------------ LOCUS D87069 993 bp mRNA BCT 07-FEB-1999 DEFINITION Escherichia coli mRNA for RNA polymerase sigma subunit, truncated form of sigma-38, complete cds. ACCESSION D87069 VERSION D87069.1 KEYWORDS RNA polymerase sigma subunit, truncated form of sigma-38. SOURCE Escherichia coli (strain:W3110) cDNA to mRNA. ORGANISM Escherichia coli Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 993) AUTHORS Jishage,M. TITLE Direct Submission JOURNAL Submitted (14-AUG-1996) to the DDBJ/EMBL/GenBank databases. Miki Jishage, National Institute of Genetics, Molecular Genetics; Yata 1111, Mishima, Shizuoka 411, Japan (E-mail:mjishage@lab.nig.ac.jp, Tel:0559-81-6742, Fax:0559-81-6746) REFERENCE 2 (bases 1 to 993) AUTHORS Jishage,M. and Ishihama,A. TITLE Variation in RNA polymerase sigma subunit composition within different stocks of Escherichia coli starin W3110 JOURNAL Unpublished (1996) REFERENCE 3 (sites) AUTHORS Ivanova,A., Renshaw,M., Guntaka,R. and Eisenstark,A. TITLE DNA base sequence variability in katF (putative sigma factor) gene Escherichia coli JOURNAL Nucleic Acids Res. 20, 5479-5480 (1992) REFERENCE 4 (sites) AUTHORS Takayanagi,Y., Tanaka,K. and Takahashi,H. TITLE Structure of the 5' upstream region and the regulation of the rpoS gene of Escherichia coli JOURNAL Mol Gen Genet 243, 525-531 (1994) COMMENT FEATURES Location/Qualifiers source 1..993 /organism="Escherichia coli" /sequenced_mol="cDNA to mRNA" /strain="W3110" CDS 1..810 /note="the gene has four single base changes, resulting in two amino acid substitutions and an amber mutation" /product="RNA polymerase sigma subunit, truncated form of sigma-38" /protein_id="BAA13238.1" /translation="MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEYEPSDNDLAEEE LLSQGATQRVLDATQLYLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLV VKIARRYGNRGLALLDLIEEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMN QTRTIRLPIHIVKELNVYLRTARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNER ITSVDTPLGGDSEKALLDILADEKENGPEDTTQDDDMKQSIVKWLFELNAK" /transl_table=11 mutation 75 /citation=[3] /replace="t" mutation 97 /citation=[3] /replace="t" mutation 99 /citation=[3] /replace="t" mutation 808 /citation=[3] /replace="t" BASE COUNT 254 a 223 c 291 g 225 t 0 others ORIGIN 1 atgagtcaga atacgctgaa agttcatgat ttaaatgaag atgcggaatt tgatgagaac 61 ggagttgagg tttttgacga aaaggcctta gtagaatatg aacccagtga taacgatttg 121 gccgaagagg aactgttatc gcagggagcc acacagcgtg tgttggacgc gactcagctt 181 taccttggtg agattggtta ttcaccactg ttaacggccg aagaagaagt ttattttgcg 241 cgtcgcgcac tgcgtggaga tgtcgcctct cgccgccgga tgatcgagag taacttgcgt 301 ctggtggtaa aaattgcccg ccgttatggc aatcgtggtc tggcgttgct ggaccttatc 361 gaagagggca acctggggct gatccgcgcg gtagagaagt ttgacccgga acgtggtttc 421 cgcttctcaa catacgcaac ctggtggatt cgccagacga ttgaacgggc gattatgaac 481 caaacccgta ctattcgttt gccgattcac atcgtaaagg agctgaacgt ttacctgcga 541 accgcacgtg agttgtccca taagctggac catgaaccaa gtgcggaaga gatcgcagag 601 caactggata agccagttga tgacgtcagc cgtatgcttc gtcttaacga gcgcattacc 661 tcggtagaca ccccgctggg tggtgattcc gaaaaagcgt tgctggacat cctggccgat 721 gaaaaagaga acggtccgga agataccacg caagatgacg atatgaagca gagcatcgtc 781 aaatggctgt tcgagctgaa cgccaaatag cgtgaagtgc tggcacgtcg attcggtttg 841 ctggggtacg aagcggcaac actggaagat gtaggtcgtg aaattggcct cacccgtgaa 901 cgtgttcgcc agattcaggt tgaaggcctg cgccgtttgc gcgaaatcct gcaaacgcag 961 gggctgaata tcgaagcgct gttccgcgag taa // ------------------------------------------------------------------------------ Table 2. Part of the contents in the file 'ddbjbct.acc'. The first column refers to the secondary accession number, second column to the locus name, and third to the primary accession number. The primary number may be the same as the secondary number. They are arranged in the ascending order of the secondary accession numbers. ------------------------------------------------------------------------------ D00001 -> ECOPBPAA X04516 D00002 -> ECOPYRH X04469 D00006 -> PNS981TET D00006 D00020 -> COLE2LYS D00020 D00021 -> COLE31YS D00021 D00038 -> BRLAM330 D00038 D00066 -> BAC139AC D00066 D00067 -> ECONANA M20207 D00069 -> ECOUVRD2 D00069 D00087 -> BACXYNAA D00087 ------------------------------------------------------------------------------ Table 3. Part of the contents in the file 'ddbjbct.aut'. For each author name given on the left to the arrow, the corresponding locus name and primary accession number are respectively listed on the right. They are arranged in the alphabetical order of the author names. ------------------------------------------------------------------------------ Aan,F. -> STYCRR X05210 Aan,F. -> STYENZI M76176 Aaronson,W. -> ECOKPSD M64977 Aaronson,W. -> ECONEUA J05023 Abad-Lapuebla,M.A. -> VIBTDHI D90238 Abdel-Mawgood,A.L. -> CYAPSBHA X16394 Abdel-Meguid,S.S. -> TRNGDRECM J01843 Abdelal,A. -> STYCARA M36540 Abdelal,A. -> STYCARAB X13200 Abdelal,A.H. -> PSENOSA M60717 ------------------------------------------------------------------------------ Table 4. Part of the short directory in DDBJ style in the file 'ddbjbct.dir'. For each locus name given in the first column, the corresponding primary accession number, molecular type, number of nucleotide pairs, and description for the locus are respectively listed. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ ABCAARAA M34830 ds-DNA 1624 A.aceti acetic acid resistance protein (aarA) gene, complete cds. ABCADHCC D00635 ds-DNA 4230 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. ABCALDH D00521 ds-DNA 2683 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. ABCBCSAA M37202 ds-DNA 9540 A.xylinum bcs B, bcs C and bcs D genes, complete cds and bcs A gene, partial cds. ABCCELA M76548 ds-DNA 1165 Acetobacter xylinum UDP pyrophosphorylase (celA) gene, complete cds. ABCCELSYN X54676 ds-DNA 5363 A. xylinum gene for cellulose biosynthesis ABCIS1380 D10043 ds-DNA 1665 A.pasteurianus insertion sequence IS1380. ACAADH1 D90004 ds-DNA 2467 Acetobacter aceti(K6033) alcohol dehydrogenase subunit gene(adh1). ACCAAC2 M62833 ds-DNA 1123 Acinetobacter baumannii aminoglycoside acetyltr ansferase (aac2) gene, complete cds. ACCACEAA M62822 ds-DNA 1874 A.baumannii chloramphenicol acetyltransferase (cat) gene, complete cds. ------------------------------------------------------------------------------ Table 5. Part of the contents in the file 'ddbjbct.idx'. The first column refers to the locus name, second column to the starting site of the locus in byte, and third to its ending site in byte. They are arranged in the alphabetical order of the locus names. ------------------------------------------------------------------------------ %***************************** #ABCAARAA 0 3211 #ABCADHCC 3212 10608 #ABCALDH 10609 15864 #ABCBCSAA 15865 29583 #ABCCELA 29584 32289 #ABCCELSYN 32290 40960 #ABCIS1380 40961 44711 #ACAADH1 44712 49357 #ACCAAC2 49358 52395 ------------------------------------------------------------------------------ Table 6. Part of the contents in the file 'ddbjbct.jou'. This gives information on the journal in which sequence data were published. ------------------------------------------------------------------------------ (in) Chaloupka,J. and Krumphanzl,V. (Eds.); Extracellular Enzymes of Microorganisms: 129-137, Plenum Press, New York (1987) -> BACAMYABS M57457 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16S M55011 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SA M55006 (in) Ganesan,A.T., Chang,S. and Hoch,J.A. (Eds.); Molecular Cloning and Gene Regulation in Bacilli: 3-10, Academic Press, New York (1982) -> BACRG16SB M55008 (in) Hoch,J.A. and Setlow,P. (Eds.); Molecular Biology of Microbial Differentiation: 85-94, American Society for Microbiology, Washington, DC (1985) -> BACSPOII M57606 (in) Holmgren,A. (Ed.); Thioredoxin and Glutaredoxin Systems: Structure and Function: 11-19, Unknown name, Unknown city (1986) -> ECOTRXA1 M54881 (in) Kjeldgaard,N.C. and Maaloe,O. (Eds.); Control of ribosome synthesis: 138-143, Academic Press, New York (1976) -> ECOLAC J01636 (in) Losick,R. and Chamberlin,M. (Eds.); RNA polymerase: 455-472, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1976) -> ECOTGY1 K01197 (in) Sikes,C.S. and Wheeler,A.P. (Eds.); Surface reactive peptides and polymers. Discovery and commercialization.: 186-200, American Chemical Society, Washington, D.C. (1991) -> ECOTGP J01714 (in) Sund,H. and Blauer,G. (Eds.); Protein-Ligand Interactions: 193-207, Walter de Gruyter, New York (1975) -> ECOLAC J01636 (in) Wu,R. and Grossman,L. (Eds.); Methods in Enzymology, Recombinant DNA, part E: In press, Academic Press, New York, N.Y. (1986) -> PLMCG M11320 Acta Microbiol. Pol. 35, 175-190 (1986) -> ECOTGG1 M54893 Actinomycetologica 5, 14-17 (1991) -> STMARGG D00799 Adv. Biophys. 21, 115-133 (1986) -> R10REP M26840 Adv. Biophys. 21, 175-192 (1986) -> ECONUSAA M26839 Adv. Enzyme Regul. 21, 225-237 (1983) -> ECOPURFA M26893 Adv. Exp. Med. Biol. 195, 239-246 (1986) -> ECOAPT M14040 Agric. Biol. Chem. 50, 2155-2158 (1986) -> ECONANA M20207 Agric. Biol. Chem. 50, 2771-2778 (1986) -> BRLAM330 D00038 Agric. Biol. Chem. 51, 2019-2022 (1987) -> BACCGT D00129 Agric. Biol. Chem. 51, 2641-2648 (1987) -> STRSAGP D00219 Agric. Biol. Chem. 51, 2807-2809 (1987) -> BACPGECR M35503 Agric. Biol. Chem. 51, 3133-3135 (1987) -> BACXYLAP D00312 Agric. Biol. Chem. 51, 455-463 (1987) -> BACHDCRY D00117 Agric. Biol. Chem. 51, 953-955 (1987) -> BACXYNAA D00087 Agric. Biol. Chem. 52, 1565-1573 (1988) -> BACIP135 D00348 Agric. Biol. Chem. 52, 1785-1789 (1988) -> BACTMR D00343 Agric. Biol. Chem. 52, 2243-2246 (1988) -> PSEGI D00342 Agric. Biol. Chem. 52, 399-406 (1988) -> BACAMYEB M35517 Agric. Biol. Chem. 52, 479-487 (1988) -> ECAPALI D00217 ------------------------------------------------------------------------------ Table 7. Part of the contents in the file 'ddbjbct.key'. For the locus and accession number respectively given on the right to the arrow, the corresponding key words are listed on the left. ------------------------------------------------------------------------------ A.aceti acetic acid resistance protein (aarA) gene, complete cds. -> ABCAARAA M34830 acetic acid resistance protein. -> ABCAARAA M34830 Cloning of genes responsible for acetic acid resistance in acetobacter aceti -> ABCAARAA M34830 A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and cytochrome c genes. -> ABCADHCC D00635 alcohol dehydrogenase; cytochrome c. -> ABCADHCC D00635 Cloning and sequencing of the gene cluster encoding two subunits of membrane- bound alcohol dehydrogenase from Acetobacter polyoxogenes -> ABCADHCC D00635 These data kindly submitted in computer readable form by: Toshimi Tamaki Nakano Central Biochemical Institute 2-6 Nakamura-cho Handa-shi, Aichi-ken 475 Japan Phone: 0569-21-3331 Fax: 0569-23-8486 -> ABCADHCC D00635 A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, complete cds and flanks. -> ABCALDH D00521 aldehyde dehydrogenase gene; ethanol oxidation; membrane-bound enzyme. -> ABCALDH D00521 Nucleotide sequence of the membrane-bound aldehyde dehydrogenase gene from Acetobacter polyoxogenes -> ABCALDH D00521 ------------------------------------------------------------------------------ Table 8. Part of the contents in the file 'ddbjbct.org'. For the locus and accession number respectively given on the right to the arrow, the corresponding taxonomic names are listed on the left. They are arranged in the alphabetical order of the species names. ------------------------------------------------------------------------------ A. nidulans 6301 DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRUBPS X00019 A. nidulans DNA, clone pAN4. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGGX X00343 A. nidulans DNA. Anacystis nidulans Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> ANIRGG X00512 A. polyoxogenes genomic DNA. Acetobacter polyoxogenes Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. - > ABCADHCC D00635 A. quadruplicatum (strain PR-6) DNA, clone pAQPR1. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUPCAB K02660 A. quadruplicatum (strain PR6) DNA. Agmenellum quadruplicatum Prokaryota; Bacteria; Gracilicutes; Oxyphotobacteria; Cyanobacteria. -> AQUCPCAB K02659 A. vinelandii DNA. Azotobacter vinelandii Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> AVINIFUSV M17349 A.aceti (strain 10-8) DNA, clone pAR1611. Acetobacter aceti Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Aerobic rods and cocci; Azotobacteraceae. -> ABCAARAA M34830 A.actinomycetemcomitans (strain JP2) DNA, clone lambda-OP8. Actinobacillus actinomycetemcomitans Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Facultatively anaerobic rods; Pasteurellaceae. -> ACNLKTXN M27399 A.anitratum DNA, clone pLJD1. Acinetobacter anitratum Prokaryota; Bacteria; Gracilicutes; Scotobacteria; Neisseriaceae. -> ACCCITSYN M33037 ------------------------------------------------------------------------------ Table 9. Part of the short directory file in DDBJ style in the file 'ddbjbct.sdr'. The short directory file contains brief descriptions of all of the sequence entries contained in the DDBJ style. ------------------------------------------------------------------------------ ABCAARAA A.aceti acetic acid resistance protein (aarA) gene, complete 1624bp ABCADHCC A. polyoxogenes alcohol dehydrogenase (EC 1.1.99.8) and 4230bp ABCALDH A.polyoxogenes membrane-bound aldehyde dehydrogenase gene, 2683bp ABCBCSABCD A.xylinum bcs A, B, C and D genes, complete cds's. 9540bp ABCCELA Acetobacter xylinum UDP pyrophosphorylase (celA) gene, 1165bp ABCCELSYN A. xylinum gene for cellulose biosynthesis 5363bp ABCIS1380 A.pasteurianus insertion sequence IS1380. 1665bp ACAADH1 Acetobacter aceti(K6033) alcohol dehydrogenase subunit 2467bp ACCAAC2 Acinetobacter baumannii aminoglycoside acetyltransferase 1123bp ACCACEAA A.baumannii chloramphenicol acetyltransferase (cat) gene, 1874bp ACCAPHA6 Acinetobacter baumannii aphA-6 gene. 1170bp ACCBENABCA A.calcoaceticus BenA, BenB, BenC, BenD, and BenE proteins 15922bp ACCCAT Acinetobacter calcoaceticus cat operon. 15922bp ACCCATAM A.calcoaceticus catA and catM genes, encoding catechol 1, 5537bp ACCCHMO Acinetobacter sp. cyclohexanone monooxygenase gene, complete 2128bp ACCCITSYN A.anitratum citrate synthase gene, complete cds. 1895bp ------------------------------------------------------------------------------ In addition to the 9 tables the four following index files are included in this release. These files were prepared irrespective of the 14 categories of taxonomic divisions. Accession number index file Keyword phrase index file Journal citation index file Gene name index file A brief description is given for each file in the following. Table 10. Part of the accession number index file in the 'ddbjacc.idx'. The following excerpt from the accession number index file illustrates the format of the index. ------------------------------------------------------------------------------ D00100 PSEASPAA BCT D00100 D00101 RABNP450R MAM D00101 D00102 HUMLTX HUM D00102 D00103 AFARRN5SA BCT D00103 AFRRN5SA BCT X05517 D00104 AFARRN5SB BCT D00104 AFRRN5SB BCT X05518 D00105 AFARRN5S BCT D00105 ASRRN5S BCT X05524 D00106 ACH5SRR BCT D00106 AXRRN5S BCT X05522 AXRRN5SA BCT X05523 D00107 ACH5SRRX BCT D00107 ACRRN5S BCT X05521 ------------------------------------------------------------------------------ Table 11. Part of the keyword phrase index file in the 'ddbjkey.idx'. Keyword phrases consist of names for gene products and other characteristics of sequence entries. ------------------------------------------------------------------------------ A CHANNEL DROCHA INV M17155 A COMPONENT SQLCVEA VRL M38183 A LOCUS GORGOGOA3 PRI X54375 GORGOGOA4 PRI X54376 A LOCUS ALLELE GORA0101 PRI X60258 GORA0201 PRI X60259 GORA0401 PRI X60257 GORA0501 PRI X60256 A MULTI-GENE FAMILY RICGLUTE PLN D00584 A PROTEIN MS2AAR PHG M25187 ST1APCS PHG M25396 A SEQUENCE HS5TOA30 VRL D00148 HS5TOA31 VRL D00147 ------------------------------------------------------------------------------ Table 12. Part of the author name index file in 'ddbjaut.idx'. The author name index file lists all of the author names that appear in the citations. ------------------------------------------------------------------------------ ABE,A. HUMMHDRBWE PRI M27509 HUMMHDRBWF PRI M27510 HUMMHDRBWG PRI M27511 YSCGAL11A PLN M22481 ABE,C. S85445 BCT S85445 ABE,E. M23442 UNA M23442 ABE,H. CHKADF VRT M55660 CHKCOF VRT M55659 ABE,K. CHPCLAC PRI D11383 CHPIMRF PRI D11384 CUGCUR09 PLN X64110 CUGCUR37 PLN X64111 HPCCEXPA VRL M55970 HPCCPEP1 VRL D10687 HPCCPEP2 VRL D10688 HPCHABC82 VRL X51587 HPCNS2APA VRL M55972 HPCNS2PA VRL M55971 HPCNS2PB VRL M55973 HPCNS5PA VRL M55974 MUSKE2 ROD M65255 MUSKE2A ROD M65256 MZECYS PLN D10622 RICCPI PLN J03469 RICGLUTE PLN D00584 RICLNOCI PLN J05595 RICOCS PLN M29259 RICORYII PLN X57658 RICOZA PLN D90406 RICOZB PLN D90407 RICOZC PLN D90408 S54524 PLN S54524 S54526 PLN S54526 S54530 PLN S54530 S73960 ROD S73960 ------------------------------------------------------------------------------ Table 13. Part of the journal citation index file in 'ddbjjou.idx'. The journal citation index file lists all of the citations that appear in the references. ------------------------------------------------------------------------------ ACTA BIOCHIM. BIOPHYS. SIN. 23, 246-253 (1992) HUMPLASINS HUM M98056 ACTA BIOCHIM. BIOPHYS. SIN. 28, 233-239(1996) TKTII PLN X82230 ACTA BIOCHIM. POL. 24, 301-318 (1977) LUPTRFJ PLN K00345 LUPTRFN PLN K00346 ACTA BIOCHIM. POL. 26, 369-381(1979) HVTRNPHE PLN X02683 ACTA BIOCHIM. POL. 29, 143-149 (1982) EMEMTA PLN M32572 EMEMTB PLN M32573 EMEMTC PLN M32574 EMEMTD PLN M32575 EMEMTE PLN M32576 ACTA BIOCHIM. POL. 34, 21-27 (1987) LUPNOSP PLN M32571 ------------------------------------------------------------------------------ Table 14. Part of the gene name index file in 'ddbjgen.idx'. This file lists all the gene names that appear in the feature table. ------------------------------------------------------------------------------ AACC8 STMAACC8 BCT M55426 AACC9 MPUAACC9 BCT M55427 AACT HUMA1ACM PRI K01500 HUMA1ACMA PRI X00947 HUMA1ACMB PRI M18035 HUMAACT1 PRI M18906 HUMAACT2 PRI M22533 HUMAACTA PRI J05176 AAD INTINTORF BCT L06418 LMOMO229D BCT X17478 AAD A1 ENTAAC3VI BCT M88012 AAD9 ENEAAD9A BCT M69221 AADA LMOMO229A BCT X17479 S52249 BCT S52249 SYNAADA SYN M60473 TRNTAAB BCT M55547 TRNTN21CAS BCT M86913 ------------------------------------------------------------------------------ The files in this release are arranged in the following order with non- labeled format. Release note ddbjrel.txt 991 records Category for bacteria, 95039 entries, 244140576 bases ddbjbct.seq 10139311 records Category for EST1 (expressed sequence tag), 100000 entries, 37317616 bases ddbjest1.seq 5900162 records Category for EST2 (expressed sequence tag), 100000 entries, 41136721 bases ddbjest2.seq 5782892 records Category for EST3 (expressed sequence tag), 100000 entries, 37307947 bases ddbjest3.seq 5739657 records Category for EST4 (expressed sequence tag), 100000 entries, 32463761 bases ddbjest4.seq 5985100 records Category for EST5 (expressed sequence tag), 100000 entries, 38755506 bases ddbjest5.seq 5688599 records Category for EST6 (expressed sequence tag), 100000 entries, 40094624 bases ddbjest6.seq 5576199 records Category for EST7 (expressed sequence tag), 100000 entries, 38953341 bases ddbjest7.seq 5596537 records Category for EST8 (expressed sequence tag), 100000 entries, 38761582 bases ddbjest8.seq 5609570 records Category for EST9 (expressed sequence tag), 100000 entries, 39084649 bases ddbjest9.seq 5589342 records Category for EST10 (expressed sequence tag), 100000 entries, 39411570 bases ddbjest10.seq 5554980 records Category for EST11 (expressed sequence tag), 100000 entries, 41986426 bases ddbjest11.seq 5647623 records Category for EST12 (expressed sequence tag), 100000 entries, 43561973 bases ddbjest12.seq 5573412 records Category for EST13 (expressed sequence tag), 100000 entries, 40504989 bases ddbjest13.seq 5212251 records Category for EST14 (expressed sequence tag), 100000 entries, 39965585 bases ddbjest14.seq 5483956 records Category for EST15 (expressed sequence tag), 100000 entries, 41362141 bases ddbjest15.seq 5745086 records Category for EST16 (expressed sequence tag), 100000 entries, 44389794 bases ddbjest16.seq 5716712 records Category for EST17 (expressed sequence tag), 100000 entries, 41127331 bases ddbjest17.seq 5543683 records Category for EST18 (expressed sequence tag), 100000 entries, 44438700 bases ddbjest18.seq 5716703 records Category for EST19 (expressed sequence tag), 100000 entries, 42051745 bases ddbjest19.seq 5811817 records Category for EST20 (expressed sequence tag), 100000 entries, 42596576 bases ddbjest20.seq 5535731 records Category for EST21 (expressed sequence tag), 100000 entries, 46213568 bases ddbjest21.seq 5109719 records Category for EST22 (expressed sequence tag), 100000 entries, 51566536 bases ddbjest22.seq 4664359 records Category for EST23 (expressed sequence tag), 100000 entries, 33937603 bases ddbjest23.seq 5523088 records Category for EST24 (expressed sequence tag), 100000 entries, 25967887 bases ddbjest24.seq 5707606 records Category for EST25 (expressed sequence tag), 100000 entries, 27597185 bases ddbjest25.seq 7440623 records Category for EST26 (expressed sequence tag), 100000 entries, 24794753 bases ddbjest26.seq 8796117 records Category for EST27 (expressed sequence tag), 100000 entries, 42962996 bases ddbjest27.seq 5042399 records Category for EST28 (expressed sequence tag), 100000 entries, 46354504 bases ddbjest28.seq 4800870 records Category for EST29 (expressed sequence tag), 100000 entries, 52762284 bases ddbjest29.seq 5165091 records Category for EST30 (expressed sequence tag), 100000 entries, 43564890 bases ddbjest30.seq 5674446 records Category for EST31 (expressed sequence tag), 100000 entries, 44019658 bases ddbjest31.seq 6074163 records Category for EST32 (expressed sequence tag), 100000 entries, 42794018 bases ddbjest32.seq 5841559 records Category for EST33 (expressed sequence tag), 100000 entries, 40318607 bases ddbjest33.seq 5337617 records Category for EST34 (expressed sequence tag), 100000 entries, 39466556 bases ddbjest34.seq 5917742 records Category for EST35 (expressed sequence tag), 100000 entries, 44450350 bases ddbjest35.seq 6124659 records Category for EST36 (expressed sequence tag), 100000 entries, 46770383 bases ddbjest36.seq 5594103 records Category for EST37 (expressed sequence tag), 100000 entries, 42880740 bases ddbjest37.seq 5634087 records Category for EST38 (expressed sequence tag), 100000 entries, 36784393 bases ddbjest38.seq 5635276 records Category for EST39 (expressed sequence tag), 100000 entries, 43749228 bases ddbjest39.seq 5737276 records Category for EST40 (expressed sequence tag), 100000 entries, 26869557 bases ddbjest40.seq 8720075 records Category for EST41 (expressed sequence tag), 100000 entries, 27490272 bases ddbjest41.seq 8628085 records Category for EST42 (expressed sequence tag), 100000 entries, 28420256 bases ddbjest42.seq 8681006 records Category for EST43 (expressed sequence tag), 100000 entries, 27346079 bases ddbjest43.seq 8580427 records Category for EST44 (expressed sequence tag), 100000 entries, 27579844 bases ddbjest44.seq 8434628 records Category for EST45 (expressed sequence tag), 100000 entries, 26589515 bases ddbjest45.seq 8171649 records Category for EST46 (expressed sequence tag), 100000 entries, 42675676 bases ddbjest46.seq 5790788 records Category for EST47 (expressed sequence tag), 100000 entries, 42009519 bases ddbjest47.seq 5623296 records Category for EST48 (expressed sequence tag), 100000 entries, 57139824 bases ddbjest48.seq 5441248 records Category for EST49 (expressed sequence tag), 100000 entries, 52930346 bases ddbjest49.seq 5448411 records Category for EST50 (expressed sequence tag), 100000 entries, 48478068 bases ddbjest50.seq 5489040 records Category for EST51 (expressed sequence tag), 100000 entries, 58057194 bases ddbjest51.seq 5668093 records Category for EST52 (expressed sequence tag), 100000 entries, 43392892 bases ddbjest52.seq 5653204 records Category for EST53 (expressed sequence tag), 100000 entries, 58854840 bases ddbjest53.seq 5771261 records Category for EST54 (expressed sequence tag), 100000 entries, 59370874 bases ddbjest54.seq 5581901 records Category for EST55 (expressed sequence tag), 100000 entries, 48669602 bases ddbjest55.seq 6083243 records Category for EST56 (expressed sequence tag), 100000 entries, 53121025 bases ddbjest56.seq 5694645 records Category for EST57 (expressed sequence tag), 100000 entries, 65040046 bases ddbjest57.seq 5739131 records Category for EST58 (expressed sequence tag), 100000 entries, 64319352 bases ddbjest58.seq 5624139 records Category for EST59 (expressed sequence tag), 100000 entries, 45391182 bases ddbjest59.seq 5954294 records Category for EST60 (expressed sequence tag), 100000 entries, 50461077 bases ddbjest60.seq 5999329 records Category for EST61 (expressed sequence tag), 100000 entries, 56697834 bases ddbjest61.seq 5881716 records Category for EST62 (expressed sequence tag), 100000 entries, 55756725 bases ddbjest62.seq 5226335 records Category for EST63 (expressed sequence tag), 100000 entries, 36556280 bases ddbjest63.seq 4373390 records Category for EST64 (expressed sequence tag), 100000 entries, 33329482 bases ddbjest64.seq 5398687 records Category for EST65 (expressed sequence tag), 100000 entries, 36861498 bases ddbjest65.seq 5870412 records Category for EST66 (expressed sequence tag), 100000 entries, 35607978 bases ddbjest66.seq 5721501 records Category for EST67 (expressed sequence tag), 100000 entries, 34273026 bases ddbjest67.seq 5514618 records Category for EST68 (expressed sequence tag), 100000 entries, 40252226 bases ddbjest68.seq 5886296 records Category for EST69 (expressed sequence tag), 54539 entries, 19935956 bases ddbjest69.seq 3064485 records Category for GSS1 (Genome Survey Sequence), 100000 entries, 64904920 bases ddbjgss1.seq 4721021 records Category for GSS2 (Genome Survey Sequence), 100000 entries, 80332634 bases ddbjgss2.seq 5732808 records Category for GSS3 (Genome Survey Sequence), 100000 entries, 88615461 bases ddbjgss3.seq 6227066 records Category for GSS4 (Genome Survey Sequence), 100000 entries, 51403848 bases ddbjgss4.seq 5036810 records Category for GSS5 (Genome Survey Sequence), 100000 entries, 41319632 bases ddbjgss5.seq 4988704 records Category for GSS6 (Genome Survey Sequence), 100000 entries, 47506911 bases ddbjgss6.seq 5065716 records Category for GSS7 (Genome Survey Sequence), 100000 entries, 52342347 bases ddbjgss7.seq 5470390 records Category for GSS8 (Genome Survey Sequence), 100000 entries, 50917222 bases ddbjgss8.seq 5541912 records Category for GSS9 (Genome Survey Sequence), 100000 entries, 50258254 bases ddbjgss9.seq 5650200 records Category for GSS10 (Genome Survey Sequence), 100000 entries, 50400261 bases ddbjgss10.seq 5594218 records Category for GSS11 (Genome Survey Sequence), 100000 entries, 51084750 bases ddbjgss11.seq 5844704 records Category for GSS12 (Genome Survey Sequence), 100000 entries, 55724740 bases ddbjgss12.seq 5663184 records Category for GSS13 (Genome Survey Sequence), 100000 entries, 52792789 bases ddbjgss13.seq 6001884 records Category for GSS14 (Genome Survey Sequence), 100000 entries, 49854017 bases ddbjgss14.seq 5829491 records Category for GSS15 (Genome Survey Sequence), 100000 entries, 47134194 bases ddbjgss15.seq 6049191 records Category for GSS16 (Genome Survey Sequence), 100000 entries, 56734345 bases ddbjgss16.seq 5604254 records Category for GSS17 (Genome Survey Sequence), 100000 entries, 46497747 bases ddbjgss17.seq 6339185 records Category for GSS18 (Genome Survey Sequence), 100000 entries, 50530826 bases ddbjgss18.seq 7091234 records Category for GSS19 (Genome Survey Sequence), 100000 entries, 50431105 bases ddbjgss19.seq 6767197 records Category for GSS20 (Genome Survey Sequence), 100000 entries, 48721606 bases ddbjgss20.seq 6748284 records Category for GSS21 (Genome Survey Sequence), 100000 entries, 58655105 bases ddbjgss21.seq 5639841 records Category for GSS22 (Genome Survey Sequence), 47812 entries, 21399046 bases ddbjgss22.seq 2395234 records Category for HTG1 (high throughput genomic sequencing), 3000 entries, 439950348 bases ddbjhtg1.seq 7731361 records Category for HTG2 (high throughput genomic sequencing), 3000 entries, 334335636 bases ddbjhtg2.seq 5910933 records Category for HTG3 (high throughput genomic sequencing), 3000 entries, 263803560 bases ddbjhtg3.seq 4705541 records Category for HTG4 (high throughput genomic sequencing), 3000 entries, 206452744 bases ddbjhtg4.seq 3631565 records Category for HTG5 (high throughput genomic sequencing), 3000 entries, 445555602 bases ddbjhtg5.seq 7903215 records Category for HTG6 (high throughput genomic sequencing), 3000 entries, 468258335 bases ddbjhtg6.seq 8296130 records Category for HTG7 (high throughput genomic sequencing), 3000 entries, 92977415 bases ddbjhtg7.seq 1761558 records Category for HTG8 (high throughput genomic sequencing), 3000 entries, 13034222 bases ddbjhtg8.seq 354549 records Category for HTG9 (high throughput genomic sequencing), 3000 entries, 53570089 bases ddbjhtg9.seq 1063535 records Category for HTG10 (high throughput genomic sequencing), 3000 entries, 13902605 bases ddbjhtg10.seq 369253 records Category for HTG11 (high throughput genomic sequencing), 3000 entries, 28228948 bases ddbjhtg11.seq 620872 records Category for HTG12 (high throughput genomic sequencing), 3000 entries, 23609498 bases ddbjhtg12.seq 541168 records Category for HTG13 (high throughput genomic sequencing), 3000 entries, 15424588 bases ddbjhtg13.seq 393904 records Category for HTG14 (high throughput genomic sequencing), 3000 entries, 12555596 bases ddbjhtg14.seq 344760 records Category for HTG15 (high throughput genomic sequencing), 3000 entries, 23717471 bases ddbjhtg15.seq 536557 records Category for HTG16 (high throughput genomic sequencing), 3000 entries, 35145959 bases ddbjhtg916seq 739416 records Category for HTG17 (high throughput genomic sequencing), 3000 entries, 11566677 bases ddbjhtg17.seq 326172 records Category for HTG18 (high throughput genomic sequencing), 3000 entries, 27931926 bases ddbjhtg18.seq 616244 records Category for HTG19 (high throughput genomic sequencing), 3000 entries, 27313290 bases ddbjhtg19.seq 596635 records Category for HTG20 (high throughput genomic sequencing), 3000 entries, 244799958 bases ddbjhtg20.seq 4419718 records Category for HTG21 (high throughput genomic sequencing), 3000 entries, 36404550 bases ddbjhtg21.seq 758926 records Category for HTG22 (high throughput genomic sequencing), 3000 entries, 222093582 bases ddbjhtg22.seq 3949473 records Category for HTG23 (high throughput genomic sequencing), 3000 entries, 6214015 bases ddbjhtg23.seq 232939 records Category for HTG24 (high throughput genomic sequencing), 3000 entries, 202960111 bases ddbjhtg24.seq 3655647 records Category for HTG25 (high throughput genomic sequencing), 3000 entries, 106697987 bases ddbjhtg25.seq 2019861 records Category for HTG26 (high throughput genomic sequencing), 3000 entries, 126245595 bases ddbjhtg26.seq 2315123 records Category for HTG27 (high throughput genomic sequencing), 3000 entries, 489935565 bases ddbjhtg27.seq 8509275 records Category for HTG28 (high throughput genomic sequencing), 3000 entries, 482045190 bases ddbjhtg28.seq 8406266 records Category for HTG29 (high throughput genomic sequencing), 31 entries, 3406492 bases ddbjhtg29.seq 58079 records Category for human1, 30000 entries, 607162944 bases ddbjhum1.seq 13076678 records Category for human2, 30000 entries, 247114586 bases ddbjhum2.seq 6224904 records Category for human3, 30000 entries, 290287763 bases ddbjhum3.seq 6835468 records Category for human4, 30000 entries, 43170610 bases ddbjhum4.seq 2074964 records Category for human5, 18332 entries, 45706608 bases ddbjhum5.seq 1585246 records Category for invertebrates, 77952 entries, 359122493 bases ddbjinv.seq 10399499 records Category for mammals, 27692 entries, 24458994 bases ddbjmam.seq 1555942 records Category for patents, 257389 entries, 92015835 bases ddbjpat.seq 7076577 records Category for phages, 1652 entries, 4551666 bases ddbjphg.seq 203766 records Category for plants, 129229 entries, 349478461 bases ddbjpln.seq 12405993 records Category for primates, 10211 entries, 9217321 bases ddbjpri.seq 588011 records Category for rodents, 60764 entries, 105593989 bases ddbjrod.seq 4514918 records Category for STS (sequence tagged site), 117007 entries, 51314276 bases ddbjsts.seq 6699852 records Category for synthetic DNAs, 4146 entries, 10503608 bases ddbjsyn.seq 378918 records Category for unannotated sequences, 486 entries, 285657 bases ddbjuna.seq 21782 records Category for viruses, 110318 entries, 96136569 bases ddbjvrl.seq 6725614 records Category for vertebrates, 48998 entries, 44630810 bases ddbjvrt.seq 2792756 records Accession number index file ddbjacc.idx 10192738 records Keyword phrase index file ddbjkey.idx 3735101 records Journal citation index file ddbjjou.idx 5815766 records Gene name index file ddbjgen.idx 556591 records