DDBJ Amino Acid Sequence Database (DAD) Release 1.5 July 10, 1997 This is a test release of DDBJ Amino Acid Sequence Database (DAD). This database was created by extracting all translated sequences from DDBJ entries. The release 1.5 was made from DDBJ release 30 (July 1997). 1. Format of DAD Entries DAD is divided into 15 files according to the organisms from which amino acid sequences are derived. The divisions are the same as those of DDBJ DNA Database, except that all EST data are put into one file of DAD. Please refer to the release note of DDBJ (filename: ddbjrel.txt). All the amino acid sequences are stored in a fasta-like format (ODEN format). The first line of an entry has a character '>' followed by an accession number of the entry, its PID (protein ID), and the product name if it is known. Accession number of DAD is comprised of DDBJ accession number and a consecutive integer that begins from 1. These two numbers are combined by a hyphen (-). For example, amino acid sequences extracted from DDBJ entry D12345 have accession numbers D12345-1, D12345-2, etc. An amino acid sequence begins from the next line of accession number. Up to sixty amino acids are written on one line. Following the amino acid sequence, there is a double slash (//), which means the end of the entry. Below is an example of DAD entry. >X65727-1 PID:g825605 glutathione S-transferase MAEKPKLHYSNTRGRMESIRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI DGMKLVQTRAILNYIASKYNLYGKDIKEKALIDMYIEGIADLGEMILLLPFTQPEEQDAK LALIQEKTKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEESRKIFRF // 2. Statistics of DAD The following are statistics of this release of DAD. total number of entries 242,538 total length of sequences 74,829,275 aa average length 308.5 aa name of longest sequence X90568-1 PID:g1212992 length of longest sequence 26,926 aa (X90568-1) file entries peptides ------------------------------------- ddbjbct 61641 18384283 ddbjest 966 89694 ddbjhum 23988 7471563 ddbjinv 27150 10371035 ddbjmam 7041 1951623 ddbjpat 2048 571819 ddbjphg 2669 510928 ddbjpln 36675 14049339 ddbjpri 1942 372025 ddbjrod 22619 6721393 ddbjsts 11 869 ddbjsyn 1858 409789 ddbjuna 964 321921 ddbjvrl 43403 10997001 ddbjvrt 9563 2605993 ------------------------------------- total 242538 74829275