===== ===== ===== ===== ===== ===== ===== ===== ===== ===== ===== =====
   THIS DATABASE MAY BE COPIED AND REDISTRIBUTED WITHOUT PERMISSION
   ON THE CONDITION THAT ALL THE STATEMENTS IN THIS RELEASE NOTE ARE
   REPRODUCED IN EACH COPY.
===== ===== ===== ===== ===== ===== ===== ===== ===== ===== ===== =====

                       DDBJ Amino Acid Sequence Database
                                     (DAD)

                                  Release 19.0
                                  Apr 23, 2002
                including 1,012,203 entries, 309,708,601 residues


    This is a release of DDBJ Amino Acid Sequence Database (DAD).  This
database has been produced by extracting all translated sequences from the
release 49 of the DDBJ/EMBL/GenBank entries (April 2002).

1. DAD Files

    DAD entries are stored in 17 separate files according to the organisms
from which the original DNA sequences are derived.  These 17 divisions are the
same as those of the DDBJ DNA Database, except that all translated sequences
from the EST sequences are put into one file of DAD.  Please refer to the
release note of the DDBJ release for details (filename: ddbjrel.txt).  Also,
there are two types of DAD files for each division;  files with suffices
".DAD" in the DAD standard format, and those with suffices
".DAD.fasta" in a FASTA-compatible format.

2. Recent changes

    A new division "htc" (ddbjhtc.DAD) has been added from this release.

    From release 11.0, "PID" label was changed to "PROTEIN_ID", and the
"/protein_id" qualifier was removed from the feature part.

    From release 9.0, PID numbers were replaced with Protein_ID numbers.
Protein_ID is expressed like AAA12345.1, where the number after a period
denotes the version number.  The version number is increased by one, when
the original DNA sequence is updated and the protein sequence translated
is changed as a result.

3. Format of DAD Entries

    The standard format of DAD is almost the same as that of the DDBJ
nucleotide sequence database.  There are, however, notable differences.
as described below.

    Accession numbers of the DAD entries are written in the lines labeled as
"ACCESSION."  An accession number of DAD is comprised of a DDBJ accession
number and an integer that begins from 1.  These two numbers are combined by
a hyphen (-).  For example, two amino acid sequences extracted from a DDBJ
entry D12345 respectively have accession numbers of D12345-1 and D12345-2.
The number is useful for identifying a DAD entry.

    An amino acid sequence begins from the next line of "BEGIN."  Up to
sixty amino acids are written in one line.  Following the amino acid
sequence, there is a double slash (//) which means the end of the entry.

    LOCUS line contains locus name, length of protein, molecular type (this
is always "PRT"), division name, and date of release of DNA counterpart.
DEFINITION line contains species name and protein name.  The other parts
of a DAD entry, including FEATURES, are almost the same as those of the
corresponding DDBJ entry.

4. A Sample of DAD Entries

    Below is a typical sample of DAD entries.  This might be useful for
understanding its format and contents.

----- ----- ----- ----- sample begin ----- ----- ----- -----
LOCUS       AB000714      220 aa    PRT             HUM       27-OCT-1997
DEFINITION  Homo sapiens RVP1 protein.
ACCESSION   AB000714-1
PROTEIN_ID  BAA22986.1
SOURCE      Homo sapiens tissue_lib:lung cDNA to mRNA.
  ORGANISM  Homo sapiens
            Eukaryotae; Metazoa; Chordata; Vertebrata; Mammalia; Eutheria;
            Primates; Catarrhini; Hominidae; Homo.
REFERENCE   1
  AUTHORS   Katahira,J.
  TITLE     Direct Submission
  JOURNAL   Submitted (26-JAN-1997) to the DDBJ/EMBL/GenBank databases. Jun
            Katahira, Institute for Microbial Diseases, Osaka University,
            Department of Bacterial Toxinology; 3-1, Yamadaoka, Suita, Osaka
            565, Japan (E-mail:katahira@biken.osaka-u.ac.jp,
            Tel:81-6-879-8285, Fax:81-6-879-8283)
  STANDARD  full staff_review
REFERENCE   2
  AUTHORS   Katahira,J., Sugiyama,H., Inoue,N., Horiguchi,Y., Matsuda,M. and
            Sugimoto,N.
  TITLE     Clostridium perfringens enterotoxin utilizes two structurally
            related membrane proteins as functional receptors in vivo
  JOURNAL   J. Biol. Chem. 272, 26652-26658 (1997)
  STANDARD  full staff_review
COMMENT
FEATURES             Qualifiers
     source          /organism="Homo sapiens"
                     /sequenced_mol="cDNA to mRNA"
                     /tissue_lib="lung"
     protein         /gene="hRVP1"
                     /transl_table=1
BEGIN
        1 MSMGLEITGT ALAVLGWLGT IVCCALPMWR VSAFIGSNII TSQNIWEGLW MNCVVQSTGQ
       61 MQCKVYDSLL ALPQDLQAAR ALIVVAILLA AFGLLVALVG AQCTNCVQDD TAKAKITIVA
      121 GVLFLLAALL TLVPVSWSAN TIIRDFYNPV VPEAQKREMG AGLYVGWAAA ALQLLGGALL
      181 CCSCPPREKK YTATKVVYSA PRSTGPGASL GTGYDRKDYV
//
----- ----- ----- ----- sample end ----- ----- ----- -----

5. Statistics of DAD

    The following are statistics of this release of DAD.

total number of entries        1,012,203
total length of sequences    309,708,601 aa
average length                       305.9 aa
name of longest sequence      AJ277892-2 PID:CAD12456.1
length of longest sequence        34,350 aa (AJ277892-2)

======================================================
file       no. of entries    no. of amino acids
======================================================
ddbjbct       323,717         96,546,879
ddbjest           947             84,797
ddbjhtc        10,474          2,603,175
ddbjhtg         2,048            781,892
ddbjhum        85,978         28,300,724
ddbjinv       113,587         41,655,493
ddbjmam        20,762          5,142,223
ddbjpat        16,229          5,356,999
ddbjphg         8,750          1,794,044
ddbjpln       150,760         55,865,352
ddbjpri         8,290          1,770,283
ddbjrod        55,651         18,231,455
ddbjsts             4                314
ddbjsyn         4,870          1,308,201
ddbjuna           233             38,289
ddbjvrl       165,337         39,067,103
ddbjvrt        44,566         11,161,378
======================================================
total       1,012,203        309,708,601
======================================================

DNA Data Bank of Japan
Center for Information Biology
National Institute of Genetics
Mishima 411-8540, Japan
Phone:  +81 559 81 6853
FAX:    +81 559 81 6849
E-mail: ddbj@ddbj.nig.ac.jp  (for general inquiry)
WWW:    http://www.ddbj.nig.ac.jp (for DDBJ WWW server)