======================================== refseq/LocusLink/README ======================================== Last modified: 1 June, 2005 REMINDER: This site is now inactive. A synopsis of the relationship between the files LocusLink provided and the files Entrez Gene provides is provided here: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/LL2G.html#files Previous files are now in the ARCHIVE directory. ======================================== Modifications: 5 July 1999------------------------------------------------ addition of mim2loc file modification of LL_tmpl to include a line for PubMed ids 12 August 1999--------------------------------------------- modification of LL_tmpl to add representation of PID values for CDS features on sequence records and links to MMDB for views of structures of proteins related in primary structure 10 September 1999------------------------------------------ added report of marker data (STS) 29 November 1999------------------------------------------- modified display of map data to specify whether genetic (G) or cytogenetic (C), and to indicate the source of the map information 11 February 2000------------------------------------------- modified display in loc2ref to indicate whether the RefSeq record is reviewed or provisional 12 March 2000---------------------------------------------- modified LL_tmpl to represent the source strain of sequence data, to report NC_ RefSeqs as a distinct category, and to remove a redundant tag: ***** strain representation: GenBank section: ACCNUM: M12266|BALB/c ^separator added and strain provided when available RefSeq section: STRAIN: BALB/c ***** NC_ RefSeq records: NC: NC_001807 (paired with the protein accession) ***** tag removed: LOCUS_STRING 23 March 2000---------------------------------------------- Sequence section: added a new protein-based link for Drosophila melanogaster based on the proteins predicted from the genomic sequence. 07 April 2000 Began to provide species-specific reports in the LL.out format as well as the comprehensive one. LL.out: comprehensive LL.out_dm Drosophila melanogaster LL.out_dr Danio rerio LL.out_hs Homo sapiens LL.out_mm Mus musculus LL.out_rn Rattus norvegicus 26 June 2000: Added another element to the STS line to make the repeat set 5 instead of 4. The 5th element is used to indicate if the STS has been detected by electronic PCR in both sequences known to come from the locus and the genomic sequence. 17 July 2000: ACCNUM: accession|gi|strain ^added PROT: accession|gi|structure information (no longer printed if there are no data) 13 Aug 2000: added tag LOCUS_TYPE added loc2cit file: (locus_id and citations) tab-delimited: LocusID, PubMed id, MedLine uid updated daily 2 November 2000: modified PROT reporting: PROT: accession|gi (structure link discontinued) One line per protein accession added structures to support NCBI's model transcripts from genomic analysis (MODEL, LOCUS_STRING, XM, XP) added domains predicted on proteins: CDD added links to the predicted mouse-human comparative map: COMP 5 December: added homol_seq_pairs a tab-delimited file of related mRNA sequences associated with current locus_ids 01 February 2001 modified representation of STS so each marker has one line in LL_templ and so that the method of assigning the marker to the gene is explicit added functional representation SUMFUNC: brief summary of the function GO: Gene Ontology terms EXTANNOT: Proteome's BioKnowledge Library terms and other annotation from external sources 12-14 February 2001 loc2ref and loc2acc: added column 5 to provide protein accession added section to represent related loci RELL: block to describe other genes (LocusID) related to the one being reported 12 September 2001: (first documented) GRIF: block to report text supplied by the public, desribing the function of a gene or other critical aspect of a gene, and the PubMed id of the paper in which this was reported RELL: additional qualifier added to the block to support reporting mRNA accessions that related to the LocusID, but may not necessarily instantiate that gene. This latter category is used only in those records generated through the genome annotation pipeline. October, 2001: COMP: additional values to support linking to more than one comparative map XR: a new type of RefSeq from the NCBI Annotation Project RNA only (no protein translation product) November, 2001: (first documented) LocudID_history: added documentation for the file and changed the date column from the date the first LocusID was created to the date of the merge of LocusIDs. April 26, 2002: modified the loc2acc and loc2ref files by: --adding version to the accessions --adding tax_id to support subdivision by species June 24, 2002: --modified the CONTIG: line to provide information about the position of the gene on the contig the contig now defined a larger set, since a gene may be annotated on more than one contig, and the annotation on each contig may include more than one mRNA/protein pair July 31, 2002: --changed the content of LL.out and LL.out_rn --the last column is now the id from RGD rather than RATMAP --the RATMAP id now occupies the 4th column position November, 2002: --With the addition of the bovine genome, added a new file LL.out_bt.qz to summarize the attributes of cow genes --added the file XM_XP_history to represent what what curated RefSeq accession (NM_/NP_) probably replaced a model RefSeq (XM_/XP_) NOTE: As of April 1, 2004, this file is archival only. The function of representing the relationship between XM_/XP_ and NM_/NP_ accessions is now being met by making XM/XP accessions secondary to NM/NP accessions. April, 2003: --With the replacement of GO annotation from Proteome, Inc. with that from GOA, data in the EXTRANNOT tag was no longer supplied. --The file loc2go as added to facilitate extraction of the relationship between LocusIDs and the GO identifiers assigned by FlyBase, GOA, MGD, and RGD. September, 2003: --modified loc2go to add a colum to display the evidence code --modified the scope of loc2acc to report protein-only accessions. In that case, the nucleotide accession is reported as 'none' and the sequence type is 'p' December, 2003: --changed the content of loc2cit to discontinue reporting MedLine UIDs. the files for converting MedLine UIs to PubMed IDs are available at: ftp://ftp.ncbi.nlm.nih.gov/pubmed/ see also: http://www.nlm.nih.gov/bsd/revup/archive_aug03.html January, 2004: ======> moved to ARCHIVE subdirectory --homol_seq_pairs.gz is being phased out. The contents had not been updated for some time. Those interested in homology data are encouraged to use the HomoloGene ftp site: ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene/ January 15, 2004: --REL2 is added as a new tag in LL_tmpl to support more explicit representation of protein-protein interactions the set of data currently populated for this function is descibed in http://www.ncbi.nlm.nih.gov/LocusLink/HIVinteractions.html April 14, 2004 --COMP data are changed to support a link to Map Viewer. LocusLink should no longer be used as the primary source of data for putatitve orthologs; HomoloGene should be used instead. ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene/ June 18, 2004 --note is added to loc2ref to indicate that the reporting of secondary accessions will be discontinued when LocusLink is no longer being updated July 23, 2004 --GRIF lines are removed from the LL_tmpl display because they are out of date Current information on GeneRIFs is available from ftp://ftp.ncbi.nlm.nih.gov/gene/GeneRIF/generifs_basic.gz Documented in ftp://ftp.ncbi.nlm.nih.gov/gene/README March 1, 2005 --With the phasing out of LocusLink, all files were moved to the ARCHIVE directory to provide a self_consistent set of records. LL_tmpl was copied as LL_tmpl_050301.gz It will continue to be refreshed temporarily. ======================================== This directory contains the following files: README -- this file an explanation of the LL_tmpl file LL_tmpl-- a tag:value file containing complete information for each LocusID as detailed below LL.out -- a tab-delimited file containing summary information for each LocusID, one line per combination of LocusID and an identifer from the genome-specific database. Values: LocusID, official symbol, interim symbol, MIM number (if applicable), chromosome, cytogenetic or genetic localization, default gene name (whether the defaults are defined by Human Nomenclature Committee is defined in LL_tmpl), NCBI's tax_id, the identifier of the genome-specific database (GDB, MGD, RGD). The tax_id can be used to identify the species of orgin as follows: cow: 9913 (bt) human: 9606 (hs) mouse: 10090 (mm) rat: 10116 (rn) zebrafish: 7955 (dr) D. melanogaster: 7227 (dm) loc2acc -- a tab-delimited file containing: LocusID nucleotide accession.version NCBI gi for the protein record associated with the CDS annotated on the record (the number you see in a GenBank-formated record as /db_xref="PID:g000000"), the type of sequence record (m = mRNA, g=genomic, u=undefined, p=protein) protein accession.version tax_id (Same usage as in LL.out, namely the tax_id used to indicate the genome, NOT NECESSARILY the tax_id of the accession) NOTES: this file is updated daily; as sequences representing loci are refined, these associations may change if the accession is protein only, nucleotide accession.version is reported as 'none' loc2go -- a tab-delimited file containing: LocusID GO identifier GO evidence code loc2ref -- a tab-delimited file containing: LocusID RefSeq accession.version the NCBI gi for the protein record associated with the CDS annotated on the record (the number you see in a GenBank-formatted RefSeq nucleotide record as /db_xref="PID:g000000") the review status of the RefSeq (reviewed or provisional or predicted or model) protein accession.version tax_id (see LL.out documentation) NOTE: this file is updated daily; as provisional sequences representing loci are refined, these associations may change. NOTE: if one RefSeq becomes secondary to another, the value in column 4 will be 'secondary'. This accession is retained in the report to make it easier to identify possible changes in LocusID-to-RefSeq sequence associations. NOTE: (6/18/2004) Reports of accessions that are secondary are provided in each RefSeq release. Information about the file is provided in: ftp://ftp.ncbi.nih.gov/refseq/release/release-notes/RefSeq-release#.txt The file that contains the information is: ftp://ftp.ncbi.nih.gov/refseq/release/release-catalog/release#.removed-records The replacement for this file from the Gene ftp site ftp://ftp.ncbi.nlm.nih.gov/gene/ is: DATA/gene2refseq.gz mim2loc -- a tab-delimited file containing: MIM number, LocusID It is valid for there to be more than one MIM number per locus. One may be associated with a gene; others may be associated with distinct phenotypes resulting from different mutations in the gene. loc2UG -- a tab-delimited file containing the LocusID and the current UniGene cluster id. IMPORTANT: the correspondence between the LocusID and a cluster id may change after a UniGene build. homol_seq_pairs: NOTE: ARCHIVE ONLY tab-delimited file reporting sequence homologies based on BLAST analysis of mRNA records species1 LocusID for species 1 symbol for species 1 sequence for species 1 length of sequence for species 1 species2 LocusID for species 2 symbol for species 2 sequence for species 2 length of sequence for species 2 max % identity av % identity length of match sequence LocusID_history tab-delimited file reporting changes in LocusIDs previous LocusID symbol at the time it became inactive current LocusID current symbol date the change was made XM_XP_history =======> moved to ARCHIVE DIRECTORY tab-delimited file reporting replacement of model accessions (XM_/XP_) by curated accessions (NM_/NP_) NOTE: this report is limited to the most direct replacement. The NM_/NP_ accession is subject to additional curation and may itself be replaced or updated. LocusID LocusID assigned to the NM /NP pair at the time of the replacement NM_ accession: mRNA accession and version replacing the XM_ nucleotide gi: gi for the above mRNA (XM_) NP_ accession: protein accession and version replacing the XP_ protein gi: gi for the above protein (XP_) LocusID_model: last LocusID annotated for the model sequence XM_ accession: mRNA accession that was replaced XP_ accession: protein accession that was replaced loc2sts bar-delimited file reporting correspondences between ids for STS as used by UniSTS and LocusLink IDS sts_uid preferred symbol preferred name map location LocusID taxonomy identifer (tax_id) loc2cit CHANGED Dec 12, 2003 tab-delimited file reporting citations attached to LocusIDs LocusID PubMed id scope: all citations connected to a LocusID source: GeneRIF genome-specific databases GenBank curation ======================================== The LL_tmpl file uses the following conventions: [x|y|z] x or y or z at that position [required] the tag is required, but a value may be null [unique] tag appears only once per record [multiple] tag may appear more than once per record [SET] indicates the beginning of a possible repeat unit for hierarchical tag:value pairs [/SET] end of repeat unit [optional] the tag may not appear in all records Description of LL_tmpl: >>[numeric] record separator; the number equals the LocusID LOCUSID [numeric] [unique] [required] the unique integer id for a locus CURRENT_LOCUSID: [numeric] [unique] [optional] If a LocusID has been merged with another, the current LOCUSID corresponding to the value on the previous LOCUSID line, is provided here. LOCUS_CONFIRMED: [alphanumeric][yes|no] The LOCUSID has been assigned to a confirmed locus and can be treated as an identifier that will be tracked. LOCUS_TYPE: [alphanumeric] description of the type of locus ORGANISM: [alphanumeric] [unique] [required]source species (Homo sapiens, Rattus norvegicus, etc.), based on NCBI's Taxonomy RELL: [set][optional][alphanumeric][multiple] description|id|id type|print representation[/set] brief text summarizing the relationship, the other id, the type of id, and the display for that second id. At present these id types of are 2 classes: l for locus_id, n for nucleotide accession official/default symbol for the other locus being described STATUS: [alphanumeric] [optional] (only if a reference sequence exists) [REVIEWED|PROVISIONAL|PREDICTED|MODEL] type of reference sequence record PROVISIONAL: generated automatically from an existing GenBank record and information stored in the LocusLink database; no curation REVIEWED: generated from the most representative, complete GenBank sequence or merge of GenBank sequences and from information stored in the LocusLink database PREDICTED: mRNA from a large-scale sequencing project the CDS has been predicted from the nucleotide sequence, but usually has not been verified MODEL: a model based on NCBI's genomic sequence assembly NG: the RefSeq accession for genomic region (nucleotide) records NR: the RefSeq accession for a non-messenger RNA. (mRNAs are part of a nucleotide/protein set NM/NP below) [SET] NM: the RefSeq accession for a mRNA record [alphanumeric] [optional] (only if a mRNA reference sequence exists) the accession for the mRNA, followed by the gi and the strain, if applicable NC: the accession for chromosome RefSeq records [alphanumeric] [optional] (only if a reference sequence exists) the RefSeq accession for a genomic record, followed by the gi and strain, if applicable. NP: the RefSeq accession for a protein record [alphanumeric] [optional] (only if a reference sequence exists) the RefSeq accession number for a protein record, followed by the PID for that protein and either MMDB or CBLASTP or na (values separated by |). MMDB indicates structure data are available for a protein related to the protein referenced by the PID. CBLASTP indicates that related proteins identified by BLASTP can be reviewed from the WWW site. PRODUCT: [alphanumeric] [optional] (only if a reference sequence exists) the name of the product of this transcript TRANSVAR: [alphanumeric] [optional] (only if a reference sequence exists) a variant-specific description ASSEMBLY: [alphanumeric] [optional][multiple] (only if a reference sequence exists)[/SET] CONTIG: [SET][alphanumeric][optional][multiple] the accession.version of the RefSeq contig, the nucleotide gi, the strain, the position of the gene (from|to|orientation), the chromosome, and an indicator of whether this is on the reference assembly or a strain|haplotype XG: [alphanumeric][optional] (only if an NG accession was used in the annotation process to define position of features on the contig) NG accession, nucleotide gi, strain [SET] XR: [alphanumeric][optional] (only if a model exists) the RefSeq accession of a model RNA, not associated with a protein product EVID: [alphanumeric] [optional] (only if a model exists) text summary of the evidence for this model XM: [alphanumeric] [optional] (only if a model exists) the accession for the mRNA, followed by the gi and the strain, if applicable XP: the RefSeq accession for a model protein record [alphanumeric] [optional] (only if an XM exists) the RefSeq accession of a model protein, followed by the PID for that protein and either MMDB or CBLASTP or na (values separated by |. MMDB indicates structure data are available for a protein related to the protein referenced by the PID. CBLASTP indicates that related proteins identified by BLASTP can be reviewed from the WWW site. CDD: [alphanumeric][multiple][optional] name|key|score|e_value|bit_score [/SET] [/SET] ACCNUM: GenBank accession used to assemble the RefSeq record [SET][alphanumeric] [optional] [multiple] nucleotide sequence accession number (no version), nucleotide gi, strain (if applicable), 5' end of the gene in the sequence, 3' end of the gene in the sequence one accession number per line TYPE: [e|m|g] refers to type of nucleotide sequence: e=EST m=mRNA g=genomic PROT: [SET][multiple][optional]A potentially repeating set of two values: accession and identifier (PID value) for the coding region or regions annotated on the associated nucleotide record, one line for each accession If no data are available, na is supplied. The delimiter is |. [/SET][/SET] [OFFICIAL|PREFERRED]_SYMBOL: [alphanumeric] [unique] [required] the symbol used for gene reports OFFICIAL: validated by the appropriate nomenclature committee PREFERRED: interim option selected for display na is used for models without evidence [OFFICIAL|PREFERRED]_GENE_NAME: [alphanumeric] [unique] [required (but may be null)] the gene description used for gene reports OFFICIAL: validated by the appropriate nomenclature committee PREFERRED: interim selected for display [NOTES--If the symbol is official, the gene_name will be official. No record will have both official AND interim nomenclature. PREFERRED_PRODUCT: [alphanumeric] [unique] [optional] the name of the product used in the RefSeq record ALIAS_SYMBOL: [alphanumeric][multiple] other symbols associated with this gene ALIAS_PROT: [alphanumeric][multiple] other protein names associated with this gene REL2: [set][optional][alphanumeric][multiple] LocusID of the intracting protein| RefSeq accession of the interacting protein| name of the interacting protein| keyword for the type of interaction| accession of the RefSeq protein associated with this locus| name of the RefSeq protein at this locus| a description of the interaction| PubMed id(s) describing the interaction [/set] PHENOTYPE: [SET][alphanumeric][multiple] a phenotype associated with a mutation in this gene PHENOTYPE_ID: [/SET] an ID used for this phenotype. For humans, this is the MIM number SUMMARY: [alphanumeric][optional] a summary description of the gene, its products, its significance, and mutant phenotypes UNIGENE: [alphanumeric][multiple] UniGene cluster id(s) associated with this gene OMIM: [numeric][optional][multiple] MIM number CHR: [alphanumeric][optional][multiple] the chromosome assignment MAP: [alphanumeric][optional][multiple] One line, consisting of a repeating set of 3 data elements, each element separated by | the first element is the location; the second is the source (as a URL when appropriate), and the third element is the type of map information (G = genetic, C=cytogenetc) STS: set of STS markers [SET][alphanumeric][optional][multiple] multiline set, one marker per line marker name|chromosome|sts_id|D segment|seq_known|evidence[/SET] evidence types are currently either epcr, or PubMed id(s) COMP: link to comparative maps in Map Viewer provides the URL and enumerates the species for which comparative views are available one line per record ECNUM: [alphanumeric][optional][multiple] BUTTON: [SET][alphanumeric][optional] an web resource accessed by a button, as well as or in addition to text LINK: [/SET][alphanumeric the url underlying the button (note: if there are variation data for this locus at NCBI, the line "BUTTON: snp.gif" will be present) DB_DESCR: [SET][alphanumeric][optional][multiple] The name of an external web site with more information about this locus DB_LINK: [/SET][alphanumeric] the URL PMID: [numeric][multiple] a subset of publications associated with this locus with the link being the PubMed unique identifier comma separated GRIF: [SET][alphanumeric][optional][multiple][/SET] PubMed unique identifier|comment SUMFUNC: [alphanumeric][optional] a brief summary of the function of the products of this locus GO: [SET][alphanumeric][optional][/SET] category of term|the term itself|evidence code|GO identifier| source of annotation|PubMed id(s) EXTANNOT: [SET][alphanumeric][optional][/SET] category of term|the term itself|evidence code| source of annotation|PubMed id(s) EXAMPLE (hypothetical only) >>5076 LOCUSID: 5076 LOCUS_CONFIRMED: yes LOCUS_TYPE: gene with protein product, function known or inferred ORGANISM: Homo sapiens STATUS: REVIEWED NM: NM_000278|4557820|na NP: NP_000269|4557821 CDD: Paired Box domain|PAX|517|na|205.797 CDD: 'Paired box' domain|pfam00292|540|na|214.756 PRODUCT: paired box protein 2 isoform b TRANSVAR: Transcript Variant: This splice variant (b) does not contain the alternate exons (6 and 10), and utilizes the normal exon 12 splice junction. ASSEMBLY: M89470 NM: NM_003987|4557822|na NP: NP_003978|4557823 CDD: Paired Box domain|PAX|527|na|209.692 CDD: 'Paired box' domain|pfam00292|540|na|214.756 PRODUCT: paired box protein 2, isoform a TRANSVAR: Transcript Variant: This splice variant (a) includes the alternate exon 6 but not exon 10, and utilizes the normal exon 12 splice junction. ASSEMBLY: AH006910,M89470 NM: NM_003988|4557824|na NP: NP_003979|4557825 CDD: Paired Box domain|PAX|517|na|205.797 CDD: 'Paired box' domain|pfam00292|540|na|214.756 PRODUCT: paired box protein 2, isoform c TRANSVAR: Transcript Variant: This splice variant (c) includes alternate exon 10 but not exon 6, and utilizes the normal exon 12 splice junction. ASSEMBLY: L25597,M89470 NM: NM_003989|4557826|na NP: NP_003980|4557827 CDD: Paired Box domain|PAX|527|na|209.692 CDD: 'Paired box' domain|pfam00292|540|na|214.756 PRODUCT: paired box protein 2, isoform d TRANSVAR: Transcript Variant: This splice variant (d) includes the alternate exon 6 but not exon 10, and also utilizes an alternate exon 12 splice junct ion that results in a different COOH-terminus. ASSEMBLY: AH006910,M89470 NM: NM_003990|4557828|na NP: NP_003981|4557829 CDD: Paired Box domain|PAX|527|na|209.692 CDD: 'Paired box' domain|pfam00292|540|na|214.756 PRODUCT: paired box protein 2, isoform e TRANSVAR: Transcript Variant: This splice variant (e) includes the alternate exon 6, lacks alternate exon 10, and uses an alternate exon 12 splice junct ion that results in a different COOH-terminus. ASSEMBLY: M89470 CONTIG: NT_008874 EVID: supported by alignment with mRNA XM: XM_005943|11432394|na XP: XP_005943|11432395|na EVID: supported by alignment with mRNA XM: XM_005944|11432408|na XP: XP_005944|11432409|na EVID: supported by alignment with mRNA XM: XM_005945|11432403|na XP: XP_005945|11432404|na EVID: supported by alignment with mRNA XM: XM_005946|11432412|na XP: XP_005946|11432413|na EVID: supported by alignment with mRNA XM: XM_005947|11432398|na XP: XP_005947|11432399|na ACCNUM: L09747|292380|na TYPE: g ACCNUM: U45245|3649601|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45247|1469405|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45248|1469406|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45249|1469407|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45250|1469408|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45251|1469409|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45252|1469410|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45253|1469411|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45254|1469412|na TYPE: g PROT: AAC63385|1469415 ACCNUM: U45255|1469413|na TYPE: g PROT: AAC63385|1469415 ACCNUM: L25597|438649|na TYPE: m PROT: AAA36417|438650 ACCNUM: M89470|409138|na TYPE: m PROT: AAA60024|409139 OFFICIAL_SYMBOL: PAX2 OFFICIAL_GENE_NAME: paired box gene 2 PREFERRED_PRODUCT: paired box protein 2, isoform d PREFERRED_PRODUCT: paired box protein 2, isoform e PREFERRED_PRODUCT: paired box protein 2 isoform b PREFERRED_PRODUCT: paired box protein 2, isoform a PREFERRED_PRODUCT: paired box protein 2, isoform c SUMMARY: Summary: PAX2 encodes paired box gene 2, one of many human homologues of the Drosophila melanogaster gene prd. The central feature of this tran scription factor gene family is the conserved DNA-binding paired box domain. PAX2 is believed to be a target of transcriptional supression by the tumor supressor gene WT1. Mutations within PAX2 have been shown to result in optic nerve colobomas and renal hypoplasia. PAX2 undergoes alternative splicing t hat results in 5 transcripts, splice variants a-e. CHR: 10 RELL: gene|51441|l|HGRG8 RELL: related mRNA|BC002559|n|XM_033717--BC002559 RELL: related mRNA|NM_016258|n|XM_001812--NM_016258 STS: CHLC.UTR_04354_M89470|10|74159|D10S2478|seq_map|epcr STS: sts-M89470|10|88437|na|seq_map|epcr COMP: Pax2|10|19|19 43.0 cM|18504 ALIAS_PROT: paired box homeotic gene 2 UNIGENE: Hs.155644 BUTTON: unigene.gif LINK: http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?ORG=Hs&CID=155644 OMIM: 167409 MAP: 10q22.1-q24.3|RefSeq|C| MAPLINK: http://www.ncbi.nlm.nih.gov/cgi-bin/Entrez/maps.cgi?ORG=hum&chr=10&maps=morbid,gene,loc&query=PAX2&VERBOSE=ON&ZOOM=1 PHENOTYPE: Optic nerve coloboma with renal disease PHENOTYPE_ID: 120330 BUTTON: snp.gif LINK: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?locusId=5076 BUTTON: homol.gif LINK: http://www.ncbi.nlm.nih.gov/HomoloGene/homolquery.cgi?TEXT=5076[loc] BUTTON: gdb.gif LINK: http://gdbwww.gdb.org/gdb-bin/genera/accno?GDB:138771 DB_DESCR: GeneCard for PAX2 DB_LINK: http://bioinformatics.weizmann.ac.il/cards-bin/carddisp?PAX2 DB_DESCR: Human PAX2 Allelic Variant Database DB_LINK: http://www.hgu.mrc.ac.uk/Softdata/PAX2/ PMID: 9439670,9297966,8661132,8431641,8241771,7981748,7819127,7795640,1378753 GRIF: 10958699|Murine orthologue of the human retinal glycoprotein IPM 150 (IMPG 1), involved in retinal adhesion and photoreceptor cell survival. Analyses of IPM 150 and IPM 200 core proteins reveals the presence of multiple conserved domain of unknown function. SUMFUNC: Member of the paired domain family of nuclear transcription activators; stimulates transcription of Wilms tumor suppressor gene (WT1)|Proteome GO: molecular function|transcription activating factor|E|GO:0003710|Proteome|8760285 GO: molecular function|DNA binding|P|GO:0003677|Proteome|9106533 GO: biological process|transcription from Pol II promoter|E|GO:0006366|Proteome|8760285 GO: biological process|axonogenesis|P|GO:0007409|Proteome|9106533 GO: biological process|vision|P|GO:0007601|Proteome|9106533 GO: biological process|histogenesis and organogenesis|NR|GO:0007397|Proteome|na EXTANNOT: cellular role|Pol II transcription|NR|Proteome|8760285 EXTANNOT: biochemical function|DNA-binding protein|NR|Proteome|9106533 EXTANNOT: biochemical function|Activator|NR|Proteome|8760285 EXTANNOT: organismal role|Osmoregulation and Excretion|NR|Proteome|9106533 EXTANNOT: organismal role|Photoreception|NR|Proteome|9106533 EXTANNOT: molecular localization|DNA-associated (direct or indirect)|NR|Proteome|9106533