Protein sequence databases in the context of genome projects

Amos Bairoch (bairoch@cmu.unige.ch)

Medical Biochemistry Department, University of Geneva
1211 Geneva 4, Switzerland


Abstarct

Recent developments concerning the SWISS-PROT and PROSITE databases are discussed in the context of genome projects and of new network access tools
SWISS-PROT[1] is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc), a minimal level of redundancy and high level of integration with other databases. In the recent months we have developed the database in the following directions:

PROSITE[2] is a compilation of sites and patterns found in protein sequences; it can be used as a method of determining the function of uncharacterized proteins translated from genomic or cDNA sequences. Recent developments include:

Both SWISS-PROT and PROSITE are available through the ExPASy World-Wide Web (WWW) server[3]. WWW is a powerful global information system merging networked information retrieval and hypertext. The ExPASy server allows access to the SWISS-PROT, PROSITE, SWISS-2DPAGE and SWISS-3DIMAGE databases and, through any SWISS-PROT protein sequence entry, to other databases such as EMBL, REBASE, FlyBase, GCRDb, MaizeDB, OMIM, PDB and Medline.