Indexing protein sequences with MINOS

H. Ripoche[1] (hr@lirmm.fr)
E. Mephu Nguifo[2] (mephu@lens.lifl.fr)
J. Sallantin[1] (js@lirmm.fr)

[1] LIRMM
UMR 9928 CNRS - Montpellier II
161 rue Ada
F-34392 Montpellier
[2] Université d'Artois - IUT de LENS
Département d'Informatique
Rue de I'Université - SP 16
62307 LENS cedex


Abstract

This paper concerns the use of an object-oriented database for the analysis of protein sequences. We describe proteins either by bibliographic information or by prediction function such as Prosite patterns [2, 5]. We propose to use concept lattices - a tool used in information retrieval to build thesauruses - to classify protein sequences. This classification of proteins may help finding sequence alignments, or discussing about them. Conversely, sequence alignments can be used to criticize the structuration of sequences.

Keywords: Knowledge Discovery in Databases, Concept Lattices, Object-Oriented Databases, Sequence Alignments, Protein Data Bank, Prosite.