Protein Sequence Motif Extraction with a Probabilistic Logic Neural Network: Motif Evaluation on a 3-D Structure

Kazuhiro Iida[1] (iida@exp.cl.nec.co.jp)
Hiroshi Mamitsuka[2] (mami@ibl.cl.nec.co.jp)

[1]Fundamental Research Laboratories, NEC Corporation
34 Miyuki-ga-oka, Tsukuba, Ibaraki, 305 Japan
[2]C&C Research Laboratories, NEC Corporation
4-1-1 Miyazaki, Miyamae, Kawasaki, Kanagawa, 216 Japan


Abstract

A probabilistic logic neural network, mSDN reveals multiple biochemical rules hidden in a protein amino-acid sequence. Two motifs are extracted from a 16-residue hemoglobin alpha-helixregion. The motifs each containing only 3 amino-acid residues, correctly classify new data with 96% accuracy. Evaluating the motifs on a hemoglobin 3-D structure suggests that one motif represents a local alpha-helix determiner, and the other explains long-range interactions which are important for hemoglobin tertiary structure. The findings indicate that the mSDN extracts region specific and biochemically significant motifs from an amino-acid sequence, and suggest that the network separates heterogeneous biochemical rules in a sequence into corresponding motifs. Motifs extracted by the mSDN will help us to analyze, and to predict protein conformations and its functions.