Classification of Proteins via Successive State Splitting Algorithm of Hidden Markov Network

Hidetoshi Tanaka [1] (htanaka@icot.or.jp)
Kentaro Onizuka [1] (onizuka@icot.or.jp)
Kiyoshi Asai [2] (asai@etl.go.jp)

[1]Institute for New Generation Computer Technology (ICOT)
1-4-28 Mita, Minato-ku, Tokyo 108
[2]Electrotechnical Laboratory (ETL)
1-1-4 Umezono, Tsukuba 305


Abstract

Hidden Markov Model (HMM) introduces a stochastic approach to protein representation and motif abstraction. We need the stochastic classification which is seamless with HMM representation and abstraction. Successive State Splitting (SSS) classifies proteins represented by HMM. It uses no previous knowledge of the proteins. The SSS algorithm was originally developed for allophone modeling. It is based on continuous distribution of phenome data. It enables to obtain an appropriate Hidden Markov Network automatically, and HMM simultaneously. We map amino acids onto continuous space according to quantification based on PAM-250.