Protein alpha-Helix Region Prediction Using Stochastic-Rule Learning

Hiroshi Mamitsuka[1]
Kenji Yamanishi[2]

[1]C&C Information Technology Research Labs. NEC Corp.
[2]NEC Research Institute, Inc.


Abstract

In this paper, we apply Mamitsuka and Yamanishi's method (for short, the MY method) to predicting protein alpha-nelix region for alpha-domain-type (alpha/alpha) proteins. The MY method provides a stochastic rule, which assigns, to any region in an amino acid sequence, a probability that it is alpha-helix. Further, on the basis of the minimum description length (MDL) principle, the MY method optimally categorizes 20 types of amino acids using their numberical attiributes (e.g. molecular weight, hydrophobicity, etc.) into less than 20 groups. Our experimental results show that, by using a variety of proteins to obtain examples of alpha-helix, the MY method achieves the average prediction rates of more than 80% and 70% for training and test examples respectively, and these results are significantly better than those of conventional methods, i.e. Chou and Fasman, Garnier et. al., Qian and Sejnowski etc.