New Hashing Techniques for Three-Dimensional Protein Structures

Tatsuya Akutsu[1] (akutsu@cs.gunma-u.ac.jp)
Kentaro Onizuka[2] (onizuka@mrit.mei.co.jp)
Masato Ishikawa[3] (ishikawa@icot.or.jp)

[1] Department of Computer Science, Gunma University
1-5-1 Tenjin, Kiryu, Gunma 376 Japan
[2] Human Interface Research Laboratory, Matsusita Research Institute Tokyo, Inc.,
3-10-1 Higashimita, Tama-ku, Kawasaki 214 Japan
[3] Institute for New Generation Computer Technology
1-4-28 Mita, Minato-ku, Tokyo 108 Japan

Abstract

This paper describes new methods to evaluate the structural similarity of proteins. In each method, a hash vector is associated with each fixed-length fragment of protein structure, where the following desirable property is theoretically proved: if the root mean square deviation between two fragments is small, then the distance between the hash vectors is small. Using the hash vectors, searching for similar protein structures can be done quickly. The methods were compared with the previous methods using PDB data, and were shown to be much faster.