Rapid similarity search algorithm of nucleic acid database

S. Hiraoka (hiraoka@crl.hitachi.co.jp)
K. Nagai (k-nagai@crl.hitachi.co.jp)

Central Research Laboratory, Hitachi, Ltd
1-280 Higashi-koigakubo, Kokubunji-shi, Tokyo 185, Japan


Abstract

Smith and Waterman dynamic programming algorithm is too computer intensive to be used for database search. Most of the common database search programs use approximate scores to reduce search time in stead of Smith and Waterman score. They do not consider all possible alignments and may miss some sequence similarities. We present an algorithm for database search which uses score table of k-tuples. The method calculates the upper limit of Smith and Waterman score for each database sequence in nearly interactive time. Calculated each upper limit of Smith and Waterman score prevents us from missing high sequence similarities.