Fast A* Algorithms for Multiple Sequence Alignment

Takahiro Ikeda (ike@is.s.u-tokyo.ac.jp)
Hiroshi Imai (imai@is.s.u-tokyo.ac.jp)

Department of Information Science, Faculty of Science,
University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan


Abstract

The multiple alignment of the sequences of DNA and proteins is applicable to various important fields in molecular biology. Although the approach based on Dynamic Programming is well-known for this problem, it requires enormous time and space to obtain the optimal alignment. On the other hand, this problem corresponds to the shortest path problem and the A* algorithm, which can efficiently find the shortest path with an estimator, is usable.
This paper directly applies the A* algorithm to multiple sequence alignment problem with more powerful estimator in more than two dimensional case and discusses the improvement of this approach utilizing an upper bound of the shortest path length. The algorithm to provide the upper bound is also proposed in this paper.