Extraction of Conserved or Variable Regions from a Multiple Sequence Alignment

Osamu Gotoh (ogotoh@gan.ncc.ac.jp)

Department of Biochemistry
Saitama Cancer Center Research Institute
818 Komuro, Ina-machi, Saitama 362, Japan


Abstract

Given a multiple sequence alignment of a family of protein or nucleotide sequences, conserved or highly variable regions are valuable landmarks to get insight into the functional and structural roles of individual regions. Conserved regions can also act as anchor points in the process of further improvement of the given alignment. Two different approaches were undertaken to extract conserved regions based on the principle of either consistency or high scores. The latter approach is easily modified to extract highly variable regions by reversing the scoring scheme. Examinations on a few protein families are discussed.