Speed-up of Information Retrieval by Data Compression

Takeshi SHINOHARA
Shuichi FUKAMACHI

Kyushu Inst. Tech.


Abstract

This paper describes a method for improving efficiency of sequential text search by data compression and its application to a database system SIGMA. Usually, the most expensive task in searching text is data transfer from storage devices. With data compression, the size of transferred data can be reduced. We have designed a string pattern matching algorithm that can scan the compressed data without decoding. We apply this algorithm to revision of search command in SIGMA. Although the effect of this method depends on the characteristics of data, it is observed that in general the size of data and the response time of searching are reduced to 60% and 70%, respectively.