Running Learning Systems in Parallel for Machine Discovery from Sequences

Ayumi Shinohara [1](ayumi@rifis.sci.kyushu-u.ac.jp)
Shinichi Shimozono[2] (sir,@ces.kyutech.ac.jp)
Tomoyuki Uchida[3] (uchida@rifis.sci.kyushu-u.ac.jp)
Satoru Miyano[1] (miyano@rifis.sci.kyushu-u.ac.jp)
Satoru Kuhara[4] (kuhara@grt.kyushu-u.ac.jp)
Setsuo Arikawa[1] (arikawa@rifis.sci.kyushu-u.ac.jp)

[1] Research Institute of Fundamental Information Science, Kyushu University
[2] Department of Control Engineering and Science, Kyushu Institute of Technology
[3] Department of Information Systems, Kyushu University
[4] Graduate School of Genetic Resources Technology, Kyushu University 46


Abstract

We have developed a machine learning system BONSAI which gets positive and negative examples as inputs and produces a pair of a decision tree over regular patterns and an alphabet indexing as a hypothesis. This paper proposes two applications of BONSAI when we can run multiple BONSAI systems in parallel.

The one is to classify given examples which are coming from several different unknown classes. The process of solving the problem consists of multiply spawned BONSAI systems, each of which tries to find a decision tree, an alphabet indexing and a group of examples. It will finally partition a hodgepodge of sequences into a small number of disjoint classes together with hypotheses explaining these classes accurately.
The other is to find a good sample of a concept. Though the main interest of applying the BONSAI system is to discover good hypotheses, it is equally interesting to find a small set of examples from which a good hypothesis is made. We present a method for solving this problem by combining a strategy in genetic algorithms with multiply running BONSAI systems.