Development of New DDBJ DNA Sequence Database with Data Annotation Tool Yamato II

T. Koike [3] (tkoike@genes.nig.ac.jp)
T. Okayama [3] (tokayama@genes.nig.ac.jp)
J. Ishii [3] (jishii@genes.nig.ac.jp)
T. Mizunuma [3] (tmizunum@genes.nig.ac.jp)
T. Tamura [2] (tatamura@genes.nig.ac.jp)
Y. Tateno [1] (ytateno@genes.nig.ac.jp)
H. Sugawara [1] (hsugawar@genes.nig.ac.jp)
K. Nishikawa [1] (knishika@genes.nig.ac.jp)
T. Imanishi [1] (timanish@genes.nig.ac.jp)
K. Fukami-Kobayashi [1] (kfukami@genes.nig.ac.jp)
K. Ikeo [1] (kikeo@genes.nig.ac.jp)
T. Gojobori [1] (tgojobor@genes.nig.ac.jp)

[1] Center for Information Biology, National Institute of Genetics
1111 Yata, Mishima, 411 Japan
[2] Association for Propagation of the Knowledge of Genetics
1171-195 Sakuragaoka, Yata, Mishima, 411 Japan
[3] Hitachi Software Engineering Co., Ltd
5-79 Onoe-cho, Naka-ku, Yokohama, 231 Japan

Abstract

As the molecular biology has made a rapid progress these years, there has been a great number of changes required of the methodology for maintaining and utilizing DNA sequence data. For example, annotation to sequences has become complex and extensive. DDBJ which recognized the impending requirements decided to develop a new DNA sequence database system in 1995. To tolerate with frequent changes of the data structures and significant increment of the data in terms of quality and quantity, we designed a completely new database schema. In the new system, physical changes of the data structure do not affect such applications as a tool for annotation. We also designed a new annotation tool with object oriented concept that allows us to handle DNA sequence data in computers as intuitively as in the real world. The annotation tool is named as YAMATO II. We also take care of needs from DDBJ itself in the new system. Data traffics and security in the database access are especially analyzed and reviewers of data for DDBJ who are distant from DDBJ are now able to process the data safely and comfortably in the new system. The new system also realized more robust and effective data exchange with partners in the international nucleotide sequence banks, EMBL and GenBank.