Learning from Crystallographic Databases. Suzanne Fortier, Shishan Guo and Janice Glasgow, Depts. of Chemistry and Computing and Information Science, Queen[Otilde]s University, Kingston, Canada, K7L 3N6
A fundamental aspect of learning is the ability to recognize patterns and rules and to form new concepts. It has long been acknowledged that the rapidly growing wealth of information on crystallographic structures offers an opportunity to learn from the databases, so as to discover new knowledge on molecular structures and molecular structure organization. It has also been recognized that computing approaches and tools are needed to assist in the learning process and, indeed, numerous algorithms have been implemented to help search, screen, analyse and classify the databases.
Despite these advances, most learning from the databases has been achieved through the intervention of expert users who had considerable prior knowledge in the structural area under study. Many of the studies have served to confirm hypotheses and add a quantitative description to an already existing qualitative model of the dataset. In other words, many of the studies have been undertaken after intuition, insight and manual exploration of the dataset had already led to knowledge discovery. To fully achieve the objective of learning from the databases, further approaches and tools are needed to assist in the early exploration of the databases and the evaluation and interpretation of results derived from classification exercises. These challenges will be the subject of this presentation.