E297

Intelligent Computational Aids for Crystal Growth John M. Rosenberg1, Patricia A. Wilkosz1, K. Chandrasekhar1, Devika Subramanian2, Daniel Hennessy3 and Bruce Buchanan3, 1Depts. of Biological Sciences & Crystallography, University of Pittsburgh. 2Dept. of Computer Science, Rice University. 3Intelligent System Laboratory, University of Pittsburgh.

During the course of crystallization experiments, substantial data accumulate on the conditions that lead to unsuccessful, partially successful and successful crystallizations. The project described here seeks to provide computational tools for the collection and interpretation of that data.

Specifically, the goals of this project are to design, implement, and test an intelligent, interactive, electronic assistant for crystallographers that will facilitate the trial and error process of growing diffraction quality crystals of biological macromolecules based on: 1. Archiving of crystallization experiments. 2 Accessing the database's crystallization trials including generation of new experimental protocols. 3. Inducing empirical theories that capture regularities in the data. 4. Suggesting plausible next steps in a series of crystallization trials.

The initial "front end" will be demonstrated. We also invite volunteers to test the software and to contribute data for the next stage of the project, which is the application of artificial intelligence methods.

Our initial analytical efforts utilized the data in the BMCD database. We found that significant improvements in the statistical interpretation of the BMCD required classifying macromolecules according to a hierarchical scheme we developed. We then applied standard statistical analyses, including the Student T-test, to the BMCD data. As one representative example, we asked whether the distribution of macromolecular concentration was systematically different for the protein subclasses. We found that the heme containing proteins and the membrane proteins stand out as significantly different from the rest of the protein families. Additional data will be reported.

How can statistical results like this be incorporated into the design of crystallization experiments? We have developed software that calculates Bayesian probabilities for any combination of crystallization parameters, using data retrieval from the archive, currently the BMCD. The calculated probabilities are used to bias the selection of data from an incomplete factorial design such that the more probable combinations are sampled more densely than the less probable ones. This feature has been included in the software to be demonstrated and made available, as described above.