SHAKE'N'BAKE: A Direct Methods Phasing Technique for Structure Determination

Russ Miller


Scientific Significance:

Molecular structure determination via x-ray diffraction analysis is an important step in rational drug design, a process used to develop more effective drugs for the cure and prevention of disease, as well as improved fertilizers, repellents, and other substances that are used to improve the general quality of life. The three stages of an x-ray diffraction evaluation are:

  1. the growth of suitable single crystals of the substance to be studied,
  2. the measurement of x-ray diffraction data, and
  3. unraveling the molecular structure so that it agrees with the diffraction data.

While the experiment yields the amplitudes and positions of the Fourier components, it does not produce their phases. It is the determination of these missing phases that constitutes the phase problem of x-ray crystallography.

A team of researchers at the Medical Foundation of Buffalo Research Institute and the State University of New York (SUNY) at Buffalo has developed a new technique for determining molecular structures of moderate size molecules from x-ray diffraction data. The team is headed by Nobel laureate Herbert A. Hauptman, President of the Medical Foundation. The so called Shake-and-Bake method of structure determination is a direct methods phasing algorithm based on Hauptman's minimal principle. The combination of our new formulation of the problem based on a minimal principle, and advances in massively parallel computing, have led to the development of our SnB program, which is currently being prepared for general distribution by Molecular Structures Corporation, MacScience, and the Pittsburgh Supercomputing Center. SnB has been developed over the past three years, and has been used successfully to solve more than three dozen structures over a variety of space groups. Highlights of SnB include solving (on other platforms) two previously complex and important unknown 100-atom structures (Ternatin_E and Ternatin_D), which had escaped solution for over a decade and re-solving Crambin, a 400-atom structure. Crambin is now the largest structure solved by direct methods. Previous attempts considered more than 500,000 ambiguities without success, while SnB obtained a success rate of approximately 4%.


Numerical Approach and Performance:

A single processor version is available on C90. The multiprocessor version on T3D currently runs entirely on T3D and scales very well with number of PEs. Single PE performance on T3D and single processor performance on C90 vary with the number of atoms in the structure. This may be due to bad cache alignment in T3D version.

Number of Atoms	T3D Performance / PE
317 6.2 MFlops/s 98 9.8 74 9.0

The scaling is indistinguishable from linear.

9.8 MFLOPS/PE
32 PE = 2.89 Equivalent C90 CPU's