Evaluation of the Potential of MOZYME for the Study of Enzyme Reaction Mechanisms

*Developments in applications software, coupled with increased HPC power which make very large calculations feasible, have created the expectation of a new era of large-scale biomolecular computation. One such development is MOZYME, a unique program module within the computational chemistry program MOPAC, which allows calculations to be undertaken at an approximate quantum mechanical (semiempirical QM) level on molecules with thousands of atoms. MOZYME is one of a few so-called linear-scaling (LC) QM methods, introduced recently, which achieve this capability by overcoming the bottle-neck of standard QM methods – the N3 (or higher; N = no of orbitals) computational dependence. MOZYME LC is based on a localised molecular orbital method developed by J.J.P. Stewart, the originator of MOPAC. For conventional QM methods, the N3 dependence limits calculations to 50-100 atoms (*ab initio* QM) or a few hundred atoms (semiempirical QM, as in MOPAC and MOZYME). This new capability to compute molecular properties and energies for such large molecules at an all-QM level, provides the opportunity to test the usefulness of MOZYME in real applications. We started with a problem for which MOZYME had been proposed, that is, studying enzyme reaction mechanisms, and compared its performance with another type of new method, hybrid quantum mechanical/mechanical methods (QM/MM) which overcomes the QM computational bottleneck in a completely different way. We also employed MOZYME to study protein properties in other contexts where an all-QM calculation would seem advantageous.*

Principal Investigator Dr Jill E. GreadyComputational Proteomics GroupJohn Curtin School of Medical Research ANU |
Project x11 Facilities Used SC, VPP, PC |

Co-Investigators Mr Stephen J. TitmussComputational Proteomics Group
John Curtin School of Medical Research ANU
Dept of Computer Science FEIT ANU |
RFCD Codes 250106, 250601, 250699, 270108 |

Significant Achievements, Anticipated Outcomes and Future Work

Comparison of results from MOZYME and QM/MM calculations on the reaction mechanism of the enzyme dihydrofolate reductase (DHFR) showed major and complex differences in components of the energies, as well as total energies in some cases, reflecting the different theoretical treatments of the interaction of the reactant subsystem with the rest of the enzyme; these have major theoretical ramifications (see below). In particular, the profiles showed close agreement in the total energies for fixed MM coordinates, but divergence when the MM geometry is allowed to relax during the reaction, largely due to the MM electrostatic energy [1]. After implementing energy decomposition into MOZYME to allow direct comparison with QM/MM components, the results showed a large change (~16 kcal/mol) in the MOZYME MM component of the fixed-geometry model due to polarization of the MM region surrounding the active site QM region, which is not explicitly modelled by QM/MM. For the variable-geometry model the differences are even greater, with a 52 kcal/mol discrepancy in the relative reaction energies (product - reactant), largely due to MM energy differences [2]. Results so far have provided many surprises, not all of which have yet been explained. Future work will focus on further analysis of the different descriptions of the MM or MM-equivalent (i.e. in MOZYME) region, specifically polarization and other electrostatic contributions, which are especially large for the variable-geometry model. This will include investigation of relative energies as a function of distance from the reaction centre, and study of energy contributions for selected residues or residue-residue pairs. Also, although initially proposed as a method for studying, in particular, enzyme reactions (hence, the name) our studies suggested several other potential uses where an all-QM calculation of protein properties was desirable; we have used it for studying electrostatic potentials (see x04/d55) and it has been used by others at ANU for a similar purpose (Bliznyuk et al. (2002). J Phys Chem B 105, 12673).

**Other outcomes. **Apart from these specific results, the project, and other concurrent work with MOZYME, provided several other significant insights; these findings could only be made once it became possible to do all-QM calculations on such large molecules. One unexpected outcome was the value of MOZYME as a quality control on QM/MM enzymic calculations, both on the quality of the parameterization of the QM/MM coupling terms, and on the risks of using static (i.e. without MD) QM/MM with optimization of MM-atom positions during the reaction, as is commonly done in the literature. Secondly, the results for the treatment of both short- and long-range electrostatics by MM, suggest that merely extending the size of the QM region in QM/MM calculations is not a universal solution to the MOZYME- and QM/MM-method differences; this is an important and timely result which conflicts with expectations in the literature and may provide a means for judicious choice of the QM boundary for large QM-region calculations now becoming practicable (e.g. see w05, u51/d52, u53/d55). Results from project x04/d55 which used both MOZYME and an MM-based method to calculate electrostatic potentials (EPs) of prion protein showed fundamental problems with the latter results. Finally, fundamental issues in the theoretical foundation of semiempirical theory and the meaning of energy components emerged; we have addressed one aspect so far [3], with other studies planned.

**In summary** the project offers an important conclusion. There has been a common assumption that application of current computational chemistry (CC) methods is limited largely by computing power, and, consequently, that if such power is now available, the methods would be able to deliver reliable results at the level now achievable by small-molecule CC. The MOZYME results demonstrate that this assumption is flawed. It cannot be assumed that the theoretical performance of current methods (including semiempirical QM theory), developed and extensively tested for small-molecule chemistry, will scale to large system size.

Computational Techniques Used

MOZYME is incorporated into MOPAC (Fujitsu Ltd Japan). Our application to real problems uncovered major deficiencies in the implementation and overall capability of the program. The project worked closely with MOPAC developer Dr A. Bliznyuk in ANUSF to remedy these. Major refinements, which also generated new R&D contracts with Fujitsu, were implementation of linear-scaling direct SCF to reduce memory limitations, and implementation of linear-scaling COSMO to provide implicit solvation functionality. Within the project, an energy decomposition analysis routine (to complement that in the non-LC part of MOPAC) was implemented in the MOZYME module. The improved MOPAC program was tuned and available on the VPP300 and PC, and is now available on the APAC National Facility Compaq SC. The QM/MM calculations were performed using the locally-developed program MOPS (see u51/d52). MOPS was vectorised for the VPP300 and tuned for the PC; extensive efforts of parallelization for the National Facility Compaq SC are documented in the report for u51/d52. *ab initio *QM calculations required for parts of the study were done with GAUSSIAN98.

The MOZYME calculations were done on a DHFR model containing ~3200 atoms (3047 protein, 97 substrate + cofactor, 17 water) requiring ~8000 basis functions for the PM3 model. The QM/MM calculations were done with a QM region containing 48 atoms, with the remainder (i.e. ~3200 atoms) in the MM region. Several coordinate sets (e.g. with fixed or various relaxed MM regions) generated along a QM/MM reaction path were used as input to the MOZYME calculations. The main details of the QM/MM model were taken from ongoing QM/MM work (see u51/d52, u53/d55).

Publications, Awards and External Funding

ARC/DETYA SPIRT (APAI) Grant (1998-2001) with industry partner Fujitsu Ltd Japan to JE Gready, AP Rendell and R Nobes. "Application of linear scaling semi-empirical quantum chemical methods to the study of enzyme reaction mechanisms."

1. SJ Titmuss, PL Cummins, AA Bliznyuk, AP Rendell, JE Gready. Comparison of linear-scaling semiempirical methods and combined quantum mechanical/molecular mechanical methods applied to enzyme reactions. Chem Phys Letters 320, 2000, 169-176.

2. SJ Titmuss, PL Cummins, AP Rendell, AA Bliznyuk, JE Gready. Comparison of linear-scaling semiempirical methods and combined quantum mechanical/molecular mechanical methods for enzyme reactions II: an energy decomposition analysis. J Comput Chem, 2002, in press*.*

3. PL Cummins, SJ Titmuss, D Jayatilaka, AA Bliznyuk, AP Rendell, JE Gready. Comparison of semiempirical and ab initio QM decomposition analyses for the interaction energy between molecules. Chem Phys Letters 352, 2002, 245-251.