Sei sulla pagina 1di 3

4508

J. Phys. Chem. B 2009, 113, 45084510

Prediction of the Free Energy of Hydration of a Challenging Set of Pesticide-Like


Compounds
Andreas Klamt,*,, Frank Eckert, and Michael Diedenhofen
COSMOlogic GmbH&CoKG, Burscheider Str. 515, 51381 LeVerkusen, Germany, and Institute of Physical and
Theoretical Chemistry, UniVersity of Regensburg, Germany
ReceiVed: July 3, 2008; ReVised Manuscript ReceiVed: October 10, 2008

In a blind validation test the COSMO-RS method, a combination of the quantum chemical dielectric continuum
solvation model COSMO with a statistical thermodynamics treatment for more realistic solvation (RS)
simulations, has been used for the direct prediction of transfer free energies of 55 demanding pesticide-like
compounds. Comparison with experimental data yields an rms deviation of 2 kcal/mol, which is in the
order of the estimated inaccuracy of the experimental data. A detailed comparison reveals experimental and
calculation pitfalls on conformational flexible, multifunctional, polar compounds.
Introduction
X
Ghydr
,

The free energy of hydration


or equivalently the
logarithmic aqueous Henrys law constant, of a compound X
is an important physicochemical property, especially for estimating the volatility of compounds from an aqueous solution.
Because of the overwhelming importance of water, and because
X
is also considered as the most
of its high polarity, Ghydr
important descriptor of solvation strength, and it is widely used
to parametrize and validate solvation models in computational
chemistry. Such models meanwhile claim to be able to predict
X
for small and medium sized organic compounds with
Ghydr
an accuracy of 1 kcal/mol (RMSE) down to 0.5 kcal/mol for
SMx1 or COSMO-RS.2-4 Nevertheless, in some of the models
X
many of the generally available experimental data of Ghydr
have been used for the adjustment of the sometimes large
amount of parameters, and hence the predictive power of
solvation models can hardly be decided on publicly available
data for such simple organic compounds. Despite the large
amount of work spent in the development and refinement of
solvation models, the proper prediction of the free energy of
hydration of large, flexible, multifunctional compounds with less
common functional groups is still a challenge. Hence for us as
developers of the rather fundamental COSMO-RS solvation
model the SAMPL bind test with 63 complex pesticide-like
compounds provided a welcome test for our ability to treat such
compounds.
Data Set
The SAMPL data for transfer energies mainly consists of
multifunctional flexible compounds, most of which results from
pesticide chemistry.5 The original data set had 63 compounds.
Unfortunately, for eight compounds the chemical structures
communicated in the SAMPL test were not consistent with the
structures for which experimental data were reported later. For
the purpose of this paper we focus on the remaining 55

Part of the special section Calculation of Aqueous Solvation Energies


of Drug-Like Molecules: A Blind Challenge.
* Corresponding author. Phone: +49-2171-731681. E-mail: klamt@
cosmologic.de.

COSMOlogic GmbH&CoKG.

University of Regensburg.

compounds. The experimental data have been collected from


various literature sources by the contest organizers. Many of
the experimental data are derived by temperature extrapolation
from high temperature solubility measurements because the
extremely low room-temperature solubilities are lower than
typical experimental detection limits. The organizers also gave
an estimated error range for each experimental value. Many of
these error bars are almost 2 kcal/mol.
For two compounds, i.e., for carbofuran and nitralin, the
organizers corrected the experimental value based on an analysis
of our COSMO-RS predictions. In this context two more
incorrect data were corrected. In this paper we will only report
comparisons with the finally accepted experimental data set.
Methods
As the initial step of the conductor-like screening model for
realistic solvation (COSMO-RS), quantum chemical calculations
are performed for a compound X in two reference states, i.e. in
vacuum, which is the standard reference state of quantum
chemistry, and in the presence of a virtual perfect conductor
surrounding the solute outside a solvent accessible cavity. The
latter state is called the COSMO6 reference state. Both calculations are done with a Becke-Perdew (BP) density functional7,8
and a TZVP9,10 basis set within the TURBOMOLE program
package.11 The key step of the COSMO-RS method is the
conversion of the polarization COSMO surface charge densities
of solute and solvent molecules, which arise from the DFT/
COSMO calculations, into a statistical thermodynamics treatment of the resulting fluid ensemble. For this purpose all
interactions in the liquid system are considered as local pair
interactions of surfaces, quantified by the polarization charge
densities and of the surfaces, and using functionals
Emisfit(,) for the electrostatic interactions, and Ehb(,) for
hydrogen bond interactions, respectively. In addition, van der
Waals interactions are taken into account by a simple surface
proportional approach. Finally, the statistical thermodynamics
of these pairwise interacting surfaces is solved by a thermodynamically rigorous and highly efficient recursive expression,
which results in total free energies of the solutes in the liquid,
i.e., in this case in the aqueous phase. For small and relatively
inflexible compounds COSMO-RS thus yields an accuracy of
0.4 kcal/mol (RMSE) for free energies of phase transfer,

10.1021/jp805853y CCC: $40.75 2009 American Chemical Society


Published on Web 03/10/2009

Prediction of the Free Energy of Hydration

J. Phys. Chem. B, Vol. 113, No. 14, 2009 4509


TABLE 1: Predicted and Experimental Free Energies of
Transfer (in kcal/mol) of Pesticide-Like Compounds
compound

Figure 1. Flow diagram of COSMOconf.

including the free energy of transfer between gas-phase and


water. For a more detailed description of the COSMO-RS
methodology see references3 and.4
While the above-described COSMO-RS workflow is rather
straightforward for conformationally simple compounds, it gets
considerably complicated for molecules with a larger number
of rotatable bonds. In such cases, care must be taken that all
relevant conformations of a compound are taken into account
by DFT/COSMO calculations, in order to enable the COSMOtherm program,12 i.e., the program for the COSMO-RS calculations, to perform a consistent thermodynamic averaging of
conformations based on the individual free energies of the
conformers in the gas-phase and in solution. While the averaging
itself is automatically implemented, the search for the relevant
conformations in the gas-phase and in solution, which may be
different, has to be done externally and this usually is the
bottleneck for COSMO-RS calculations for flexible multifunctional molecules as those given in the SAMPL transfer energy
contest. For such compounds we have developed a conformational search algorithm COSMOconf, which is especially
focused on finding the low lying conformations of flexible
molecules in gas-phase and in a very polar solvent, e.g., in a
conductor. COSMOconf is a sophisticated workflow, combining
the following computational steps (see Figure 1):
(1) Automated perception of the relevant rotatable bonds and
of potential intramolecular hydrogen bonds using functionality
of COSMOfrag13 and resulting in a MOPAC14 input containing
a large number of constrained MOPAC AM115 optimization jobs
related to potentially important conformations.
(2) Constrained and subsequent free AM1 gas-phase optimizations using a locally modified version of MOPAC7, which is
able to treat geometry constraints by penalty potentials.
(3) Clustering of the resulting optimized geometries, yielding
an initial set of gas-phase conformations.
(4) In the case of COSMO conformational search, geometry
optimization of the initial conformations with AM1/COSMO
and subsequent clustering of the final geometries with respect
to their total energies and core-core-interaction energies,
yielding an initial set of COSMO conformations.
(5) Single-point BP/SVP DFT calculations for all conformers
below a certain threshold (20 kcal/mol, 50 structures maximum)
above the AM1 minimum energy conformer of the initial
conformer set (gas-phase of COSMO).

nitroglycol
1,2-dinitroxypropane
butyl_nitrate
2-butyl_nitrate
isobutyl_nitrate
ethylenglycol_mononitrate
alachlor
aldicarb
Ametryn
Azinphosmethyl
benefin
bensulfuron
bromacil
captan
carbaryl
carbofuran
carbophenothion
chlorfenvinphos
chlorimuronethyl
chloropicrin
chlorpyrifos
dialifor
diazinon
dicamba
dichlobenil
dinitramine
dinoseb
endosulfan,alpha
endrin
ethion
Fenuron
heptachlor
isophorone
Lindane
malathion
methyparathion
metsulfuronmethyl
nitralin
nitroxyacetone
parathion
pebulate
phorate
profluralin
prometryn
propanil
pyrazon
simazine
Sulfometuronmethyl
terbacil
terbutryn
thifensulfuron
trichlorfon
trifluralin
vernolate
pirimor(pirimicarb)

prediction
orig.
after cross
internal_name prediction merging experiment
CUP08001
CUP08002
CUP08003
CUP08004
CUP08005
CUP08006
CUP08007
CUP08008
CUP08009
CUP08010
CUP08011
CUP08012
CUP08013
CUP08014
CUP08015
CUP08016
CUP08017
CUP08019
CUP08020
CUP08021
CUP08022
CUP08023
CUP08024
CUP08025
CUP08026
CUP08027
CUP08028
CUP08029
CUP08030
CUP08031
CUP08032
CUP08033
CUP08034
CUP08035
CUP08036
CUP08038
CUP08039
CUP08040
CUP08041
CUP08043
CUP08044
CUP08045
CUP08046
CUP08047
CUP08048
CUP08049
CUP08050
CUP08051
CUP08052
CUP08053
CUP08054
CUP08055
CUP08056
CUP08057
CUP08063

-1.75
-1.23
0.02
0.43
0.13
-5.7
-7.66
-9.56
-8.32
-14.3
-3.69
-23.12
-12.38
-9.11
-9.57
-10.97
-6.92
-8.63
-17.61
0.22
-5.38
-10.58
-6.22
-9.46
-5.03
-7.44
-4.54
-8.84
-7.34
-9
-9.62
-5.91
-7.23
-7.04
-10.27
-7.85
-17.49
-11.87
-3.81
-7.65
-3.57
-4.71
-4.36
-8.15
-8.94
-16
-9.74
-16.63
-11.27
-7.38
-25.62
-14.9
-2.94
-4.63
-8.48

-1.76
-1.33
0.02
0.02
0.13
-5.73
-8.02
-9.94
-8.32
-12.32
-3.70
-18.44
-12.38
-9.11
-9.56
-11.15
-7.63
-9.21
-17.59
0.22
-5.39
-10.58
-6.22
-9.46
-5.03
-7.43
-4.54
-8.84
-7.34
-9.00
-9.62
-5.91
-7.23
-7.04
-10.27
-7.85
-17.92
-12.00
-3.97
-7.65
-3.57
-4.71
-4.36
-8.17
-8.94
-16.00
-9.74
-17.04
-11.27
-7.63
-19.01
-11.98
-2.89
-3.54
-8.45

-5.73
-4.95
-2.09
-1.82
-1.88
-8.18
-8.21
-9.84
-7.65
-10.03
-3.51
-17.17
-9.73
-9.01
-9.45
-9.61
-6.5
-7.07
-14.01
-1.45
-5.04
-5.74
-6.48
-9.86
-4.71
-5.66
-6.23
-4.23
-4.82
-6.1
-9.13
-2.55
-5.18
-5.44
-8.15
-7.19
-15.54
-7.98
-5.99
-6.74
-3.64
-4.37
-2.45
-8.43
-7.78
-16.43
-10.22
-20.25
-11.14
-6.68
-16.23
-12.74
-3.25
-4.13
-9.41

(6) BP/TZVP DFT geometry optimization for all conformers


below a certain threshold (3 kcal/mol) above the BP/SVP
minimum energy conformer of the initial conformer set (gasphase or COSMO).
(7) Final clustering, for the case that conformations have
merged into each other during the final optimization.
(8) [Cross-merging of gas-phase and COSMO conformational sets. This step was added to the workflow based on
the results of the SAMPL blind test, as discussed in the results
section.]
This procedure is run for the gas-phase and for the COSMO
state. Thus we end up with two final sets of conformations,
energies, and COSMO files, respectively, where the COSMO
files contain the complete information about the screening charge
densities required for the COSMO-RS postprocessing.

4510 J. Phys. Chem. B, Vol. 113, No. 14, 2009

Klamt et al.

Figure 3. (left) Sulfuron substructure with the two urethane hydrogen


atoms in cis position to the carbonyl group; (right) sulfuron substructure
in a cis-trans conformation with internal hydrogen bond.

Figure 2. Experimental vs predicted free energies of transfer of


pesticide-like compounds.

Finally, the individual gas-phase energies of the conformations are converted into a Boltzmann averaged gas-phase free
energy

GXgas

[ ]

) kT ln

-EX,i
gas
kT

Conclusions

(1)

From the COSMO files of the final COSMO conformer set


the individual COSMO-RS free energies GX,i
aq of the conformers
in the aqueous phase are calculated and a Boltzmann average
is calculated as

GXaq

[ ]

) kT ln

-GX,i
aq
kT

conformer in the other ensemble, or if it coincides with an


existing conformer, it is deleted. The results of the COSMOtherm prediction after cross-merging are also shown in Table 1
and Figure 2. The cross-merging decreased the error in 10 cases,
while small error increases happened in 6 cases. The root-meansquared error (RMSE) decreased to 2.0 kcal/mol (mean unsigned
error ) 1.6 kcal/mol) at an average underestimation of 0.5 kcal/
mol.

(2)

The two last steps are part of the standard functionality of


the COSMOtherm program.
Results
The results for the calculated free energies of transfer together
with the experimental data are given in Table 1 and Figure 2.
The statistical analysis of the results yields an RMSE of 2.5
kcal/mol and a mean unsigned error of 1.8 kcal/mol, in
combination with an average underestimation of 0.76 kcal/mol.
The original calculations clearly show two dramatically underestimated hydration free energies, i.e., for the compounds
bensulfuron and thifensulfuron, which both are out of the same
compound class.
Analysis of the conformations of these compounds immediately led to the finding that our COSMOconf algorithm
had successfully generated the energetically favorable cis-transconformations, which are stabilized by an internal hydrogen
bond (see figure 3), in the COSMO reference state, but failed
to generate these conformations in the gas-phase, although these
also would have been most favorable in the gas-phase. As a
result, the energy difference between the gas-phase conformational set and the aqueous phase conformational was calculated
too strongly in favor of the aqueous phase.
In order to overcome this apparent weakness of our
COSMOconf algorithm, we modified the algorithm in a way,
that in a final cross-check the conformations of each subset
(COSMO and gas-phase) are mutually merged into the other
subset after reoptimization in the respective other phase. If a
conformation is more than 3 kcal/mol higher than the minimum

The quantum chemically based method COSMO-RS in its


COSMOtherm implementation was able to predict the free
energy of transfer of 55 demanding pesticide-like compounds
with an accuracy of 2 kcal/mol. Considering the estimated
experimental error of up to 2 kcal/mol this result can be
considered as satisfying, although it is much worse than the
accuracy of 0.5 kcal/mol RMSE which COSMOtherm usually
achieves on smaller and conformationally less demanding
compounds, for which also the experimental data usually are
known with much higher accuracy. This project clearly showed
the mutual benefits of comparing experimental and prediction
results. While COSMOtherm results gave rise to the detection
of at least two errors in the reported experimental data, the
analysis of two large prediction outliers dismantled a weakness
of the conformational search tool COSMOconf, which was fixed
in the course of the study.
References and Notes
(1) Marenich, A. V.; Olson, R. M.; Kelly, C. P.; Cramer, C. J.; Truhlar,
D. G. J. Chem. Theor. Comput. 2007, 3, 20112033.
(2) Klamt, A. J. Phys. Chem. 1995, 99, 2224.
(3) Klamt, A.; Jonas, V.; Burger, T.; Lohrenz, J. C W. J. Phys. Chem.
1998, 102, 5074.
(4) Klamt, A. COSMO-RS: From Quantum Chemistry to Fluid Phase
Thermodynamics and Drug Design, Elsevier, Amsterdam 2005.
(5) Guthrie, P. University of Western Ontario, private communication.
(6) Klamt, A.; Schuurmann, G. J. Chem. Soc. Perkins Trans. 2 1993,
799.
(7) Becke, A. D. Phys. ReV. A 1988, 38, 3098.
(8) Perdew, J. P. Phys. ReV. B 1986, 33, 8822.
(9) Schafer, A.; Huber, C.; Ahlrichs, R. J. Chem. Phys. 1994, 100,
5829. Eichkorn, K.; Weigend, F.; Treutler, O.; Ahlrichs, R. Theor. Chem.
Acc. 1997, 97, 119.
hm, H.; Haser, M.; Ahlrichs, R. Chem.
(10) Eichkorn, K.; Treutler, O.; O
Phys. Lett. 1995, 242, 652.
(11) Eckert, F.; Klamt, A. COSMOtherm, Version C2.1-Revision 01.07;
COSMOlogic GmbH&CoKG, Leverkusen, Germany (2007); see also URL:
http://www.cosmologic.de.
(12) TURBOMOLE, a development of University of Karlsruhe and
Forschungszentrum Karlsruhe GmbH, 1989-2007, TURBOMOLE GmbH,
since 2007; see also URL: http://www.turbomole.com.
(13) COSMO frag, COSMOlogic GmbH&CoKG, Leverkusen, Germany
(2008).
(14) locally modified version of: Stewart, J.J.P: MOPAC7 Version2,
QCPE, Bloomington, 1993.
(15) Dewar, M. J. S.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P.
J. Am. Chem. Soc. 1985, 107, 39023909.

JP805853Y

Potrebbero piacerti anche