Sei sulla pagina 1di 7

ANALYTICAL SCIENCES NOVEMBER 2017, VOL.

33 1211
2017 © The Japan Society for Analytical Chemistry

Original Papers

Estimation of Retention Time in GC/MS of Volatile Metabolites in


Fragrant Rice Using Principle Components of Molecular
Descriptors
Nataporn WIJIT,*1 Sukon PRASITWATTANASEREE,*2 Sugunya MAHATHEERANONT,*1
Peter WOLSCHANN,*3,*4 Supat JIRANUSORNKUL,*5 and Piyarat NIMMANPIPUG*1†

*1 Department of Chemistry and Center of Excellence for Innovation in Chemistry, Faculty of Science and
Graduate School, Chiang Mai University, Chiang Mai 50200, Thailand
*2 Department of Statistics, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand
*3 Department of Pharmaceutical Technology and Biopharmaceutics, University of Vienna, Vienna 1090, Austria
*4 Institute of Theoretical Chemistry, University of Vienna, Vienna 1090, Austria
*5 Department of Pharmaceutical Sciences, Chiang Mai University, Chiang Mai 50200, Thailand

A quantitative structure–retention relationship (QSRR) study was applied for an estimation of retention times of secondary
volatile metabolites in Thai jasmine rice. In this study, chemical components in rice seed were extracted using solvent
extraction, then separated and identified by gas chromatography–mass spectrometry (GC-MS). A  set of molecular
descriptors was generated for these substances obtained from GC-MS analysis to numerically represent the molecular
structure of such compounds. Principal component analysis (PCA) and principal component regression analysis (PCR)
were used to model the retention times of these compounds as a function of the theoretically derived descriptors. The best
fitted regression model was obtained with R-squared of 0.900. The informative chemical properties related to retention
time were elucidated. The results of this study demonstrate clearly that the combination of molecular weight and
autocorrelation functions of two dimensional interatomic distance, which are molecular polarizability, atom identity,
sigma charge, sigma electronegativity and polarizability, can be considered as comprehensive factors for predicting the
retention times of volatile compounds in rice.

Keywords Fragrant rice, GC-MS, QSRR, molecular descriptor

(Received February 27, 2017; Accepted July 4, 2017; Published November 10, 2017)

to property domains. Several methods can be used to determine


Introduction volatile compounds from plant materials, which consist of four
major parts including collection, concentration, separation, and
World rice production and consumption have been steadily quantification. Conventional methods for extraction (collection
rising since the 1980 crop year. Rice is one of the major dietary and concentration)10 include direct solvent extraction, steam
components for many people in the world, especially in the distillation–solvent extraction, solid–liquid extraction (SLE),
Asia-Pacific region; about 60% of the world is population relies supercritical fluid extraction (SFE), purge and trap, static
on rice as a main staple food and 90% of rice is produced and headspace and solid phase micro-extraction (SPME).10–12
consumed in Asia. Rice is the seed of monocot plants, which Afterwards, gas chromatography (GC)13 with flame ionization
are usually distinguished as a semiaquatic, annual grass plant. or mass spectrometer (MS) as detector system is used for
This carbohydrate-rich plant food is not only consumed as a separation and analysis of volatile compounds. However, for
source of energy, but it contains a number of beneficial health the analysis of volatile components in complicated samples, gas
components, bioactive compounds e.g. phenolic acids: ferulic, chromatography–mass spectrometry (GC-MS) is the most
coumaric and caffeic acids.1–5 One of the famous rice varieties popular technique that has been employed to identify the
in the world, which has unique features such as aroma and chemical composition of various plant sample extracts.14–18
texture, is Thai jasmine rice (Oryza sativa; Khao Dawk Mali105 From GC analysis, different rice genetics can be expected to
or KDML105).6–9 have different volatile profiles, i.e. aldehydes and aromatic
Determinations and classifications of chemical components in compounds were the most abundant odor-active compounds in
fragrant rice have been carried out to interpret the generation of the aromatic rice types. In nonaromatic rice, aldehydes were the
aromatic compound, to identify the essential chemical structures most abundant odor-active compounds.19 KDML 105 jasmine
related to olfactory property, and to transfer structure information rice contains a substance called 2-acetyl-1-pyrroline (2-AP), and
2-AP is an important factor for the formation of the fragrance.6

To whom correspondence should be addressed. The second rice volatile compound, 2-acetyl-2-thiazoline was
E-mail: piyaratn@gmail.com considered a specific characteristic of jasmine rice.20 Different
1212 ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33

substances were found in a comparison of volatile compounds particularly suitable for describing differences in congeneric
in California long-grain rice cultivar21 and Thai rice (KDML series of molecules.32,33
105)22 using GC-MS. The number of alcohol compounds in In this study, retention times of volatile organic compounds
Thai rice is higher than that of California long-grain rice, from Thai jasmine rice obtained from GC-MS were used for
however, the numbers of aldehyde, ketone, aromatic, acid, ester QSRR modelling. PCA and PCR were applied to predict
and nitrogeneous compounds in Thai rice are lower than in retention times of these compounds as a function of the
California long-grain rice. theoretically derived descriptors.
A quantitative description of molecular structures provided
through the parameters and descriptors is a prerequisite for
quantitative structure–property relationship (QSPR) studies.23 Experimental
Molecular descriptors can be defined as the outcome of a logical
and mathematical procedure that changes chemical information Materials
encoded within a symbolic representation of a molecule into a Thai jasmine rice (Oryza sativa) variety Khao Dawk Mali 105
practicable number.24 QSPR studies have been utilized to (KDML105) used in this study was obtained from the
investigate the relationship of the property to the relevant part of Agricultural Technology Research Institute, Lampang, Thailand.
structures. Consequently, if essential information of each After harvesting, the rice seeds were kept in cool conditions
structure can be extracted and screened properly, a rational (–20°C) until further use in experiments.
predictive model can be constructed. There are several ways to For use as an internal standard, 2,4,6-trimethylpyridine (TMP),
generate molecular descriptors containing topological, geometric 99% purity, was purchased from Aldrich Chemical Co.,
and electronic features to project all dimensions and information Milwaukee, WI. Preparation of the internal standard solution
of each structure. The technique is associated with several containing 0.25 ppm of TMP was made by dissolving an exact
advantages and applications, such as estimation of weight of it in a volume of 0.1 M HCl.
physicochemical properties using substituent constant, reduction
of the number of compounds to be synthesized, and faster Preparation of rice grain extracts
detection and identification of the most favorable compounds. First, 50 grams of the rice seeds were extracted with 0.1 M
There are numerous statistical techniques for extended QSPR hydrochloric acid. TMP 0.25 ppm was used as the internal
modeling.25 Principal component analysis (PCA) and principal standard. The extracted solution was made alkaline by 0.1 M
component regression analysis (PCR) are used for classification NaOH and then extracted with dichloromethane. The organic
in linear models and built with the help of a training set and phase was dried by anhydrous sodium prior to removal of
validation using an external prediction set. Statistical evaluation solvent using a rotary evaporator with reduced pressure. Finally,
has been suggested to produce an appropriate predictive model.25 the residue was subjected to analysis by GC-MS.
Molecular descriptors have been designed for encoding
structural and physicochemical features and fingerprints. They GC-MS analysis
can be applied in various fields of structural design and property The profile of volatile compounds from the rice extract was
prediction, such as analysis of high throughput screening (HTS) determined using GC-MS (Model 6890N/5973, Agilent, Palo
results, finding new lead structures and lead hopping, modeling Alto, CA). The GC-MS temperature program started from 45 to
biological activities26 and in the study of chromatographic 250° C with a rate of increase of 3°C/min, and held for 30 min.
retention.27 A  capillary column HP-5MS with dimensions of 30 m ×
Retention time in gas chromatographic analysis is the most 0.25 mm i.d. and 0.5 μm film thickness was used. The injection
important criteria to separate and identify the composition of the port temperature was set at 250° C. Purified helium gas was
substances. It is commonly used to identify the type of used as the carrier gas with the flow rate of 1.3 mL/min. The
substance by comparing with authentic compounds. GC injector was in a split mode with a 1:10 split ratio. The MS
Nevertheless, most of the samples are not pure and are condition was operated in the electron impact (EI) mode with
sometimes complex mixtures, therefore the development of a ionization voltage of 70 eV and the ion source temperature was
theoretical model for estimating the retention times seems to be set to 230° C. The MS quadrupole temperature was 150° C and
useful for reducing the time spent on analysis. This quantitative mass scan was in the range of 45 – 550 amu. The volatile
determination of the retention time in chromatographic studies compounds were tentatively identified by matching their mass
can be defined as a quantitative structure–retention relationship spectra with reference spectra complied in NIST05 and Wiley7n
(QSRR). QSRR has been demonstrated to be a powerful tool in mass spectral libraries. The structures of these volatile
chromatographic studies for estimation of retention data of compounds were confirmed by linear retention index (RI) using
novel compounds provided through their molecular descriptors. n-alkanes (Supelco) as the reference. This experiment was done
The models have been successfully elaborated for many types of in triplicate.
chromatography including, gas chromatography, planar
chromatography, column liquid chromatography, micellar liquid Data set
chromatography and affinity chromatography.28 The retention times of volatile compounds in extracts of
According to a number of previous studies, the superiority of KDML105 were obtained from GC-MS analysis. The data set
the QSRR model has been shown in describing retention data was divided into two subsets including training set and test set.
using physicochemical properties and Moreau–Broto It is difficult to give a general rule on how to choose the number
autocorrelation topological descriptors. Physicochemical of observations in each of the two parts. A typical split might
properties are an important factor as molecules having a similar be 20 – 25% for test set, therefore 35 compounds were used for
structure will also have a similar physicochemical property.29–31 the training set, and 10 compounds for the test set in this case.
Moreau–Broto autocorrelations are 2D-descriptors derived from The training set was used to generate the retention times of
the molecular graph weighted by atom physicochemical these compounds as a function of chemical descriptors and
properties based on spatial autocorrelation and contain encoded the  test set was used to evaluate the predictive ability of the
information on structural fragments and therefore seem to be regression model. The structures obtained from GC-MS
ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33 1213

Table 1 Structural assignments and EI mass spectral of volatile compounds in KDML 105 rice analyzed by GC-MS

Retention Match,
No Structure m/z (% relative abundance) MW
time/min %

1 3.92 Octane 43(100), 57(57), 71(40), 85(68),114(12) 114 91


2 4.75 2,4-Dimethyl-1-heptane 43(89), 55(57), 70(100), 83(28), 111(3), 126(19) 126 91
3 8.39 4-Methylnonane 43(54), 57(100), 71(37), 98(30), 142(5) 142 87
4 9.81 Hexanoic acid 41(24), 60(100), 73(61), 87(19), 99(7), 111(3) 116 90
5 9.89 Decane 43(96), 57(100), 71(59), 85(45), 99(16), 113(6), 142(14) 142 91
6 10.16 2,6-Dimethylnonane 43(67), 57(64), 71(100), 85(27), 113(25), 141(2), 156(3) 156 83
7 11.1 D-Limonene 41(22), 53(20), 68(100), 79(39), 93(83), 107(27), 121(30), 136(29) 136 94
8 12.61 2,5-Dimethyl-2,5-hexanediol 43(96), 55(41), 59(100), 70(81), 77(4), 85(2), 95(37), 105(3), 113(62) 146 83
9 18.68 Dodecane 43(65), 57(100), 71(65), 85(42), 98(9), 112(5), 127(4), 141(3), 170(10) 170 94
10 18.98 Decanal 43(69), 57(100), 70(58), 82(48), 95(28), 112(34), 128(2) 156 91
11 19.25 4,8-Dimethyl-undecane 43(77), 57(66), 71(100), 85(68), 98(5), 113(9), 141(38) 184 90
12 20.07 2,6-Dimethyl-dodecane 43(57), 57(80), 71(100), 85(46), 99(9), 113(21), 127(5), 155(14), 198(2) 198 94
13 20.46 4,6-Dimethyl-dodecane 43(58), 57(76), 71(100), 85(41), 99(10), 113(22), 127(6), 155(11), 183(2), 198(2) 198 94
14 20.77 1,3-Bis(1,1-dimethylethyl)-benzene 41(6), 57(21), 91(5), 115(4), 147(5), 175(100), 190(18) 190 95
15 23.78 2-Methoxy-4-vinylphenol 39(3), 51(5), 63(4), 77(23), 89(4), 107(26), 135(75),150(100) 150 91
16 27.5 Tetradecane 43(63), 57(100), 71(72), 85(52), 99(12), 198(9) 198 98
17 27.66 Vanillin 72(16), 81(20), 109(16), 123(15), 144(21), 152(100) 152 94
18 30.66 Cyclododecane 43(77), 55(100), 69(96), 83(79), 97(56), 111(34), 125(11), 140(14), 168(4) 168 94
19 31.68 Butylated hydroxytoluene 41(7), 57(14), 81(4), 105(5), 145(9), 177(6), 205(100), 220(26) 220 98
20 32.12 2,4-Bis(1,1-dimethylethyl)-phenol 41(4), 57(10), 163(5), 175(3), 191(100), 206(17) 206 95
21 35.54 Hexadecane 43(62), 57(100), 71(78), 85(59), 99(17), 113(10), 127(6), 141(5), 226(8) 226 99
22 38.5 Cyclotetradecane 43(61), 55(89), 69(85), 83(100), 97(76), 111(39), 125(17), 140(7), 168(10), 196(2) 196 91
23 39.86 2-(Dodecylogy)-ethanol 43(55), 57(100), 71(72), 85(53), 97(34), 111(26), 140(15), 168(10), 199(9) 230 91
24 41.93 Tetradecanoic acid 43(72), 60(76), 73(100), 185(51), 228(34) 228 95
25 42.82 Octadecane 43(65), 57(100), 71(78), 85(62), 99(20), 254(8) 254 97
26 44.21 6,10,14-Trimethyl-2-pentadecanone 43(100), 58(96), 71(63), 85(42), 95(28), 109(33), 123(17), 210(6), 250(14) 268 90
27 45.68 Cyclohexadecane 43(59), 55(80), 69(82), 83(100), 97(85), 111(48), 125(22), 196(8), 224(3) 224 91
28 46.89 2-(Hexadecylogy)-ethanol 43(59), 57(100), 71(81), 85(59), 97(40), 111(25), 125(14), 168(13), 196(10), 227(9) 286 91
29 40.91 Dibutyl phthalate 57(14), 149(100), 223(6), 278(1) 278 94
30 49.01 n-Hexadecanoic acid 43(61), 60(73), 73(100), 129(58), 213(44), 256(63) 256 99
31 51.36 Oleyl alcohol 41(41), 55(79), 67(65), 82(100), 96(79), 109(35), 123(20), 250(15) 268 91
32 52.15 1-Octadecene 43(59), 55(76), 69(79), 83(100), 97(88), 111(51), 125(25), 224(7), 252(3) 252 99
33 54.29 9,12-Octadecadienoic acid 41(54), 55(77), 67(100), 81(93), 95(69), 109(34), 280(30) 280 99
34 54.47 (E)-9-Octadecenoic acid 41(50), 55(97), 69(94), 83(97), 97(100), 111(42), 125(25), 222(18), 264(55), 282(9) 282 91
35 54.96 Octadecanoic acid 43(75), 60(74), 73(96), 129(77), 185(44), 241(52), 284(100) 284 99

analysis were used to generate molecular descriptors. Molecular


descriptors were generated by the Adriana.Code 2.0 program. Results and Discussion
The 2D and 3D structures were obtained from ChemSpider, the
free chemical database from the Royal Society of Chemistry. GC-MS analysis, descriptor generation and selection
Statistical evaluations of data analyses were performed mainly By using capillary GC-MS, volatile organic compounds in the
by using the SPSS Statistics 17.0 statistical package program. jasmine rice cultivar KDML105 were identified. The structure
Physicochemical properties are molecular descriptors, mainly of each volatile in the rice extract obtained by GC-MS is
describing transport phenomena. Two dimensional descriptors presented in Table 1, showing that up to 100 components were
describe how the atoms are connected in terms of chemical separated and 35 volatile constituents were identified based on a
bonds and atom pair properties. The 2D autocorrelation of comparison of their mass spectra with the reference spectra of
interatomic distance descriptor was calculated by the Moreau– the libraries. Those compounds with less than 80% matching
Broto equation.34 Topological autocorrelation simultaneously quality were defined as unknowns, and have not been included
encodes the constitution of a molecule and the distribution of in Table 1.
atom pair properties depending on a distance function as defined All calculated molecular descriptors including physicochemical
as Eq. (1): properties and autocorrelation of 2D interatomic distance
descriptors have been analyzed to discard descriptors that had a
ATS k = 1 ∑ i=1
A
∑ Aj=1wi w jδ (d ij; k ) (1)
zero value in more than 50% of the cases and a calculated
2 standard score was used for making norm-referenced
interpretations, for which the mean and standard deviation are
where ATSk is the autocorrelation coefficient for a certain selected to simplify interpretations. A  screened subset of 76
topological distance k (number of bonds between two atoms), w descriptors was selected from the 96 generated descriptors. The
is any atomic property, A is the number of atoms in a molecule, 20 descriptors that were constant and almost constant have been
dij is the topological distance between the ith and jth atoms, and eliminated. The names of selected molecular descriptors are
δ(dij; k) is a distance function. listed in Table 2. All 76 descriptors calculated from Adriana.Code
2.0 were subjected to generate 35 × 76 data matrix. A statistical
technique was used for determination of the relationship
1214 ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33

Table 2 Physicochemical properties and autocorrelation of 2D interatomic distance descriptor

Descriptor name Abbreviation Description


Physicochemical properties
Molecular weight Weight Molecular weight of compound
Number of H-bond acceptors HAcc Number of hydrogen bonding acceptors derived from the sum of nitrogen and
oxygen atoms in the molecule
Octanol/water distribution coefficient XlogP Octanol/water partition coefficient in [log units] of the molecule following the
XlogP approach
Topological polar surface area TPSA Topological polar surface area in [Å2] of the molecule derived from polar 2D
fragments
Mean molecular polarizability Polariz Mean molecular polarizability in [Å3] of the molecule
Molecular dipole moment Dipole Dipole moment in [Debye] of the molecule
Aqueous solubility LogS Solubility of the molecule in water in [log units]
Autocorrelation of 2D interatomic distance
Atom identity Ident 2D autocorrelation weighted by atom identities, i.e., “1” for an atom
Sigma charge SigChg 2D autocorrelation weighted by σ atom charges
Pi charge PiChg 2D autocorrelation weighted by π atom charges
Total charge TotChg 2D autocorrelation weighted by total atom charges (sum of σ and π charges)
Sigma electronegativity SigEN 2D autocorrelation weighted by σ atom electronegativities
Pi electronegativity PiEN 2D autocorrelation weighted by π atom electronegativities
Lone-pair electronegativity LpEN 2D autocorrelation weighted by lone pair electronegativities
Polarizability Polrz 2D autocorrelation weighted by effective atom polarizabilities

Table 3 Spearman rank correlation coefficient between molecular descriptor and retention time of volatile compounds in rice
Weight HAcc XlogP TPSA Polariz Dipole LogS Ident1 Ident2 Ident3
0.937a 0.317 0.663a 0.347a 0.898a 0.264 –0.626a 0.952a 0.963a 0.838a
0 0.064 0 0.041 0 0.125 0 0 0 0
35 35 35 35 35 35 35 35 35 35
Ident4 Ident5 Ident6 Ident7 Ident8 Ident9 Ident10 Ident11 SigChg1 SigChg2
0.858a 0.881a 0.906a 0.869a 0.794a 0.777a 0.746a 0.757a 0.575a –0.06
0 0 0 0 0 0 0 0 0 0.731
35 35 35 35 35 35 35 35 35 35
SigChg3 SigChg4 SigChg5 SigChg6 SigChg7 SigChg8 SigChg9 SigChg10 SigChg11 PiChg1
0.541a 0.685a 0.711a 0.790a 0.795a 0.786a 0.731a 0.729a 0.751a 0.286
0.001 0 0 0 0 0 0 0 0 0.095
35 35 35 35 35 35 35 35 35 35
PiChg2 PiChg3 PiChg4 PiChg5 PiChg6 PiChg7 PiChg8 PiChg9 PiChg10 PiChg11
–0.26 –0.146 –0.319 –0.166 0.091 –0.105 0.126 0.022 0.105 0.287
0.131 0.402 0.061 0.342 0.604 0.546 0.469 0.902 0.547 0.095
35 35 35 35 35 35 35 35 35 35
TotChg1 TotChg2 TotChg3 TotChg4 TotChg5 TotChg6 TotChg7 TotChg8 TotChg9 TotChg10
0.564 a –0.045 0.482 a 0.702 a 0.707 a 0.778 a 0.778 a 0.786 a 0.697 a 0.716a
0 0.798 0.003 0 0 0 0 0 0 0
35 35 35 35 35 35 35 35 35 35
TotChg11 SigEN1 SigEN2 SigEN3 SigEN4 SigEN5 SigEN6 SigEN7 SigEN8 SigEN9
0.743 a 0.924 a 0.908 a 0.701 a 0.785 a 0.889 a 0.929 a 0.905 a 0.800 a 0.779a
0 0 0 0 0 0 0 0 0 0
35 35 35 35 35 35 35 35 35 35
SigEN10 SigEN11 PiEN1 PiEN2 LpEN1 Polrz1 Polrz2 Polrz3 Polrz4 Polrz5
0.747 a 0.760 a 0.235 0.172 0.326 0.848 a 0.841 a 0.752 a 0.799 a 0.825a
0 0 0.174 0.323 0.056 0 0 0 0 0
35 35 35 35 35 35 35 35 35 35
Polrz6 Polrz7 Polrz8 Polrz9 Polrz10 Polrz11
0.816a 0.859a 0.791a 0.771a 0.745a 0.758a
0 0 0 0 0 0
35 35 35 35 35 35
a. Correlation is significant at the 0.01 level (2-tailed).
ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33 1215

between selected molecular descriptor and retention time of backward elimination and stepwise regression were applied. All
volatile compounds in rice. Spearman rank correlation techniques give the same best-fitted models with R-squared
coefficient was utilized to identify the physicochemical 0.900 as shown in Table 6 and the equation for prediction of
properties and autocorrelation of 2D interatomic distance retention time in this model is defined as Eq. (2):
descriptors as abbreviated in Table 2, subscripting a certain
topological distance k in Eq. (1) associated with the retention y = –0.023 + 0.945x1 (2)
time (Table 3).
The discrimination power of the variables effect on retention
QSRR model and informative descriptors elucidation time were in the order of molecular polarizability of the
PCA is a mathematical procedure that uses an orthogonal molecule, molecular weight of compound, σ atom
transformation to convert a set of observations of possibly electronegativities, σ atom charges, and total atom charges
correlated variables into a set of values of linearly uncorrelated (Ident1, Weight, Ident2, Polariz, SigEN7, SigEN1, SigEN2,
variables called principal components. PCA is applied for SigEN9, Polrz9, Ident9, SigEN6, Ident7, SigEN11, Ident11, SigEN10,
reduction of the molecular descriptor dimension. Selected Polrz11, Polrz10, SigEN8, Ident10, TotChg8, Ident6, SigChg8,
descriptors obtained from these structures were used for PCA to Ident5, Polrz7, Ident8, SigChg11, Polrz8, Polrz1, SigEN5, TotChg11,
extract the relevant elements, which can be reduced to eight Polrz6, SigChg10, Ident4, SigChg9, TotChg10, Polrz2, Ident3,
components with 94.64% of the total variance accounted, as TotChg9, Polrz5, SigEN4, SigEN3, SigChg7, TotChg7, SigChg5).
shown in Table 4, and molecular descriptors obtained in each
component are demonstrated in Table 5.
Modeling of retention times as a function of theoretically
derived descriptors of each chemical structure was established Table 6 Statistical parameters of PCR model
by PCA and PCR. The eight components from PCA of
molecular descriptors were selected to build an appropriate All enter regression
model to determine the relationship between retention time of a
b SEb β
compound and its chemical structure. By using the retention Variable (Unstandardized (Standard (Standardized t p-value
time as the dependent variable and eight major components of coefficient) error) coefficient)
the molecular descriptor variables as independent variables,
PCR was generated. All enter regression, forward selection, PCA1(x1) 0.947 0.057 0.95 16.72 0
PCA2(x2) 0.037 0.056 0.037 0.657 0.517
PCA3(x3) 0.043 0.057 0.043 0.757 0.456
PCA4(x4) 0.076 0.056 0.078 1.364 0.184
Table 4 Cumulative variation and eigenvalue in each principal PCA5(x5) 0.019 0.056 0.02 0.344 0.733
component of chemical structure PCA6(x6) 0.02 0.056 0.02 0.351 0.728
PCA7(x7) –0.046 0.056 –0.047 –0.82 0.42
Total variance explained PCA8(x8) –0.061 0.056 –0.061 –1.08 0.29
Component
Initial eigenvalue % of Variance Cumulative, % Constant –0.025; SEest (Standard error of the estimate) = ±0.33; R =
0.957; R2 = 0.916; F = 35.464; p-value <0.001.
1 36.175 47.599 47.599
2 13.469 17.723 65.322 Stepwise regression
3 8.171 10.751 76.073
4 4.850 6.382 82.455 Variable b SEb β t p-value
5 2.882 3.793 86.248
6 2.687 3.535 89.783 PCA1(x1) 0.945 0.055 0.949 17.249 0
7 2.000 2.632 92.415
Constant –0.023; SEest (Standard error of the estimate) = ±0.32; R =
8 1.693 2.228 94.643
0.949; R2 = 0.900; F = 297.513; p-value <0.001.

Table 5 Molecular descriptors in each component from PCA

Component 1 Component 2 Component 3 Component 4 Component 5 Component 6 Component 7 Component 8

Ident1 Polrz11 Polrz6 HAcc Polrz3 TotChg3 SigChg6 PiChg9 PiChg4 PiChg3
Weight Polrz10 SigChg10 LpEN1 Polrz4 TotChg4 TotChg6 PiChg11
Ident2 SigEN8 Ident4 SigChg2 SigChg3 PiChg10
Polariz Ident10 SigChg9 TPSA SigChg4 PiChg8
SigEN7 TotChg8 TotChg10 SigChg1 TotChg5
SigEN1 Ident6 Polrz2 TotChg1
SigEN2 SigChg8 Ident3 PiEN1
SigEN9 Ident5 TotChg9 TotChg2
Polrz9 Polrz7 Polrz5 Dipole
Ident9 Ident8 SigEN4 PiEN2
SigEN6 SigChg11 SigEN3 LogS
Ident7 Polrz8 SigChg7 XlogP
SigEN11 Polrz1 TotChg7 PiChg1
Ident11 SigEN5 SigChg5 PiChg2
SigEN10 TotChg11 PiChg6
PiChg7
PiChg5
1216 ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33

Table 7 An external test set of compounds used to test the performance of the QSRR model

No Structure PCA1(x1) ZRTesta RTestb RTexpc

1 1,2,3,3,4-Pentamethyl-cyclopentene –1.05597 –1.02089 11.41566 10.86


2 2,4,6-Trimetyloctane –0.81923 –0.79718 15.16913 15.24
3 2,4,6-Trimethylpyridine –1.15378 –1.11332 9.864775 9.77
4 2,6-bis(1,1-Dimethyl-ethyl)-4-ethylphenol 1.114564 1.030263 45.83022 44.74
5 2,6-Dimethylundecane –0.28521 –0.29253 23.63624 22.89
6 Hexanal –1.42969 –1.37406 5.490108 5.58
7 Nonanal –0.85934 –0.83508 14.53326 14.95
8 Pentylcyclopropane –1.37257 –1.32008 6.395817 6.47
9 Tetracosane 2.233823 2.087963 63.5765 62.47
10 Tricosane 1.999677 1.866695 59.86402 58.21
PRESS 6.215594
RMSE 0.831036

a. Z-score of predicted retention time values. b. Predicted retention time values. c. Experimental retention time values.

In this case, 2D autocorrelation weighted by atom identities,


effective atom polarizabilities, σ atom charges, σ atom
electronegativities, and total atom charges are the crucial factors
in retention time prediction. As each Moreau–Broto
autocorrelation coefficient for a certain topological distances,
2D autocorrelation weighted by atom identities itself, adjacent
pair and the products of atom pair σ atom electronegativities
summed up for seven bonds between atoms mostly contributed
to this model. Efficient separation of compounds in GC depends
on the different movement rates of compounds in the column.
The important factor affecting retention time is the
polarizability of the molecule. If polarity of the stationary phase
and compound are similar, the retention time increases because
the compound interacts strongly with the stationary phase. As a
result, polar compounds have long retention times on polar
stationary phases and shorter retention times on non-polar Fig. 1 The predicted versus the experimental retention time (RT) by
columns using the same condition. PCR.
The other factors that have an influence on retention time are
sigma charge, total charge and sigma electronegativity, which
are responsible for polarizability because the distribution of
charge leads to the dipole of molecules related to polarizability. Conclusions
Molecular weight is a factors that influences the separation time
of the components, and is often related to boiling point of a
compound because low boiling (volatile) components will move This study revealed the potential of GC-MS applied for
faster through the column than will high boiling components. separation and identification of secondary volatile metabolites
The validation of the model was performed by prediction error in Thai jasmine rice extract. Altogether, up to 100 volatile
sum of squares (PRESS) and root mean squared error (RMSE). components were separated and 35 compounds were identified.
The predicted values of external samples were then compared to By using input from the GC-MS experiment, identifications of
the observed values using PRESS in Eq. (3), which indicates the volatile metabolites were achieved. The PCA results show that
residuals are computed in validation, and RMSE in Eq. (4), two principal components (PC1 and PC2) describe 65.32% of the
which gives the agreement from training set with PRESS of overall variances and eight principal components describe
6.2156 and RMSE of 0.8310, as shown in Table 7. The 94.64% of the overall variances. PCR was used to simulate
predicted values for retention time of the compounds in the retention time patterns and the QSRR model was proposed.
training and test sets using equation retention time were plotted Modeling of retention times of the rice volatile compounds as a
against the experimental retention time values in Fig. 1. function of theoretical descriptors, which gives the best-fitted
models with R-squared 0.900, and the external test set of volatile
PRESS = ∑ i=1
N
( yi − yˆ i,cv ) 2 (3) compounds, which gives the agreement from training set with
PRESS 6.2156 and RMSE 0.8310.
N
∑ i=1 ( yi − yˆ i ) 2 The results of this study demonstrate that the molecular
RMSE = (4)
N −1 information in terms of molecular weight, molecular
polarizability, atom identity, sigma charge, sigma
In this case, the relevant components regarding PC1 were electronegativity and polarizability can be considered as
mainly contributed by molecular weight, molecular polarizability, comprehensive descriptors for predicting the retention times of
atom identity, sigma charge, sigma electronegativity and volatile compounds in rice. This information supports the fact
polarizability (Table 7). that retention time in gas chromatography is the result of the
solute between the mobile and stationary phases. While,
ANALYTICAL SCIENCES NOVEMBER 2017, VOL. 33 1217

molecular structure and chemical properties of the solute 10. G. Reineccius, “Flavor Chemistry and Technology”, 2nd
determine the type and extent of the interactions of the solute ed., 2005, CRC Press, New York.
with these phases. The differences between these properties 11. X. Yang and T. Peppard, J. Agric. Food Chem., 1994, 42,
govern the retention behavior through the column. 1925.
The ultimate goal of this study has been accomplished. QSRR 12. A. Steffen and J. Pawliszyn, J. Agric. Food Chem., 1996,
models for the prediction of GC retention time of various 44, 2187.
volatile components from Thai rice can successfully be 13. G. B. Lockwood, J. Chromatogr. A, 2001, 936, 23.
developed. The proposed models have good predictive ability 14. C. C. Grimm, C. Bergman, J. T. Delgado, and R. Bryant,
and are of high statistical significance. The models are helpful J. Agric. Food Chem., 2001, 49, 245.
for the discovery of new components in Thai rice using retention 15. H. S. Lam and A. Proctor, J. Food Sci., 2003, 68, 2676.
time projected to molecular descriptors of the compounds, 16. E. T. Champagne, J. F. Thompson, K. L. Bett-Garber, R.
which can be used as fragment information for structural Mutters, J. A. Miller, and E. Tan, Cereal Chem., 2004, 81,
elucidation of the unknown component and the PCA is useful 444.
for highlighting the key molecular descriptor for explaining 17. S. Wongpornchai, K. Dumri, S. Jongkaewwattana, and B.
chromatographic mechanisms. Siri, Food Chem., 2004, 87, 407.
18. Z. Zeng, H. Zhang, J. Y. Chen, T. Zhang, and R. Matsunaga,
Cereal Chem., 2007, 84, 423.
Acknowledgements 19. D. S. Yang, R. L. Shewfelt, K. S. Lee, and S. J. Kays,
J. Agric. Food Chem., 2008, 56, 2780.
We gratefully acknowledge the Center of Excellence for 20. K. Mahattanatawee and R. L. Rouseff, Food Chem., 2014,
Innovation in Chemistry (PERCH-CIC) and the Graduate 154, 1.
School, Chiang Mai University, for financial support. P. N. and 21. R. G. Buttery, J. G. Turnbaugh, and L. C. Ling, J. Agric.
N. W. acknowledge partial financial support from CMU-IC Food Chem., 1988, 36, 1006.
research project for Asean+3 Cross Border Research, the Center 22. S. Mahatheeranont, S. Promdang, and A. Chiampiriyakul,
of Excellence for Innovation in Analytical Science, CMU and Kasetsart J. Nat. Sci., 1995, 29, 508.
Standardization and Development of Miang Extract and 23. M. Grover, B. Singh, M. Bakshi, and S. Singh, Pharm. Sci.
Chemical Analysis Methodology Project, ARDA & NRCT, Technol. Today, 2000, 3, 28.
Thailand. 24. Z. Garkani-Nejad, M. Karlovits, W. Demuth, T. Stimpfl, W.
Vycudilik, M. Jalali-Heravi, and K. Varmuza, J. Chromatogr.
A, 2004, 1028, 287.
References 25. L. Xu and W.-J. Zhang, Anal. Chim. Acta, 2001, 446, 475.
26. M. Wagener, J. Sadowski, and J. Gasteiger, J. Am. Chem.
1. W. E. Marshall and J. I.. Wadsworth, “Rice Science and Soc., 1995, 117, 7769.
Technology”, 1993, Taylor & Francis, New York. 27. T. Gobbo-Neto, J. Schmidt, and F. B. Da Costa, J. Chem.
2. B. O. Juliano, “Rice: Chemistry and Technology”, 1985, Inf. Model., 2015, 55, 26.
American Association of Cereal Chemists, Minnesota. 28. K. Héberger, J. Chromatogr. A, 2007, 1158, 273.
3. V. Leardkamolkarn, W. Thongthep, P. Suttiarporn, R. 29. S. Z. Kovacevic, S. O. Podunavac-Kuzmanovic, L. R.
Kongkachuichai, S. Wongpornchai, and A. Wanavijitr, Food Jevric, P. T. Jovanov, E. A. Djurendic, and J. J. Ajdukovic,
Chem., 2011, 125, 978. Eur. J. Pharm. Sci., 2016, 93, 1.
4. B. M. Rao, U. V. R. V. Saradhi, N. S. Rani, S. Prabhakar, G. 30. M. M. Talmaciu, E. Bodoki, J. Platts, and R. Oprean, Stud.
S. V. Prasad, G. S. Ramanjaneyulu, and M. Vairamani, Food Ubb. Che., 2016, 4, 99.
Chem., 2007, 105, 736. 31. L. T. Qin, S. S. Liu, F. Chen, Q. F. Xiao, and Q. S. Wu,
5. T. Sriseadka, S. Wongpornchai, and P. Kitsawatpaiboon, Chemosphere, 2013, 90, 300.
J. Agric. Food Chem., 2006, 54, 8183. 32. T. B. Oliveira, L. Gobbo-Neto, T. J. Schmidt, and F. B. Da
6. R. G. Buttery, L. C. Ling, B. O. Juliano, and J. G. Costa, J. Chem. Inf. Model., 2015, 55, 26.
Turnbaugh, J. Agric. Food Chem., 1983, 31, 823. 33. M. H. Fatemi and H. Malekzadeh, J. Iran. Chem. Soc.,
7. R. J. Bryant and A. M. McClung, Food Chem., 2011, 124, 2015, 12, 405.
501. 34. T. Puzyn, J. Leszczynski, and M. T. Cronin, “Recent
8. N. J. N. Yau and T. T. Liu, J. Sens. Stud., 1999, 14, 209. Advances in QSAR Studies: Methods and Applications”,
9. A. M. D. Mundo and B. O. Juliano, J. Texture Stud., 1981, 2010, Springer Netherlands, Dordrecht.
12, 107.

Potrebbero piacerti anche