Moral Geosta IJC 2010

INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 30: 620631 (2010) Published online 9 April 2009 in Wiley InterScience (www.interscience.wiley.
com) DOI: 10.1002/joc.1913
Comparison of different geostatistical approaches to map climate variables: application to precipitation

Francisco J. Moral*
Department of Graphic Representation, University of Extremadura, 06071 Badajoz, Spain
ABSTRACT: The benets of an integrated geographical information system (GIS) and a geostatistics approach to accurately model the spatial distribution pattern of precipitation are known. However, the determination of the most appropriate geostatistical algorithm for each case is usually neglected, i.e. it is important to select the best interpolation technique for each study area to obtain accurate results. In this work, the ordinary kriging (OK), simple kriging (SK) and universal kriging (universal kriging) methods are compared with three multivariate algorithms which take into account the altitude: collocated ordinary cokriging (OCK), simple kriging with varying local means (SKV) and regression-kriging (RK). The different techniques are applied to monthly and annual precipitation data measured at 136 meteorological stations in a region of southwestern Spain (Extremadura). After carrying out cross-validation, the smallest prediction errors are obtained for the three multivariate algorithms but, particularly, SKV and RK outperform collocated OCK, which needs a more demanding variogram analysis. These algorithms are easily implemented in a GIS, requiring the residual estimates and map algebra capability to generate the nal maps. Results evidence the necessity of accounting for spatially dependent precipitation data and the collocated altitude, to accurately dene monthly and annual precipitation maps. Copyright 2009 Royal Meteorological Society
KEY WORDS
kriging; precipitation; altitude; geographical information system; regression
Received 11 July 2008; Revised 22 January 2009; Accepted 8 March 2009
1.
Introduction
There are many different areas of research (e.g. climatology, agriculture, ecological modelling, hydrology) that require interpolated surfaces or gridded datasets of climate variables. Consequently, there have been numerous attempts made at spatial interpolation using a variety of methods. Surfaces of climate variables have been interpolated, using point data, for areas ranging from a few thousand square kilometres (Ninyerola et al., 2000; VicenteSerrano et al., 2003) to the continental scale (Hulme et al., 1995, 1996) and even for the entire globe (Willmott and Robeson, 1995). The main problem, previous to the selection of the most appropriate estimation technique, is related to the availability of climatic data. Sometimes data are recorded at permanent but too much disperse weather stations, especially in mountainous areas, where climatic values are more difcult to predict due to the complex topography. Even in atter areas, weather stations should be properly distributed to detect the inuences of air ows, surrounding mountains, thermal inversions and other phenomena that could affect the climatic patterns.
* Correspondence to: Francisco J. Moral, Department of Graphic Representation, University of Extremadura, 06071 Badajoz, Spain. E-mail: fjmoral@unex.es Copyright 2009 Royal Meteorological Society
The spatial interpolation methods differ in their assumptions, deterministic or statistical nature, and local (they use the data of the nearest sampling points to estimate at unsampled locations) or global (they use the data of all sampling points to estimate at unsampled sites) perspective (Burrough and McDonnell, 1998). Some examples related to the use of deterministic techniques can be found in the works of Legates and Willmott (1990) inverse distance weighting; Hutchinson and Gessler (1994) splines; Agnew and Palutikof (2000) or Vicente-Serrano et al. (2003) empirical multiple regressions. However, it is recognized that the statistical approach, geostatistical methods or kriging, has several advantages over the deterministic techniques (Isaaks and Srivastava, 1989; Goovaerts, 1997). Nowadays, geostatistics is widely used in climate mapping (Atkinson, 1997; Goovaerts, 1997). The fact of giving unbiased predictions with minimum variance and taking into account the spatial correlation between the data recorded at different weather stations is an important advantage of kriging. Some studies have shown that kriging provides better estimates than other techniques (e.g. Phillips et al., 1992; Goovaerts, 2000), but other authors have found that results depend on the sampling density (Dirks et al., 1998). A major advantage of kriging over simpler methods, besides providing a measure of prediction error (kriging variance), is the possibility of complementing the sample data, when they are sparse, by secondary or auxiliary information which can help with interpolation.
DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES
621
Those sources of knowledge are: (1) data from a cheapto-measure covariable which is known at many more points, and (2) an empirical spatial model of a driving process. For precipitation, weatherradar observations can be the secondary data, as Azimi-Zonooz et al. (1989) and Raspa et al. (1997) considered to estimate precipitation elds using multivariate extensions of kriging (cokriging and kriging with an external drift, respectively). However, Goovaerts (2000) suggested the use of altitude from a digital elevation model (DEM) as another valuable and cheaper source of auxiliary data. It is known that precipitation is higher with increasing elevation, due to the orographic effect of mountainous areas where the air is lifted vertically and the condensation generates because of adiabatic cooling. Goovaerts (2000) showed that geostatistical algorithms outperform deterministic techniques and, especially, multivariate extensions of kriging, where the altitude is considered, generate the best results. More recently, Diodato (2005) also found better estimates when ordinary cokriging (OCK), considering altitude as the auxiliary data, is compared with ordinary kriging (OK). During the last years, some mixed interpolation techniques have been developed, combining kriging and the secondary information. According to Hengl et al. (2003), these methods can be classied depending on the properties of input data. When the number of secondary variables is low and these auxiliary data are not available at all grid-nodes, cokriging is the most appropriate interpolation technique. If auxiliary data are available at all grid-nodes and correlated with the primary or target variable, kriging with a trend model or external drift (Hudson and Wackernagel, 1994; Bourennane et al., 2000) is the correct interpolation method. This non-stationary geostatistical technique has three different approaches from a computational point of view. In the rst, called universal kriging (UK), the trend is modelled as a function of coordinates (Deutsch and Journel, 1992; Wackernagel, 1998). If the trend is dened externally, with some secondary variables, the term kriging with external drift (or trend) is used (Wackernagel, 1998; Chiles and Delner, 1999). The third approach consists in a regression modelling; the trend is modelled outside the kriging algorithm, followed by kriging of residuals. This was called regression-kriging (RK) by Odeh et al. (1994, 1995), while Goovaerts (1999) employed the term kriging after detrending. Another multivariate extension of kriging is the simple kriging (SK) with varying local means algorithm. In fact, it is similar to the kriging with external drift method, but has some advantages over it (Goovaerts, 1997). Besides the previously cited references, there are some others about the use of different geostatistical techniques to interpolate precipitation data. Mart nezCob (1996) obtained the best estimates using cokriging, including topography to improve predictions. PardoIg uzquiza (1998) found the best results for the prediction of precipitation by means of kriging with an external drift. However, according to Goovaerts (1999), RK has
Copyright 2009 Royal Meteorological Society
proven to be superior to simpler geostatistical methods. Therefore, several methods must be compared to establish the best technique to estimate precipitation in a particular area or region. More unanimity exists when deterministic and geostatistical algorithms are compared. The great majority of works shows better results when kriging, or any multivariate extension of it, is used. There are also some references about modelling climatological variables using geographical information systems (GIS). For example, Ninyerola et al. (2000) and Agnew and Palutikof (2000) integrate statistical and GIS techniques to make climatic maps. The linkage of GIS, statistics and geostatistics provides a complementary set of tools for spatial analysis (Burrough, 2001). In this paper, monthly and annual precipitation data from the Extremadura region (Spain) are interpolated to generate high-resolution maps, using two types of geostatistical methods: (1) algorithms that use only precipitation data recorded at the meteorological stations (ordinary kriging, OK and simple kriging, SK); (2) algorithms that combine precipitation data with auxiliary information (universal kriging, UK; ordinary cokriging, OCK; simple kriging, SK with varying local means, SKV; and regression-kriging, RK). Prediction performances of the algorithms are compared using cross-validation and the one with higher accuracy of estimates is selected to map precipitation. Thirteen maps were the outcome of this work: 12 maps of mean monthly precipitation and 1 of mean annual precipitation. Investigation of the reasons for different performance between approaches is also carried out.
2.
Site description
This work is centred in Extremadura (latitude between 37 57 and 40 29 N, longitude between 4 39 and 7 33 W). The region is located in the southwest of Spain on the Portuguese border. It is one of the largest regions in Europe, with a surface area of approximately 41 600 km2 , the size of Belgium. Extremadura shows a great contrast, with wide agricultural and forest areas, and is considered to be one of the most important ecological enclaves in Europe. In the north lie districts with gentle wooded hills, the Sierra de Gata and Hurdes, that through the fertile valleys of the Alag on, Jerte and La Vera, link with the high Gredos mountains. In the east lie the irrigation lands of the river Tagus, the rugged Villuercas, and the areas of Los Montes and La Serena, with the longest interior coast in the Iberian Peninsula, that descend further to the south, to the agricultural areas of La Campi na. In the west the great plains of Brozas and Alc antara drop to the San Pedro mountains and lead into the rich plain of the Guadiana river. In the south lie the great pasture lands and the mountains of Jerez, Tentud a and Hornachos. The maximum and minimum altitudes in the region are 2091, at Gredos mountains, and 116 m a.s.l., in the Guadiana valley (near the border between Spain and Portugal), respectively. The mean altitude is about 425 m
Int. J. Climatol. 30: 620631 (2010)
622
F. J. MORAL
Figure 1. Location and topography of the study area.
a.s.l. Figure 1 shows the DEM, with a spatial resolution of 1 km, used in this research. The climate of Extremadura is characterized by a variation in both temperature and precipitation typical of a Mediterranean climate. However, this feature is modied by the interior location of the region and by oceanic inuences that penetrate the peninsula due to its proximity to the Atlantic. Mean annual precipitation reaches <600 mm in the majority of the areas of the region, even <400 mm in the centre of the Guadiana valley, but it can reach much >1000 mm in the northern (Gredos) and eastern (Guadalupe) mountainous areas. One of the most important characteristics of the precipitation is its interannual variability. There are a dry season, from June to September, and a wet season, from October to May (80% of the precipitation falls between these months). Periodical droughts with duration of 2 or more years, with an occurrence of once in the 89 years, are frequent (Almarza, 1984). Extremadura is a semiarid region, where the water balance is negative.
3. 3.1.
Material and methods Data
Figure 2 shows the spatial distribution of the weather stations used in this work. It should be indicated that locations of the stations are not the ideal. There are considerable spatial variations in certain areas. Many stations are placed in at areas, so that they are lacking in the mountainous areas. But, even in some at areas, stations are unevenly distributed. The daily observations made at all stations have passed through a rigorous quality control procedure. Consistency checks were applied to data and, of course, climatological data from series within 40 years were ltered out, to avoid possible errors during the analysis of the information. We were very strict with data selection, only keeping weather stations with complete years. After assuring the raw data quality, monthly precipitation averages were calculated. This information was loaded to the spatial database and used as the source of input data for the gridding process. Some basic sample statistics were also determined (Table I). Fluctuations of monthly and annual precipitations from one year to another were not investigated. The elevation data is another source of information. Altitude of each meteorological station was included in the spatial database. This auxiliary information can help to improve the estimates of precipitation. 3.2. Geostatistical interpolation techniques In this section, the estimators used in the case study are briey introduced. More information about them can be found in, for instance, Isaaks and Srivastava (1989) and Goovaerts (1997). Each precipitation datum is regarded as one drawn at random according to some law, from some probability distribution. This point of view, when the studied variable (precipitation) is considered random and distributed continuously (regionalized variable), i.e. the phenomenon
Int. J. Climatol. 30: 620631 (2010)
Although active meteorological stations are numerous in Extremadura, daily data of precipitation were obtained from those with series of more than 40 years, at least between 1961 and 2000. Thus, the nal spatial database contains 136 precipitation stations, which were geographically referenced at UTM coordinates. The World Meteorological Organization (WMO) gives 30 years as the optimal length for a series with the aim of obtaining reliable climate data to make predictions (WMO, 1967).
623
Figure 2. (a) Locations of the 136 meteorological stations and (b) annual precipitation map obtained after considering the linear regression between precipitation and altitude data from these stations.
Table I. Descriptive statistics of the monthly and annual precipitation (mm) data for the 136 rain gauge stations. Period January February March April May June July August September October November December Annual Mean 80.03 71.96 49.47 59.81 49.67 29.47 6.67 6.26 32.83 65.91 85.19 87.89 625.18 Median 73.23 67.22 45.97 55.61 44.29 27.64 5.97 5.58 29.84 59.04 78.33 81.58 580.14 SD 29.84 25.37 15.15 16.16 17.15 7.60 3.84 3.69 10.12 22.56 27.72 29.13 198.83 MXV 215.93 189.55 125.10 142.72 132.18 70.66 38.82 38.40 78.84 164.19 231.30 212.83 1580.49 MNV 46.41 40.73 29.74 39.62 29.17 18.83 2.43 2.42 20.09 39.18 53.02 51.22 414.32 SK 2.09 2.02 2.20 2.19 2.50 2.22 4.80 5.29 2.26 1.99 2.34 1.94 2.23 KU 5.34 5.02 6.20 6.21 6.82 7.57 36.03 42.64 6.19 4.56 7.13 4.52 5.85 Cor 0.57 0.55 0.59 0.64 0.51 0.67 0.40 0.33 0.52 0.50 0.57 0.57 0.58
SD = Standard deviation; MXV = Maximum value; MNV = Minimum value; SK = Coefcient of skewness; KU = Kurtosis; Cor = Linear correlation coefcient between precipitation and altitude.
being studied takes values (not necessarily measured) everywhere within the study area, is adopted to use geostatistics as an estimation technique. In this study, three phases were completed to conduct any geostatistical work (e.g. Isaaks and Srivastava, 1989): (1) Exploratory analysis of data. Data were studied without considering their geographical distribution. Statistics was applied to check data consistency, removing outliers and identifying statistical distribution where data came from. (2) Structural analysis of data. Spatial distribution of the variable was analysed. Spatial correlation or dependence can be quantied with semivariograms, or simply variograms, which also can characterize and determine distributions patterns such as randomness, uniformity and spatial trend. Variogram
function relates the semivariance, half the expected squared difference between paired data values Z(xi ) and Z(xi + h), to the lag distance, h, by which sample points are separated. For discrete sampling locations, the function is estimated as: 1 (h) = 2 N (h)
N(h)
{Z(xi ) Z(xi + h)}2

i =1
(1)
where (h) is the experimental semivariance value at distance interval h, Z(xi ) are the measured sample values at sample points of xi , in which there are data at xi and xi + h; N (h) is the total number of sample pairs within the distance interval h. The variogram shows the degradation of spatial correlation between two points of space when the separation distance increases. This
Int. J. Climatol. 30: 620631 (2010)
624
F. J. MORAL
function has two components: (i) the nugget effect, which characterize the discontinuity jump observed at the origin of distances, quanties the short-term, erratic variations of the studied phenomenon plus measurements and data errors; (ii) the increasing part of the variogram, which may reach the sill (theoretical sample variance), levelling off the curve, for a distance called range, or keep on increasing continuously with distance. The non-nugget part of the variogram measures the non-random part of the phenomenon and models its average medium-scale behaviour in space. The variogram is a function of both the distance and direction, and so direction-dependent variability can be accounted for. Because of the lack of data only the omnidirectional variograms were computed in this study. Therefore, the spatial variability is assumed to be identical in all directions. When an experimental variogram is dened, i.e. some points of a variogram plot are determined by calculating variogram at different lags, a model (theoretical variogram) should be tted to the points. Although there are some statistical techniques to justify the choice of a theoretical variogram (Cressie, 1985), subjective criteria and previous experiences are the main tools to choose one. (3) Predictions. Geostatistics offers a great variety of methods that provide estimates for unsampled locations. These methods are known as kriging, in honour of Danie Krige, who rst formulated this form of interpolation in 1951 (Krige, 1951). Kriging is regarded as the best linear unbiased estimator (BLUE). Weights for sample values are calculated based on the parameters of the variogram model. All geostatistical estimators are variants of the linear regression estimator Z (x):
n
(1989) or Goovaerts (1997) for a detailed presentation of the kriging algorithms]. Univariate algorithms, OK and SK, consider the problem of estimating the precipitation at an unsampled location using only precipitation data. When UK is taking into account, a previous model of the trend, function of the spatial coordinates, have to be selected. If auxiliary information, mainly altitude data, is considered together with the primary data, precipitation, some multivariate extension of kriging can result in better estimates. In this case, the simplest approach is using a linear relation between the precipitation and the collocated altitude: Z (x) = a + bH (x) (3)
Z (x) m(x) =
i =1
wi (x) [Z(xi ) m(xi )]
(2)
where each datum, Z(xi ), has an associated weight, wi (x), and, m(x ) and m(xi ) are the expected values of Z (x) and Z(xi ), respectively. The kriging weights must be determined to minimize the estimation variance, Var[Z (x) Z(x)], while ensuring the unbiasedness of the estimator, E [Z (x) Z(x)] = 0. All different types of kriging are distinguished depending on the chosen model for the trend, m(x ), of the random function Z (x ) (e.g. Goovaerts, 1997). Thus, SK considers m(x ) to be known and constant, m, all over the study area; unlike the previous kriging type, m(x ) is unknown in the OK and is considered to uctuate locally, maintaining the stationarity within the local neighbourhood; UK considers that m(x ) smoothly varies within each local neighbourhood and is modelled as a linear combination of functions of the spatial coordinates. The weights, wi (x), are generated when the corresponding system of linear equations, depending on the type of kriging, is solved [see Isaaks and Srivastava
where the two regression coefcients, a and b, are estimated from the set of collocated precipitation and altitude data, Z(xi ) and H (xi ), respectively. Using map algebra techniques, all cells of the raster DEM were multiplied by the corresponding regression coefcient, b, and later, the interception value, a , was added. With this method, the so-called regression maps were obtained (Figure 2). This methodology has been applied in some works (e.g. Vicente-Serrano et al., 2003). However, from a spatial point of view, the linear regression is an inexact interpolator. Only with the addition of the estimated residuals at each point an exact interpolator can be obtained, i.e. estimated values at each sample point (meteorological station) are the same that observed values. In mountainous areas, with sparse rainfall measurements, regressions may capture the orographic effect on precipitation distribution, generating the best estimates (e.g. Daly et al., 1994), but if the spatial correlation of the precipitation data is taking into account, estimates are improved (e.g. Guan et al., 2005). SKV accounts for the secondary data replacing the known stationary mean, m, by known varying means, mSKV (x), which are usually relations similar to (3). In this case, the weights, wi (x) are generated by solving a SK system where the covariance function of the residuals [r(x) = Z(x) mSKV (x)] is involved (Goovaerts, 1997). The estimate at a location x , ZSKV (x), is:
n ZSKV (x) = f (H (x)) + i =1
wi (x) r(xi )
(4)
where f (H (x)) is the regression estimate. There is an alternative approach to use the secondary data and perform SK on the corresponding residuals: kriging with an external drift, KE. This is but a variant of UK. The trend is modelled as a linear function of the auxiliary information, instead of as a function of the coordinates. The KE estimator is similar to the SKV estimator. However, the denition of the trend is different in both methods: whereas the trend coefcients, a and b, are unique and generated without considering the kriging system in the SKV approach, these coefcients are implicitly estimated within each search neighbourhood
Int. J. Climatol. 30: 620631 (2010)
625
by the kriging system in the KE algorithm. Thus, KE extrapolates the linear trend model to the last data, which could be unrealistic sometimes. SKV is more robust in that it extrapolates a relation that is tted to all data (a more exhaustive discussion can be found in Goovaerts, 1997). This is the reason why the KE algorithm was not employed in this study. The cokriging approach is another possibility to incorporate secondary data. Although it is indicated when the secondary information is not exhaustive, i.e. auxiliary data are not available at all grid-nodes, if this information is known everywhere and changes smoothly across the study area, the cokriging system can retain only the secondary datum collocated with the location which is estimated (Goovaerts, 1997). The collocated cokriging estimate, ZCK (x), is:
n (x) = ZCK i =1
wi (x) Z(xi ) + wn+1 (5)
[H (x) m2 + m1 ]
where m1 and m2 are the global means of the precipitation and altitude data, respectively. The weights, wi (x) and wn+1 , are solutions of the cokriging system. Now, it is necessary to calculate and model one variogram for the precipitation data, another one for the altitude and their cross variogram, which is computed as: 1 (h) = 2 N (h)
N(h) i =1
[Z(xi ) Z(xi + h)][H (xi ) H (xi + h]
(6)
Altitude data is considered in a different way when cokriging and SKV are compared. Whereas altitude datum provides information about the trend in the SKV approach, the cokriging estimate is directly inuenced by it. If the same assumptions about the mean in the OK approach are considered, similarly to that, the OCK, method is dened. When RK is used, predictions are made separately for the trend and residuals and then added back together. Thus, the precipitation at a new unsampled point, x , is estimated using RK as follows:
ZRK (x) = m(x) + r(x)
The trend model coefcients can be solved using a weighted linear regression, where the covariance matrix, i.e. covariances between sample point pairs, is employed as the matrix of weights (Cressie, 1993, p.166). If there is no signicant spatial clustering between sample points, regression coefcients using an ordinary least square estimation are similar than those obtained with a general least square estimation based on the spatial matrix of residuals. According to Goovaerts (1997), before applying RK two general requirements need to be fullled: (1) relation between the target and predictors must be linear, (2) value of predictors must be known at all primary data locations and all new locations where the predictions are made. The choice of independent variables to model the trend should be based on the most known factors that inuence on the precipitation. One of the most important is the altitude (e.g. Agnew and Palutikof, 2000), H . In this study, it is the only one predictor used. H is the nominal altitude, in metres, of the stations, derived from the 1 km resolution DEM. It generates information about the variability due to the relief. Although there may be potential for improving the predictions, e.g. by including further factors, only H was considered to compare RK estimates with the other multivariate extensions of kriging. All geostatistical analyses were conducted using the extension Geostatistical Analystd of the GIS software ArcGISd (version 9.1, ESRI Inc, Redlands California, USA). After modelling the annual and monthly precipitation with the selected algorithms, a set of map layers in raster format was generated. These layers were based on grids, where each point (datum) represents the center of a 1000 m side square. All maps were produced with the ArcMapd module of the ArcGISd.
4. Results: assessment of the geostatistical interpolators During the exploratory analysis of precipitation data, the rst phase of any geostatistical study, data distribution was described using classical descriptive statistics. It was observed that monthly and annual mean and median values were appreciably different and, moreover, the coefcients of skewness were high (Table I). After performing the logarithmic transformation of the data, normality was apparent, i.e. mean and median values were similar and the coefcients of skewness were lower and nearer to zero (Table II). This means that monthly and annual precipitation data t to lognormal distributions. Although normality is not a prerequisite for kriging, it is a desirable property. Kriging will only generate the best absolute estimate if the random function ts a normal distribution. The correlations between precipitation and altitude for each month and all year were analysed. Linear correlation coefcients (Table I) ranging from 0.33 (August) to 0.67 (June) indicate that the secondary data, altitude, can be
Int. J. Climatol. 30: 620631 (2010)
(7)
where the trend, m(x ), is tted using linear regression analysis and the residuals, r (x ), are estimated using OK. If cj are the coefcients of the estimated trend model, vj (x) is the jth predictor at location x , p is the number of predictors, wi (x) are the weights determined by solving the OK system of the regression residuals, r(xi ), for the n sample points, the prediction, ZRK (x), is made by:
p ZRK (x) = j =0 n
cj vj (x) +
i =1
wi (x) r(xi ) (8)
v0 (x) = 1
626
F. J. MORAL
Table II. Descriptive statistics of the monthly and annual precipitation (mm) data, transformed to their corresponding natural logarithms, for the 136 rain gauge stations. Period January February March April May June July August September October November December Annual Mean 4.33 4.23 3.87 4.06 3.86 3.36 1.80 1.74 3.46 4.14 4.41 4.43 6.40 Median 4.29 4.21 3.83 4.02 3.79 3.32 1.79 1.72 3.40 4.08 4.36 4.40 6.36 SD 0.31 0.30 0.26 0.23 0.27 0.22 0.41 0.41 0.26 0.29 0.27 0.28 0.26 MXV 5.38 5.24 4.83 4.96 4.88 4.26 3.66 3.65 4.37 5.10 5.44 5.36 7.37 MNV 3.84 3.71 3.39 3.68 3.37 2.94 0.89 0.89 3.00 3.67 3.97 3.93 6.03 SK 1.04 0.98 1.18 1.29 1.59 1.12 0.86 0.88 1.27 1.06 1.25 0.98 1.29 KU 4.03 4.01 4.76 4.95 5.65 4.95 5.22 5.62 5.00 4.11 4.92 3.99 4.81 CorL 0.55 0.51 0.56 0.62 0.48 0.70 0.44 0.30 0.50 0.46 0.55 0.54 0.56
SD = Standard deviation; MXV = Maximum value; MNV = Minimum value; SK = Coefcient of skewness; KU = Kurtosis; CorL = Linear correlation coefcient between natural logarithm of precipitation and altitude.
Figure 3. (a) Nugget effect, sill and (b) relative nugget effect values of the monthly and annual variograms for the precipitation data.
worth incorporating into the mapping of precipitation. Previous transformation of precipitation data using the natural logarithm did not generate, in general, better regressions (Table II), so it was not considered. The structural analysis was carried out with the omnidirectional variogram. Experimental variograms were computed for monthly and annual data. The fact that two precipitation data close to each other are more alike than those that are further apart is apparent in the variogram values which increase with the separation distance. A linear combination of two models, one spherical and one lineal, and a nugget effect, provided the best t for all cases. This indicates that the spatial dependence occurs at two different scales, which are represented in the variogram as two spatial components. The short-range component
is described by the spherical model and the long-range component by the linear model. Monthly and annual variograms have similar shape, due to the control of the relief on the spatial distribution of precipitation, although the range and nugget effect uctuate from one month to another (Figure 3). The choice of a particular variogram model is dependent upon the expected spatial variability. A variable like precipitation can be distributed erratically in reduced distances and exponential or spherical models are the most suitable (Isaaks and Srivastava, 1989). Spatial dependence of the monthly precipitation data displayed some differences as determined by variogram analyses. Variogram parameters varied between months. The nugget effect was estimated extending the variogram until the
Int. J. Climatol. 30: 620631 (2010)
627
vertical axis. The behaviour near the origin is very important because of the inuence on the interpolation process. In this work, variograms showed a considerable nugget effect during the driest months, July and August, when the spatial variability of precipitation is higher at a scale smaller than the minimum lag distance (Figure 3). However, the relative nugget effect, ratio of the nugget discontinuity to the sill value, has its maximum in May, remaining high from June to September, also the driest months (Figure 3). During this period, an important part of the variance is due to the nugget effect. The way of decreasing the high nugget effect is considering closer precipitation data, which will only be possible in the future, when more information from new weather stations is available. The sills (Figure 3) are depending on the sample variances. These results are in accordance with standard deviations shown in Table II. The linear model of the long-range component implies that if we were to increase the region surveyed we should encounter ever more variation. Probably, the sill of the long-range scale is achieved for a distance greater than the limits of the experimental area. The third phase of the geostatistical study, the prediction at all unsampled locations, was carried out using univariate and multivariate interpolators, which integrate the spatial correlation structures described with the variograms.
The visual comparison amongst all maps obtained after applying the different geostatistical interpolation algorithms is very signicant. Annual and monthly maps show the same characteristics. Thus, for instance, annual maps are considered to illustrate the effect of all estimators (Figure 4). The OK, SK and UK methods generate similar spatial distribution of precipitation and, moreover, these graphic representations are crude. This indicates the necessity of more densely sampled information. As it is not possible to account for more precipitation data, the altitudes, through the DEM, are the immediate data to be considered. If altitude is taken into account, the simplest approach is the estimation of precipitation as a function of the collocated elevation, usually with a linear relation such as (3). The result is a precipitation map that is similar to the DEM (regression map); in fact, it is a rescaled map of the DEM (Figure 2). The SKV and RK methods generate precipitation maps where the impact of altitude is less marked. The spatial distribution of precipitation is inuenced by the auxiliary information, elevation, but the spatial correlation of the samples is also considered and, in consequence, the maps are not as crude as that obtained by the OK and SK techniques (Figure 5). When SKV and RK algorithms are used, the maps of kriged residuals can provide a visual representation
Figure 4. Annual precipitation maps obtained by (a) ordinary kriging, (b) simple kriging and (c) universal kriging. Copyright 2009 Royal Meteorological Society Int. J. Climatol. 30: 620631 (2010)
628
F. J. MORAL
Figure 5. Annual precipitation maps obtained by (a) simple kriging with varying local means, (b) regression-kriging and (c) collocated ordinary cokriging.
of the arrangement of the trend correctors. These types of maps can be denominated corrector maps or anomaly maps (Ninyerola et al., 2000) because they are of great importance to reveal the singularities of the precipitation at local scale. Generally, the corrector maps show maximum variability in the unpredictable locations, which are usually the most rugged terrains, and minimum variability in the predictable areas, the attest zones (Figure 6). Unlike the two previous algorithms, the maps of OCK precipitation estimates have not the details of the elevation (Figure 5). In fact, the OCK maps are more similar to that generated with the univariate methods, OK and SK, or UK. The seven interpolators were evaluated using crossvalidation (e.g. Isaaks and Srivastava, 1989, pp. 351368). Thus, one precipitation data is removed at a time from the database and estimated from remaining data using each different estimator. Later, the mean square error (MSE), i.e. the average square difference between the measured precipitation and its prediction for all 136 precipitation stations, is calculated and used as comparison criterion. The most accurate algorithm has a MSE closest to zero. It is important to denote that all geostatistical methods provide an estimate of the error variance, but this value has not been retained as a performance criterion because
it is not adequate to delimit the reliability of the kriging estimate (Armstrong, 1994). The MSE of prediction calculated using each of the seven estimators for the monthly and annual precipitation data are shown in Figure 7. Some results are very evident. Firstly, the largest prediction errors are always obtained for the linear regression of precipitation against altitude. This is an indication about the great importance of considering the spatial distribution of data and the information provided by surrounding precipitation stations, when an interpolation technique is chosen. Secondly, univariate geostatistical algorithms (OK and SK), including UK, yields similar prediction errors. The reduced number of precipitation stations in some areas makes these estimates, considering only the primary data, to be practically identical. Thirdly, algorithms that account for altitude perform better than the univariate techniques, i.e. they have lower MSE values. According to Goovaerts (2000), if correlation between precipitation and altitude increases, the gain of OCK versus OK increases too. Moreover, the same author (Goovaerts, 1997, pp. 217221) states that the contribution of the auxiliary data to the OCK predictions depends not only on the correlation between primary and secondary variables, but also on their patterns of spatial continuity. If the spatial dependence between samples weakens, i.e. the precipitation semivariogram has a larger
Int. J. Climatol. 30: 620631 (2010)
629
Figure 6. Annual corrector maps obtained by interpolation of the precipitation residuals (measured value minus estimated value from the linear regression between precipitation and altitude, at each meteorological station) using (a) the simple kriging and (b) ordinary kriging algorithms.
relative nugget effect, the OCK estimates are more accurate than the OK ones. In this study, when the correlation coefcients are lower, for instance in July and August, the performances of OCK and OK are very similar and, on the contrary, when the correlation coefcient is highest, for instance in June, the performance of OCK is better than OK (see Figure 7 and Table I). The inuence of the relative nugget effect is less apparent, maybe due to the existence of a clear spatial dependence between precipitation data. However, SKV and RK yield the best results. Between them, RK has lower MSE values for 8 months (March, June, July, August, September, October, November and December) and for the annual estimates, whereas SKV is the most accurate interpolator for 4 months (January, February, April and May). For the annual case, there is no practically difference between the RK and SKV approach (Figure 7). Consequently, these algorithms, RK and SKV, were selected to predict at unsampled locations and generate the nal monthly and annual precipitation maps. To clarify the inuence of the elevation on the RK prediction performance, some scattergrams have been plotted. Thus, the scattergram of Figure 8(a) shows that precipitation-elevation regressions perform similarly when their correlation coefcients are high, e.g. 0.67, 0.64 and 0.58 for June, April and November, respectively, but they are not sufcient to ensure that RK will improve to the same degree for all months; in the scatterplot shown in Figure 8(b), April and June have similar results but November is the second month with the worst performance. Moreover, it can be observed that December, November, January and March have practically the same correlation coefcient, but RK results are very different. As it was previously stated for OCK, the same is apparent for RK and SKV, i.e. RK and SKV predictions are not only dependent on the correlation between precipitation and elevation but also on the pattern of spatial continuity. Higher correlation coefcient implies that elevation brings more information on precipitation, but this fact does not necessarily make RK performs better.
Figure 7. Mean square errors of estimation generated by each of the seven interpolation algorithms for monthly and annual precipitation; the values are expressed as proportions of the prediction error of the ordinary kriging approach.
Additionally, the benet of using RK instead of other univariate methods, e.g. OK, increases as the spatial dependence between observations weakens. This is shown in Figure 9: RK performs better than OK for those months in which the relative nugget effect is larger for the precipitation variogram, tting moderately to a straight line. The nal maps can be automatically updated with new precipitation data and easily managed in a GIS environment. It is convenient to indicate that only a DEM and precipitation data are needed to apply the same methodology in other regions or areas. However, some limitations have been detected and can be improved. Kriging is the BLUE if the assumptions required to krige a surface are fully met. Thus, the assumption of stationarity is rarely met in reality. Moreover, the accuracy of the kriged surface depends on the density of
Int. J. Climatol. 30: 620631 (2010)
630
F. J. MORAL
Figure 8. Scattergrams between (a) the estimates using the precipitation-elevation regression and observations (correlation coefcients are 0.97, 0.96 and 0.94 for June, April and November respectively) for some selected months, and between (b) the mean square error of prediction for the regression-kriging approach and the correlation coefcients of precipitation-elevation regressions for all months.
Figure 9. Scattergram between the ratio of mean square error of prediction for the OK and RK techniques and the relative nugget effect values of the monthly and annual variograms for the precipitation data.
the network of meteorological stations. These problems are partially overcome when SKV and RK are employed because the stationarity is replaced by known varying means derived from the secondary data. Despite the limitations outlined previously, a GIS-based approach has important potential, given the range of topography, for instance, in the study area, Extremadura. The results of the cross-validation provide clear evidence of the usefulness of kriging and the use of auxiliary
information, mainly altitude, in the spatial interpolation of precipitation. Future research includes considering new independent variables, e.g. proximity to large bodies of water and landcover, and the support of remote sensing. Using average synoptic circulation patterns as predictors in the regression model, in addition to the terrain-based variables, can also contribute to rene estimates.
Int. J. Climatol. 30: 620631 (2010)
631
5.
Conclusions
Highly accurate annual and monthly precipitation maps have been generated by following a methodology that comprises geostatistical and GIS techniques. Univariate and multivariate geostatistical algorithms were applied to estimate precipitation patterns from data recorded in 136 meteorological stations in Extremadura region of southwestern Spain. The results conrm other previous ndings (e.g. Goovaerts, 2000; Diodato, 2005) where, in general, geostatistical algorithm interpolations are more accurate than predictions made using deterministic techniques. Moreover, if correlated secondary information, such as altitude, is considered, estimates can be further considerably improved. In this work, different geostatistical methods to incorporate the secondary data have been examined. All multivariate techniques (OCK, SKV and RK) outperform univariate methods (OK, SK and UK) but, however, cross-validation has shown that prediction performances vary among algorithms. SKV and RK provide the smallest MSE of estimates and so performs better than the more demanding (three semivariograms have to be modelled) OCK. The SKV and RK maps are inuenced by the DEM pattern and show more details than the cokriged maps. Therefore, although these non-stationary interpolators have not been used extensively in climatology, they are potentially more appropriate to model climate variables than any form of cokriging. Acknowledgements The author thanks the reviewers of this paper for providing constructive comments which have contributed to improve the nal version. References
Agnew MD, Palutikof JP. 2000. GIS-based construction of baseline climatologies for the Mediterranean using terrain variables. Climate Research 14: 115127. Almarza C. 1984. Fichas h dricas normalizadas y otros par ametros hidrometeorol ogicos, tomo II. National Meteorological Institute INM: Madrid, Spain. Atkinson PM. 1997. Geographical information science. Progress in Physical Geography 21: 573582. Armstrong M. 1994. Is research in mining geostats as dead as a dodo? In: Geostatistics for The Next Century, Dimitrakopoulos R (ed). Kluwer Academic: Dordrecht; 303312. Azimi-Zonooz A, Krajewski WF, Bowles DS, Seo DJ. 1989. Spatial rainfall estimation by linear and non-linear co-kriging of radarrainfall and raingage data. Stochastic Hydrology and Hydraulics 3: 5167. Burrough PA. 2001. GIS and geostatistics: essential partners for spatial analysis. Environmental and Ecological Statistics 8: 361377. Burrough PA, McDonnell RA. 1998. Principles of Geographical Information Systems. Oxford University Press: Oxford. Bourennane H, King D, Couturier A. 2000. Comparison of kriging with external drift and simple linear regression for predicting soil horizon thickness with different sample densities. Geoderma 97(34): 255271. Chiles J, Delner P. 1999. Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons: New York. Cressie N. 1985. Fitting variogram models by weighted least squares. Mathematical Geology 17(5): 563586. Cressie N. 1993. Statistics for Spatial Data. John Wiley & Sons: New York. Copyright 2009 Royal Meteorological Society
Daly C, Neilson RP, Phillips DL. 1994. A statistical topographic model for mapping climatological precipitation over mountain terrain. Journal of Applied Meteorology 33: 140158. Deutsch C, Journel A. 1992. Geostatistical Software Library and Users Guide. Oxford University Press: New York. Diodato N. 2005. The inuence of topographic co-variables on the spatial variability of precipitation over small regions of complex terrain. International Journal of Climatology 25: 351363. Dirks KN, Hay JE, Stow CD, Harris D. 1998. High-resolution studies of rainfall on Norfolk Island, Part II: interpolation of rainfall data. Journal of Hydrology 208(34): 187193. Goovaerts P. 1997. Geostatistics for Natural Resources Evaluation. Oxford University Press: New York. Goovaerts P. 1999. Using elevation to aid the geostatistical mapping of rainfall erosivity. Catena 34(34): 227242. Goovaerts P. 2000. Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology 228: 113129. Guan H, Wilson JL, Makhnin O. 2005. Geostatistical mapping of mountain precipitation incorporating auto-searched effects of terrain and climatic characteristics. Journal of Hydrometeorology 6(6): 10181031. Hengl T, Heuvelink GBM, Stein A. 2003. Comparison of kriging with external drift and regression-kriging. Technical Note, International Institute for Geo-information Science and Earth Observation (ITC), Enschede, http://www.itc.nl/library/Academic output.. Hudson G, Wackernagel H. 1994. Mapping temperature using kriging with external drift: theory and an example from Scotland. International Journal of Climatology 14(1): 7791. Hulme M, Conway D, Jones PD, Jiang T, Barrow EM, Turney C. 1995. Construction of a 19611990 European climatology for climate change modelling and impact applications. International Journal of Climatology 15: 13331363. Hulme M, Conway D, Joyce A, Mulenga H. 1996. A 196190 climatology for Africa south of the equator and a comparison of potential evapotranspiration estimates. South Africa Journal of Science 92(7): 334343. Hutchinson MF, Gessler PE. 1994. Splines more than just a smooth interpolator. Geoderma 62: 4567. Isaaks EH, Srivastava RM. 1989. An Introduction to Applied Geostatistics. Oxford University Press: Oxford. Krige DG. 1951. A statistical Approach to Some Mine Valuations and Allied Problems at the Witwatersrand. Masters Thesis, University of Witwatersrand, South Africa. Legates DR, Willmott CJ. 1990. Mean seasonal and spatial variability in global surface air temperature. Theoretical and Applied Climatology 41: 1121. Mart nez-Cob A. 1996. Multivariate geostatistical analysis of evapotranspiration and precipitation in mountainous terrain. Journal of Hydrology 174: 1935. Ninyerola M, Pons X, Roure JM. 2000. A methodological approach of climatological modelling of air temperature and precipitation through GIS techniques. International Journal of Climatology 20: 18231841. Odeh I, McBratney A, Chittleborough D. 1994. Spatial prediction of soil properties from landform attributes derived from a digital elevation model. Geoderma 63(34): 197214. Odeh I, McBratney A, Chittleborough D. 1995. Further results on prediction of soil properties from terrain attributes: heterotopic cokriging and regression-kriging. Geoderma 67(34): 215226. Pardo-Ig uzquiza E. 1998. Comparison of geostatistical methods for estimating the areal average climatological rainfall mean using data of precipitation and topography. International Journal of Climatology 18: 10311047. Phillips DL, Dolph J, Marks D. 1992. A comparison of geostatistical procedures for spatial analysis of precipitation in mountainous terrain. Agricultural and Forest Meteorology 58: 119141. Raspa G, Tucci M, Bruno R. 1997. Reconstruction of rainfall elds by combining ground raingauges data with radar maps using external drift method. In: Geostatistics Wollongong 96, Baa EY, Schoeld NA (eds). Kluwer Academic: Dordrecht, 941950. Vicente-Serrano SM, Saz-S anchez MA, Cuadrat JM. 2003. Comparative analysis of interpolation methods in the middle Ebro Valley (Spain): application to annual precipitation and temperature. Climate Research 24: 161180. Wackernagel H. 1998. Multivariate Geostatistics. Springer: Berlin. Willmott CJ, Robeson SM. 1995. Climatologically aided interpolation (CAI) of terrestrial air temperature. International Journal of Climatology 15: 221229. Int. J. Climatol. 30: 620631 (2010)

Moral Geosta IJC 2010

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Moral Geosta IJC 2010

Caricato da

Copyright:

Formati disponibili

INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 30: 620631 (2010) Published online 9 April 2009 in Wiley InterScience (www.interscience.wiley.

com) DOI: 10.1002/joc.1913

Comparison of different geostatistical approaches to map climate variables: application to precipitation

kriging; precipitation; altitude; geographical information system; regression

Received 11 July 2008; Revised 22 January 2009; Accepted 8 March 2009

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

Figure 1. Location and topography of the study area.

Material and methods Data

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

{Z(xi ) Z(xi + h)}2

wi (x) [Z(xi ) m(xi )]

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

wi (x) Z(xi ) + wn+1 (5)

[Z(xi ) Z(xi + h)][H (xi ) H (xi + h]

wi (x) r(xi ) (8)

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES

Potrebbero piacerti anche