Sei sulla pagina 1di 71

Statistical modelling of spatial data

Contents
Andreas Papritz 1 geostatistics: data sets and objectives of analyses 5
1.1 spatial statistics . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 objectives of geostatistical analyses . . . . . . . . . . . . 10
1.2.1 spatial prediction: Wolfcamp aquifer data set . . 11
ETH Zurich 1.2.2 spatial prediction: Dornach data set . . . . . . . 21
Institute of Terrestrial Ecosystems 1.2.3 parameter estimation (and prediction): forest soil
Institute of Biogeochemistry and Pollutant Dynamics SOC stock data . . . . . . . . . . . . . . . . . . . 32
1.2.4 parameter estimation and inference: wheat yield
Chairs of Soil Physics and Soil Protection
experiment . . . . . . . . . . . . . . . . . . . . . . 43

2 spatial data: R classes and methods 59


2.1 package sp: methods and classes for spatial data . . . 61
andreas.papritz@env.ethz.ch
3 an example of a geostatistical analysis with R 88

3.1 exploratory analysis . . . . . . . . . . . . . . . . . . . . . 92 5.2 estimating and modelling auto-correlation . . . . . . . . . 184


3.2 trend modelling . . . . . . . . . . . . . . . . . . . . . . . . 94
3.3 estimating and modelling auto-correlation . . . . . . . . . 101 6 maximum likelihood (ML) estimation of parameters of Gaus-
3.4 fitting spatial model by maximum likelihood . . . . . . . . 114 sian model for spatial data 198
3.5 inference, model building and assessment . . . . . . . . . 118 6.1 ML estimation for Gaussian spatial model . . . . . . . . . 200
3.6 computing spatial predictions by kriging . . . . . . . . . . 126 6.2 restricted maximum likelihood estimation (REML) . . . . . 211
6.3 statistical inference and model building . . . . . . . . . . . 217
4 some theory on stochastic processes 132
4.1 stochastic processes . . . . . . . . . . . . . . . . . . . . . 133 7 spatial prediction by kriging 228
4.2 stationary and isotropic stochastic processes . . . . . . . 134 7.1 mean square prediction . . . . . . . . . . . . . . . . . . . 230
4.3 Gaussian stochastic processes . . . . . . . . . . . . . . . 136 7.2 mean square prediction for Gaussian process . . . . . . . 232
4.4 properties of covariance functions and variograms . . . . 137 7.3 universal/external drift kriging . . . . . . . . . . . . . . . . 237
4.5 smoothness of stochastic processes . . . . . . . . . . . . 140 7.4 lognormal universal kriging . . . . . . . . . . . . . . . . . 256
4.6 examples of isotropic covariance functions . . . . . . . . . 142
8 model assessment by cross-validation 267
4.7 anisotropic covariance functions . . . . . . . . . . . . . . 161
8.1 criteria to assess precision of predictions . . . . . . . . . . 270
5 ad-hoc estimation of parameters of model for spatial data 176 8.2 criteria to assess MSEP[Y!k (xi )] . . . . . . . . . . . . . . . 271
5.1 trend modelling . . . . . . . . . . . . . . . . . . . . . . . . 179 8.3 criteria to assess probabilistic predictions . . . . . . . . . 272
8.4 criteria to assess “sharpness” of predictive distributions . 274
1 geostatistics: data sets and objectives 1.1 spatial statistics ↑↓ 6

of analyses
• spatial statistics: statistical analysis of data for which spatial (or
spatio-temporal) position where attribute was recorded is known

• depending on type of spatial data one distinguishes:

– geostatistics: analysis of spatial data that refers to a very


small area (volume) and that can in principle be recorded at
any point in a study domain (⇒ infinite number of locations in
study domain where measurements can be taken)
– point pattern analysis: analysis of spatial positions of
“points” (or other objects) in a study domain
– statistical image analysis: analysis of spatial data that typ-
ically refer to larger areas (volumes) and that are arranged
on regular or irregular lattices (with finite number of “cells”)

⇒ geostatistics is just one (important) branch of spatial statistics

a geostatistical data set ↑↓ 7 a point pattern ↑↓ 8

log10(Cu content) topsoil (0–20 cm), metal smelter Dornach BL


259500
259000
northing [m]
258500

● pollution level A
● pollution level B
● pollution level C
● pollution level D
● pollution level E
● pollution level F
258000

612500 613000 613500 614000


easting [m]
a lattice data set ↑↓ 9 1.2 objectives of geostatistical analyses ↑↓ 10

37.0
rate of sudden infant deaths, 1974, North Carolina 1. computing spatial predictions

• pressure head of Wolfcamp aquifer


36.5

• topsoil heavy metal content in vicinity of metal smelter


36.0
latitude
35.5

2. parameter estimation, significance testing in controlled experi-


35.0

ments
34.5

• stocks of organic carbon stored in Swiss forest soils


34.0
33.5

• field trial on yield of 56 wheat varieties


−84 −82 −80 −78 −76

longitude

1.2.1 Wolfcamp aquifer data set ↑↓ 11 > library(geoR)


> str(wolfcamp)

• data on hydraulic head (pressure) in a confined brine aquifer in List of 2


NW Texas $ NULL: num [1:85, 1:2] 68.85 -44.09 -1.87 -29.96 155.24 ..
$ data: num [1:85] 446 778 658 748 535 ...
- attr(*, ”class”)= chr ”geodata”
• hydro-geological study part of evaluation whether region suited to
host nuclear waste repository
> ## convert to ordinary dataframe
• measurement of hydraulic head at 85 locations > d.w <- as.data.frame(wolfcamp)
> class(d.w) <- ”data.frame”
• geostatistical analysis by Cressie (1993)
> colnames(d.w) <- c(”x”, ”y”, ”pressure”)
> str(d.w)
• data set part of R package geoR

’data.frame’: 85 obs. of 3 variables:


$ x : num 68.85 -44.09 -1.87 -29.96 155.24 ...
$ y : num 44.5 -14.8 -24.3 -37.9 -57 ...
$ pressure: num 446 778 658 748 535 ...
- attr(*, ”ncol.data”)= int 3
> summary(d.w) > ## bubble plot: symbol area linearly related to data
> plot(y~x, d.w, asp=1, cex=sqrt(pressure-300)/5)

x y pressure
Min. :-233.72 Min. :-145.79 Min. : 312.1
1st Qu.: -34.26 1st Qu.:-106.73 1st Qu.: 471.8
Median : 19.57 Median : -65.74 Median : 547.7
Mean : 27.63 Mean : -33.23 Mean : 610.3

100
3rd Qu.: 114.10 3rd Qu.: 51.21 3rd Qu.: 774.2
Max. : 181.53 Max. : 136.41 Max. :1088.4

50
0
y
−50
−100
−150
−300 −200 −100 0 100 200
x

d data n
e hea e predictio
pressur surfac
trend
pressure

pressure
N N
y−

y−
co

te
rdina
co
te
rdina
or

o
or
x−co o
di

x−co
di
na

n
at
te

e
on
g predicti rror
krigin rd e
nda
sta
ng
pressure
krigi

pressure
N
N

y−
y−

co
ate
din
co

te

or
dina or
or

coor −co

din
di

x− x
na

ate
te

geostatistical data sets ↑↓ 19 terminology and model notation ↑↓ 20

• set of observations (yi , xi ) where yi is a datum of a re- • model for data: Yi = S(xi ) + Zi where
sponse variable and xi is a spatial location in a study domain D
Yi ith datum
• optional: spatial covariates, say dk (xi ), used to “explain” the
S(xi ) “signal” (= true quantity) at location xi
spatial pattern of the response variable
Zi iid random measurement error
• geostatistical data often show (gradual) large-scale spatial vari-
ation (trend) and small-scale local fluctuations • decomposition of signal into trend µ(xi ) and stochastic fluctuation:

– trend: commonly modelled as low-order polynomial func- S(xi ) = µ(xi ) + E(xi )


tion of spatial coordinates (⇒ trend surface) or as function of
external spatial covariates (⇒ external trend) where commonly a linear model is used for µ(xi )
"
– local fluctuations usually spatially structured (values at pairs µ(xi ) = dk (xi )βk = d(xi )T β
of nearby locations “more similar” than for pairs farther apart): k
⇒ auto-correlation
with dk (xi ) denoting (spatial) covariates and {E(xi )} a zero mean
• spatial data sometimes distorted by independent measurement er- stochastic process (random field)
rors
1.2.2 Dornach data: soil pollution by heavy metals ↑↓ 21

• site: Swissmetal smelter in Dornach (SO), Switzerland

• emission of dust and fumes containing Cu, Cd and Zn from 1870


to 1980s severely contaminated soils in vicinity of smelter

• comprehensive survey 2003–2005 with objective to spatially de-


limit zones that require mitigating measures
details → website BBG Dornach

• more details in Hofer et al. (2013)


Cu content topsoil (0-20 cm) Cu: expectations of predictive distributions of parcel means
261000

Arlesheim

260000



260000

● ● ●

● ●



Reinach
● ●

Reinach Arlesheim

● ● ● ●

● ●

● ● ● ●


● ● ●


● ●



●●
● ● ● ● ●
● ● ●


● ●




● ● ●
● ● ● ●
● ● ●
● ● ●

● ● ● ●

●●
● ●

● ● ● ●● ●

● ● ●
● ● ●●


●● ●
● ●
● ● ●


● ●
● ● ● ● ●

●●
● ●

● ● ● ● ● ● ● ●


● ●
● ●
● ●


● ●





● ● ● ● ●● ●
● ● ● ●● ● ● ● ●
259000

●● ● ●

● ● ● ● ● ● ● ●
● ● ● ●
●● ●

● ●● ● ● ●● ●
● ●



● ●

259000
● ● ● ● ● ● ●


● ● ●

● ●
●● ●
●● ●

● ● ● ● ●

● ● ● ● ● ●

● ●



● ●

● ● ● ●


● ● ●



● ● ●

● ● ●
● ● ● ●

● ● ●

● ● ● ●●
northing [m]

●●
● ●

●●

northing [m]
● ● ● ● ● ●● ● ● ●
●●
● ●● ● ●




● ● ●●


● ● ● ●●

●● ● ● ●




●●
●●
● ●
● ● ●

●● ●●● ● ● ● ● ●
●● ●

● ●


● ● ● ●

●●
●● ●

●●● ●●●
● ● ● ●● ● ●●●

● ● ●● ● ●
● ● ● ● ●●
● ● ●● ● ● ● ● ●
●● ●
● ● ●

● ●
● ●
● ● ● ● ● ●



● ●
●● ●● ●
● ●
● ● ● ● ● ●●

●●
●●● ● ●
● ● ●● ●
● ●

●●
●● ●●

●● ● ● ●● ●●
● ● ● ●
●●
● ●
● ●●● ●●● ● ● ●● ● ●
● ●
● ● ●● ●
● ● ● ● ● ● ●

● ●● ●





● ●● ● ●

● ●
●● ● ●●● ●
● ● ●
●●● ●
● ● ●● ● ●● ● ●
●● ● ● ●
● ●●


●● ●


●●

● ●●
● ●
● ● ● ●
● ● ●
●●
● ● ●●● ●
● ●●
● ● ●● ●● ●
●● ●

● ● ● ● ● ● ●



● ● ●● ● ● ●

●●●


● ●● ●●
●● ●

● ● ●
● ●● ●
● ●
● ● ● ● ●

● ● ●

● ●

●●● ●
● ●● ● ● ● ● ●● ● ● ●


● ● ● ● ●


●● ●●●


●●
● ●●
● ●●
● ●

● ● ● ●


● ● ●

● ● ●●
● ●
● ●● ●● ● ● ●
●● ● ●
● ● ●


● ● ●

● ● ● ● ● ●
● ● ●


●●● ●


● ● ● ●
● ● ● ● ● ● ● ●





● ● ● ● ● ●● ●
258000

● ● ● ● ● ● ● ●
● ●●
● ●

● ●
● ●
● ● ● ●
● ● ● ● ● ● ● ● ● ● ●


● ●
● ● ●

● ●● ● ● ● ● ●●●
●●

●● ● ● ●● ●
● ●

● ● ● ●
● ● ● ● ●
● ● ●

● ●



● ● ●● ●
● ●

● ● ● ●
● ● ● ● ●

● ● ●

258000
● ● ●

● ● ● ●
● ●
● ● ●

Dornach
● ●


● ● ● ●

● ●

● ● ●


● ● ● ●

● ● ●
● ●

● ●

● ● ●

● ● ● ●

● ● ●
● ●

Dornach
● ● ●
● ●
●●

● ●

● ●





● ●
● ●

● ● ● ●
● ●●

Aesch ●
● ●


257000



● ● ●

● ●

pollution level A

● ●
● ● ● ●

Cu ≤ guide value Aesch ● pollution level B


● ●

● ●

● pollution level C
guide < Cu ≤ trigger value 257000

● ●


● ● pollution level D
trigger < Cu ≤ clean-up value

● ● pollution level E
● pollution level F
Untersuchungen Stand Sommer 2003 ● Cu > clean-up value
611000 612000 613000 614000 615000 616000 611000 612000 613000 614000 615000 616000
easting [m] easting [m]
Cu: 95%-percentiles of predictive distributions of parcel means Cu: 5%-percentiles of predictive distributions of parcel means

Arlesheim Arlesheim
● ●

260000

260000
● ●
● ●

● ●
● ● ● ●

● ● ● ●
● ●

● ● ● ●
● ●
● ●
● ●
● ●

Reinach Reinach
● ● ● ●

● ●

● ●

● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ● ● ● ● ●

● ●

● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ●
● ● ● ● ●
● ●
● ● ● ●
● ●● ● ● ● ● ● ●
● ●● ● ●
● ● ● ●● ● ● ● ●●
● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ●

● ● ● ● ● ●
● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●
●● ● ● ●● ● ●
●● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●● ● ● ●● ● ●● ● ● ●●
259000

259000
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●● ● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
●●
● ● ●●
● ●
northing [m]

northing [m]
● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ●● ●
● ● ●● ● ● ●● ●
● ● ●●
● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●

● ●●
● ●
● ●●

● ● ● ● ● ● ● ●
● ●
● ● ● ● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●
●● ● ●● ●
● ● ● ●● ● ●●● ● ● ● ●● ● ●●●
● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ●● ● ● ● ● ● ● ●

● ● ●
● ● ● ● ● ● ●
● ● ● ●
● ●● ● ● ● ●● ● ●
● ●● ●● ● ● ●● ●● ●
● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ●
●● ●
● ● ● ●● ● ●● ●
● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ●●● ●●● ● ● ●● ●
● ● ● ●●● ●●● ● ● ●● ●
● ●

● ● ●● ●
● ● ● ● ● ● ● ●
● ● ●● ●
● ● ● ● ● ● ●
● ●● ● ● ● ● ●● ● ● ●
● ● ●●● ●
● ● ● ●● ● ● ●●● ●
● ● ● ●●
● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ●
● ● ● ●
●● ● ● ● ● ●
●● ●
● ● ● ●● ● ●● ● ●● ●● ● ●
● ● ● ●● ● ●● ● ●● ●● ● ●
● ●
●●
● ● ●●● ● ● ● ●● ●● ● ● ●
●●
● ● ●●● ● ● ● ●● ●● ●
● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●
● ●● ●
●●
●● ● ● ●● ●
●●
●● ●
● ● ●●
●● ● ● ● ● ● ● ●●
●● ● ● ● ●
● ●
●●● ● ● ● ● ●
●●● ● ● ●
● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ●
● ● ● ● ● ● ● ●
● ●● ● ● ●● ● ●
● ● ●● ● ● ● ●●
● ● ● ●● ●●● ● ● ●● ● ● ● ● ●● ●●● ● ● ●● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ●
●●●
●●● ●● ●● ● ● ● ● ●●●
●●● ●● ●● ● ● ● ●
● ● ●
● ● ● ● ● ●
● ● ●
● ●
●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ●● ● ●


● ● ● ● ● ●● ●
● ● ● ● ● ● ● ●
● ●● ●
● ●

● ● ● ● ● ● ● ●
● ●● ●
● ●

● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
● ● ● ●● ● ● ● ● ●● ●
● ● ● ●●● ● ● ● ● ● ●●● ● ●

● ● ● ● ● ●● ● ●
● ● ● ● ● ●● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●


● ● ●● ● ●

● ● ●● ●
● ● ● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ●
● ● ● ●
258000

258000
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ●
● ● ● ●
● ●

Dornach Dornach
● ● ● ● ● ●
● ● ●
● ● ●● ● ● ● ●●
● ● ● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ●

● ● ●● ● ● ●●
● ●
● ● ● ●
● ●
● ●

● ●
● ●
● ● ● ●
● ●

pollution level A pollution level A


● ●


● ● ● ● ●
● ● ● ●

Aesch ● ●


● pollution level B Aesch ● ●


● pollution level B
● pollution level C ● pollution level C
257000

257000
● ●

● ●




● ● pollution level D ●



● ● pollution level D
● pollution level E ● pollution level E
● pollution level F ● pollution level F

611000 612000 613000 614000 615000 616000 611000 612000 613000 614000 615000 616000
easting [m] easting [m]

support of data and predictions ↑↓ 31 1.2.3 SOC stocks of Swiss forest soils ↑↓ 32

• Dornach: metal content of bulked samples obtained by combining


soil material of several borings collected over a small plot • soil organic carbon (SOC) stocks reported in national greenhouse
gas inventory (UNFCCC, Kyoto Protocol)
⇒ support: area (or volume) to which a datum relates
(Dornach: support of majority of measurements: 100 m2 ) • SOC stock data available for ≈ 1000 forest sites in CH
• support of data usually small compared to size of study domain
• objectives
⇒ approximation: infinitesimal support
⇒ “regionalized variable”: signal S(x) exists for any point ⇒ develop regression model to estimate SOC stock from (spa-
x ∈ study domain tial) environmental covariates
• prediction targets are sometimes spatial means with non- ⇒ predict mean SOC stocks at regional and national scales
vanishing support |B| (Dornach: mean conc. of hm on parcels)
# • full details in Nussbaum et al. (2014)
1
S(B) = S(x) dx
|B| B

⇒ change of support: predictions of spatial means from data with


(approximately) infinitesimal support
WSL soil profiles (total 1033 sites) Monitoring network soil profiles
#
* calibration set (858 sites) (total 76 sites)
#
*
#
**
validation set (175 sites) #
*
! ! # #
*
##
*#
*# validation set
#
* *#
# * **
#
*#
* #
* #
* NABO (28 sites)
! sites with litter data used for #
* #
* EE#
* ! !#
# * * E E
ipitation
#
* E
comparison with E validation set
nual prec
#
* #
* #
* #
#
* #
* #
#
*#
*
#
* #
* E E # #
* E
*
**# **#
# * *! #
E *#
* #
*# *#
**
mean an
#
* ! #
* !#
*
# #
*! ! * # ! *E ! #
*# # #
*# # * *!#
# #
*
Yasso07 (149 sites) *#
* #
*#
* #
**#
#* *
#
** #
# * *#
#*#
*# * ##
*** *#
* KABO (48 sites)
! !!#
#
** #
**
!! # * #
#
#
** #
*
* #
#
* * ! E ! #
* *#
! #
* # # ** #
# *#
**#
* #
#
#*
* #
#
*
!#
**! *
# ! ! ! #*#
*# #
*
#
* E! E E #
* #
**#
#
sites without litter #
* # *#
*#*# #
*# *
!
# !
*** E
E # *#*#* #
*
*
! !
#
*
!
#**
!#
* *!#
## #
! E E! !# #
*#
* *#
* *
#
* # *#
#
**
rature
#
data (26 sites) *#
#
*
* !#* !#*#
* * ! # * # #
* * E # #
*
* #
tempe
#
* ! !# #
* !#
**#* ! ! # **#
#
#
* * # **#
# *# # #
* !E ! !
* ! # ** * #
#
*# !
#
*#
* # #
annual
#
* #
#
*
* #
* #
*
#
* *#* #
#
#
*
* #
*
* * * #
# *#
* #* E
*# E
#
* ! * #
mean
# ! #
* # ! #
* # # *
* #
* # #
** #
*#
* #
* #
*#
* *#
*#
*
#
*
*
# #
** ! #* #
*# * #
* # *
!#
*#
* * #
# *
#
#
*#*#
*#
#
*
*
#
*#
#
*
#
*
#
*# #
*#
#
**# * #
**#
* # ! !
* *
#
* * #
# * #
*# #
*
* #
*#
forest #
* #
* # *!#
# #
*
#
** * !#
#
* #
* *#
!# *! ! # #
*
* ! ! *
*#*
!
#
* ! !
#
* *
!
*# * * #
nits)
#
#
* #
* *#
# *#
* *#
# *#
ap u
#
* *#
*
#
* #* ! ! ! ! #
*
soil m
! #
##
*#
*# #
* #
* * E # #
*
*
d on
*## **
#
*#
* ** # #
*!#
#
* ! #
*#
* * #
* ! #
* ! #
* !
#
* ##
*
ase
*#
* #
*#
*#
* # #*
pH (b
# #
* #
*#
*# #
* **
#*# #
*
*
#
* #
* # # *
#
*
##!
** ! ! ! ! *#
#*#*!#
*# *
* # * *#
*#
*
# *#
##
* *
! !
* #
*#
# * #
*
* #
* #
*#
*#
* *#
#
*#
#**#
!#*##
*
*#
* *#
#* ! !#
*
#
#
* *#
# *# *#* #
*
#
*
#
*
#
* ! #
* #
* #
*#
* #
*
! #
*#* #
* #
*#
*
*
#
* *#
# * #
*
#
*#
* *#
!#*#
#
** *#
#*
!! ! ! ! !
#
*#
ce
# !
#
*#
#
* * * #
*# * !
#
#* *
# #
*
ctan
*
*
#
* ! #
* # * # ! #*!*
!##
#*
* * # ! #
* *#
* !# #
**!#
*#
#
* ! ! #
* !
refle
#
*#*# * * # *!!
*# # #
*#*
#
* # #
* #
**
* *# * #
# *#* *# #
*
* **#
# #
*#
*** #
*#
#
*
* #
*# * #* * !# d
frare
#
* !
#
* *#
# ##
*#*#
# * # * ! !# #
* #
*
# #
* #
*
* *
r in
!
##
*!!
*#
! #
*#
#
** ** #
nea
!#
*# * #
! #
*#
*# ! #
* *
#
* #
*#
*# ! #
* #
* !#
*
*#
* *#
*#
# * #
*
* *
#
* ! #
#
*
* #
* #
* #
*! # ! !
*
#
* #
*
# #
* ht
heig
#
* * #
* #
*
#
* #
*
#
*#
*
#
* #
*
#
*
# n
tatio
#
*
#
*#
* # * #
* # #
*! #
*
#
* *
#
* * e
#
#
*
* #
* !
#
*
! ! ! ! ! #
#
*
*
#
**
!#
#
*
#
* veg
vel
#
* # * *#
#
#
**
#
*
#
*
#
*
ale
#
*
!
*#
#**!#
#
*
! #
* #
* #
* #
* ! !
**#
*#
#!#
* *!#
* #
* se
ove
#
*
t ab
#
* ! !

igh ex
Data Source: #
* *#
!# * !
he ind
Soil profiles calibration/validation tn ess
we
!
© 2011 WSL, Soil Science Group
hic ex
rap
Soil profiles validation NABO / KABO © 2012,
ind
!#
*
*
Agroscope / ALN Zurich / AFU St. Gallen og on
Forest: SilvaProtect-CH © 2008 BAFU, Giamboni. Lakes: Vector 200 © 2007 swisstopo (DV033492.2)) 0 25 50 km top siti
o
Relief 1:1'000'000: K606-01 © 2004 swisstopo. Swiss Boundary: BFS GEOSTAT, swisstopo
h ic p
rap
og
top

> summary(spatial.model)
> summary(non.spatial.model)
...
Variogram: exponential
... Estimate
Estimate Std. Error t value Pr(>|t|) variance 0.096
(Intercept) 4.025e+00 1.834e-01 21.951 < 2e-16 snugget 0.000
sqrt(precipitation) 1.065e-02 1.118e-03 9.525 < 2e-16 nugget 0.096
nir -9.666e-03 2.841e-03 -3.402 0.000695 scale 329.205
position_index_clay_poor 2.082e-03 8.547e-04 2.436 0.015002
position_index_clay_rich -1.688e-03 4.547e-04 -3.713 0.000216 Fixed effects coefficients:
stock_soil_mass -2.506e-04 5.601e-05 -4.474 8.53e-06 Estimate Std. Error t value Pr(>|t|)
soil_map_units_B -3.740e-01 4.938e-02 -7.573 8.18e-14 (Intercept) 3.980e+00 1.979e-01 20.113 < 2e-16
soil_map_units_C -4.679e-01 4.580e-02 -10.216 < 2e-16 sqrt(precipitation) 1.098e-02 1.245e-03 8.824 < 2e-16
soil_map_units_D -2.512e-01 4.373e-02 -5.744 1.22e-08 nir -8.389e-03 2.988e-03 -2.807 0.005090
soil_map_units_E -1.179e+00 1.258e-01 -9.375 < 2e-16 position_index_clay_poor 2.303e-03 8.739e-04 2.636 0.008525
position_index_clay_rich -1.658e-03 4.806e-04 -3.450 0.000584
Residual standard error: 0.4283 on 1012 degrees of freedom stock_soil_mass -2.667e-04 5.975e-05 -4.463 8.97e-06
... soil_map_units_B -3.712e-01 5.398e-02 -6.877 1.07e-11
soil_map_units_C -4.715e-01 4.965e-02 -9.495 < 2e-16
soil_map_units_D -2.531e-01 4.728e-02 -5.353 1.07e-07
soil_map_units_E -1.142e+00 1.452e-01 -7.866 9.34e-15

Residual standard error (sqrt(nugget)): 0.3095


...
parameter estimates standard errors
4
non−spatial model non−spatial model
3 spatial model spatial model
0.15

2
0.10
1

0 0.05

−1 0.00
(Intercept)
sqrt(precipitation)
nir

(Intercept)
position_index_clay_poor
position_index_clay_rich
stock_soil_mass
soil_map_units_B
soil_map_units_C
soil_map_units_D
soil_map_units_E

sqrt(precipitation)
nir
position_index_clay_poor
position_index_clay_rich
stock_soil_mass
soil_map_units_B
soil_map_units_C
soil_map_units_D
soil_map_units_E
-1
SOC stock 0 - 30 cm [t ha ] Data Source:
Prediction of soil organic carbon in forests: own data
0 30 60 90 120 150 180 210 240 270 Lakes: Vector 200 © 2007 swisstopo (DV033492.2))
0 25 50 km Relief 1:1'000'000: K606-01 © 2004 swisstopo
Swiss Boundary: BFS GEOSTAT, swisstopo

carbon stock 0–30 cm depth


Jura
Plateau
Northern Prealps

120
Alps
Southern Alps
<600 m
600−1200 m
>1200 m

C stock [t/ha]
40 800

Ju
Ju
Ju
Pla
Pla
Pla
Pr
Pr
Pr
Alp

S−
CH
Alp
Alp
S−
S−
ea
ea
ea
ra
ra
ra

Alp
Alp
Alp
t
t
tea

s<
s6
s > 1200
ea
ea

lps
lps
lps
<6
60
>1

s>
s<
s6
00
60

12
u<
u6
u > 1200
0−
00

20

<6
60
>1


0

00

00
12
60
00
60

12
12
0

0
00

−1
00
0

−1
0

00

00
00

20
20

0
0
spatial regression analyses ↑↓ 41 parameter estimation vs. (spatial) prediction ↑↓ 42

• regression analyses with spatially referenced data • parameter estimation: infering values of parameters of a
stochastic model from data (e.g. coefficients and variance of error
• auto-correlation of errors frequently due to effects of unaccounted
term of a regression model)
covariates (imperfect model)
• prediction: inference about a unobserved values of random vari-
⇒ assumption of independent errors often untenable
ables (e.g. prediction of SOC stocks for a location without meas-
⇒ falsely ignoring auto-correlation biases statistical inference (p- urement);
values usually too small!)
• prediction target sometimes large (national SOC stock estimate:
spatial mean over whole study domain!)
#
S(D) = 1/|D| S(x) dx
D

1.2.4 field experiment on yield of wheat varieties ↑↓ 43

• field experiment to compare the yield of 56 varieties of wheat


15 6 43 50 24 42 23 2 53 13 10

• design of experiment: completely randomized block experiment 25 14

1
54

27
42

41
54

12
10

53
13

31
36

12
30

39
31

15
27

38
54

20 40
with 4 blocks 7

50
26

25
40

39
1

48
36

14 32
9

37
47

29
12

8
30

48
43

55
35
2 24 38 42 18 45 53 25 34 5 24
20
12 13 37 35 31 44 6 14 25 19 45

• location of centers of each of 224 experimental plots were recor- 10 36 19 6 2 43 10 4 42 18


30
23 35 5 40 23 24 21 36 52 47

ded

longitude
55 34 17 47 25 46 27 41 49 17 25
15 11 33 37 9 13 32 22 33 40 50

49 32 34 11 55 48 33 32 29 6
20
• more details in Pinheiro and Bates (2000, pp. 260) 5

22
19

9
29

7
20

30
21

46
5

54
52

1 18
39

7
1

46
10 21 53 22 27 49 56 45 51 14 16
15
17 56 4 52 43 28 55 41 56 9

20 51 28 8 8 26 17 51 35 10
18 31 48 51 15 11 19 4 26 37

5 8 30 47 3 38 50 20 40 2 28
5
52 29 46 26 33 49 38 35 21 22

4 28 45 39 16 16 44 15 3 23

16 3 44 56 41 34 7 3 44 11 0

10 20 30 40

latitude
> library(nlme) > # global F-Test: testing for any significant
> # analysis as a completely randomized block experiment > # treatment effects
> r.lme.means <- lme(yield~variety-1, Wheat2, > anova(update(r.lme.means, .~. + 1))
+ random=~1|Block)
> summary(r.lme.means)
numDF denDF F-value p-value
(Intercept) 1 165 242.05402 <.0001
variety 55 165 0.87549 0.7119
Linear mixed-effects model fit by REML
Data: Wheat2
AIC BIC logLik
1333.702 1514.891 -608.8508 > # testing particular treatment contrast
Random effects:
> # (BUCKSKIN vs. ARAPHAHOE)
Formula: ~1 | Block > anova(r.lme.means, L=c(1, 0, -1))
(Intercept) Residual
StdDev: 3.14371 7.041475
F-test for linear combination(s)
Fixed effects: yield ~ variety - 1 varietyARAPAHOE varietyBUCKSKIN
Value Std.Error DF t-value p-value 1 -1
varietyARAPAHOE 29.4375 3.855687 165 7.634827 0 numDF denDF F-value p-value
varietyBRULE 26.0750 3.855687 165 6.762738 0 1 1 165 0.6056841 0.4375
varietyBUCKSKIN 25.5625 3.855687 165 6.629818 0
...

residuals
25

80
20
longitude

60
15
10

40

20
5

0
10 20 30 40 50
latitude
20 40 60 80 100 120
lagged scatterplots lagged scatterplots
−20 −10 0 10 −20 −10 0 10

(0,4.3] (4.3,8.6] (8.6,12.9] (0,1.2] (1.2,2.4] (2.4,3.6]


r = 0.407 r = 0.343 r = 0.204 r = 0.456 r = 0.4 r = 0.307
10 10

0 0

−10 −10

−20 −20
res

res
(12.9,17.2] (17.2,21.5] (21.5,25.8] (3.6,4.8] (4.8,6] (6,7.2]
r = 0.0281 r = 0.192 r = 0.235 r = 0.315 r = 0.252 r = 0.269
10 10

0 0

−10 −10

−20 −20

−20 −10 0 10 −20 −10 0 10 −20 −10 0 10 −20 −10 0 10

res res
along direction S −> N (latitude) along direction W −> E (longitude)

> # analysis using a geostatistical spatial model > # global F-Test: testing for any significant
> r.gls.means <- gls(yield~variety-1, Wheat2, > # treatment effects
+ corr=corRatio(form=~latitude+longitude, > anova(update(r.gls.means, .~.+1))
+ nugget=TRUE))
> summary(r.gls.means)
Denom. DF: 168
numDF F-value p-value
Generalized least squares fit by REML (Intercept) 1 30.39940 <.0001
Model: yield ~ variety variety 55 1.85094 0.0015
Data: Wheat2
AIC BIC logLik
1183.278 1367.592 -532.6389
> # testing particular treatment contrast
Correlation Structure: Rational quadratic spatial correlation > # (BUCKSKIN vs. ARAPHAHOE)
Formula: ~latitude + longitude > anova(r.gls.means, L=c(1, 0, -1))
Parameter estimate(s):
range nugget
13.4613358 0.1935803
Denom. DF: 168
Coefficients: F-test for linear combination(s)
Value Std.Error t-value p-value varietyARAPAHOE varietyBUCKSKIN
varietyARAPAHOE 26.54597 4.970942 5.340229 0e+00 1 -1
varietyBRULE 26.28374 4.984883 5.272690 0e+00 numDF F-value p-value
varietyBUCKSKIN 35.03727 5.007094 6.997526 0e+00 1 1 7.69673 0.0062
...
> # block design: testing BUCKSKIN vs. ARAPHAHOE > # spatial model: testing BUCKSKIN vs. ARAPHAHOE
> >
> # treatment means > # treatment means
> round(fixef(r.lme.means)[c(1, 3)], 3) > round(coef(r.gls.means)[c(1, 3)], 3)

varietyARAPAHOE varietyBUCKSKIN varietyARAPAHOE varietyBUCKSKIN


29.437 25.562 26.546 35.037

> # block design: (co-)variances treatment means > # spatial model: (co-)variances treatment means
> round(vcov(r.lme.means)[c(1, 3),c(1, 3)], 3) > round(vcov(r.gls.means)[c(1, 3), c(1, 3)], 3)

varietyARAPAHOE varietyBUCKSKIN varietyARAPAHOE varietyBUCKSKIN


varietyARAPAHOE 14.866 2.471 varietyARAPAHOE 24.710 20.207
varietyBUCKSKIN 2.471 14.866 varietyBUCKSKIN 20.207 25.071

> # block design: t-value treatment contrast > # spatial model: t-value treatment contrast
> (29.438-25.562) / sqrt(2*(14.866-2.471)) > (26.546-35.037) / sqrt(24.710+25.071-2*20.207)

[1] 0.7784765 [1] -2.774333

geostatistical analyses of controlled experiments ↑↓ 55 summary section 1 ↑↓ 56

• field experiments in ecology, agriculture, forestry, … often give rise


to spatial data • geostatistical data (yi , xi , dk (xi )):

• classical analysis of variance of spatial experimental data ignores 1. response yi


spatial structure of the data 2. location xi
• blocking and randomization sometimes not effective to account for 3. covariates dk (xi ))
natural heterogeneity within experimental site 4. often approximately infinitesimal support
• residuals often violate independence assumption of classical ana- • models for geostatistical data decompose spatial variation (non-
lysis of variance methods uniquely) into “large-scale” trend and local auto-correlated fluctu-
• explicit consideration of auto-correlation by generalized least ations
squares estimation: • trend modelled by linear regression model
⇒ increased estimation variance for treatment means
• local fluctuations modelled by auto-correlated stochastic process
⇒ decreased estimation variance for treatment contrasts due to
strong positive correlation of treatment mean estimates
summary section 1 ↑↓ 57 References
Cressie, N. A. C. (1993). Statistics for Spatial Data. John Wiley & Sons,
• objectives of geostatistical analyses:
New York, revised edition.
1. prediction of response variable
Hofer, C., Borer, F., Bono, R., Kayser, A., and Papritz, A. (2013). Pre-
– at location without measurement dicting topsoil heavy metal content of parcels of land: An empirical
– for finite support targets (spatial means) validation of customary and constrained lognormal block kriging and
2. estimation of parameters of (regression) models fitted to geo- conditional simulations. Geoderma, 193–194, 200–212.
statistical data Nussbaum, M., Papritz, A., Baltensweiler, A., and Walthert, L. (2014).
3. (likely more powerful) analyses of spatial experimental data Estimating soil organic carbon stocks of Swiss forest soils by robust
(than classical ANOVA) external-drift kriging. Geoscientific Model Development, 7(3), 1197–
1210.

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-Effects Models in S and


S-PLUS. Springer Verlag.

2 spatial data: R classes and methods characteristics of spatial data ↑↓ 60

• geostatistical data set: measurements of response variable and


coordinates of data locations (and optionally covariates)

⇒ data.frame sufficient for storing, analysing and predicting geo-


statistical data

• prediction target sometimes defined by polygons (cf. example Dor-


nach)

• data set with (point) prediction target sometimes very large

• requirement for linking R and geographical information systems


(GIS)

⇒ incentive for creating customized classes (and methods!) for


handling and analysing spatial data
2.1 package sp: methods & classes for spatial data↑↓ 61 SpatialPointsDataFrames: basics ↑↓ 62

• package sp provides more advanced S4 formal classes and meth- > library(sp)
ods for analysing spatial data > d.dornach <- read.table(”dornach.txt”, header=TRUE)
> d.dornach$dist <- with(d.dornach, sqrt(x^2+y^2))
⇒ suit of classes and methods for SpatialPoints, Spa- > str(d.dornach)
tialLines, SpatialPolygons, SpatialGrid and Spa-
tialPixel objects ’data.frame’: 181 obs. of 10 variables:
$ x : num 562 361.3 1003.5 341.2 40.1 ...
$ y : num 241 140 442 -381 -201 ...
⇒ focus here on SpatialPointsDataFrame class and associ- $ survey : Factor w/ 2 levels ”a”,”b”: 2 2 2 2 2 2 2 2 2..
ated methods $ forest : Factor w/ 2 levels ”no”,”yes”: 1 1 1 1 1 1 1 ..
$ built.up: Factor w/ 3 levels ”after.1969”,”before.1960”..
$ geology : Factor w/ 4 levels ”limestone.a”,..: 3 3 3 3 ..
• more information in Bivand et al. (2013, chap. 2–3) $ cu : num 39 51 267 278 1401 ...
$ cd : num 0.6 0.48 1.18 1.58 4.04 1.08 2.68 0.57 0..
$ zn : num NA NA NA NA NA NA NA NA NA NA ...
$ dist : num 611 388 1096 512 205 ...

> # generate SpatialPointsDataFrame > slotNames(spdf.dornach)


>
> spdf.dornach <- d.dornach
> coordinates(spdf.dornach) <- ~x+y [1] ”data” ”coords.nrs” ”coords” ”bbox”
[5] ”proj4string”
> str(spdf.dornach, max=3)

Formal class ’SpatialPointsDataFrame’ [package ”sp”] with .. > slot(spdf.dornach, ”bbox”)


..@ data :’data.frame’: 181 obs. of 8 variables:
.. ..$ survey : Factor w/ 2 levels ”a”,”b”: 2 2 2 2 2 2..
.. ..$ forest : Factor w/ 2 levels ”no”,”yes”: 1 1 1 1 .. min max
.. ..$ built.up: Factor w/ 3 levels ”after.1969”,”befor”.. x -1846.50 3050.74
.. ..$ geology : Factor w/ 4 levels ”limestone.a”,..: 3 .. y -1204.24 1107.90
.. ..$ cu : num [1:181] 39 51 267 278 1401 ...
.. ..$ cd : num [1:181] 0.6 0.48 1.18 1.58 4.04 1.0..
.. ..$ zn : num [1:181] NA NA NA NA NA NA NA NA NA ..
.. ..$ dist : num [1:181] 611 388 1096 512 205 ... > spdf.dornach@bbox
..@ coords.nrs : int [1:2] 1 2
..@ coords : num [1:181, 1:2] 562 361.3 1003.5 341.2..
.. ..- attr(*, ”dimnames”)=List of 2 min max
..@ bbox : num [1:2, 1:2] -1846 -1204 3051 1108 x -1846.50 3050.74
.. ..- attr(*, ”dimnames”)=List of 2 y -1204.24 1107.90
..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..
> summary(spdf.dornach) NA’s :23
dist
Min. : 97.67
1st Qu.: 306.70
Object of class SpatialPointsDataFrame Median : 567.79
Coordinates: Mean : 716.25
min max 3rd Qu.: 942.37
x -1846.50 3050.74 Max. :3159.78
y -1204.24 1107.90
Is projected: NA
proj4string : [NA]
Number of points: 181
Data attributes:
survey forest built.up geology
a: 66 no :159 after.1969 :46 limestone.a: 7
b:115 yes: 22 before.1960:66 limestone.b: 12
not :69 other :151
tertiary : 11

cu cd zn
Min. : 5.0 Min. : 0.110 Min. : 37.0
1st Qu.: 56.0 1st Qu.: 0.810 1st Qu.: 158.2
Median : 117.0 Median : 1.320 Median : 251.0
Mean : 381.5 Mean : 1.768 Mean : 508.9
3rd Qu.: 398.0 3rd Qu.: 2.100 3rd Qu.: 557.0
Max. :3881.0 Max. :23.300 Max. :4955.0

> # selecting elements from SpatialPointsDataFrame > # convert SpatialPointsDataFrame


> spdf.dornach$cu > str(as(spdf.dornach, ”data.frame”))

[1] 39.0 51.0 267.0 278.0 1401.0 358.0 3881.0


....... ’data.frame’: 181 obs. of 10 variables:
$ x : num 562 361.3 1003.5 341.2 40.1 ...
$ y : num 241 140 442 -381 -201 ...
> spdf.dornach[1:2, 2:4] $ survey : Factor w/ 2 levels ”a”,”b”: 2 2 2 2 2 2 2 2 2..
$ forest : Factor w/ 2 levels ”no”,”yes”: 1 1 1 1 1 1 1 ..
coordinates forest built.up geology $ built.up: Factor w/ 3 levels ”after.1969”,”before.1960”..
1 (561.98, 240.85) no after.1969 other $ geology : Factor w/ 4 levels ”limestone.a”,..: 3 3 3 3 ..
2 (361.27, 140.49) no before.1960 other $ cu : num 39 51 267 278 1401 ...
$ cd : num 0.6 0.48 1.18 1.58 4.04 1.08 2.68 0.57 0..
$ zn : num NA NA NA NA NA NA NA NA NA NA ...
> spdf.dornach[1:2, c(”survey”, ”dist”)] $ dist : num 611 388 1096 512 205 ...

coordinates survey dist


1 (561.98, 240.85) b 611.4166
> str(as(spdf.dornach, ”SpatialPoints”))
2 (361.27, 140.49) b 387.6254

> coordinates(spdf.dornach[1:2,]) Formal class ’SpatialPoints’ [package ”sp”] with 3 slots


..@ coords : num [1:181, 1:2] 562 361.3 1003.5 341.2..
x y .. ..- attr(*, ”dimnames”)=List of 2
1 561.98 240.85 .. .. ..$ : chr [1:181] ”1” ”2” ”3” ”4” ...
2 361.27 140.49 .. .. ..$ : chr [1:2] ”x” ”y”
..@ bbox : num [1:2, 1:2] -1846 -1204 3051 1108
.. ..- attr(*, ”dimnames”)=List of 2 > # SpatialPointsDataFrame ’behaves’ as ordinary data.frame
.. .. ..$ : chr [1:2] ”x” ”y” > lm(log(cu)~dist, data=d.dornach)
.. .. ..$ : chr [1:2] ”min” ”max”
..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..
.. .. ..@ projargs: chr NA
Call:
lm(formula = log(cu) ~ dist, data = d.dornach)

Coefficients:
(Intercept) dist
6.277739 -0.001738

> lm(log(cu)~dist, data=spdf.dornach)

Call:
lm(formula = log(cu) ~ dist, data = spdf.dornach)

Coefficients:
(Intercept) dist
6.277739 -0.001738

> plot(cu~dist, data=spdf.dornach) > # overview of S4 methods for class SpatialPointsDataFrame


> showMethods(classes=”SpatialPointsDataFrame”)
4000

Function: $ (package base)


x=”SpatialPointsDataFrame”
(inherited from: x=”SpatialPoints”)
3000

Function: [ (package base)


x=”SpatialPointsDataFrame”

Function: bbox (package sp)


2000

obj=”SpatialPointsDataFrame”
cu

(inherited from: obj=”Spatial”)

Function: coerce (package methods)


1000

from=”SpatialGridDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialLines”, to=”SpatialPointsDataFrame”
from=”SpatialLinesDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialMultiPointsDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialPixelsDataFrame”, to=”SpatialPointsDataFrame”
0

0 500 1000 1500 2000 2500 3000 from=”SpatialPointsDataFrame”, to=”Spatial”


dist from=”SpatialPointsDataFrame”, to=”SpatialPixelsDataFrame”
from=”SpatialPointsDataFrame”, to=”SpatialPoints”
.......
> # overview of S3 methods for class SpatialPointsDataFrame SpatialPointsDataFrames: plotting ↑↓ 74
> methods(class=”SpatialPointsDataFrame”)

[1] $ $<- [ [<-


• using traditional graphics system of R:
[5] [[ [[<- addAttrToGeom as.data.frame ⇒ plot(), points(), legend(), etc.
[9] bbox coerce coerce<- coordinates
[13] coordinates<- coordnames coordnames<- dim
[17] dimensions fullgrid geometry geometry<-
• using Trellis graphics system provided by package lattice:
[21] gridded gridded<- is.projected length ⇒ spplot()
[25] merge names names<- over
[29] plot points polygons print
[33] proj4string proj4string<- rbind row.names
[37] row.names<- show spChFIDs<- spTransform
[41] split sppanel spplot spsample
[45] stack summary text
see ’?methods’ for accessing help and source code

> plot(spdf.dornach) > plot(spdf.dornach, axes=TRUE, pch=1, col=”blue”,


+ cex=sqrt(spdf.dornach$cu/max(spdf.dornach$cu))*5)
> points(spdf.dornach, pch=1, col=”orange”,
+ cex=sqrt(spdf.dornach$cd/max(spdf.dornach$cd))*5)
> legend(”bottomleft”, pch=1, col=c(”blue”, ”orange”),
+ legend=c(”cu”, ”cd”), bty=”n”)

1000
500
0
−500
−1000

cu
cd
−1500

−2000 −1000 0 1000 2000 3000


> spplot(spdf.dornach, zcol=”cu”, key.space=”left”) SpatialGridDataFrame: outlook ↑↓ 78

> x.nodes <- with(d.dornach,


+ seq(min(x), max(x), length=ceiling(diff(range(x))/20)))
> y.nodes <- with(d.dornach,
+ seq(min(y), max(y), length=ceiling(diff(range(y))/20)))
> d.grid.dornach <- expand.grid(x=x.nodes, y=y.nodes)
> d.grid.dornach$dist <- with(d.grid.dornach,
+ sqrt(x^2+y^2))
[5,780.2]
(780.2,1555] > str(d.grid.dornach)
(1555,2331]
(2331,3106]
(3106,3881]
’data.frame’: 28420 obs. of 3 variables:
$ x : num -1846 -1826 -1806 -1786 -1766 ...
$ y : num -1204 -1204 -1204 -1204 -1204 ...
$ dist: num 2204 2188 2171 2154 2138 ...
- attr(*, ”out.attrs”)=List of 2
..$ dim : Named int 245 116
.. ..- attr(*, ”names”)= chr ”x” ”y”
..$ dimnames:List of 2
.. ..$ x: chr ”x=-1.846500e+03” ”x=-1.826429e+03” ”x=-”..
.. ..$ y: chr ”y=-1204.240000” ”y=-1184.134435” ”y=-11”..

> sgdf.dornach <- d.grid.dornach > c(object.size(d.grid.dornach), object.size(sgdf.dornach))


> coordinates(sgdf.dornach) <- ~x+y
[1] 707232 232080
> gridded(sgdf.dornach) <- TRUE
> fullgrid(sgdf.dornach) <- TRUE > image(sgdf.dornach)
> str(sgdf.dornach)

Formal class ’SpatialGridDataFrame’ [package ”sp”] with 4 ..


..@ data :’data.frame’: 28420 obs. of 1 variable:
.. ..$ dist: num [1:28420] 2153 2136 2119 2102 2085 ...
..@ grid :Formal class ’GridTopology’ [package ”s”..
.. .. ..@ cellcentre.offset: Named num [1:2] -1846 -1204
.. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
.. .. ..@ cellsize : Named num [1:2] 20.1 20.1
.. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
.. .. ..@ cells.dim : Named int [1:2] 245 116
.. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
..@ bbox : num [1:2, 1:2] -1857 -1214 3061 1118
.. ..- attr(*, ”dimnames”)=List of 2
.. .. ..$ : chr [1:2] ”x” ”y”
.. .. ..$ : chr [1:2] ”min” ”max”
..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..
.. .. ..@ projargs: chr NA
> spplot(sgdf.dornach, zcol=”dist”) SpatialPolygonsDataFrame: outlook ↑↓ 82

> library(constrainedKriging)
> str(meuse.blocks, max=2)
3500

3000
Formal class ’SpatialPolygonsDataFrame’ [package ”sp”] wit..
2500
..@ data :’data.frame’: 259 obs. of 2 variables:
..@ polygons :List of 259
2000 .. .. [list output truncated]
..@ plotOrder : int [1:259] 177 179 180 178 188 182 181..
1500 ..@ bbox : num [1:2, 1:2] 178438 329598 181562 333..
.. ..- attr(*, ”dimnames”)=List of 2
1000 ..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..

500

0
> str(meuse.blocks@data)

’data.frame’: 259 obs. of 2 variables:


$ dist: num 0 0.003056 0 0.000453 0.019181 ...
$ M : num 0 0.001358 0 0.000574 0.009329 ...

> str(meuse.blocks@polygons, max=2) > str(meuse.blocks@polygons[[1]]@Polygons[[1]], max=2)

List of 259 Formal class ’Polygon’ [package ”sp”] with 5 slots


$ :Formal class ’Polygons’ [package ”sp”] with 5 slots ..@ labpt : num [1:2] 178907 329627
$ :Formal class ’Polygons’ [package ”sp”] with 5 slots ..@ area : num 5065
$ :Formal class ’Polygons’ [package ”sp”] with 5 slots ..@ hole : logi FALSE
$ :Formal class ’Polygons’ [package ”sp”] with 5 slots ..@ ringDir: int 1
....... ..@ coords : num [1:7, 1:2] 178960 178878 178878 178810 ..
....... .. ..- attr(*, ”dimnames”)=List of 2

> str(meuse.blocks@polygons[[1]], max=2) ⇒ see section 2.6 of Bivand et al. (2013) for details

Formal class ’Polygons’ [package ”sp”] with 5 slots


..@ Polygons :List of 1
..@ plotOrder: int 1
..@ labpt : num [1:2] 178907 329627
..@ ID : chr ”block4”
..@ area : num 5065
> spplot(meuse.blocks, zcol=”dist”) summary section 2 ↑↓ 86

⇒ classes and methods essential part of R software environment


1.0

⇒ acquire basic understanding how this works


– first argument of the call of a generic function determines
0.8
choice of specific method
– use methods(generic-function) to see what S3 meth-
0.6 ods exist for a generic-function
– use methods(class=”cl”) to see all S3 methods defined
for a class cl
0.4
– analogous queries for S4 methods
showMethods(generic-function) and
0.2
showMethods(classes=”cl”)
– for S3 methods look at help page of generic-
function.cl(), e.g. ?points.geodata
0.0
⇒ classes defined by package sp have become quasi standard for
analyzing spatial data

References 3 an example of a geostatistical analysis


Bivand, R. S., Pebesma, E. J., and Gómez-Rubio, V. (2013). Applied
with R
Spatial Data Analysis with R. Springer, New York, second edition.
typical steps of a geostatistical analysis ↑↓ 89 example data set: Wolfcamp aquifer data ↑↓ 90

> library(sp)
exploratory analysis
> library(geoR)
⇓ > library(gstat)
> d.w <- as.data.frame(wolfcamp)
trend modelling > class(d.w) <- ”data.frame”
⇓ > colnames(d.w) <- c(”x”, ”y”, ”pressure”)
> coordinates(d.w) <- ~x+y
modelling residual auto-correlation > summary(d.w)

$ %
Object of class SpatialPointsDataFrame
statistical inference Coordinates:
min max
⇓ x -233.7217 181.5314
y -145.7884 136.4061
model assessment by cross-validation Is projected: NA
proj4string : [NA]
⇓ Number of points: 85
Data attributes:
computing spatial predictions pressure
Min. : 312.1

1st Qu.: 471.8


Median : 547.7
3.1 exploratory analysis ↑↓ 92
Mean : 610.3
3rd Qu.: 774.2
Max. :1088.4 > plot(y~x, d.w, asp=1, cex=sqrt(d.w$pressure-300)/5)

0 100
50
y
−50
−150

−300 −200 −100 0 100 200


x
interactive data inspection by R package rgl ↑↓ 93 3.2 trend modelling ↑↓ 94

> r.lm.1 <- lm(pressure~x+y, d.w)


> library(rgl) > summary(r.lm.1)
> open3d()
> plot3d(x=d.w$x, y=d.w$y,
+ z=d.w$pressure/3, Call:
lm(formula = pressure ~ x + y, data = d.w)
+ type=”s”, radius=7, col=”red”, aspect=”iso”,
+ xlab=”x”, ylab=”y”, zlab=”pressure”) Residuals:
> clear3d() Min 1Q Median 3Q Max
-111.989 -50.297 -9.326 48.510 197.986

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 607.77066 7.52219 80.80 <2e-16
x -1.27844 0.06552 -19.51 <2e-16
y -1.13874 0.07739 -14.71 <2e-16

Residual standard error: 62.29 on 82 degrees of freedom


Multiple R-squared: 0.8909, Adjusted R-squared: 0.8882
F-statistic: 334.8 on 2 and 82 DF, p-value: < 2.2e-16

> op <- par(mfrow=c(2, 2)); plot(r.lm.1); par(op) > op <- par(mfrow=c(1, 2))
> scatter.smooth(d.w$x, residuals(r.lm.1), xlab=”x”,
+ main=”residuals vs x”)
Residuals vs Fitted Normal Q−Q > scatter.smooth(d.w$y, residuals(r.lm.1), xlab=”y”,
78 78 + main=”residuals vs y”)
Standardized residuals
−2 −1 0 1 2 3
150

27 27
> par(op)
Residuals
50

residuals vs x residuals vs y
−50

56
−150

56

300 500 700 900 −2 −1 0 1 2

150

150
Fitted values Theoretical Quantiles

residuals(r.lm.1)

residuals(r.lm.1)
Scale−Location Residuals vs Leverage
78

0 50

0 50
Standardized residuals

78
Standardized residuals
0 1 2 3
1.5

56 27

73
1.0
0.5

−100

−100
Cook's distance 84
−2
0.0

300 500 700 900 0.00 0.02 0.04 0.06 0.08 −200 −100 0 100 −150 −50 0 50 100
Fitted values Leverage x y
> r.lm.2 <- update(r.lm.1, .~.+I(x^2)+I(y^2)+x:y) > anova(r.lm.1, r.lm.2)
> summary(r.lm.2)

Analysis of Variance Table


Call:
lm(formula = pressure ~ x + y + I(x^2) + I(y^2) + x:y, data = d.w) Model 1: pressure ~ x + y
Model 2: pressure ~ x + y + I(x^2) + I(y^2) + x:y
Residuals: Res.Df RSS Df Sum of Sq F Pr(>F)
Min 1Q Median 3Q Max 1 82 318200
-124.405 -43.662 -2.337 39.017 199.198 2 79 256887 3 61313 6.2852 0.000702

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.203e+02 1.295e+01 47.902 < 2e-16
x -1.075e+00 8.191e-02 -13.128 < 2e-16
y -1.330e+00 8.861e-02 -15.008 < 2e-16
I(x^2) 8.994e-05 5.908e-04 0.152 0.879388
I(y^2) -2.929e-03 1.101e-03 -2.659 0.009486
x:y 3.184e-03 8.790e-04 3.622 0.000515

Residual standard error: 57.02 on 79 degrees of freedom


Multiple R-squared: 0.9119, Adjusted R-squared: 0.9063
F-statistic: 163.6 on 5 and 79 DF, p-value: < 2.2e-16

> plot(y~x, d.w, asp=1, cex=sqrt(abs(residuals(r.lm.1)))/2,


+ xlim=c(200, -250), ylim=c(150, -150),
surface
trend
+ col=c(”blue”, NA, ”orange”)[sign(residuals(r.lm.1))+2],
+ main = ”bubble plot residuals linear trend surface”)
> legend(”bottomright”, pch=1, col=c(”blue”, ”orange”),
+ legend=c(”< 0”, ”> 0”), bty=”n”)

bubble plot residuals linear trend surface


−150

pressure
−50

N
0
y
50

y−
150 100

co
te
<0 rdina
or
o
x−co
di
>0

atn
200 100 0 −100 −200 −300

e
x
3.3 estimating and modelling auto-correlation ↑↓ 101 auto-correlation: lag-scatter plots ↑↓ 102

an example with simulated time series data (AR(1) process)


> plot(d.ar[1:49], d.ar[2:50], asp=1, xlab=”y(x)”,
> set.seed(20) + ylab=”y(x+1)”, main=”lag-scatter plot for lag=1”)
> d.ar <- arima.sim(list(ar=0.9), n=50)
> plot(d.ar, main=”simulated AR(1) process”)
lag−scatter plot for lag=1

simulated AR(1) process

4
3
4
3

y(x+1)
2
d.ar
2

1
1

0
−1 0

−1
0 10 20 30 40 50 −4 −2 0 2 4 6
y(x)
Time

lag 1 lag 2
pro memoria: sample covariance and correlation ↑↓ 104

0.82 0.75
• data: measurements (y1,i , y2,i ), i = 1, 2, . . . , n about 2 response
4

variables
3

3
y(x+1)

y(x+2)
2

• sample covariance
1

" n
1
s1,2 = (y1,i − ȳ1 )(y2,i − ȳ2 )
0

(n − 1) i=1
−1

−1

−2 0 2 4 −2 0 2 4
y(x) y(x)
where ȳ1 and ȳ2 are the (arithmetic) sample means

lag 3 lag 5 • (Pearson) correlation coefficient


s1,2
0.57 0.52 ρ̂ =
s1 s2
4

where s1 and s2 are the sample standard deviations


3

3
y(x+3)

y(x+5)
2

• “plug-in” estimator for auto-correlogram of time series


1

&n−h
(yi+h − ȳ)(yi − ȳ)
0

ρ̂(h) = i=1&n 2
i=1 (yi − ȳ)
−1

−1

−2 0 2 4 −2 0 2 4
y(x) y(x)
pro memoria: sample covariance and correlation ↑↓ 105 auto-correlation: correlogram of time series ↑↓ 106

● > acf(d.ar, main=”auto-correlogram AR(1) process”)


0.41 ●

200

auto−correlogram AR(1) process

● ●


1.0
100



y(x+1)

● ●

● ●
y ●

0.6
0

● ● ●
● ●
● ●

● ● ● ●

ACF


● ●

0.2

● ●


−200

−0.2

● y
0 5 10 15
−300 −100 0 100 200 Lag
y(x)

defining lags for irregular sampling grids ↑↓ 107 lag-scatter plots of trend surface residuals ↑↓ 108

> d.w$res.1 <- residuals(r.lm.1)


> hscat(res.1~1, d.w, breaks=seq(0, 80, by=20))

lagged scatterplots
−100 0 100 200 −100 0 100 200

dφl 200
(0,20] (20,40] (40,60]
r = 0.272
(60,80]
r = 0.0436

100 r = 0.333
dh xj r = 0.406

res.1
dh dφl 0

hkl φl
−100

xk −100 0 100 200 −100 0 100 200

res.1

xi
auto-correlation: (co-)variogram of spatial data ↑↓ 109 auto-correlation: semi-variance ↑↓ 110

• (k, l)th lag class, hkl , characterized by distance, (hk − dh, hk + dh], ●

and angular class, φl − dφ, φl + dφ] ●

200

• Nkl : number of pairs of locations (xi , xj ) with xj − xi ≈ hkl ● ●


xi ●

y(xi+1)-y(xi)


100

• estimator for covariance for lag class hkl : ●

y(x+1)

1 " ● ●

!(hkl ) =
γ [y(xi ) − ȳ][y(xj ) − ȳ] ●

0
Nkl ●
● ● ●

● ●
(i,j)∈hkl ●

● ● ● ●

● ●

⇒ covariogram ● ●
● ●

• estimator for (semi-)variance for lag class hkl : ●

−200
1 "
V! (hkl ) =

[y(xi ) − y(xj )]2 ●
2 Nkl
(i,j)∈hkl
−300 −100 0 100 200
⇒ (semi-)variogram y(x)

variogram of trend surface residuals ↑↓ 111 > r.v <- sample.variogram(residuals(r.lm.1),


+ locations=coordinates(d.w), lag.dist.def=20,
+ max.lag=200, estimator=”matheron”)
> library(georob)
> plot(r.v)
> plot(sample.variogram(residuals(r.lm.1),
> text(gamma~lag.dist, r.v, labels=npairs, pos=1)
+ locations=coordinates(d.w), lag.dist.def=20,
+ estimator=”matheron”))

5000
4000
196
10000

234 320
169 258 316
353

2000 3000
semivariance
semivariance

sill (variance) 161


6000

163
82
1000
2000

nugget
range (scale)
0

0 50 100 150
0

0 100 200 300 400 lag distance


lag distance
> r.v.sph <- fit.variogram.model(r.v, 3.4 fitting spatial model by maximum likelihood ↑↓ 114
+ variogram.model=”RMspheric”,
+ param=c(variance=3000, nugget=1000, scale=100))
> lines(r.v.sph) > r.georob.1 <- georob(pressure~x+y, d.w,
+ locations=~x+y, variogram.model=”RMspheric”,
+ param=c(variance=3000, nugget=1000, scale=100),
+ tuning.psi=1000, control=control.georob(ml.method=”ML”))
5000

> summary(r.georob.1)
...
4000

Maximized log-likelihood: -458.367


...
Variogram: RMspheric
2000 3000
semivariance

sill (variance) Estimate Lower Upper


variance 3328.80 1453.89 7621.6
snugget(fixed) 0.00 NA NA
nugget 1236.23 615.98 2481.0
1000

scale 122.95 95.12 158.9


nugget
range (scale) Fixed effects coefficients:
0

Estimate Std. Error t value Pr(>|t|)


0 50 100 150 (Intercept) 620.3545 17.0634 36.356 < 2e-16
lag distance x -1.3256 0.1360 -9.750 2.33e-15
y -1.2061 0.1793 -6.728 2.15e-09
...

> summary(r.lm.1) > plot(r.georob.1, lag.dist.def=20, max.lag=200)


> lines(r.v.sph, col=”orange”)
> legend(”topleft”, lty=1, col=c(”black”, ”orange”), bty=”n”,
Call: + legend=c(”ML estimate”,
lm(formula = pressure ~ x + y, data = d.w)
+ ”fit of sample variogram”))
Residuals:
Min 1Q Median 3Q Max
-111.989 -50.297 -9.326 48.510 197.986

Coefficients: ML estimate
Estimate Std. Error t value Pr(>|t|) fit of sample variogram
(Intercept) 607.77066 7.52219 80.80 <2e-16

5000
x -1.27844 0.06552 -19.51 <2e-16
y -1.13874 0.07739 -14.71 <2e-16

semivariance
3000
Residual standard error: 62.29 on 82 degrees of freedom
Multiple R-squared: 0.8909, Adjusted R-squared: 0.8882
F-statistic: 334.8 on 2 and 82 DF, p-value: < 2.2e-16

1000
0

0 50 100 150
lag distance
> op <- par(mfrow=c(1,2)) 3.5 inference, model building and assessment ↑↓ 118
> plot(fitted(r.georob.1), residuals(r.georob.1),
+ main=”Tukey-Anscombe plot”)
> qqnorm(rstandard(r.georob.1, level=0), • data analysis often leads to a set of equally plausible candidate
+ main=”QQnorm regression residuals”)
> par(op)
models that use different set of covariates and different variograms

⇒ compare fit of candidate models by hypothesis tests taking auto-


Tukey−Anscombe plot QQnorm regression residuals
correlation properly into account

3
60

⇒ use established goodness-of-fit criteria (AIC, BIC) to select a


40

2
“best” model, again taking auto-correlation into account
residuals(r.georob.1)

Sample Quantiles
20

1
⇒ use cross-validation to compare the power of candidate models to
0

predict new data


0
−20

−1
−40
−60

−2

400 600 800 1000 −2 −1 0 1 2


fitted(r.georob.1) Theoretical Quantiles

ML fit quadratic trend surface model ↑↓ 119 > waldtest(r.georob.2, r.georob.1, test=”F”)

> r.georob.2 <- update(r.georob.1, .~.+I(x^2)+I(y^2)+x:y) Wald test


> summary(r.georob.2)
Model 1: pressure ~ x + y + I(x^2) + I(y^2) + x:y
Model 2: pressure ~ x + y
... Res.Df Df F Pr(>F)
Maximized log-likelihood: -455.0775 1 79
... 2 82 -3 2.7684 0.04713
Variogram: RMspheric

Estimate Lower Upper


variance 2061.7 786.0 5408.2
snugget(fixed) 0.0 NA NA
nugget 1398.6 694.1 2817.8
scale 103.4 44.6 239.6

Fixed effects coefficients:


Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.111e+02 2.160e+01 28.291 < 2e-16
x -1.168e+00 1.315e-01 -8.883 1.65e-13
y -1.269e+00 1.564e-01 -8.117 5.16e-12
I(x^2) 1.293e-03 9.407e-04 1.375 0.173
I(y^2) -2.324e-03 1.650e-03 -1.408 0.163
x:y 2.293e-03 1.499e-03 1.530 0.130
...
> step(r.georob.2) Variogram: RMspheric
variance(fixed) snugget(fixed) nugget(fixed)
2060.4 0.0 1402.1
Start: AIC=922.16 scale(fixed)
pressure ~ x + y + I(x^2) + I(y^2) + x:y 103.8

Df AIC Converged
- I(x^2) 1 922.05 1
- I(y^2) 1 922.13 1
<none> 922.16
- x:y 1 922.49 1

Step: AIC=922.05
pressure ~ x + y + I(y^2) + x:y

Df AIC Converged
<none> 922.05
- I(y^2) 1 922.54 1
- x:y 1 924.61 1

Tuning constant: 1000

Fixed effects coefficients:


(Intercept) x y I(y^2)
627.526464 -1.148338 -1.347008 -0.002587
x:y
0.003005

cross-validating trend surface models ↑↓ 123 > op <- palette(rainbow(12))


> plot(y~x, r.cv.1$pred, asp=1, col=subset,
+ main = ”cross-validation subsets”)
> r.cv.1 <- cv(r.georob.1, seed=5426) > legend(”topleft”, pch=1, col=1:10,
> r.cv.2 <- cv(r.georob.2, seed=5426) + title=”subsets”, legend=1:10, bty=”n”)
> palette(op)

> summary(r.cv.1)
cross−validation subsets

subsets
Statistics of cross-validation prediction errors

100
me mede rmse made qne msse 1
-11.5675 -14.5255 59.9980 56.5949 60.3714 0.8451 2
medsse crps 3

50
0.3500 33.9145 4
5
6

0
y
7
> summary(r.cv.2)
−50
8
9
10
Statistics of cross-validation prediction errors
−150

me mede rmse made qne msse


-3.7322 6.1160 78.9419 68.2495 73.3545 1.4344
−300 −200 −100 0 100 200
medsse crps
0.6257 43.0219 x
> op <- par(mfrow=c(2, 2)) 3.6 computing kriging predictions ↑↓ 126
> plot(r.cv.1, type=”sc”); plot(r.cv.1, type=”ta”)
> plot(r.cv.1, type=”qq”); plot(r.cv.1, type=”hist.pit”)
> par(op) • workhorse: ⇒ universal (or external-drift) kriging
data vs. predictions Tukey−Anscombe plot
• prediction of signal S(x0 ) at location x0 without measurement

3
standardized prediction errors
1000

2
n
"
800

Ŝ(x0 ) = κi (x0 ) y(xi )

1
data
600

0
i=1

−1
400

−2
400 600 800 1000 400 600 800 1000
where weights κi (x0 ) depend on trend model and variogram
predictions predictions

normal−QQ−plot of standardized prediction errors histogram PIT−values • kriging provides in addition an estimate of the variance of the pre-
quantiles of standardized prediction errors

1.5 diction error S(x0 ) − Ŝ(x0 )


2 3

1.0
density
0 1

0.5
−2 −1

0.0

−2 −1 0 1 2 0.0 0.2 0.4 0.6 0.8 1.0


quantile N(0,1) PIT

> d.w.grid <- expand.grid( pred se lower


Min. : 220.8 Min. :18.95 Min. : 92.89
+ x = seq(-240, 190, by= 2.5),
1st Qu.: 497.2 1st Qu.:33.21 1st Qu.: 426.74
+ y = seq(-150, 140, by= 2.5) Median : 663.5 Median :38.20 Median : 569.44
+ ) Mean : 656.5 Mean :41.49 Mean : 575.19
> r.uk <- predict(r.georob.1, newdata=d.w.grid) 3rd Qu.: 789.3 3rd Qu.:46.03 3rd Qu.: 709.57
Max. :1119.7 Max. :77.22 Max. :1012.69
> coordinates(r.uk) <- ~x+y upper
> gridded(r.uk) <- TRUE Min. : 348.7
> fullgrid(r.uk) <- TRUE 1st Qu.: 563.0
Median : 750.3
Mean : 737.8
3rd Qu.: 878.7
> summary(r.uk) Max. :1226.8

Object of class SpatialGridDataFrame


Coordinates:
min max
x -241.25 191.25
y -151.25 141.25
Is projected: NA
proj4string : [NA]
Grid attributes:
cellcentre.offset cellsize cells.dim
x -240 2.5 173
y -150 2.5 117
Data attributes:
> spplot(r.uk, zcol=”pred”, main=”UK prediction”) > spplot(r.uk, zcol=”se”, main=”UK standard error”)

UK prediction UK standard error


80

1000 70

60
800

50

600
40

400 30

20
200

summary section 3 ↑↓ 131 4 some theory on stochastic processes

• modelling trend by linear regression model based on insights from


exploratory analysis

• modelling residual auto-correlation by variogram

• simultaneous estimation of regression coefficients of trend model


and parameters of variogram by (restricted) maximum likelihood

• hypothesis tests for spatial data should take auto-correlation into


account
4.1 stochastic process {S(x, ω)} ↑↓ 133 4.2 stationary and isotropic stochastic process ↑↓ 134

• strictly stationary process: joint distributions of arbitrary collec-


• spatial stochastic process {S(x, ω)}: collection (= set) of ran-
tions of random variables {S(x1 ), . . . , S(xn )} are invariant to trans-
dom variables S(x, ω) : x ∈ D ⊂ Rd , ω ∈ sample space Ω, with a
lations by vector h ∈ Rd ⇒ {S(x1 ), . . . , S(xn )} and {S(x1 ) +
well defined joint distribution
h, . . . , S(xn + h)} have same joint distribution
• for a single location, say xi ∈ D, the value si = s(xi ) of a F (s1 , . . . , sn ; x1 , . . . , xn ) = F (s1 , . . . , sn ; x1 + h, . . . , xn + h)
response variable is modelled as particular outcome (= realiza-
• weakly or second-order stationary process: distributions of ar-
tion = elementary event) ω ′ of the random variable S(xi , ω), i.e.
bitrary pairs of random variables (S(x), S(x + h)) satisfy
si = S(xi , ω ′ )
1. E [S(x)] = constant (independent of x)
• if we consider all locations x ∈ D but only a particular outcome ω ′
2. Cov [S(x + h), S(x), ] = γ(h) (independent of x)
then s(x) = S(x, ω ′ ) is a deterministic function of x
3. Var [S(x)] = constant (independent of x)

⇒ strict stationarity implies weak stationarity

⇒ stationarity required for estimation/prediction with single realiza-


tion of stochastic process

isotropic stochastic process ↑↓ 135 4.3 Gaussian stochastic process ↑↓ 136

• weakly stationary process that is invariant to rotations • all finite-dimensional joint distributions are multivariate normal
'
( d F (s1 , . . . , sn ; x1 , . . . , xn ) ∼ N (µ, Σ)
("
Cov [S(x), S(x + h)] = γ(h) with h = ||h|| = ) h2 i
i=1
with mean vector µT = (µ1 , . . . , µn ); µi = E [S(xi )]
and covariance matrix with elements [Σ]ij = Cov [S(xi ), S(xi )]
⇒ unless stated otherwise, only isotropic and weakly stationary pro-
• joint multivariate normal density (sT = (s1 , . . . , sn ))
cesses are considered
1
f (s) = 2π −n/2 det(Σ)−1/2 exp(− (s − µ)T Σ−1 (s − µ))
2

• weakly stationary Gaussian process is also strictly stationary


4.4 covariance function and variogram ↑↓ 137

2.0
• definition of variogram V (h) and covariance function γ(h)
1
V (h) = Var [S(x + h) − S(x)] γ(h) = Cov [S(x + h), S(x)]
2

1.5
• relation between variogram and covariance function

co− or semivariance
V (h) = γ(0) − γ(h) with γ(0) = Var [S(x)] sill = γ(0) covariance function

1.0
correlogram
variogram
• relation between correlogram and covariance function
γ(h)

0.5
ρ(h) =
γ(0)
• relation between variogram and correlogram range

0.0
V (h) = γ(0) (1 − ρ(h))
0 1 2 3 4

• symmetry distance

V (h) = V (−h) γ(h) = γ(−h) ρ(h) = ρ(−h)

covariance functions must be positive definite ↑↓ 139 4.5 smoothness of stochastic processes ↑↓ 140

&n T
• consider weighted sum i=1 ai S(xi ) = a S of arbitrary set of • {S(x)} is mean square continuous
T
random variables S = (S(x1 ), . . . , S(xn )) with aT = (a1 , . . . , an ) * +
arbitrary real weights E (S(x + h) − S(x))2 → 0 as h→0

• clearly if V (h) → 0 and γ(h) → γ(0) for h = ||h|| → 0


* + * +
Var aT S = aT Cov S, S T a = aT Σa ≥ 0
• {S(x)} is (once) mean square differentiable with 1st derivative pro-
⇒ covariance matrix Σ must be positive definite1 cess {S ′ (x)}
,$ %2 -
⇒ condition imposes restrictions on parametric covariance γ(h) S(x + h) − S(x) ′
functions (and variograms V (h)) E − S (x) →0
h

[Σ]ij = Cov [S(xi ), S(xj )] = γ(xi − xj ) if γ(h) (and V (h)) is twice differentiable at h = 0

⇒ γ(h) must be a positive definite function • higher order mean square derivatives analogously defined

1
more exactly: non-negative definite
smoothness of stochastic processes ↑↓ 141 4.6 examples of isotropic covariance functions ↑↓ 142

• mean square continuity and differentiability in general not suffi- preliminary remark: all models can be used for variograms as well
cient for continuity and differentiability of single realizations s(x) by the relation V (h) = γ(0) − γ(h); ⇒ in the sequel γ(0) = 1
of {S(x)}
nugget effect covariance models absence of auto-correlation
• only for Gaussian processes conditions for mean square continu- . .
1 if h = 0 0 if h = 0
ity/differentiability and continuity and differentiability of realizations γ(h) = V (h) =
0 otherwise 1 otherwise
equivalent.
⇒ {S(x)} spatial white noise process (p(u) = constant)
• if V (h) is 2m times differentiable at h = 0 then the realizations of
the associated Gaussian process have derivatives up to order m • mechanism: measurement error and small-scale spatial vari-
ation
• example: Taylor series of exponential and Gaussian variograms
exponential: 1 − exp(−h) = h − h2 /2 + h3 /6 − h4 /24 + . . . • valid for all dimensions d of study region
2 2 4 6 8
Gaussian: 1 − exp(−h ) = h /2 − h /2 + h /6 − h /24 + . . . • Gaussian {S(x)} with nugget effect covariance function has non-
continuous realizations
⇒ V (h) not differentiable at h = 0 for exponential but infinitely many
times for Gaussian variogram • see ?RMnugget (package RandomFields)

> library(RandomFields); RFoptions(spConform=FALSE)


> x1 <- seq(0, 15, length=301)
> set.seed(1)
1.0

> plot(x1, RFsimulate(RMnugget(var=1), x=x1), type=”l”)


0.8
co− or semivariance
0.6

3
covariance function
variogram

2
0.4

1
s(x_1)
0
0.2

−1
−2
0.0

−3
0 1 2 3 4 5
lag h 0 5 10 15
x_1
> x2 <- seq(0, 7.5, length=151) examples of isotropic covariance functions ↑↓ 146
> RFoptions(spConform=TRUE)
> plot(RFsimulate(RMnugget(var=1), x=x1, y=x2, grid=TRUE)) Whitte-Matérn covariance
21−ν √ h √ h
−4 −2 0 2 4 γ(h) = ( 2ν ) Kν ( 2ν )
Inf
Γ(ν) α α
where α > 0 is the range (scale) and ν > 0 the smoothness parameter,
Kν is the modified Bessel function of order ν
7

Inf
6

• valid for all d


5
x_2

• Gaussian {S(x)} with Wittle-Matérn covariance have m times dif-


4

ferentiable realizations where m is largest integer with m < ν


3

• special cases
2
1

ν = 1/2 γ(h) = exp(−h/α) m=0


0

ν = 3/2 γ(h) = (1 + h/α) exp(−h/α) m=1


0 5 10 15
ν = 5/2 γ(h) = (1 + h/α + h2 /(3α)) exp(−h/α) m=2
x_1

• see ?RMmatern (package RandomFields)

> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMmatern(var=1, scale=3, nu=0.5),
+ x=x1), type=”l”)
1.0

> lines(x1, RFsimulate(RMmatern(var=1, scale=2.7,


+ nu=1.5), x=x1), col=2)
> lines(x1, RFsimulate(RMmatern(var=1, scale=2.6,
0.8

+ nu=2.5), x=x1), col=3)


co− or semivariance
0.6

smoothness

3
ν=1 2
ν=3 2

2
ν=5 2
0.4

1
s(x_1)
0
0.2

−1
−2
ν=1 2 ν=3 2 ν=5 2
0.0

−3
0 5 10 15 0 5 10 15
x_1
lag h
> RFoptions(spConform=TRUE) > RFoptions(spConform=TRUE)
> plot(RFsimulate(RMmatern(var=1, scale=3, > plot(RFsimulate(RMmatern(var=1, scale=2.7,
+ nu=0.5), x=x1, y=x2, grid=TRUE)) + nu=1.5), x=x1, y=x2, grid=TRUE))

−2 −1 0 1 2 −2 −1 0 1 2 3
Inf

Inf
7

7
Inf Inf
6

6
5

5
x_2

x_2
4

4
3

3
2

2
1

1
0

0
0 5 10 15 0 5 10 15
x_1 x_1

> RFoptions(spConform=TRUE) examples of isotropic covariance functions ↑↓ 152


> plot(RFsimulate(RMmatern(var=1, scale=2.6,
+ nu=2.5), x=x1, y=x2, grid=TRUE)) powered exponential or stable covariance

−2 −1 0 1 2 3 γ(h) = exp(−(h/α)ν )
Inf

where α > 0 is the range (scale) and 0 < ν ≤ 2 the smoothness para-
7

Inf meter
6

• valid for all d


5
x_2
4

• Gaussian {S(x)} with stable covariance with ν = 2 (Gaussian


3

covariance) have realizations that are an infinite number of times


2

differentiable but non-differentiable for ν < 2


1

• see ?RMstable (package RandomFields)


0

0 5 10 15
x_1
> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMstable(var=1, scale=1.6, alpha=0.7),
+ x=x1), type=”l”)

1.0
> lines(x1, RFsimulate(RMstable(var=1, scale=2.5, alpha=1),
+ x=x1), col=2)
> lines(x1, RFsimulate(RMstable(var=1, scale=4.3, alpha=2),
0.8

+ x=x1), col=3)
co− or semivariance
0.6

smoothness

3
α = 0.7
α=1

2
α=2
0.4

1
s(x_1)
0
0.2

−1
−2
ν = 0.7 ν=1 ν=2
0.0

−3
0 5 10 15 0 5 10 15
x_1
lag h

examples of isotropic covariance functions ↑↓ 155

spherical covariance family with compact support: γ(h) = 0 for h > α

1.0
• generated by computing average of spatial white noise within a
moving ball with radius α/2 in Rd

0.8
• models for d ≤ 3
.

co− or semivariance
h
1− α
if h ≤ α
d=1 γ(h) = triangle

0.6
0 otherwise tent
. / circular
2 h h h2
( arccos ( ) − 1− α2
) if h ≤ α spherical

0.4
d=2 γ(h) = π α α circular
. 0 otherwise
3h h3
1− + if h ≤ α

0.2
2α 2α3
d=3 γ(h) = spherical
0 otherwise

• all models with non-differentiable Gaussian realizations 0.0

0 5 10 15
• “moving average” covariance functions exist also for d > 3
lag h
• see ?RMtent, ?RMcircular, ?RMspheric (package Ran-
domFields)
> RFoptions(spConform=FALSE) examples of isotropic covariance functions ↑↓ 158
> plot(x1, RFsimulate(RMtent(var=1, scale=5), x=x1),
+ type=”l”) compact support covariance functions with differentiable Gaussian real-
> lines(x1, RFsimulate(RMcircular(var=1, scale=5), x=x1),
izations
+ col=2)
> lines(x1, RFsimulate(RMspheric(var=1, scale=5), x=x1),
+ col=3) • cubic covariance: see ?RMcubic (package RandomFields);
valid for d ≤ 3; twice differentiable

• penta covariance: see ?RMpenta (package RandomFields);


3

valid for d ≤ 3; 4 times differentiable


2 1

• Gneiting covariance: see ?RMgneiting (package Random-


s(x_1)
0

Fields); valid for d ≤ 3; 6 times differentiable


−1
−2

tent circular spherical


−3

0 5 10 15
x_1

> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMcubic(var=1, scale=5),
+ x=x1), type=”l”)
1.0

> lines(x1, RFsimulate(RMpenta(var=1, scale=5),


+ x=x1), col=2)
> lines(x1, RFsimulate(RMgneiting(var=1, scale=3.01),
0.8

+ x=x1), col=3)
co− or semivariance
0.6

cubic

3
penta

2
Gneiting
0.4

1
s(x_1)
0
0.2

−1
−2
cubic penta gneiting
0.0

−3
0 5 10 15 0 5 10 15
x_1
lag h
4.7 anisotropic covariance functions ↑↓ 161 geometrically anisotropic covariance function ↑↓ 162

• for an anisotropic covariance function in general • idea: rotate and stretch/shrink components of x such that the
stochastic process is isotropic in the transformed coordinate sys-
γ(h) ̸= γ(||h||) tem /
γ(h∗ ) = γ( (hA)T Ah)
• particular application: covariance functions for weakly stationary
space-time data s(xi , tj ) (e.g. Gneiting and Guttorp, 2010b) ⇒ iso-covariance contours are ellipsoids in space of untransformed
coordinates and are mapped to unit sphere in Rd by transformation
Cov [S(x + h, t + u), S(x, t)] = γ(h, u)
• example geometrically anisotropic covariance in R2
• valid weakly stationary covariance functions for such zonally an-
h2
isotropic stochastic processes difficult to construct in a general
manner ω
α
, -, -
• approaches f1α 1/α 0 cos(ω) sin(ω)
A=
0 < f1 ≤ 1 h1
0 1/(f1 α) − sin(ω) cos(ω)
1. geometrically anisotropic covariance function 0
)=
2. product-sum covariance function ,h 2
γ(h
1

> library(geoR) > r.sv <- sample.variogram(d.elevation$res,


> library(georob) + locations=as.matrix(d.elevation[, c(”x”,”y”)]),
> d.elevation <- as.data.frame(elevation) + lag.dist.def=0.5, xy.angle.def=c(0, 45, 135, 180))
> class(d.elevation) <- ”data.frame” > summary(r.sv)
> d.elevation$res <- residuals(lm(data~y, d.elevation))
> plot(y~x, d.elevation, cex=sqrt(abs(res)), asp=1,
+ col=c(”blue”, NA, ”orange”)[sign(res)+2]) Sample variogram estimator: qn

Summary of lag distances


Min. 1st Qu. Median Mean 3rd Qu. Max.
0.3812 2.2240 4.2280 4.1290 6.1760 8.2760
6

Summary of number of pairs per lag and distance classes


Min. 1st Qu. Median Mean 3rd Qu. Max.
5

1.00 11.00 45.00 40.18 63.00 73.00


4

Angle classes in xy-plane: (-45,45] (45,135]


y

Angle classes in xz-plane: [0,180]


3 2

> plot(r.sv, type=”l”)


1
0

0 1 2 3 4 5 6 7
x
> (r.exp <- fit.variogram.model(r.sv,
+ variogram.model=”RMexp”,
+ param=c(variance=1500, nugget=10, scale=4 ),
xy.angle: (−45,45] + aniso=c(f1=0.4, f2=1, omega=0, phi=90, zeta=0),
xy.angle: (45,135]
3000 + fit.aniso=c(f1=TRUE, f2=FALSE, omega=FALSE, phi=FALSE,
+ zeta=FALSE), min.npairs=1))
> lines(r.exp, xy.angle=c(0, 90))
semivariance
2000

Variogram: RMexp
variance snugget(fixed) nugget
1.861e+03 0.000e+00 7.240e-05
scale
1000

3.176e+00

f1 f2(fixed) omega(fixed)
0.3151 1.0000 0.0000
phi(fixed) zeta(fixed)
90.0000 0.0000
0

0 2 4 6 8
lag distance

an example for zonal anisotropy ↑↓ 168

xy.angle: (−45,45]
temporal change of soil water storage in a forest (Jost et al., 2005)
xy.angle: (45,135]
3000
semivariance
2000
1000
0

0 2 4 6 8
lag distance

measurement of soil water storage biweekly during growing season


2000 and 2001
an example for zonal anisotropy ↑↓ 169 an example for zonal anisotropy ↑↓ 170

zonal space-time sample variogram product-sum covariance model fitted to space-time sample variogram
γ(h, u) = a0 γx (h)γt (u) + a1 γx (h) + a2 γt (u)

100
100
semivariance [mm2]

80

80
60

semivariance
60
40

20

40
0 time - lag
0
100

20
10
30 20
tim
e la 50 20 30
g [d ]
lag [m

0
ays 10
e
] 0 0 spac 0 5 10 15 20 25 30 35
space lag [m]

summary section 4 ↑↓ 171 summary section 4 ↑↓ 172

• relation between covariance and semi-variance for weakly station-


• stochastic process: generalization of multidimensional random ary processes
variable V (h) = γ(0) − γ(h)
• stationarity assumption of required for estimation from single real- • valid covariance functions must be positive definite (some models
ization of stochastic process are valid only for one- or two-dimensional space)
• in practice assumption of weak stationarity: • variogram generally preferred over covariance function

1. constant mean • shape of variogram close to origin controls smoothness of realiz-


2. constant variance ations of Gaussian processes:
3. covariance and semivariance depends only on lag distance 1. variogram with nugget: realizations non-continuous
but not on location
2. variogram grows linearly at origin: realizations continuous
• often additional assumption of isotropic auto-correlation but not everywhere differentiable

• Gaussian stochastic process: all joint and conditional distribu- 3. variogram grows at at least quadratically at origin: realiza-
tions are normal tions everywhere at least once differentiable
summary section 4 ↑↓ 173 References
Chilès, J.-P. and Delfiner, P. (1999). Geostatistics: Modeling Spatial
• geometrically anisotropic auto-correlation:
Uncertainty . John Wiley & Sons, New York.
1. iso-semivariance surfaces are ellipsoids Diggle, P. J. and Ribeiro, Jr., P. J. (2007). Model-based Geostatistics.
2. variogram has same sill and nugget for all directions but Springer, New York.
direction-dependent range
Gneiting, T. and Guttorp, P. (2010a). Continuous parameter stochastic
3. modelled by linear transformation of spatial coordinates process theory. In A. E. Gelfand, P. J. Diggle, M. Fuentes, and
(stretching and rotation) P. Guttrop, editors, Handbook of Spatial Statistics, pages 17–28.
CRC Press.
• zonal anisotropy:
Gneiting, T. and Guttorp, P. (2010b). Continuous parameter spatio-
1. also nugget and sill depend on direction temporal processes. In A. E. Gelfand, P. J. Diggle, M. Fuentes, and
2. modelling more demanding P. Guttrop, editors, Handbook of Spatial Statistics, pages 427–436.
CRC Press.

Jost, G., Heuvelink, G. B. M., and Papritz, A. (2005). Analysing the

space-time distribution of soil water storage of a forest ecosystem 5 ad-hoc estimation of parameters of model
using spatio-temporal kriging. Geoderma, 128(3–4), 258–273.
for spatial data
pro memoria: model for Gaussian spatial data ↑↓ 177 pro memoria: steps of a geostatistical analysis ↑↓ 178

• model for data: Yi = S(xi )+Zi = µ(xi )+E(xi )+Zi where exploratory analysis
Yi th
i datum ⇓
S(xi ) “signal” (= true quantity) at location xi trend estimation by linear regression analysis
µ(xi ) trend ⇒ estimate of trend parameters β
{E(xi )} a zero mean Gaussian process, parametrized ⇓
by covariance function γ(h; θ) or variogram V (h; θ) modelling residual auto-correlation
Zi iid Gaussian measurement error with variance τ 2
computing sample variogram of residuals; fitting model function to it
• trend µ(xi ) modelled by linear
" regression model ⇒ estimate of variogram parameters θ and of nugget variance τ 2
µ(xi ) = dk (xi )βk = d(xi )T β ⇓
k $ %
with dk (xi ) denoting (spatial) covariates statistical inference
• unknown elements of model:

1. structure and parameters β of trend model model assessment by cross-validation
2. covariance (or variogram) parameters θ

3. nugget variance τ 2
computing spatial predictions

5.1 ordinary least squares trend estimation ↑↓ 179 generalized least squares trend estimation ↑↓ 180

• Gaussian model in vector notation Y = Xβ + E + Z • generalized least squares estimates


!
β T −1 −1 T −1
• estimation of trend parameters β by ordinary least squares GLS = (X Γθ X) X Γθ Y

!
β T −1 T
• GLS = OLS with “orthogonalized” data Ỹ = L−1 Y and design
OLS = (X X) X Y

* + matrix X̃ = L−1 X where Γθ = LLT


• for spatially uncorrelated data (E = 0; Cov Y Y T = τ 2 I )
! GLS = (X̃ T X̃)−1 X̃ T Ỹ
β
0 1
! 2 T −1
β OLS ∼ N β, τ (X X) 0 1
! T −1 −1
• sampling distribution β GLS ∼ N β, (X Γθ X)
* T
+ * T
+
• for spatially auto-correlated data (Cov Y , Y = Cov ZZ + ! !
* + • for spatially uncorrelated data (Γθ = τ 2 I ) β GLS = β OLS
Cov EE T = Γθ = τ 2 I + Σθ )
!
• β
0 1 GLS has among all linear estimators smallest standard errors
! 2 T −1
+ (X T X)−1 X T Σθ X(X T X)−1
β OLS ∼ N β, τ (X X) (Gauss-Markov theorem)
⇒ BLUE (Best Unbiased Linear Estimator)
⇒ ignoring auto-correlation: !
β OLS unbiased, but standard errors
too small ⇒ tests based on OLS fit biased! ! GLS is maximum likelihood estimate for Gaussian Y
• β
(Intercept) x y
example: trend estimation Wolfcamp data ↑↓ 181
80.79707 -19.51093 -14.71473

customary OLS fit (ignoring auto-correlation) example: trend estimation Wolfcamp data ↑↓ 181

> library(sp) ! OLS when auto-correlation is taken into account (us-


standard errors of β
> library(geoR)
ing REML estimates of θ and τ 2 )
> library(gstat)
> d.w <- as.data.frame(wolfcamp) > library(RandomFields)
> class(d.w) <- ”data.frame” > Gamma <- RFcovmatrix(
> colnames(d.w) <- c(”x”, ”y”, ”pressure”) + RMspheric(var=4370, scale=139)+RMnugget(var=1153),
> coordinates(d.w) <- ~x+y + x=coordinates(d.w))
> r.ols <- lm(pressure~x+y, d.w) > X <- model.matrix(r.ols)
> (se.ols.1 <- sqrt(diag(vcov(r.ols)))) > tmp <- solve(crossprod(X)) %*% t(X)
> (se.ols.2 <- sqrt(diag(tmp %*% Gamma %*% t(tmp))))

(Intercept) x y (Intercept) x y
7.52218651 0.06552440 0.07738785 22.5365079 0.1950825 0.2411611

> # t-values
> # t-values > coef(r.ols) / se.ols.2
> coef(r.ols) / se.ols.1 (Intercept) x y
26.968271 -6.553340 -4.721909

example: trend estimation Wolfcamp data ↑↓ 183 5.2 computing sample variogram of residuals ↑↓ 184

generalized squares estimate and standard errors ! of fitted linear model (or use data
• extract residuals R = Y − X β
> L.inv <- solve(t(chol(Gamma))) Y if model has constant µ(x))
> y.tilde <- L.inv %*% d.w$pressure
> X.tilde <- L.inv %*% X • choose bin width dh (and width of angular classes dφ) to define
> r.gls <- lm(y.tilde~X.tilde-1) (k, l)th lag class, hkl , characterized by distance, (hk − dh, hk + dh]
> rbind(ols=coef(r.ols), gls=coef(r.gls)) (and angular class, φl − dφ, φl + dφ])

(Intercept) x y
ols 607.7707 -1.278442 -1.138741
gls 624.3471 -1.329099 -1.180178
dφl
> rbind(se.ols.1=se.ols.1, se.ols.2=se.ols.2,
dh xj
+ se.gls=sqrt(diag(vcov(r.gls))))
dh dφl
hkl φl
(Intercept) x y xk
se.ols.1 7.522187 0.0655244 0.07738785
se.ols.2 22.536508 0.1950825 0.24116113 xi
se.gls 20.812750 0.1612431 0.21161369
computing sample variogram of residuals ↑↓ 185 example: sample variogram Wolfcamp data ↑↓ 186

• form all Nkl pairs (i, j) with xi − xj ≈ hkl and compute for each > library(georob)
lag class hkl the semivariance > r.sv.5 <- sample.variogram(residuals(r.ols),
+ locations=coordinates(d.w), lag.dist.def=5,
1 " + max.lag=200, estimator=”matheron”)
V! (hkl ) = [R(xi ) − R(xj )]2 > plot(r.sv.5, main=”lag class width=5 km”)
2 Nkl
(i,j)∈hkl > text(gamma~lag.dist, r.sv.5, labels=npairs, pos=3)
> ...
• sample variogram plot of V! (hkl ) vs. hkl

• rules of thumb:

1. choose dh (and dφ) such that Nkl > 30 − 50


2. largest hkl ≤ 0.5 max(xi − xj )

lag class width=5 km lag class width=10 km fitting variogram model to sample variogram ↑↓ 188

48 52
• semivariance required for arbitrary lag distances when computing
5000

5000

37 60 71 147
100
7158 76
96
48 54
49 73
8391 7568 95
80
70 109125
116
173 148
168 168
predictions
33 47 90 142
semivariance

semivariance

5869 93 99 185
45 40 52 48 73
⇒ smoothing sample variogram by fitting a parametric variogram
3000

3000

42 94 78 83
43
38 33 70
282328 55 51 93 function V (h, θ)
23
31
1000

1000

8 • choose a variogram function that approximates shape of sample


variogram well (in particular close to origin)
0

0 50 100 150 200 0 50 100 150 200


lag distance lag distance
• fit parameters θ by (weighted) non-linear least squares
lag class width=20 km lag class width=30 km
" 2 32
! = argmin
θ w(hkl ) V! (hkl ) − V (hkl , θ)
θ
5000

5000

196
234 334 kl
320 405
258 316 265 433
169 526
semivariance

semivariance

353
• options for weighing
3000

3000

161
254
163
82
152
1. equal weights: w(hkl ) = 1
2. by number of pairs: w(hkl ) = Nkl
1000

1000

3. Cressie’s weights: w(hkl ) = Nkl /V (hkl , θ)2


0

0 50 100 150 0 50 100 150 200


lag distance lag distance
example: fitting variogram model Wolfcamp data ↑↓ 189

5000
> r.sph.e <- fit.variogram.model(r.sv.20,
+ variogram.model=”RMspheric”,

4000
+ param=c(variance=3000, nugget=100, scale=100),
+ weighting=”equal”)
> plot(r.sv.20)

3000
semivariance
> lines(r.sph.e)
> ...

2000 1000
weighting
equal
npairs
Cressie

0
0 50 100 150
lag distance

example: directional variogram Wolfcamp data ↑↓ 192


5000

computing sample variogram of residuals in N-S, NE-SW, E-W and SE-


NW direction
4000

> r.sv.20.aniso <- sample.variogram(residuals(r.ols),


+ locations=coordinates(d.w), lag.dist.def=20,
3000

+ max.lag=200, estimator=”matheron”,
semivariance

+ xy.angle.def=c(0, 22.5, 67.5, 112.5, 157.5, 180))


> plot(r.sv.20.aniso, type=”l”)
2000

lag class width


1000

20
5
10
30
0

0 50 100 150
lag distance
computing sample variogram of pressure data in N-S, NE-SW, E-W and
SE-NW direction

1000 2000 3000 4000 5000 6000 7000


xy.angle: (−22.5,22.5]
xy.angle: (22.5,67.5] > r.sv.20.data.aniso <- sample.variogram(
xy.angle: (67.5,112] + pressure~1, d.w, locations=~x+y, lag.dist.def=20,
xy.angle: (112,158] + max.lag=200, estimator=”matheron”,
+ xy.angle.def=c(0, 22.5, 67.5, 112.5, 157.5, 180))
> plot(r.sv.20.data.aniso, type=”l”)
semivariance
0

0 50 100 150
lag distance

problems with ad-hoc model estimation ↑↓ 196


70000

• subjective choice of lag class width and weighting method for


xy.angle: (−22.5,22.5]
xy.angle: (22.5,67.5]
model fitting
xy.angle: (67.5,112] • estimates of semivariance for different lag classes mutually cor-
xy.angle: (112,158]
related; choice of variogram function based on sample variogram
50000

problematic
semivariance

• auto-correlation of residuals
30000

! = Y − Xβ
R ! OLS = Y − X(X T X)−1 X T Y = (I − H)Y
4 56 7
H
differs from auto-correlation of underlying stochastic process
10000

8 9 * +
Cov R, ! R ! T = (I−H)Cov Y , Y T (I−H) = (I−H)Γθ (I−H) ̸= Γθ
0

⇒ estimate of variogram based on sample variogram of OLS resid-


0 50 100 150
lag distance uals biased
⇒ estimate trend and variogram parameters simultaneously by max-
imum likelihood
summary section 5 ↑↓ 197 6 maximum likelihood (ML) estimation of
parameters of Gaussian model for spatial
• generalized least squares (GLS) method of choice for estimating
coefficients of trend model (BLUE) data
• GLS requires knowledge of variogram

• subjective choice of lag class definition for computing sample vari-


ogram

• sample variogram susceptible to outliers ⇒ robust estimators

• fitting model function to sample variogram requires further subject-


ive choices

• ad-hoc approach provides biased estimates of variogram of un-


derlying stochastic process if trend is modelled

pro memoria: maximum likelihood estimation ↑↓ 199 6.1 ML estimation for Gaussian spatial model ↑↓ 200

• principle of ML estimation: find parameters that maximize joint • consider now a Gaussian stochastic process {Y (x)} with a linear
probability for observed data trend function

• properties of ML estimates: asymptotically unbiased and fully • any arbitrary set of random variables Y = (Y (x1 ), . . . , Y (xn )) has
efficient; asymptotically normally distributed a multivariate Gaussian distribution with expectation

• bias matters for estimating variance parameters from small E [Y ] = Xβ


samples
and covariance matrix
• profile likelihood useful for exploring shape of likelihood surface
* +
and for computing confidence intervals based on likelihood ratio Cov Y , Y T = Γθ
test
• joint probability density for Y given by
n 1 1
f (y; β, θ) = (2π)− 2 |Γθ |− 2 exp(− {y − Xβ}T Γ−1
θ {y − Xβ})
2
ML estimation for Gaussian spatial model ↑↓ 201 ML estimation for Gaussian spatial model ↑↓ 202

• unknown model parameters: • set of estimating equations


1. regression coefficients β ∂Lp
=0
2. covariance (or variogram) parameters θ ∂θ
• log-likelihood function (up to a constant) given by forms a system of non-linear equations ⇒ in general difficult to
1 1 solve
L(β, θ; y) = − log(|Γθ |) − {y − Xβ}T Γ−1
θ {y − Xβ}
2 2 ⇒ maximize Lp (θ; y) numerically by a non-linear optimization
• for known θ MLE for β equal to GLS estimator method to find MLE θ!

!
β T −1 −1 T −1 !
GLS = (X Γθ X) X Γθ Y ⇒ numerical optimization requires initial values of θ

• plugging β !
GLS for β into L(β, θ; y) gives profile likelihood function
for θ
1 1 ! GLS }T Γ−1 {y − X β
! GLS }
Lp (θ; y) = − log(|Γθ |) − {y − X β θ
2 2

example: ML estimates Wolfcamp data ↑↓ 203 Convergence in 12 function and 7 Jacobian/gradient evaluations

Estimating equations (gradient)

> library(geoR) eta scale


> library(gstat) Gradient : -8.880367e-05 7.654524e-04
> library(georob) Maximized log-likelihood: -458.3671
> d.w <- as.data.frame(wolfcamp)
> class(d.w) <- ”data.frame” Predicted latent variable (B):
> colnames(d.w) <- c(”x”, ”y”, ”pressure”) Min 1Q Median 3Q Max
-89.37 -54.98 -16.64 24.43 111.41
> coordinates(d.w) <- c(”x”, ”y”)
> r.georob.ml <- georob(pressure~x+y, d.w, Residuals (epsilon):
+ locations=~x+y, variogram.model=”RMspheric”, Min 1Q Median 3Q Max
+ param=c(variance=3000, nugget=1000, scale=100), -62.789 -19.775 6.311 18.032 62.776
+ tuning.psi=1000, control=control.georob(ml.method=”ML”)) Standardized residuals:
> summary(r.georob.ml) Min 1Q Median 3Q Max
-2.3778 -0.7413 0.2249 0.6534 3.2793

Call:georob(formula = pressure ~ x + y, data = d.w, locations = ~x +


Gaussian ML estimates
y, variogram.model = ”RMspheric”, param = c(variance = 3000,
nugget = 1000, scale = 100), tuning.psi = 1000, control = control.georob(ml.method = ”ML”))
Variogram: RMspheric
Estimate Lower Upper
Tuning constant: 1000
variance 3328.90 1453.95 7621.7
snugget(fixed) 0.00 NA NA > plot(r.georob.ml, lag.dist.def=20, max.lag=200)
nugget 1236.27 616.01 2481.1
scale 122.95 95.13 158.9

Fixed effects coefficients:


Estimate Std. Error t value Pr(>|t|)

6000
(Intercept) 620.3550 17.0641 36.354 < 2e-16
x -1.3256 0.1360 -9.750 2.33e-15
y -1.2061 0.1793 -6.727 2.16e-09

5000
Residual standard error (sqrt(nugget)): 35.16

3000 4000
semivariance
Robustness weights:
All 85 weights are ~= 1.

2000
1000
0
0 50 100 150
lag distance

example: profile likelihood Wolfcamp data ↑↓ 207


ML profile likelihoood for scale

computing profile log-likelihood for range parameter along with 95%


confidence interval

−459.0
> r.proflik.ml <- profilelogLik(r.georob.ml,
+ values=data.frame(scale=seq(50, 500, by=5)))
> str(r.proflik.ml)

−460.0
’data.frame’: 91 obs. of 9 variables:

loglik
$ scale : num 50 55 60 65 70 75 80 85 90 95 ...
$ loglik : num -461 -461 -461 -460 -460 ...
$ variance : num 3641 3508 3393 3315 3240 ...

−461.0
$ nugget : num 535 668 780 860 933 ...
$ (Intercept) : num 617 618 618 618 618 ...
$ x : num -1.29 -1.29 -1.3 -1.3 -1.31 ...
$ y : num -1.24 -1.25 -1.25 -1.25 -1.26 ...
$ gradient.nugget: num 0.018148 -0.006713 -0.00423 -0.00..
$ converged : num 1 1 1 1 1 1 1 1 1 1 ... −462.0
100 200 300 400 500
> plot(loglik~scale, r.proflik.ml, type=”l”, scale
+ main=”ML profile likelihoood for scale”)
> abline(v=r.georob.ml$param[”scale”])
> abline(h=r.georob.ml$loglik - qchisq(0.95, 1)/2)
equivalent number of independent observations ↑↓ 209 example: neq Wolfcamp data ↑↓ 210

• for small sample size MLEs of variogram parameters often negat- > library(RandomFields)
ively biased when trend is simultaneously estimated > Gamma <- RFcovmatrix(
+ RMspheric(var=3329, scale=123)+RMnugget(var=1236),
• for auto-correlated data this problem is more severe because ef- + x=coordinates(d.w))
> var.y <- sum(c(variance=3329, nugget=1236))
fective sample size usually much smaller than nominal sample > (n <- nrow(d.w))
size n

• effective sample size given by equivalent number of independent [1] 85


observations
Var [Y (x)]
neq = * + ≤n > var.ybar <- sum(Gamma)/n^2
Var Ȳ
> var.y/var.ybar
where
n n
* + Var [Y (x)] 1 " " [1] 13.67365
Var Ȳ = + 2 Cov [Y (xi ), Y (xj )]
n n i=1 j=1;j̸=i

6.2 restricted maximum likelihood estimation ↑↓ 211 restricted maximum likelihood estimation (REML) ↑↓ 212

• bias of MLEs of variogram parameters θ can be reduced by re- • principle of REML (continued)
stricted maximum likelihood estimation
2. estimate θ by maximizing likelihood function for n − p ele-
• principle of restricted maximum likelihood estimation (REML) ments of Z
⇒ this is equivalent to maximizing the restricted log-likelihood
1. form linear combinations Z = AY of data Y that have zero
function
expectation (and do no longer depend on β )
1 1
Lr (θ; y) = − log(|Γθ |) − log(|X T Γ−1 θ X|)
E [Z] = AXβ = 0 2 2
1 ! GLS }T Γ−1 {y − X β
! GLS }
− {y − X β θ
⇒ matrix A must satisfy condition AX = 0 2

⇒ A non-unique; many possibility, e.g. ⇒ REML estimate θ !REML has same properties (asymptotic nor-
mal distribution, likelihood ratio statistic) as ML estimate
A = I − H OLS = I − X(X T X)−1 X T 8 9
3. given θ!REML compute β ! ! !T
GLS and Cov β GLS , β GLS =
⇒ Z is an error contrast or a generalized increment (X T Γ−1 X)−1
θ
example: REML estimates Wolfcamp data ↑↓ 213 Convergence in 6 function and 5 Jacobian/gradient evaluations

Estimating equations (gradient)

> r.georob.reml <- georob(pressure~x+y, d.w, eta scale


+ locations=~x+y, variogram.model=”RMspheric”, Gradient : -2.248651e-04 -1.070402e-01
+ param=c(variance=3000, nugget=1000, scale=100), Maximized restricted log-likelihood: -456.3802
+ tuning.psi=1000)
> summary(r.georob.reml) Predicted latent variable (B):
> plot(r.sv.20, ylim=c(0, 6000)) Min 1Q Median 3Q Max
-94.58 -60.99 -17.59 23.10 115.72
> lines(r.sph.e)
> lines(r.georob.ml, col=2) Residuals (epsilon):
> lines(r.georob.reml, col=3) Min 1Q Median 3Q Max
> legend(”bottomright”, lty=1, col=1:3, -59.148 -18.009 6.251 15.982 54.620
+ legend=c(”fitted sample variogram”, ”ML estimate”, Standardized residuals:
+ ”REML estimate”), bty=”n”) Min 1Q Median 3Q Max
-2.4030 -0.7131 0.2282 0.6937 3.1932

Call:georob(formula = pressure ~ x + y, data = d.w, locations = ~x +


Gaussian REML estimates
y, variogram.model = ”RMspheric”, param = c(variance = 3000,
nugget = 1000, scale = 100), tuning.psi = 1000)
Variogram: RMspheric
Estimate Lower Upper
Tuning constant: 1000
variance 4358.84 1810.40 10494.6

snugget(fixed) 0.00 NA NA
nugget 1151.28 540.41 2452.7
scale 138.91 82.62 233.6

6000
Fixed effects coefficients:

5000
Estimate Std. Error t value Pr(>|t|)
(Intercept) 624.3287 20.7961 30.021 < 2e-16
x -1.3291 0.1611 -8.248 2.25e-12

2000 3000 4000


y -1.1804 0.2115 -5.581 3.00e-07

semivariance
Residual standard error (sqrt(nugget)): 33.93

Robustness weights:
All 85 weights are ~= 1.

1000
fitted sample variogram
ML estimate
0 REML estimate

0 50 100 150
lag distance
6.3 testing hypotheses about trend coefficients ↑↓ 217 example: hypothesis tests trend Wolfcamp data ↑↓ 218

• likelihood ratio test can only be used to test hypotheses and build > d.w$xs <- d.w$x - mean(d.w$x)
confidence regions for θ > d.w$ys <- d.w$y - mean(d.w$y)
> r.georob.full <- georob(
• LRT for regression for β in general biased (too small p-values Pin- + pressure~xs+ys+I(xs^2)+I(ys^2)+xs:ys, d.w,
heiro and Bates, 2000, pp. 87) + locations=~x+y, variogram.model=”RMspheric”,
+ param=c(variance=3000, nugget=1000, scale=100),
⇒ use conditional F -tests for testing hypotheses about β : + tuning.psi=1000)
> summary(r.georob.full)
1. fit covariance parameters of “largest” regression model
⇒θ !
Call:georob(formula = pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys,
data = d.w, locations = ~x + y, variogram.model = ”RMspheric”,
2. compute covariance matrix ⇒ Γθ! param = c(variance = 3000, nugget = 1000, scale = 100), tuning.psi = 1000)
3. compute ⇒Lθ! by Cholesky decomposition of Γθ!
Tuning constant: 1000
4. orthogonalize response vector and design matrix
Convergence in 10 function and 8 Jacobian/gradient evaluations
⇒ Ỹ = L−1 −1
! Y , X̃ = Lθ
θ ! X
Estimating equations (gradient)
5. conventional F -test with orthogonalized items Ỹ and X̃
eta scale

Gradient : 3.590344e-04 -4.553394e-03 Fixed effects coefficients:


Estimate Std. Error t value Pr(>|t|)
Maximized restricted log-likelihood: -470.3894 (Intercept) 613.105482 30.364674 20.191 < 2e-16
xs -1.161184 0.167717 -6.923 1.05e-09
Predicted latent variable (B): ys -1.040147 0.214430 -4.851 6.06e-06
Min 1Q Median 3Q Max I(xs^2) 0.001592 0.001230 1.294 0.199
-89.22 -46.81 -11.06 20.80 94.07 I(ys^2) -0.001731 0.002133 -0.811 0.420
xs:ys 0.002067 0.001968 1.050 0.297
Residuals (epsilon):
Min 1Q Median 3Q Max Residual standard error (sqrt(nugget)): 33.95
-59.664 -18.086 6.783 16.245 49.986
Robustness weights:
Standardized residuals: All 85 weights are ~= 1.
Min 1Q Median 3Q Max
-2.4096 -0.7213 0.2500 0.6830 3.1042

> waldtest(r.georob.full, .~.-xs:ys, test=”F”)


Gaussian REML estimates

Variogram: RMspheric Wald test


Estimate Lower Upper
variance 3740.02 1469.60 9518.1 Model 1: pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys
snugget(fixed) 0.00 NA NA Model 2: pressure ~ xs + ys + I(xs^2) + I(ys^2)
nugget 1152.79 532.29 2496.6 Res.Df Df F Pr(>F)
scale 123.93 93.96 163.5 1 79
2 80 -1 1.1032 0.2968
> waldtest(r.georob.full, .~.-I(xs^2)-I(ys^2)-xs:ys, model building: automatic covariate selection ↑↓ 222
+ test=”F”)

• given estimates of covariance parameters θ ! and keeping them


Wald test
fixed, the usual stepwise procedures for selecting covariates can
Model 1: pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys be used
Model 2: pressure ~ xs + ys
Res.Df Df F Pr(>F)
1 79 • selecting models based on AIC and BIC
2 82 -3 1.6284 0.1895
> # model building based on AIC
> step(r.georob.full)

Start: AIC=922.16
pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys

Df AIC Converged
- I(xs^2) 1 922.05 1
- I(ys^2) 1 922.13 1
<none> 922.16
- xs:ys 1 922.49 1

Step: AIC=922.05
pressure ~ xs + ys + I(ys^2) + xs:ys

Df AIC Converged Df AIC Converged


<none> 922.05 - I(xs^2) 1 934.27 1
- I(ys^2) 1 922.54 1 - I(ys^2) 1 934.34 1
- xs:ys 1 924.61 1 - xs:ys 1 934.70 1
<none> 936.81
Tuning constant: 1000
Step: AIC=934.27
Fixed effects coefficients: pressure ~ xs + ys + I(ys^2) + xs:ys
(Intercept) xs ys I(ys^2)
634.939862 -1.248212 -1.092032 -0.002587 Df AIC Converged
xs:ys - I(ys^2) 1 932.31 1
0.003005 <none> 934.27
- xs:ys 1 934.38 1
Variogram: RMspheric
variance(fixed) snugget(fixed) nugget(fixed) Step: AIC=932.31
2060.4 0.0 1402.1 pressure ~ xs + ys + xs:ys
scale(fixed)
103.8 Df AIC Converged
- xs:ys 1 931.79 1
<none> 932.31
> # model building based on BIC Step: AIC=931.79
> step(r.georob.full, k=log(nrow(d.w))) pressure ~ xs + ys

Df AIC Converged
Start: AIC=936.81 <none> 931.79
pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys - ys 1 1006.98 1
- xs 1 1085.42 1
summary section 6 ↑↓ 226
Tuning constant: 1000

Fixed effects coefficients: • no closed form expressions for ML estimates of parameters of


(Intercept) xs ys
620.412 -1.314 -1.226 Gaussian model for spatial data; estimates obtained by numer-
ically maximizing likelihood function
Variogram: RMspheric
variance(fixed) snugget(fixed) nugget(fixed)
2060.4 0.0 1402.1 • equivalent number of independent observations of a sample of
scale(fixed) spatial data often much smaller than nominal sample size: ⇒
103.8
bias of ML estimates of variance parameters important

⇒ restricted maximum likelihood estimation (REML) method of


choice

• use of conditional F -tests for testing hypotheses about trend func-


tion coefficients

• use of standard stepwise model building procedures for finding


structure of trend function

References 7 spatial prediction by kriging


Gneiting, T. and Guttorp, P. (2010). Continuous parameter spatio-
temporal processes. In A. E. Gelfand, P. J. Diggle, M. Fuentes, and
P. Guttrop, editors, Handbook of Spatial Statistics, pages 427–436.
CRC Press.

Jost, G., Heuvelink, G. B. M., and Papritz, A. (2005). Analysing the


space-time distribution of soil water storage of a forest ecosystem
using spatio-temporal kriging. Geoderma, 128(3–4), 258–273.

Pinheiro, J. C. and Bates, D. M. (2000). Mixed-Effects Models in S and


S-PLUS. Springer Verlag.
stating prediction problem ↑↓ 229 7.1 mean square prediction ↑↓ 230

• observations y T = (y1 , . . . , yn ) available for a set of n locations xi • consider for simplicity case m = 1, i.e.
• y considered as realization of the multivariate random variable S = S(x′1 ) and S! = S(x
! ′1 ; Y )
Y T = (Y1 , . . . , Yn )

• model: Yi = S(xi ) + Zi with • criterion for optimality of prediction


⇒ mean squared prediction error
Yi ith datum 8 9
S(xi ) “signal” (= true quantity) at location xi MSEP[S] ! = E {S! − S}2
{S(xi )} Gaussian process, parametrized by
&
trend µ(xi ) = k dk (xi )βk = d(xi )T β (expectation taken with respect to joint distribution of Y and S )
and covariance function γ(h; θ) or variogram V (h; θ)
Zi iid Gaussian measurement error with variance τ 2

! , of S T = (S(x′ ), . . . , S(x′ )) required for set of


• predictions, say S 1 m

m location xj without data; S ! computed from Y ⇒ S ! = S(Y
! )

mean square prediction ↑↓ 231 7.2 mean square prediction for Gaussian process ↑↓ 232

• MSEP can alternatively be written as (e.g. ?, p. 135) • standard results from theory about multivariate normal distribu-
8 9 tions apply
MSEP[S]! = EY ES|Y [{S! − S}2 ] = . . .
:; • joint distribution of (S T , Y T ) multivariate normal with mean vector
* + <2 =
! , - , -
= EY VarS|Y [S] + EY ES|Y [S] − S µS XS β
µ= =
µY XY β
⇒ conditional expectation S!opt = ES|Y [S] minimizes MSEP
and covariance matrix
, * + * + - , -
• MSEP of S!opt equal to expectation of conditional variance Cov S, S T Cov S, Y T ΣSS ΣSY
Σ= * + * + =
* + Cov Y , S T Cov Y , Y T ΣTSY ΓY Y
MSEP[S!opt ] = EY VarS|Y [S]
• note that
• evaluation of S!opt and MSEP[S!opt ] requires fully specified paramet-
ric model for joint distribution of S and Y 1. Σ depends on covariance parameters θ, τ 2 and
2. again ΓY Y = ΣSS + τ 2 I
mean square prediction for Gaussian processes ↑↓ 233 properties of simple kriging predictor ↑↓ 234

• conditional distribution of S given Y = y normal with mean • write Λ = ΣSY Γ−1


Y Y (m × n-matrix) and λ = (XS − ΛXY )β (m-
vector) then
ES|Y [S] = XS β + ΣSY Γ−1
Y Y (y − XY β) ! opt = λ + Λy
S
and covariance matrix
⇒ simple kriging predictor is a heterogeneous linear predictor (Λ:
CovS|Y [S, S T ] = ΣSS − ΣSY Γ−1 T
Y Y ΣSY
matrix with simple kriging weights; λ: vector with “intercepts”)

(conditional covariance matrix independent of y ) • one may show that Λ and λ minimize for any heterogeneous linear
predictor 2 8 93
• simple kriging predictor (= optimal predictor) equal to conditional trace E {S! − S}{S
! − S}T
expectation 8 9
! opt = ES|Y [S]
S subject to constraint E S! −S =0

• MSEP of simple kriging predictor equal to ⇒ simple kriging: BLUP (Best Linear Unbiased Predictor)
! opt ] = CovS|Y [S, S T ]
MSEP[S

properties of simple kriging predictor ↑↓ 235

• write µ = XS β and M = Γ−1 Y Y (y − XY β), which both do not

3
depend on the prediction locations x′j ; then for each x′j

2
n
"
S!opt (x′j ) = µj + Mi γ(x′j − xi )

1
i=1

s.opt
0
⇒ simple kriging predictor for x′j : weighted sum of covariance
terms “pinned down” at data locations xi (dual form of kriging)

−1
⇒ shape of covariance function (or variogram) close to origin determ- RMexp
RMexp+RMnugget

−2
ine shape of prediction surface near data locations RMmatern

⇒ continuity and diffentiability of variogram at origin control geomet- 0.0 0.2 0.4 0.6 0.8 1.0
rical properties of simple kriging prediction surface x.prime
7.3 universal/external drift kriging ↑↓ 237 properties of universal kriging predictor ↑↓ 238

! opt requires a fully specified weakly stationary model:


• evaluating S
!
• substituting β T −1 −1 T −1
GLS = (X ΓY Y X) X ΓY Y y in expression for UK
1. structure of trend function known
2. regression coefficients β known predictor reveals that
3. type of parametric covariance (variogram) function known ! k = KY
S
4. parameters θ, τ 2 of covariance function known is a homogeneous linear predictor (K : m × n-matrix with UK
• relax assumptions: only 1, 3, 4 assumed to be know, β implicitly weights)
estimated from data by generalized least squares
• one may show that K minimizes for any homogeneous linear pre-
⇒ universal (UK) (external drift, EDK) plug-in kriging predictor dictor 2 8 93
trace E {S! − S}{S
! − S}T
!
! k = XS β ! Λ = ΣSY Γ−1
S GLS + Λ(y − XY β GLS ) with YY 8 9
subject to constraint E S! −S =0
⇒ MSEP of UK plug-in predictor
! k ] = MSEP[S
! opt ] ⇒ universal kriging: eBLUP (empirical Best Linear Unbiased [ho-
MSEP[S
8 9 mogeneous] Predictor)
+(XS − ΛXY )Cov β !T
! GLS , β T
GLS (XS − ΛXY )

properties of universal kriging predictor ↑↓ 239 example: UK predictions Wolfcamp data ↑↓ 240

fitting the spatial model by REML


! k implies
• condition for unbiasedness of S
> library(geoR)
(XS − KXY ) = 0 > library(gstat)
> library(georob)
> library(lattice)
⇒ if trend model includes an intercept then for each x′j > d.w <- as.data.frame(wolfcamp)
> class(d.w) <- ”data.frame”
n
" > colnames(d.w) <- c(”x”, ”y”, ”pressure”)
Kji = 1 > coordinates(d.w) <- c(”x”, ”y”)
i=1 > r.georob <- georob(pressure~x+y, d.w,
+ locations=~x+y, variogram.model=”RMspheric”,
• dual form: UK predictor can again be written as a weighted sum + param=c(variance=3000, nugget=1000, scale=100),
of covariance terms “pinned down” at data locations + tuning.psi=1000)

• for special case µ(x) = const. UK is denoted as ordinary kriging


(OK)
example: UK predictions Wolfcamp data ↑↓ 241 > # for plotting convert predictions to SpatialGridDataFrame
> r.uk <- r.uk.signal
computing UK predictions of signal S(x′ ) > coordinates(r.uk) <- c(”x”, ”y”)
> gridded(r.uk) <- TRUE
> d.w.grid <- expand.grid( > fullgrid(r.uk) <- TRUE
+ x = seq(-240, 190, by= 2.5), > str(r.uk)
+ y = seq(-150, 140, by= 2.5) Formal class ’SpatialGridDataFrame’ [package ”sp”] with 4 ..
+ ) ..@ data :’data.frame’: 20241 obs. of 4 variables:
> r.uk.signal <- predict(r.georob, newdata=d.w.grid) .. ..$ pred : num [1:20241] 778 775 771 768 765 ...
> str(r.uk.signal) .. ..$ se : num [1:20241] 89.7 89.5 89.3 89.1 88.9 ...
.. ..$ lower: num [1:20241] 602 599 596 594 591 ...
.. ..$ upper: num [1:20241] 954 950 946 943 939 ...
..@ grid :Formal class ’GridTopology’ [package ”s”..
’data.frame’: 20241 obs. of 6 variables: .. .. ..@ cellcentre.offset: Named num [1:2] -240 -150
$ x : num -240 -238 -235 -232 -230 ... .. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
$ y : num -150 -150 -150 -150 -150 -150 -150 -150 -15.. .. .. ..@ cellsize : Named num [1:2] 2.5 2.5
$ pred : num 1117 1115 1113 1110 1108 ... .. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
$ se : num 58.9 58.3 57.8 57.3 56.8 ... .. .. ..@ cells.dim : Named int [1:2] 173 117
$ lower: num 1002 1001 999 998 997 ... .. .. .. ..- attr(*, ”names”)= chr [1:2] ”x” ”y”
$ upper: num 1233 1229 1226 1222 1219 ... ..@ bbox : num [1:2, 1:2] -241 -151 191 141
- attr(*, ”variogram.object”)=List of 1 .. ..- attr(*, ”dimnames”)=List of 2
... .. .. ..$ : chr [1:2] ”x” ”y”
.. .. ..$ : chr [1:2] ”min” ”max”
..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..
.. .. ..@ projargs: chr NA

> # plot UK predictions


UK prediction
> breaks <- seq(50, 1250, by=50)
> spplot(r.uk, zcol=”pred”, at=breaks, 1200
+ main=”UK prediction”)

1000

800

600

400

200
> # plot UK prediction standard errors and data locations
UK standard error
> spplot(r.uk, zcol=”se”, main=”UK standard error”)
> trellis.focus(”panel”, row=1, column=1) 90
> panel.points(x=d.w$x, y=d.w$y)
80

NULL
70

60
> trellis.unfocus()
50

40

30

20

> # plot lower limits of 95% prediction intervals


lower limit 95% prediction interval
> spplot(r.uk, zcol=”lower”, at=breaks,
+ main=”lower limit 95% prediction interval”) 1200

1000

800

600

400

200
> # plot upper limits of 95% prediction intervals
upper limit 95% prediction interval
> spplot(r.uk, zcol=”upper”, at=breaks,
+ main=”upper limit 95% prediction interval”) 1200

1000

800

600

400

200

computing predictions of signal and observations ↑↓ 251 example: UK predictions Wolfcamp data ↑↓ 252

UK predictions of signal S(x) and observations Y (x) at data locations


• in most cases interest to predict signal S(x) (= “error-free” version
> head(predict(r.georob)[, 1:4]) # UK prediction of signal
of data)

• occasionally predictions for (“contaminated”) observations Yi = x y pred se


1 68.851186 44.45399 446.2163 26.93364
S(xi ) + Zi ) required (e.g. for cross-validation) 2 -44.090428 -14.82616 753.7656 26.48702
3 -1.871464 -24.30719 676.2432 24.50395
• consider in sequel separately cases where predictions are com- 4 -29.962712 -37.89631 751.8351 26.33710
5 155.243957 -57.00122 523.9964 23.04018
puted for 6 174.711819 -27.48198 504.1202 28.26008

1. data locations xi or
> head(tmp <- predict(r.georob, type=”response”)[, 1:4])
2. predictions locations x′i (without data)
x y pred se
1 68.851186 44.45399 446.2190 0
2 -44.090428 -14.82616 778.1401 0
3 -1.871464 -24.30719 657.7464 0
4 -29.962712 -37.89631 748.2703 0
5 155.243957 -57.00122 535.2190 0
6 174.711819 -27.48198 518.7601 0
> summary(d.w@data[, ”pressure”] - tmp$pred) example: UK predictions Wolfcamp data ↑↓ 254

Min. 1st Qu. Median Mean 3rd Qu. Max.


UK predictions of signal S(x) and observations Y (x) at prediction loc-
0 0 0 0 0 0 ations
> head(r.uk.signal[, 1:4])
x y pred se
1 -240.0 -150 1117.333 58.92759
2 -237.5 -150 1114.903 58.31249
3 -235.0 -150 1112.538 57.75401
4 -232.5 -150 1110.235 57.25163
5 -230.0 -150 1107.991 56.80230
6 -227.5 -150 1105.801 56.40054

> head(r.uk.resp <- predict(r.georob, newdata=d.w.grid,


+ type=”response”)[, 1:4])

x y pred se
1 -240.0 -150 1117.333 67.99812
2 -237.5 -150 1114.903 67.46577
3 -235.0 -150 1112.538 66.98365
4 -232.5 -150 1110.235 66.55098
5 -230.0 -150 1107.991 66.16483
6 -227.5 -150 1105.801 65.82024

> summary(r.uk.resp$pred - r.uk.signal$pred) 7.4 lognormal universal kriging ↑↓ 256

Min. 1st Qu. Median Mean 3rd Qu. Max.


0 0 0 0 0 0 • Gaussian model fitted to log-transformed response variable
Y (x) = log (U (x)) (e.g. cu content Dornach data set)

> summary(r.uk.resp$se^2 - r.uk.signal$se^2) ⇒ computing UK predictions for log-transformed response

⇒ how should we back-transform to original scale of response?


Min. 1st Qu. Median Mean 3rd Qu. Max.
1151 1151 1151 1151 1151 1151
• lognormal distribution

Y = log(U ) ∼ N (µY , σY2 )


> r.georob$param[”nugget”]

expectation and variance of U


NULL
E [U ] = µU = exp(µY + 0.5 σY2 )
Var [U ] = µ2U (exp(σY2 ) − 1)
lognormal universal kriging ↑↓ 257 example: lognormal UK Meuse zinc data ↑↓ 258

• exp (S!k (x′ )) is a biased predictor of U (x′ ) > library(georob)


> data(meuse); data(meuse.grid)
• unbiased back-transformation > coordinates(meuse.grid) <- ~x+y
> meuse.grid <- as(meuse.grid, ”SpatialPixelsDataFrame”)
2 ; 8 9<3
!lk (x′ ) = exp S!k (x′ ) + 0.5 Var [S(x′ )] − Var S!k (x′ ) > r.logzn <- georob(log(zinc)~sqrt(dist), meuse,
U
+ locations=~x+y, variogram.model=”RMexp”,
+ param=c(variance=0.15, nugget=0.05, scale=200),
• limits of prediction intervals can be back-transformed directly by + tuning.psi=1000, control=control.georob(
exp() + cov.bhat=TRUE, cov.bhat.betahat=TRUE,
+ aux.cov.pred.target=TRUE))
• back-transformation implemented in function lgnpp of package > r.logzn
georob
Tuning constant: 1000

Fixed effects coefficients:


(Intercept) sqrt(dist)
6.985 -2.567

Variogram: RMexp
variance snugget(fixed) nugget

0.14910 0.00000 0.04867 ’data.frame’: 3103 obs. of 12 variables:


scale $ pred : num 7.03 7.05 6.75 6.49 7.07 ...
192.52854 $ se : num 0.362 0.335 0.342 0.349 0.288 ...
$ lower : num 6.32 6.39 6.08 5.8 6.5 ...
$ upper : num 7.73 7.7 7.42 7.17 7.63 ...
$ trend : num 6.99 6.99 6.7 6.45 6.99 ...
> r.luk <- predict(r.logzn, newdata=meuse.grid, $ var.pred : num 0.0388 0.0539 0.0437 0.0357 0.078..
+ control=control.predict.georob(extended.output=TRUE)) $ cov.pred.target: num 0.0285 0.0453 0.0379 0.0315 0.072..
> str(r.luk@data) $ var.target : num 0.149 0.149 0.149 0.149 0.149 ...
$ lgn.pred : num 1189 1204 902 694 1215 ...
$ lgn.se : num 440 409 314 249 354 ...
’data.frame’: 3103 obs. of 8 variables: $ lgn.lower : num 554 595 438 331 667 ...
$ pred : num 7.03 7.05 6.75 6.49 7.07 ... $ lgn.upper : num 2286 2215 1673 1299 2064 ...
$ se : num 0.362 0.335 0.342 0.349 0.288 ... ...
$ lower : num 6.32 6.39 6.08 5.8 6.5 ...
$ upper : num 7.73 7.7 7.42 7.17 7.63 ...
$ trend : num 6.99 6.99 6.7 6.45 6.99 ...
$ var.pred : num 0.0388 0.0539 0.0437 0.0357 0.078..
$ cov.pred.target: num 0.0285 0.0453 0.0379 0.0315 0.072..
$ var.target : num 0.149 0.149 0.149 0.149 0.149 ...
...

> r.luk <- lgnpp(r.luk)


> str(r.luk@data)
> print(spplot(r.luk, zcol=”pred”,
+ main=”UK prediction log(zn)”),
+ position=c(0, 0, 0.5, 1), more=TRUE) UK prediction log(zn) LUK prediction zn
> print(spplot(r.luk, zcol=”lgn.pred”, 2000
+ main=”LUK prediction zn”), 7.5
+ position=c(0.5, 0, 1, 1))
7.0
1500
6.5

6.0 1000

5.5
500
5.0

4.5 0

> print(spplot(r.luk, zcol=”se”,


+ main=”UK standard error log(zn)”),
+ position=c(0, 0, 0.5, 1), more=TRUE) UK standard error log(zn) LUK standard error zn
> print(spplot(r.luk, zcol=”lgn.se”, 500
0.40
+ main=”LUK standard error zn”),
+ position=c(0.5, 0, 1, 1))
400
0.35

300
0.30

200
0.25

100
0.20

0
summary section 7 ↑↓ 265 summary section 7 ↑↓ 266

• mean squared error (MSE) captures bias and random variation • simple kriging predictor: weighted sum of covariance (variogram)
terms “pinned-down” at observation locations (dual form)
• mean squared prediction error (MSEP) usual criterion for optimal-
ity of predictions • universal kriging predictor: approximation of simple kriging pre-
! GLS
dictor where β is estimated by β
• optimal predictor (which minimizes MSEP): conditional expecta-
• MSEP of universal kriging predictor equal to MSEP of simple kri-
tion of prediction target, given observations
ging predictor plus a term that accounts for the estimation of β
• Gaussian random processes: optimal predictor ≡ simple kriging
• computing universal kriging predictor requires:
predictor
1. known structure of trend function known
• simple kriging predictor: weighted sum of observations with
weights equal to ΣSY Γ−1 2. known structure and parameters θ, τ 2 of covariance function
Y Y ; ΣSY accounts for auto-correlation
between target and observations and Γ−1 or variogram
Y Y for auto-correlation
between observations
⇒ “plug-in” predictor: uncertainty of variogram is ignored when com-
puting predictions

8 model assessment by cross-validation problem ↑↓ 268

• data analysis often leads to a set of equally plausible candidate


models that use different set of covariates and different variograms

• covariate selection by step() does not lead to a unique set of


covariates but depends on the search strategy and on criterion
(AIC/BIC) used for model section

• likelihood ratio tests cannot be used to compare the goodness of


fit of models that have different variograms

• goodness-of-fit not necessarily good criterion for judging quality of


predictions

⇒ models should be assessed by their precision to predict new data


cross-validation ↑↓ 269 8.1 criteria to assess precision of predictions ↑↓ 270

• general strategy to assess precision of predictions of new data by • root mean square error RMSE
'
a statistical model (Hastie et al., 2009, chap. 7) ( n ; <2
(1 "
RMSE = ) Y!k (xi ) − yi
• recipe: n i=1

1. split data set (randomly) into K subsets (typically K = 5 or ⇒ overall measure of precision (bias and random variation)
K = 10)
• bias
2. for each k = 1, . . . , K 1 " ;!
n <
th
(a) exclude observations of k subset and fit model to re- BIAS = Yk (xi ) − yi
n i=1
maining data
• robust variants:
(b) predict with this model (and excluding again the data of
the k th subset) all observations Y (xi ) of the k th subset robBIAS = mediani (Y!k (xi ) − yi )
and compute prediction errors Y!k (xi ) − yi robRMSE = MADi (Y!k (xi ) − yi ) = 1.4826 mediani (|Y!k (xi ) − yi |)
3. pool prediction errors for all subsets and compute statistics of
• R2 measures strength of linear dependence between yi and Y!k (xi )
Y!k (xi ) − yi for evaluating prediction precision and the accur-
and is not a measure of precision
acy of modelling prediction uncertainty (e.g. MSEP[Y!k (xi )])

8.2 criteria to assess MSEP[Y!k (xi )] ↑↓ 271 8.3 criteria to assess probabilistic predictions ↑↓ 272

• mean of squared standardized prediction errors • for Gaussian stochastic processes kriging provides estimates of
mean and variance of conditional distribution of target Y (x′j ) given
n
1 " {Y!k (xi ) − yi }2 the data Y
MSSE = should match 1
n i=1 MSEP[Y!k (xi )]
Y (x′j )|Y ∼ N (Y!k (x′j ), MSEP[Y!k (x′j )])
• robust variant of MSSE for normally distributed prediction errors
> ? • denote cdf of predictive distribution by F!Y (x′j )|Y (y)
{Y!k (xi ) − yi }2
MEDSSE = mediani should match 0.455 • probability integral transform PIT (Gneiting et al., 2007)
MSEP[Y!k (xi )]
PITj = F!Y (x′j )|Y (yj )

• PIT has a uniform distribution on interval [0, 1] if predictive distri-


bution is ok

⇒ histogram of PITj should be flat


prediction intervals ok prediction intervals too narrow 8.4 criteria to assess “sharpness” of F!Y (x′ )|Y (y) ↑↓ 274

1.2

1.5
• overall criterion to assess quality of probabilistic predictions
frequency

frequency
0.8

1.0
• predictive distribution is “sharp” if it is narrow (small variance) and
0.4

0.5
is centred on true value (no bias)
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
PIT PIT

4
prediction intervals too wide CDF: prediction intervals ok

3
1.5

0.8

pdf
2
1.0
frequency

Fn(x)

1
0.4
0.5

yi yj

0
0 1 2 3 4
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 y
PIT x

criteria to assess “sharpness” of F!Y (x′ )|Y (y) ↑↓ 275 continuous ranked probability score ↑↓ 276

• measure for sharpness of predictive distribution for single predic- • continuous ranked probability score (CRPS) measures average
tion site x′j #
sharpness of predictive distributions for all sites of a data set
{F!Y (x′j )|Y (y) − I(yj ≤ y)}2 dy n #
1" ∞
CRPS = {F!Y (x′j )|Y (y) − I(yj ≤ y)}2 dy
where I(A) is indicator function with value equal to 1 if A is true n j=1 −∞

and zero otherwise


• CRPS equal to integral over Brier score (BS = averaged MSEP for
1.0

predicting that observations yj do not exceed cutoff y )


0.8

n
1" !
0.6

BS(y) = {FY (x′j )|Y (y) − I(yj ≤ y)}2


cdf

n j=1
0.4
0.2

⇒ CRPS criterion of choice for assessing quality of probabilistic pre-


0.0

0 1 2 3 4 dictions (strictly proper scoring rule, cf. Gneiting et al., 2007)


y
example: cross-validation Wolfcamp data ↑↓ 277 > logLik(r.georob.full)

’log Lik.’ -455.9609 (df=9)


> library(geoR)
> library(georob) > logLik(r.georob.bic)
> d.w <- as.data.frame(wolfcamp) ’log Lik.’ -458.7583 (df=6)
> class(d.w) <- ”data.frame”
> colnames(d.w) <- c(”x”, ”y”, ”pressure”) > extractAIC(r.georob.full)
> coordinates(d.w) <- c(”x”, ”y”)
> d.w$xs <- d.w$x - mean(d.w$x) [1] 9.0000 929.9218
> d.w$ys <- d.w$y - mean(d.w$y)
> r.georob.full <- georob( > extractAIC(r.georob.bic)
+ pressure~xs+ys+I(xs^2)+I(ys^2)+xs:ys, d.w,
+ locations=~x+y, variogram.model=”RMspheric”, [1] 6.0000 929.5165
+ param=c(variance=3000, nugget=1000, scale=100),
+ tuning.psi=1000) > extractAIC(r.georob.full, k=nrow(d.w))
> r.georob.bic <- update(r.georob.full,
[1] 9.000 1676.922
+ .~xs+ys)
> r.cv.full <- cv(r.georob.full, seed=30)
> extractAIC(r.georob.bic, k=nrow(d.w))
> r.cv.bic <- cv(r.georob.bic, seed=30)
[1] 6.000 1427.517

> summary(r.cv.full, se=TRUE) > op <- par(mfrow=c(1,2))


> plot(r.cv.full, type=”ta”)
> plot(r.cv.bic, add=T, col=2, type=”ta”)
Statistics of cross-validation prediction errors > abline(h=0, lty=”dotted”)
me mede rmse made qne
-19.4686 4.7745 115.0815 71.6833 88.4307
> legend(”topright”, pch=1, col=1:2,
se 27.3092 26.7779 20.9183 28.4798 6.2635 + legend=c(”full”, ”bic”), bty=”n”)
msse medsse crps > plot(r.cv.full, type=”qq”,
1.2169 0.5443 56.8032 + ylab=”standardized prediction errors”)
se 0.2820 0.2880 13.8699
> plot(r.cv.bic, add=T, col=2, type=”qq”, xlab=””)
> abline(0, 1, lty=”dotted”)
> legend(”topleft”, pch=1, col=1:2,
> summary(r.cv.bic, se=TRUE)
+ legend=c(”full”, ”bic”), bty=”n”)
> par(op)
Statistics of cross-validation prediction errors
me mede rmse made qne msse
-1.6055 -3.0263 64.3792 63.9558 62.0317 0.7129
se 12.2338 12.9879 6.4457 7.0945 6.2314 0.1142
medsse crps
0.4157 35.5114
se 0.0657 3.4191
> op <- par(mfrow=c(1,2))
Tukey−Anscombe plot normal−QQ−plot of standardized prediction errors
> plot(r.cv.full, type=”hist.pit”)
full full > plot(r.cv.bic, col=2, type=”hist.pit”)
2

2
bic bic
standardized prediction errors
> par(op)

standardized prediction errors


1

1
histogram PIT−values histogram PIT−values
0

1.5
−1

−1

1.5

1.0
−2

−2

1.0
density

density
400 600 800 1000 1200 1400 −2 −1 0 1 2
predictions quantile N(0,1)

0.5
0.5
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
PIT PIT

> plot(r.cv.full, type=”bs”, ylim=c(0, 0.1)) References


> plot(r.cv.bic, add=T, col=2, type=”bs”)
> legend(”topright”, pch=1, col=1:2,
+ legend=c(”full”, ”bic”), bty=”n”) Diggle, P. J. and Ribeiro, Jr., P. J. (2007). Model-based Geostatistics.
Springer, New York.
Brier score vs. cutoff Gneiting, T., Balabdaoui, F., and Raftery, A. E. (2007). Probabilistic
forecasts, calibration and sharpness. Journal of the Royal Statistical
0.10

full
bic Society Series B, 69(2), 243–268.
0.08

Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of


Statistical Learning; Data Mining, Inference and Prediction. Springer,
0.06
Brier score

New York, second edition.


0.04
0.02
0.00

400 600 800 1000


cutoff

Potrebbero piacerti anche