-
Notifications
You must be signed in to change notification settings - Fork 13
/
14-Spatial-Interpolation.Rmd
executable file
·535 lines (381 loc) · 31.4 KB
/
14-Spatial-Interpolation.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
# Spatial Interpolation
Given a distribution of point meteorological stations showing precipitation values, how I can I estimate the precipitation values where data were not observed?
```{r f14-precip-map, echo=FALSE, results='hide', fig.height=3, fig.cap="Average yearly precipitation (reported in inches) for several meteorological sites in Texas."}
library(terra)
library(tmap)
library(sf)
# Load precipitation point data
z <- gzcon(url("https://github.com/mgimond/Spatial/raw/main/Data/precip.rds"))
P <- readRDS(z)
P_crs <- crs(P)
# Load extent data
z <- gzcon(url("https://github.com/mgimond/Spatial/raw/main/Data/texas.rds"))
W <- readRDS(z)
# Force the point object extent to that of Texas
P@bbox <- W@bbox
# Plot the data
tm_shape(W) +tm_polygons() +
tm_shape(P) +
tm_dots(col="Precip_in", size=0.5, palette = "RdBu", title = "Precipitation (in)") +
tm_text("Precip_in", just="left", xmod=.5, size = 0.7) +
tm_legend(legend.outside=TRUE)
```
To help answer this question, we need to clearly define the nature of our point dataset. We've already encountered point data earlier in the course where our interest was in creating point density maps using different kernel windows. However, the point data used represented a complete enumeration of discrete events or observations--i.e. the entity of interest only occurred a discrete locations within a study area and therefore could only be *measured* at those locations. Here, our point data represents **sampled** observations of an entity that can be measured *anywhere* within our study area. So creating a point density raster from this data would only make sense if we were addressing the questions like "*where are the meteorological stations concentrated within the state of Texas?*".
Another class of techniques used with points that represent samples of a **continuous field** are **interpolation** methods. There are many interpolation tools available, but these tools can usually be grouped into two categories: **deterministic** and **statistical** interpolation methods.
## Deterministic Approach to Interpolation
We will explore two deterministic methods: **proximity** (aka **Thiessen**) techniques and **inverse distance weighted** techniques (**IDW** for short).
### Proximity interpolation
This is probably the simplest (and possibly one of the oldest) interpolation method. It was introduced by Alfred H. Thiessen more than a century ago. The goal is simple: Assign to all unsampled locations the value of the closest sampled location. This generates a *tessellated* surface whereby lines that split the midpoint between each sampled location are connected thus enclosing an area. Each area ends up enclosing a sample point whose value it inherits.
```{r f14-proximity, echo=FALSE, message=FALSE, results='hide', fig.height=3.3, results='hold', fig.cap="Tessellated surface generated from discrete point samples. This is also known as a *Thiessen interpolation*."}
library(spatstat) # Used for the dirichlet tesselation function
library(sf)
library(sp)
# Create a tessellated surface
P.ppp <- st_as_sf(P) |> st_geometry() |> as.ppp()
Window(P.ppp) <- as.owin(st_geometry(st_as_sf(W)))
th <- dirichlet(P.ppp) |> st_as_sfc()
st_crs(th) <- P_crs
th.z <- over(as_Spatial(th), P, fn=mean, na.rm = TRUE)
th.spdf <- SpatialPolygonsDataFrame(as_Spatial(th), th.z)
th.clp <- terra::intersect(W,th.spdf)
tm_shape(th.clp) + tm_polygons(col="Precip_in", palette="RdBu", auto.palette.mapping=FALSE) +
tm_legend(legend.outside=TRUE)
```
One problem with this approach is that the surface values change abruptly across the tessellated boundaries. This is not representative of most surfaces in nature.
Thiessen's method was very practical in his days when computers did not exist. But today, computers afford us more advanced methods of interpolation as we will see next.
### Inverse Distance Weighted (IDW)
The IDW technique computes an average value for unsampled locations using values from nearby weighted locations. The weights are proportional to the proximity of the sampled points to the unsampled location and can be specified by the IDW power coefficient. The larger the power coefficient, the stronger the weight of nearby points as can be gleaned from the following equation that estimates the value $z$ at an unsampled location $j$:
$$
\hat{Z_j} = \frac{\sum_i{Z_i/d^n_{ij}}}{\sum_i{1/d^n_{ij}}}
$$
The carat $\hat{}$ above the variable $z$ reminds us that we are estimating the value at $j$. The parameter $n$ is the weight parameter that is applied as an exponent to the distance thus amplifying the irrelevance of a point at location $i$ as distance to $j$ increases. So a large $n$ results in nearby points wielding a much greater influence on the unsampled location than a point further away resulting in an interpolated output looking like a Thiessen interpolation. On the other hand, a very small value of $n$ will give all points within the search radius equal weight such that all unsampled locations will represent nothing more than the mean values of all sampled points within the search radius.
In the following figure, the sampled points and values are superimposed on top of an (IDW) interpolated raster generated with a $n$ value of `2`.
```{r f14-idw1,echo=FALSE, message=FALSE, results='hide', fig.height=3.3, cache=FALSE, fig.cap="An IDW interpolation of the average yearly precipitation (reported in inches) for several meteorological sites in Texas. An IDW power coefficient of 2 was used in this example."}
library(gstat) # Use gstat's idw routine
library(sp) # Used for the spsample function
library(terra)
library(sf)
library(tmap)
# Create an empty grid where n is the total number of cells
grd <- as.data.frame(spsample(W, "regular", n=50000))
names(grd) <- c("X", "Y")
coordinates(grd) <- c("X", "Y")
gridded(grd) <- TRUE # Create SpatialPixel object
fullgrid(grd) <- TRUE # Create SpatialGrid object
# Add P's projection information to the empty grid
proj4string(P) <- proj4string(P) # Temp fix until new proj env is adopted
proj4string(grd) <- proj4string(P)
# Interpolate the surface using a power value of 2 (idp=2.0)
P.idw <- gstat::idw(Precip_in ~ 1, P, newdata=grd,idp=2.0)
P.idw@data <- P.idw@data[1] # Remove variance layer
# Convert to raster object then clip to Texas
r <- rast(P.idw)
r.m <- mask(r, st_as_sf(W))
# Plot
tm_shape(r.m) +
tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, title="Predicted precip") +
tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
In the following example, an $n$ value of `15` is used to interpolate precipitation. This results in nearby points having greater influence on the unsampled locations. Note the similarity in output to the proximity (Thiessen) interpolation.
```{r f14-idw2, echo=FALSE, message=FALSE, results='hide', fig.height=3.3, cache=FALSE, fig.cap="An IDW interpolation of the average yearly precipitation (reported in inches) for several meteorological sites in Texas. An IDW power coefficient of 15 was used in this example."}
# Create an empty grid where n is the total number of cells
grd <- as.data.frame(spsample(W, "regular", n=50000))
names(grd) <- c("X", "Y")
coordinates(grd) <- c("X", "Y")
gridded(grd) <- TRUE # Create SpatialPixel object
fullgrid(grd) <- TRUE # Create SpatialGrid object
# Add P's projection information to the empty grid
proj4string(P) <- proj4string(P) # Temp fix until new proj env is adopted
proj4string(grd) <- proj4string(P)
# Interpolate the surface using a power value of 15 (idp=15)
P.idw <- gstat::idw(Precip_in ~ 1, P, newdata=grd, idp=15)
P.idw@data <- P.idw@data[1] # Remove variance layer
# Convert to raster object then clip to Texas
r <- rast(P.idw)
r.m <- mask(r, st_as_sf(W))
# Plot
tm_shape(r.m) + tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, title="Predicted precip") +
tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
### Fine tuning the interpolation parameters
Finding the best set of input parameters to create an interpolated surface can be a subjective proposition. Other than *eyeballing* the results, how can you quantify the *accuracy* of the estimated values? One option is to split the points into two sets: the points used in the interpolation operation and the points used to validate the results. While this method is easily implemented (even via a pen and paper adoption) it does suffer from significant loss in power--i.e. we are using just half of the information to estimate the unsampled locations.
A better approach (and one easily implemented in a computing environment) is to remove one data point from the dataset and interpolate its value using *all other* points in the dataset then repeating this process for each point in that dataset (while making sure that the interpolator parameters remain constant across each interpolation). The interpolated values are then compared with the actual values from the omitted point. This method is sometimes referred to as **jackknifing** or **leave-one-out cross-validation**.
The performance of the interpolator can be summarized by computing the root-mean of squared residuals (RMSE) from the errors as follows:
$$
RMSE = \sqrt{\frac{\sum_{i=1}^n (\hat {Z_{i}} - Z_i)^2}{n}}
$$
where $\hat {Z_{i}}$ is the interpolated value at the unsampled location *i* (i.e. location where the sample point was removed), $Z_i$ is the true value at location *i* and $n$ is the number of points in the dataset.
We can create a scatterplot of the predicted vs. expected precipitation values from our dataset. The solid diagonal line represents the one-to-one slope (i.e. if the predicted values matched the true values exactly, then the points would fall on this line). The red dashed line is a linear fit to the points which is here to help *guide* our eyes along the pattern generated by these points.
```{r f14-rmse-plot, echo=FALSE, results='hide', fig.height=3.3, fig.cap="Scatter plot pitting predicted values vs. the observed values at each sampled location following a leave-one-out cross validation analysis."}
# Leave-one-out validation routine
IDW.out <- vector(length = length(P))
for (i in 1:length(P)) {
IDW.out[i] <- gstat::idw(Precip_in ~ 1, P[-i,], P[i,], idp=2.0)$var1.pred
}
# Plot differences
OP <- par(pty="s", mar=c(4,3,0,0))
plot(IDW.out ~ P$Precip_in, asp=1, xlab="Observed", ylab="Predicted", pch=16,
col=rgb(0,0,0,0.5))
abline(lm(IDW.out ~ P$Precip_in), col="red", lw=2,lty=2)
abline(0,1)
```
```{r echo=FALSE, message=FALSE, results='hide'}
# Compute RMSE
RMSE <- sqrt( sum((IDW.out - P$Precip_in)^2) / length(P))
```
The computed RMSE from the above working example is `r round(RMSE,3)` inches. We can extend our exploration of the interpolator's accuracy by creating a map of the confidence intervals. This involves *layering* all $n$ interpolated surfaces from the aforementioned jackknife technique, then computing the confidence interval for each location ( pixel) in the output map (raster). If the range of interpolated values from the jackknife technique for an unsampled location $i$ is high, then this implies that this location is highly sensitive to the presence or absence of a single point from the sample point locations thus producing a large confidence interval (i.e. we can't be very confident of the predicted value). Conversely, if the range of values estimated for location $i$ is low, then a small confidence interval is computed (providing us with greater confidence in the interpolated value). The following map shows the 95% confidence interval for each unsampled location (pixel) in the study extent.
```{r f13-confidence-map, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="In this example an IDW power coefficient of 2 was used and the search parameters was confined to a minimum number of points of 10 and a maximum number of points of 15. The search window was isotropic. Each pixel represents the range of precipitation values (in inches) around the expected value given a 95% confidence interval."}
# #######################################
# Use the jackknife technique to estimate
# a confidence interval at each
# unsampled point.
# #######################################
img <- gstat::idw(Precip_in~1, P, newdata=grd, idp=2.0, nmin=10,nmax=15)
n <- length(P)
Zi <- matrix(nrow = length(img$var1.pred), ncol = n)
# Remove a point then interpolate (do this n times for each point)
#st <- stack()
st <- rast()
for (i in 1:n){
Z1 <- gstat::idw(Precip_in~1, P[-i,], newdata=grd, idp=2.0,nmax=15)
#st <- addLayer(st,raster(Z1,layer=1))
st <- c(st, rast(Z1)$var1.pred )
# Calculated pseudo-value Z at j
Zi[,i] <- n * img$var1.pred - (n-1) * Z1$var1.pred
}
# Jackknife estimator of parameter Z at location j
Zj <- as.matrix(apply(Zi, 1, sum, na.rm=T) / n )
# Compute (Zi* - Zj)^2
c1 <- apply(Zi,2,'-',Zj) # Compute the subtraction
c1 <- apply(c1^2, 1, sum, na.rm=T ) # sum the square of the subtraction
# Compute the confidence interval
CI <- sqrt( 1/(n*(n-1)) * c1)
# Create (CI / interpolated value) raster
img.sig <- img
img.sig$v <- CI /img$var1.pred
# Plot the data
r <- rast(img.sig)$v
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster(n=7, title="95% confidence interval (in inches)") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
IDW interpolation is probably one of the most widely used interpolators because of its simplicity. In many cases, it can do an adequate job. However, the choice of power remains subjective. There is another class of interpolators that makes use of the information provided to us by the sample points--more specifically, information pertaining to 1st and 2nd order behavior. These interpolators are covered next.
## Statistical Approach to Interpolation
The statistical interpolation methods include **surface trend** and **Kriging**.
### Trend Surfaces
It may help to think of trend surface modeling as a regression on spatial coordinates where the coefficients apply to those coordinate values and (for more complicated surface trends) to the interplay of the coordinate values. We will explore a 0th order, 1st order and 2nd order surface trend in the following sub-sections.
#### 0^th^ Order Trend Surface
The first model (and simplest model), is the 0^th^ order model which takes on the following expression: `Z = a` where the intercept `a` is the mean precipitation value of all sample points (`r round(mean(P$Precip_in),1)` in our working example). This is simply a level (horizontal) surface whose cell values all equal `r round(mean(P$Precip_in),1)`.
```{r f13-poly0, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="The simplest model where all interpolated surface values are equal to the mean precipitation.", fig.align='center'}
# Define the polynomial equation
f.0 <- as.formula(Precip_in ~ 1)
# Run the regression model
lm.0 <- lm( f.0 , data=P)
# Use the regression model output to interpolate the surface
dat.0th <- SpatialGridDataFrame(grd, data.frame(var1.pred = predict(lm.0, newdata=grd)))
# Convert to raster object to take advantage of rasterVis' imaging
# environment
r <- rast(dat.0th)
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster( title="Predicted precip") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
This makes for an uninformative map. A more interesting surface trend map is one where the surface trend has a slope other than 0 as highlighted in the next subsection.
#### 1^st^ Order Trend Surface
The first order surface polynomial is a slanted flat plane whose formula is given by: `Z = a + bX + cY` where `X` and `Y` are the coordinate pairs.
```{r f13-poly1, echo=FALSE, message=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="Result of a first order interpolation.", fig.align='center'}
# Define the 1st order polynomial equation
f.1 <- as.formula(Precip_in ~ X + Y)
# Add X and Y to P
P$X <- coordinates(P)[,1]
P$Y <- coordinates(P)[,2]
# Run the regression model
lm.1 <- lm( f.1, data=P)
# Use the regression model output to interpolate the surface
dat.1st <- SpatialGridDataFrame(grd, data.frame(var1.pred = predict(lm.1, newdata=grd)))
r <- rast(dat.1st)
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE,title="Predicted precip") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
The 1st order surface trend does a good job in highlighting the prominent east-west trend. But is the trend truly uniform along the X axis? Let's explore a more complicated surface: the quadratic polynomial.
#### 2^nd^ Order Trend Surface
The second order surface polynomial (aka quadratic polynomial) is a parabolic surface whose formula is given by: $Z = a + bX + cY + dX^2 + eY^2 + fXY$
```{r f13-poly2, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="Result of a second order interpolation", fig.align='center'}
# Define the 1st order polynomial equation
f.2 <- as.formula(Precip_in ~ X + Y + I(X*X)+I(Y*Y) + I(X*Y))
# Run the regression model
lm.2 <- lm( f.2, data=P)
# Use the regression model output to interpolate the surface
dat.2nd <- SpatialGridDataFrame(grd, data.frame(var1.pred = predict(lm.2, newdata=grd)))
r <- rast(dat.2nd)
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, title="Predicted precip") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
This interpolation picks up a slight curvature in the east-west trend. But it's not a significant improvement over the 1st order trend.
### Ordinary Kriging
Several forms of kriging interpolators exist: ordinary, universal and simple just to name a few. This section will focus on **ordinary kriging** (**OK**) interpolation. This form of kriging usually involves four steps:
+ Removing any **spatial trend** in the data (if present).
+ Computing the **experimental variogram**, $\gamma$, which is a measure of spatial autocorrelation.
+ Defining an **experimental variogram model** that best characterizes the spatial autocorrelation in the data.
+ Interpolating the surface using the experimental variogram.
+ Adding the kriged interpolated surface to the trend interpolated surface to produce the final output.
These steps our outlined in the following subsections.
#### De-trending the data
One assumption that needs to be met in ordinary kriging is that the mean and the variation in the entity being studied is constant across the study area. In other words, there should be no global trend in the data (the term *drift* is sometimes used to describe the trend in other texts). This assumption is clearly not met with our Texas precipitation dataset where a prominent east-west gradient is observed. This requires that we remove the trend from the data before proceeding with the kriging operations.
Many pieces of software will accept a trend model (usually a first, second or third order polynomial). In the steps that follow, we will use the first order fit computed earlier to de-trend our point values (recall that the second order fit provided very little improvement over the first order fit).
```{r echo=FALSE}
# Define the 1st order polynomial equation
f.1 <- as.formula(Precip_in ~ X + Y)
# Run the regression model
lm.1 <- lm( f.1, data=P)
# Copy the residuals to the point object
P$res <- lm.1$residuals
```
Removing the trend leaves us with the residuals that will be used in kriging interpolation. Note that the modeled trend will be added to the kriged interpolated surface at the end of the workflow.
```{r f13-detrend-map, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap= "Map showing de-trended precipitation values (aka residuals). These detrended values are then passed to the ordinary kriging interpolation operations. You can think of these residuals as representing variability in the data not explained by the global trend. If variability is present in the residuals then it is best characterized as a distance based measure of variability (as opposed to a location based measure)."}
OP <- par(mar=c(0,0,0,0))
plot(W, col=rgb(0,0,0,0.05), border="grey")
plot(P, pch=20, cex=(P$res/61 +0.5), col=rgb(0,0,0,0.5), add=T)
text(P, sprintf("%.1f",P$res), pos=4, col="grey40", cex=0.5)
par(OP)
```
#### Experimental Variogram
In Kriging interpolation, we focus on the spatial relationship between location attribute values. More specifically, we are interested in how these attribute values (precipitation residuals in our working example) vary as the distance between location point pairs increases. We can compute the difference, $\gamma$, in precipitation values by squaring their differences then dividing by 2.
For example, if we take two meteorological stations (one whose de-trended precipitation value is -1.2 and the other whose value is 1.6),
```{r f13-two-sites, echo=FALSE, results='hide', fig.height=2.5, fig.cap="Locations of two sample sites used to demonstrate the calculation of *gamma*.", cache=TRUE}
OP <- par(mar=c(0,0,0,0))
plot(W, col=rgb(0,0,0,0.05), border="grey")
plot(P, pch=20, cex=( (P$res/61 +1)), col=rgb(0,0,0,0.5), add=T)
plot(P[c(2,21), ], pch=20, cex=(P$res/61 +3.5), col=rgb(1,0,0), add=T)
text(P[c(2,21), ], sprintf("%.1f",P$res[c(2,21)]), pos=4, col="grey40", cex=0.8)
par(OP)
```
we can compute their difference ($\gamma$) as follows:
$$
\gamma = \frac{(Z_2 - Z_1)^2}{2} = \frac{(-1.2 - (1.6))^2}{2} = 3.92
$$
We can compute $\gamma$ for *all* point pairs then plot these values as a function of the distances that separate these points:
```{r f14-variogram, echo=FALSE, fig.show='hold', fig.height=3.3, results='hide', cache=TRUE, fig.cap="Experimental variogram plot of precipitation residual values."}
var.cld <- gstat::variogram(res ~ 1, P, cloud = TRUE, cutoff=1000000)
var.df <- as.data.frame(var.cld)
index1 <- which(with(var.df, left==21 & right==2))
OP <- par( mar=c(4,6,1,1))
plot(var.cld$dist/1000 , var.cld$gamma, col="grey",
xlab = "Distance between point pairs (km)",
ylab = expression( frac((res[2] - res[1])^2 , 2)) )
points(var.cld$dist[index1]/1000 , var.cld$gamma[index1], col="red", pch=16)
par(OP)
```
The red point in the plot is the value computed in the above example. The distance separating those two points is about `r round(var.df[index1,]$dist/1000)` km. This value is mapped in \@ref(fig:f14-variogram) as a red dot.
The above plot is called an **experimental semivariogram** cloud plot (also referred to as an experimental **variogram** cloud plot). The terms semivariogram and variogram are often used interchangeably in geostatistics (we'll use the term variogram henceforth since this seems to be the term of choice in current literature). Also note that the word *experimental* is sometimes dropped when describing these plots, but its use in our terminology is an important reminder that the points we are working with are just samples of some continuous field whose spatial variation we are attempting to model.
#### Sample Experimental Variogram
Cloud points can be difficult to interpret due to the sheer number of point pairs (we have 465 point pairs from just 50 sample points, and this just for 1/3 of the maximum distance lag!). A common approach to resolving this issue is to "bin" the cloud points into intervals called **lags** and to summarize the points within each interval. In the following plot, we split the data into 15 bins then compute the average point value for each bin (displayed as red points in the plot). The red points that summarize the cloud are the **sample experimental variogram estimates** for each of the 15 distance bands and the plot is referred to as the **sample experimental variogram** plot.
```{r f14-sample-variogram, echo=FALSE, fig.show='hold', fig.height=3.3, results='hide', cache=TRUE, fig.cap="Sample experimental variogram plot of precipitation residual values."}
# Compute the sample experimental variogram
var.smpl <- gstat::variogram(f.1, P, cloud = FALSE, cutoff=1000000, width=89900)
bins.ct <- c(0, var.smpl$dist , max(var.cld$dist) )
bins <- vector()
for (i in 1: (length(bins.ct) - 1) ){
bins[i] <- mean(bins.ct[ seq(i,i+1, length.out=2)] )
}
bins[length(bins)] <- max(var.cld$dist)
var.bins <- findInterval(var.cld$dist, bins)
# Point data cloud with bin boundaries
OP <- par( mar = c(5,6,1,1))
plot(var.cld$gamma ~ eval(var.cld$dist/1000), col=rgb(0,0,0,0.2), pch=16, cex=0.7,
xlab = "Distance between point pairs (km)",
ylab = expression( gamma ) )
points( var.smpl$dist/1000, var.smpl$gamma, pch=21, col="black", bg="red", cex=1.3)
abline(v=bins/1000, col="red", lty=2)
par(OP)
```
#### Experimental Variogram Model
The next step is to fit a mathematical model to our *sample experimental variogram*. Different mathematical models can be used; their availability is software dependent. Examples of mathematical models are shown below:
```{r f14-variogram-models, echo=FALSE, fig.height=5, results='hide', fig.cap="A subset of variogram models available in R's `gstat` package."}
library(stringr)
library(lattice)
kk <- str_extract_all(gstat::vgm()$long, "\\([^()]+\\)")
kk <- substring(kk, 2, nchar(kk)-1)
gstat::show.vgms(strip=strip.custom(factor.levels=kk))
```
The goal is to apply the model that best fits our sample experimental variogram. This requires picking the proper model, then tweaking the **partial sill**, **range**, and **nugget** parameters (where appropriate). The following figure illustrates a nonzero intercept where the *nugget* is the distance between the $0$ variance on the $y$ axis and the variogram's model intercept with the $y$ axis. The *partial sill* is the vertical distance between the nugget and the part of the curve that levels off. If the variogram approaches $0$ on the $y$-axis, then the nugget is $0$ and the *partial sill* is simply referred to as the **sill**. The distance along the $x$ axis where the curve levels off is referred to as the *range*.
```{r f14-model-explained, echo=FALSE, fig.show='hold', fig.height=3.3, results='hide', cache=TRUE, fig.cap="Graphical description of the range, sill and nugget parameters in a variogram model."}
# Following plot is a modified version of Bivand et al.'s fig. 8.6
library(lattice)
library(gstat)
ccol <- 'blue'
data(meuse)
coordinates(meuse) <- c("x","y")
v <- variogram(log(zinc) ~ 1, meuse, width = 300)
v.fit <- fit.variogram(v, vgm(psill=10, "Sph", range=600, nugget=0.5))
plot(v, v.fit, pch = 16,col="grey", panel = function(x,y,subscripts,...) {
larrows(0,v.fit$psill[1], v.fit$range[2], v.fit$psill[1],
col=ccol, ends = 'both', length=.1, angle=15)
larrows(v.fit$range[2],0, v.fit$range[2], v.fit$psill[1],
col=ccol, ends = 'both', length=.1, angle=20)
larrows(v.fit$range[2],v.fit$psill[1], v.fit$range[2],
sum(v.fit$psill),
col=ccol, ends = 'both', length=.1, angle=15)
ltext(v.fit$rang[2]/2, 1.2*v.fit$psill[1], "range", col=ccol,
adj = c(.5, 0), cex=.9)
ltext(1.02 * v.fit$rang[2], 0.5 *v.fit$psill[1], "nugget", col=ccol,
adj = c(0, 0.5), cex=.9)
ltext(1.02 * v.fit$rang[2], v.fit$psill[1] + 0.5 * v.fit$psill[2],
"partial sill", col=ccol, adj = c(0, 0.5), cex=.9)
vgm.panel.xyplot(x,y,subscripts,...)
}
)
```
In our working example, we will try to fit the *Spherical* function to our sample experimental variogram. This is one of three popular models (the other two being *linear* and *gaussian* models.)
```{r f14-spherical-model, echo=FALSE, tidy=FALSE, fig.height=3.0,results='hide', cache=TRUE, fig.cap="A spherical model fit to our residual variogram."}
# Compute the sample variogram, note the f.1 trend model is one of the parameters
# passed to variogram(). This tells the function to create the variogram on
# the de-trended data
var.smpl <- variogram(f.1, P, cloud = FALSE, cutoff=1000000, width=89900)
# Compute the variogram model by passing the nugget, sill and range values
# to fit.variogram() via the vgm() function.
dat.fit <- fit.variogram(var.smpl, fit.ranges = FALSE, fit.sills = FALSE,
vgm(psill=18, model="Sph", range=500000, nugget=0))
# The following plot allows us to gauge the fit
plot(var.smpl, dat.fit, xlim=c(0,1000000))
```
#### Kriging Interpolation
The variogram model is used by the kriging interpolator to provide *localized* weighting parameters. Recall that with the IDW, the interpolated value at an unsampled site is determined by summarizing weighted neighboring points where the weighting parameter (the power parameter) is defined by the user and is applied uniformly to the entire study extent. Kriging uses the variogram model to compute the weights of neighboring points based on the distribution of those values--in essence, kriging is letting the localized pattern produced by the sample points define the weights (in a systematic way). The exact mathematical implementation will not be covered here (it's quite involved), but the resulting output is shown in the following figure:
```{r f14-krige01, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="Krige interpolation of the residual (detrended) precipitation values.", fig.align='center'}
dat.krg <- krige( res~1, P, grd, dat.fit)
r <- rast(dat.krg)$var1.pred
r.m <- mask(r, st_as_sf(W))
# Plot the raster and the sampled points
tm_shape(r.m) + tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE,title="Predicted residual \nprecip") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
Recall that the kriging interpolation was performed on the de-trended data. In essence, we predicted the precipitation values based on localized factors. We now need to combine this interpolated surface with that produced from the trend interpolated surface to produce the following output:
```{r f14-krige02, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="The final kriged surface.", fig.align='center'}
# gstat's krige() function allows us to include the trend model thus saving us from
# having to detrend the data, krige the residuals, then combine the rasters. Instead,
# all we need to do is pass krige() the trend formula
# Define trend model
f.1 <- as.formula(Precip_in ~ X + Y)
# Perform the krige interpolation (note the use of the variogram model
# created in an earlier step)
dat.krg <- krige( f.1, P, grd, dat.fit)
# Convert kriged surface to a raster object
r <- rast(dat.krg)$var1.pred
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster(n=10, palette="RdBu", auto.palette.mapping=FALSE, title="Predicted precip") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```
A valuable by-product of the kriging operation is the variance map which gives us a measure of uncertainty in the interpolated values. The smaller the variance, the better (note that the variance values are in squared units).
```{r f14-krige03, echo=FALSE, results='hide', fig.height=3.3, cache=TRUE, fig.cap="Variance map resulting from the Kriging analysis.", fig.align='center'}
# The dat.krg object stores not just the interpolated values, but the
# variance values as well. These can be passed to the raster object
# instead of the interpolated values as follows
r <- rast(dat.krg)$var1.var
r.m <- mask(r, st_as_sf(W))
tm_shape(r.m) + tm_raster(n=7, palette ="Reds", ,title="Variance map \n(in squared inches)") +tm_shape(P) + tm_dots(size=0.2) +
tm_legend(legend.outside=TRUE)
```