In this practicum we guide you through the fitting of thin-plate spline based deformation models for nonstationary spatial covariance structure using software for

  1. bending energy penalized, weighted least squares fitting available in the EnviroStat package, and

  2. L1 regularized maximum likelihood fitting using a partial warps parameterization of the thin-plate splines.

The R code uses as an example a French 10-day rainfall dataset for 39 sites in the Languedoc-Roussillon region of southern France.

Fig 1. L-R site map

Fig 1. L-R site map

Sites shown in Fig 1 belong to three French departments: Aude, Herault, and Gard, and cover and area from the Mediterranean coast to the southeast mountain range of the Massif Central.

Fig 2. L-R tourist map

Fig 2. L-R tourist map

Preliminary analysis and transformation of the time series was necessary in order to obtain site means and variances that were reasonably homogeneous over time and over the whole region. First, the time series were deseasonalized by subtraction of a periodic seasonal term. The analystist considered both local seasonal terms for each site as well as a common seasonal abjustment for all sites in the region. The differences among sites were small and so they chose a common seasonal adjustment for all sites. This simplifies reconstruction of the series after prediction of rainfall at unmoitored sites.

Second, modeling an altitude effect was necessary to correct for local variation in site means and variances. After deseasonalization, the cumulated rains over ten-day periods showed site means clearly linearly proportional to the corresponding standard errors, suggesting a multiplicative correction term. For details, see

Monestiez, P., Sampson, P.D., Guttorp, P. (1993). Modeling of Heterogeneous Spatial Correlation Structure by Spatial Deformation, Cahiers de Géostatistique - Fascicule 3, Compte Rendu des Journées de Géostatistique, 25-26 Mai 93, Fontainebleau, Ecole Nationale Supérieure des Mines de Paris.

The model fitting requires only a sample site-site covariance matrix from a space-time dataset and rectangular (Euclidean) coordinates of the sites at which data are observed. Note that there are varying amounts of missing data in the dataset. The current fitting assumes the calculation of a covariance matrix computed from the available monitoring data, dealing with, but not accounting for, the missing data. For example, covariances may be computed from all pairwise-complete data.

We start by loading packages and functions.

# Apply the EnviroStat package version of the early WLS deformation
# algorithm
library(EnviroStat)
library(MASS)

source("http://www.stat.washington.edu/peter/PASI/deformloglik.l1.R")
source("http://www.stat.washington.edu/peter/591/labs/Lab2/visualize.tps.warps.R")
source("http://www.stat.washington.edu/peter/591/labs/Lab2/draw2.R")  # to replace a function in EnviroStat

Next we read in data.

# For reading the French precipitation dataset
plot.title <- "Pluie.LanguedocRoussillon"
Pluie <- read.csv("http://www.stat.washington.edu/peter/PASI/pluieNA.novdec.csv")
Pluie <- Pluie[, -1]
PluieStations <- read.csv("http://www.stat.washington.edu/peter/PASI/pluieStations.Practicum.csv")
X <- as.matrix(PluieStations[, c("x", "y")])
dimnames(X) <- list(PluieStations$numero, c("x", "y"))
# Feel free to look at the other columns of PluieStations to see, for
# example, site names.  Take a quick look at the configuration of the
# monitoring sites
plot(X[, 1], X[, 2], asp = 1, type = "n")
text(X[, 1], X[, 2], dimnames(X)[[1]])
S <- cov(log(Pluie + 1), use = "pair")  # Pairwise covariances for log precip
t <- nrow(Pluie)

Now standardize the coordinate configuration by centering to mean zero, scaling to “centroid size” one, and rotating to the principal axes of the configuration. The centering and scaling are primarily for purposes of numerical stability and interpretation of model parameters across various applications.

Xs <- scale(X,scale=FALSE)  
Xs <- Xs/sqrt(sum(Xs^2))
Xsvd <- svd(Xs)
# Here we make sure diaonal elements of rotation matrix are positive
# by scaling columns by +/- 1.  This is to keep the rotation from flipping
# the orientation entirely.
Xsvd$v[,1] <- Xsvd$v[,1] * sign(Xsvd$v[1,1]) # 
Xsvd$v[,2] <- Xsvd$v[,2] * sign(Xsvd$v[2,2])
Xt <- Xs %*% Xsvd$v
xscaling <- list(mean=attr(Xs,"scaled:center"),
                 #scale=attr(Xs,"scaled:scale")*sqrt(2*(n-1)),
                 scale=1,
                 rotation=Xsvd$v)
X0 <- X   # Save initial coordinates
X <- Xt   # Replace X with center, scaled, rotated coordinates
n <- nrow(X)
# Take a quick look at the standardized configuration of the monitoring sites
# for comparison to what you looked at above.
plot(X[,1],X[,2],asp=1,type="n")
text(X[,1],X[,2],dimnames(X)[[1]])

Part 1. Fitting using functions from “EnviroStat” package

We first compute something we called “dispersion”, derived from the correlation matrix. This is a matrix of the variances of squared differences, essentially empirical variogram values from a spatiotemporal dataset.

Cor <- S/sqrt(diag(S) %o% diag(S))
Disp <- 2 - 2 * Cor
Gdist <- as.matrix(dist(X, upper = T, diag = T))
plot(Gdist, Disp)

We fit an exponential variogram to the dispersion data

h.lt <- Gdist[row(Gdist) < col(Gdist)]
disp.lt <- Disp[row(Disp) < col(Disp)]
variogfit <- EnviroStat::Fvariogfit3(disp.lt, h.lt, a0 = 0.1, t0 = 1, verbose = T)
x <- seq(min(h.lt), max(h.lt), 0.01)
a0 <- variogfit$a[1]
t0 <- variogfit$t0

plot(-0.2, 0, xlim = c(0, max(h.lt)), ylim = c(0, 2), xlab = "Dist", ylab = "Dispersion", 
    type = "n", main = c("Intersite dispersion vs intersite distance"))
for (i in 1:(n - 1)) for (j in (i + 1):n) points(Gdist[i, j], Disp[i, j])
lines(x, a0 + (2 - a0) * (1 - exp(-(t0 * x))), col = "red")

There are three steps in this first exercise. 1. You will first fit the deformation model without any constraints on the deformation. 2. Second, you will fit a deformation model interactively by specifying values of the parameter \(\lambda\) that penalizes the measure of “non-smoothness” (technically, the “bending energy”) of the thin-plate spline deformation. 3. Finally, you will visualize the final deformation using biorthogonal grids.

Step 1. Alternating estimation of variogram and deformation – first with no constraints on the deformation. The algorithm alternatives between fitting a variogram for a given set of (deformed) site coordinates and fitting (deforming) the coordinates for a given variogram model. This will produce a long sequence of plots over iterations. The left panel depicts the displacements of monitoring sites under the given deformation. (Note: the lack of constraints on the deformation is indicated by the parameter \(\lambda = 0\) in the subtitle.)

source("http://www.stat.washington.edu/peter/591/labs/Lab2/plotCoordChange.R")
source("http://www.stat.washington.edu/peter/591/labs/Lab2/plotDispersionDist.R")
assignInNamespace("plotCoordChange", plotCoordChange, ns = "EnviroStat")
assignInNamespace("plotDispersionDist", plotDispersionDist, ns = "EnviroStat")
sg.est <- Falternate3(Disp, X, max.iter = 100, alter.lim = 25, model = 1, lambda = 0)
# You can ignore the notice of 'warnings' printed out in this execution.

Step 2a. Select a smoothing parameter interactively. When you execute the Ftransdraw function, it will prompt you for values of the parameter lambda. Choose one that results in a deformation that does not fold.

source("http://www.stat.washington.edu/peter/591/labs/Lab2/Ftransdraw.R")
assignInNamespace("Ftransdraw", Ftransdraw, ns = "EnviroStat")
coords.grid <- Fmgrid(range(X[, 1]), range(X[, 2]))
par(mfrow = c(1, 2))
# temp <- setplot(X, axes = TRUE)
plot(X, type = "n", xlab = "", ylab = "", asp = T)
deform <- Ftransdraw(disp = Disp, Gcrds = X, MDScrds = sg.est$ncoords, gridstr = coords.grid)

Step 2b. Re-estimate variogram and deformation with the degree of smoothing you decided on above. To do this, replace “x.xx” in the code below with the value you chose.

lam <- x.xx
sg.est <- Falternate3(Disp, X, max.iter = 100, alter.lim = 25, model = 1, lambda = lam)

Step 3. Compute the thin-plate spline from the final result of Ftransdraw. The specification of “lam = 0” is to apply no further smoothing to the “deform” coordinates computed above in step 2b.

Tspline <- sinterp(X, deform$Dcrds, lam = 0)
temp <- Ftransdraw(disp = Disp, Gcrds = X, MDScrds = Tspline$y, gridstr = coords.grid, 
    eye = F)

Draw a “biorthogonal grid” illustrating the spatially varying affine derivative (local anisotropy) of the deformation or correlation structure.

par(mfrow = c(1, 1), mgp = c(2, 0.75, 0))
Tgrid <- bgrid(start = c(0, 0), xmat = X, coef = Tspline$sol)

plot(X, xlab = "", ylab = "", type = "n", asp = TRUE)
text(X, labels = 1:nrow(X), cex = 0.75)
Bgrid.title <- paste("WLS Bgrid: ", plot.title, "\n lambda=", lam, sep = "")
drawout <- draw2(Tgrid, fs = TRUE, lwidth = c(2, 2), lcolor = topo.colors(5), 
    legend = TRUE)
title(main = Bgrid.title)
title(sub = "Lines are colored by directional magnitude of the
      principal axes of the deformation")
# You might wish to set legend=FALSE if the legend obscures part of the plot
# in an Rstudio plot window.

Questions for part 1: Write a description of the fitted nonstationary spatial correlation structure addressing the following questions:

  • In what area(s) of the map is spatial correlation strongest and what area(s) is it weakest?
  • Where is local anisotropy greatest?
  • Are there regions where the local correlation structure appears nearly isotropic?
  • Can you relate any of this estimated structure to the topography in the tourist map shown above?

Part 2. Switch to fitting with partial warps

Here we will quite different notation from that used above. First we’ll specify some initial values for parameters of an exponential correlation model

theta <- log(0.2)  # range parameter
sigmaSqrd <- log(0.05)  # scale parameter
eta <- log(0.1)  # nugget parameter
spatialParams <- c(theta, sigmaSqrd, eta)
n <- nrow(X)

Initialize matrix of partial warp coefficients:

beta0 <- matrix(0, nrow = 2, ncol = n - 2)

Calculate the bending energy matrix B (see the texts by Bookstein (1991) or Dryden and Mardia (1998)).

h <- as.matrix(dist(X, diag = TRUE, upper = TRUE))
K <- h^2 * log(h)
diag(K) <- 0
one <- rep(1, n)
Gamma <- cbind(K, one, X)
Gamma <- rbind(Gamma, c(one, 0, 0, 0))
Gamma <- rbind(Gamma, cbind(t(X), matrix(0, 2, 3)))
Ginv <- solve(Gamma)
Ginv <- (Ginv + t(Ginv))/2  # make exactly symmetric prior to eigen

B <- Ginv[1:n, 1:n]
Beig <- eigen(B)
g <- Beig$vectors
l <- Beig$values

We need two versions of deformloglik: (1) one for estimation of correlation parameters given the deformation parameters, “beta”, and (2) one for estimation of deformation parameters given the correlation parameters.

In addition to specifying which are the parameters (“par”) to be optimized, the ‘deformloglik.beta’ function has arguments to allow specification of subset of the partial warps to be included in the fitting. ‘jWarp’ is a vector specifying which of the partial warps to include. Generally initialize all the partial warp parameters to zero, meaning that we start with an identity mapping.

# jWarp <- 1:(n-3)

Or select a subset of warps, throwing out those with high bending energy. E.g.,

jWarp <- 1:10
lambda <- 1.5  # L1 constraint

beta0 <- matrix(0, nrow = 2, ncol = 1 + length(jWarp))
beta <- as.vector(beta0)

After executing the following code with the parameter lambda = 1.5 on the L1 constraint, try some other values if you have time. You may also try fitting more warps, but this may be time consuming.

We alternate optimization of variogram parameters and coefficients for the partial warps. This version runs a specified number of iterations; it is not set up to assess convergence. Estimated values of the variogram parameters and the betas are printed at each iteration, along with the likelihood fit criterion, so you can see how these are changing.

# nrep <- 25
nrep <- 5

for (i in 1:nrep) {
    # (1) Optimize variogram parameters
    beta <- matrix(beta, nrow = 2, ncol = 1 + length(jWarp))
    betafull <- matrix(0, nrow = 2, ncol = (n - 2))
    betafull[, c(1, jWarp + 1)] <- beta
    likfit.variog <- optim(par = spatialParams, deformloglik.variog.lambda, 
        hessian = T, B = B, X = X, beta = betafull, S = S, t = t, lambda = lambda)  #,
    # method='BFGS')
    spatialParams <- likfit.variog$par
    print(round(exp(spatialParams), 4))
    print(paste("convergence", likfit.variog$convergence))
    # (2) Optimize deformation parameters
    beta <- as.vector(beta)
    # likfit.beta <- optim(par=beta,deformloglik.beta,hessian=T,
    # control=list(maxit=500),B=B,X=X,
    # spatialParams=spatialParams,S=S,t=t,jWarp=jWarp, method='BFGS')
    likfit.beta <- optim(par = beta, deformloglik.beta.lambda, hessian = T, 
        control = list(maxit = 500, trace = 1), B = B, X = X, spatialParams = spatialParams, 
        S = S, t = t, jWarp = jWarp, lambda = lambda, method = "BFGS")
    beta <- likfit.beta$par
    
    betafit <- matrix(likfit.beta$par, nrow = 2)
    print(round(betafit, 2))
    print(paste("convergence", likfit.beta$convergence))
}

# Wait for the iterations of the above loop to complete.  DID ANY OF THE
# PARAMETER ESTIMATES COME OUT TO BE (CLOSE TO) ZERO?  Save estimated
# parameters.

betafitfull <- matrix(0, nrow = 2, ncol = (n - 2))
betafitfull[, c(1, jWarp + 1)] <- betafit

# The code carried out optimization of log-transformations of the parameters
# theta, sigmaSqrd, and eta, as they must be constrained to be positive.  We
# exponentiate to get them on the original scales.
theta <- exp(spatialParams[1])
sigmaSqrd <- exp(spatialParams[2])
eta <- exp(spatialParams[3])

Now we plot results. We start with (a) The deformation and plots of covariance and correlation vs distance

par(mfrow = c(1, 2))
Fdraw_deform(X, beta = betafitfull, xscaling = xscaling, colpts = "black", cexpts = 0.5)
Y <- Fcomp_deform(X, beta = betafitfull, X)

fitted.warps <- "all"  # or, for example,
# fitted.warps <- '1:10'
plot.title <- "Pluie"
# pdftitle1 <- paste(plot.title,', lambda=',lambda,', warps ',
# fitted.warps,' .pdf',sep='') pdf(pdftitle1)

# Original coordinates
par(mgp = c(2, 0.75, 0), mar = c(3, 2, 2, 1))
eqscplot(X0[, 1], X0[, 2], type = "n", tol = 0.25)
text(X0[, 1], X0[, 2], cex = 0.75)
title(main = "Lambert projection coords")

# The deformation
par(mfrow = c(1, 2))
Fdraw_deform(X, beta = betafitfull, xscaling = xscaling, cexpts = 0.75)

par(mfrow = c(2, 2), mar = c(4, 4, 3, 1))
# Covariance vs Geographic distances
Gdist <- as.matrix(dist(X, upper = T, diag = T))
plot(Gdist, S, ylab = "Covariance")
Dsort <- sort(Gdist)
Esort <- sigmaSqrd * exp((-1/theta) * Dsort)
Esort[Dsort == 0] <- Esort[Dsort == 0] + eta
lines(Dsort, Esort, col = "red")

# Covariance vs Deformed Coordinate distances
Ddist <- as.matrix(dist(Y, upper = T, diag = T))
plot(Ddist, S, ylab = "Covariance")
Dsort <- sort(Ddist)
Esort <- sigmaSqrd * exp((-1/theta) * Dsort)
Esort[Dsort == 0] <- Esort[Dsort == 0] + eta
lines(Dsort, Esort, col = "red")

# Similarly, correlations vs Geographic and Deformed distances
Cor <- S/sqrt(diag(S) %o% diag(S))
plot(Gdist, Cor, ylab = "Correlation")
Ecorsort <- sigmaSqrd * exp((-1/theta) * Dsort)/(eta + sigmaSqrd)
lines(Dsort, Ecorsort, col = "red")
plot(Ddist, Cor, ylab = "Correlation")
lines(Dsort, Ecorsort, col = "red")

# dev.off()
  1. Visualization of contribution of each of the partial warp components. Note that this generates many pages of plots (depending on how many principal warps are used in the fitting). You’ll need to click on the “back arrow” to see the earlier pages (or save this in a pdf by uncommenting the pdf lines below).
# pdftitle2 <- paste(plot.title,', lambda=',lambda,', by warp ',
# fitted.warps,' .pdf',sep='') pdf(pdftitle2)

par(mfrow = c(2, 2))
# Affine
beta <- betafitfull
beta[, 2:(n - 2)] <- 0
Fdraw_deform(Xt, beta, cexpts = 0.75, drawsq = F)
title(main = paste("Affine, coefs = ", paste(round(beta[, 1], 2), collapse = ",")))

for (j in (1 + jWarp)) {
    beta <- betafitfull
    beta[, (1:(n - 2)) != j] <- 0
    Fdraw_deform(Xt, beta, cexpts = 0.75, drawsq = F)
    title(main = paste("PW", j, paste(round(beta[, j], 2), collapse = ",")))
}
# dev.off()
  1. Biorthogonal grid visualization of each of the composite deformation. You might want to save this plot in a pdf for comparision with the biorthogonal grid computed in Part 1.
# pdf(paste(plot.title,', lambda=',lambda,', warps ', fitted.warps,'
# bgrid.pdf',sep=''))
par(mgp = c(2, 0.75, 0))
Tspline <- sinterp(X, Y, lam = 0)
Tgrid <- bgrid(start = c(0, 0), xmat = X, coef = Tspline$sol, perpc = 10)


plot(X, xlab = "", ylab = "", type = "n", asp = TRUE)
text(X, labels = 1:nrow(X), cex = 0.75)
Bgrid.title <- paste("Bgrid: ", plot.title, "\n lambda=", lambda, ", warps ", 
    fitted.warps, sep = "")
temp <- draw2(Tgrid, fs = TRUE, lwidth = c(2, 2), lcolor = topo.colors(5), legend = TRUE)  #,lwidth=c(1,3))
title(main = Bgrid.title)
title(sub = "Lines are colored by directional magnitude of the
      principal axes of the deformation")
# dev.off()

Questions for part 2.

  • The first question is the obvious one: which fitting method — the weighted least squares or the L1 penalized principal warps fitting — found a better fitting model (without folding)? Do the differences in results look to be substantial?

Then answer the same additional questions as raised for part 1.

  • In what area(s) of the map is spatial correlation strongest and what area(s) is it weakest?
  • Where is local anisotropy greatest?
  • Are there regions where the local correlation structure appears nearly isotropic?
  • Can you relate any of this estimated structure to the topography in the tourist map?