GRASS GIS manual: r.gwr

NAME

r.gwr - Calculates geographically weighted regression from raster maps.

KEYWORDS

SYNOPSIS

r.gwr

r.gwr --help

r.gwr [-ge] mapx=name[,name,...] mapy=name [mask=name] [residuals=name] [estimates=name] [coefficients=string] [output=name] [kernel=string] [bandwidth=integer] [vf=integer] [npoints=integer] [memory=integer] [--overwrite] [--help] [--verbose] [--quiet] [--ui]

Flags:

-g: Print in shell script style
-e: Estimate optimal bandwidth
--overwrite: Allow output files to overwrite existing files
--help: Print usage summary
--verbose: Verbose module output
--quiet: Quiet module output
--ui: Force launching GUI dialog

Parameters:

mapx=name[,name,...] [required]: Map(s) with X variables
mapy=name [required]: Map with Y variable
mask=name: Raster map to use for masking; Only cells that are not NULL and not zero are processed
residuals=name: Map to store residuals
estimates=name: Map to store estimates
coefficients=string: Prefix for maps to store coefficients
output=name: ASCII file for storing regression coefficients (output to screen if file not specified).
kernel=string: Weighing kernel function.; Options: gauss, epanechnikov, bisquare, tricubic; Default: gauss
bandwidth=integer: Bandwidth of the weighing kernel.; Default: 10
vf=integer: Variance factor for Gaussian kernel: variance = bandwith / factor.; Options: 1, 2, 4, 8; Default: 1
npoints=integer: Number of points for adaptive bandwidth; If 0, fixed bandwidth is used; Default: 0
memory=integer: Memory in MB for adaptive bandwidth; Default: 300

DESCRIPTION
REFERENCES
SEE ALSO
AUTHOR

DESCRIPTION

r.gwr calculates a geographically weighted multiple linear regression from raster maps, according to the formula

Y = b0 + sum(bi*Xi) + E

where

X = {X1, X2, ..., Xm}
m = number of explaining variables
Y = {y1, y2, ..., yn}
Xi = {xi1, xi2, ..., xin}
E = {e1, e2, ..., en}
n = number of observations (cases)

In R notation:

Y ~ sum(bi*Xi)
b0 is the intercept, X0 is set to 1

The β coefficients are localized, i.e. determined for each cell individually. These β coefficients are the most important output of r.gwr. Spatial patterns and localized outliers in these coefficients can reveal details of the relation of Y to X. Outliers in the β coefficients can also be caused by a small bandwidth and can be removed by increasing the bandwidth.

Geographically weighted regressions should be used as a diagnostic tool and not as an interpolation method. If a geographically weighted regression provides a higher R squared than the corresponding global regression, then a crucial predictor is missing in the model. If that missing predictor can not be estimated or is supposed to behave randomly, a geographically weighted regression might be used for interpolation, but the result, in particular the variation of the β coefficients needs to be judged according to prior assumptions. See also the manual and the examples of the R package spgwr.

r.gwr is designed for large datasets that can not be processed in R. A p value is therefore not provided, because even very small, meaningless effects will become significant with a large number of cells. Instead it is recommended to judge by the amount of variance explained (R squared for a given variable) and the gain in AIC (AIC without a given variable minus AIC global must be positive) whether the inclusion of a given explaining variable in the model is justified.

The explaining variables

R squared for each explaining variable represents the additional amount of explained variance when including this variable compared to when excluding this variable, that is, this amount of variance is explained by the current explaining variable after taking into consideration all the other explaining variables.

The F score for each explaining variable allows to test if the inclusion of this variable significantly increases the explaining power of the model, relative to the global model excluding this explaining variable. That means that the F value for a given explaining variable is only identical to the F value of the R-function summary.aov if the given explaining variable is the last variable in the R-formula. While R successively includes one variable after another in the order specified by the formula and at each step calculates the F value expressing the gain by including the current variable in addition to the previous variables, r.gwr calculates the F-value expressing the gain by including the current variable in addition to all other variables, not only the previous variables.

Bandwidth

The bandwidth is the crucial parameter for geographically weighed regressions. A too small bandwidth will essentially use the weighed average, any predictors are mostly ignored. A too large bandwidth will produce results similar to a global regression, and spatial non-stationarity can not be explored.

Adaptive bandwidth

Instead of using a fixed bandwidth (search radius for each cell), an adaptive bandwidth can be used by specifying the number of points to be used for each local regression with the npoints option. The module will find the nearest npoints points for each cell, adapt the bandwith accordingly and then calculate a local weighted regression.

Kernel functions

The kernel function has little influence on the result, much more important is the bandwidth. Available kernel funtions to calculate weights are

Epanechnikov: w = 1 - d / bw
Bisquare (Quartic): w = (1 - (d / bw)²)²
Tricubic: w = (1 - (d / bw)³)³
Gaussian: w = exp(-0.5 * (d / bw)²)

with
w = weight for current cell
d = distance to the current cell
bw = bandwidth

Masking

A mask map can be provided to restrict LWR to those cells where the mask map is not NULL and not 0 (zero).

REFERENCES

Brunsdon, C., Fotheringham, A.S., and Charlton, M.E., 1996, Geographically Weighted Regression: A Method for Exploring Spatial Nonstationarity, Geographical Analysis, 28(4), 281- 298
Fotheringham, A.S., Brunsdon, C., and Charlton, M.E., 2002, Geographically Weighted Regression: The Analysis of Spatially Varying Relationships, Chichester: Wiley.

AUTHOR

Markus Metz

Last changed: $Date: 2016-11-21 21:59:01 +0100 (Mon, 21 Nov 2016) $

SOURCE CODE

Available at: r.gwr source code (history)