fitGLS {remotePARTS}R Documentation

Fit a PARTS GLS model.

Description

Fit a PARTS GLS model.

Usage

fitGLS(
  formula,
  data,
  V,
  nugget = 0,
  formula0 = NULL,
  save.xx = FALSE,
  save.invchol = FALSE,
  logLik.only = FALSE,
  no.F = FALSE,
  coords,
  distm_FUN,
  covar_FUN,
  covar.pars,
  invCholV,
  ncores = NA,
  suppress_compare_warning = FALSE,
  ...
)

Arguments

formula

a model formula

data

an optional data frame environment in which to search for variables given by formula

V

a covariance matrix, which must be positive definitive. This argument is optional if coords, distm_FUN, covar_FUN, and covar.pars are given instead.

nugget

an optional numeric nugget, must be positive

formula0

an optional formula for the null model to be compared with formula by an F-test

save.xx

logical: should information needed for cross-partition comparisons be returned?

save.invchol

logical: should the inverse of the Cholesky matrix be returned?

logLik.only

logical: should calculations stop after calculating parital log-likelihood?

no.F

logical: should F-test calculations be made?

coords

optional coordinate matrix for calculating V internally

distm_FUN

optional function for calculating a distance matrix from coords, when calculating V internally

covar_FUN

optional distance-based covariance function for calculating V internally

covar.pars

an optional named list of parameters passed to covar_FUN when calculating V internally

invCholV

optional pre-calculated inverse cholesky matrix to use in place of V

ncores

an optional integer indicating how many CPU threads to use for matrix calculations.

suppress_compare_warning

an optional variable to suppress warning that arises from identical formula and formula0.

...

additional arguments passed to optimize_nugget, which are only used if if nugget = NA

Details

conduct generalized least-squares regression of spatiotemporal trends

fitGLS fits a GLS model, using terms specified in formula. In the PARTS method, generally the left side of formula should be pixel-level trend estimates and the right side should be spatial predictors. The errors of the GLS are correlated according to covariance matrix V.

If nugget = NA, an ML nugget is estimated from the data using the optimize_nugget() function. Arguments additional arguments (...) are passed to optimize_nugget in this case. V must be provided for nugget optimization.

If formula0 is not specified, the default is to fit an intercept-only null model.

save.xx is included to allow for manually conducting a partitioned GLS analyses. Because most users will not need this feature, opting instead to use fitGLS_parition(), save.xx = FALSE by default.

Similarly, save.invchol is included to allow for recycling of the inverse cholesky matrix. Often, inverting the large cholesky matrix (i.e., invert_chol(V)) is the slowest part of GLS. This argument exists to allow users to recycle this process, though no remotePARTS function currently exists that can use invert_chol(V) to fit the GLS.

logLik.only = TRUE will return only the partial log-likelihood, which can minimized to obtain the maximum likelihood for a given set of data.

If no.F = TRUE, then the model given by formula is not compared to the model given by formula0.

If V is not provided, it can be fit internally by specifying all of coords, distm_FUN, covar_FUN, and covar.pars. The function given by distm_FUN will calculate a distance matrix from coords, which is then transformed into a distance-based covariance matrix with covar_FUN and parameters given by covar.pars.

This function uses C++ code that uses the Eigen matrix library (RcppEigen package) to fit models as efficiently as possible. As such, all available CPU cores are used for matrix calculations on systems with OpenMP support.

ncores is passed to the C++ code Eigen::setNpThreads() which sets the number of cores used for compatible Eigen matrix operations (when OpenMP is used).

Value

fitGLS returns a list object of class "remoteGLS", if logLik.only = FALSE. The list contains at least the following elements:

coefficients

coefficient estimates for predictor variables

SSE

sum of squares error

MSE

mean squared error

SE

standard errors

df_t

degrees of freedom for the t-test

logDetV

log-determinant of V

tstat

t-test statistic

pval_t

p-value of the t-statistic

logLik

the Log-likelihood of the model

nugget

the nugget used in fitting

covar_coef

the covariance matrix of the coefficients

If no.F = FALSE, the following elements, corresponding to the null model and F-test are also calculated:

coefficients0

coefficient estimates for the null model

SSE0

sum of squares error for the null model

MSE0

mean squared error for the null model

SE0

the standard errors for null coefficients

MSR

the regression mean square

df0

the null model F-test degrees of freedom

LL0

the log-likelihood of the null model

df_F

the F-test degrees of freedom, for the main model

Fstat

the F-statistic

pval_F

the F-test p-value

formula

the alternate formula used

formula0

the null formula used

An attribute called also set to "no.F" is set to the value of argument no.F, which signals to generic methods how to handle the output.

If save.invchol = TRUE, output also includes

invcholV

the inverse of the Cholesky decomposition of the covariance matrix obtained with invert_chol(V, nugget)

If save.xx = TRUE, output also includes the following elements

xx

the predictor variables X, from the right side of formula, transformed by the inverse cholesky matrix: xx = invcholV %*% X

xx0

the predictor variables X0, from the right side of formula0, transformed by the inverse cholesky matrix: xx0 = invcholV %*% X0

The primary use of xx and xx0 is for use with fitGLS_partition().

If logLik.only = TRUE, a single numeric output containing the log-likelihood is returned.

Examples


## read data
data(ndvi_AK10000)
df = ndvi_AK10000[seq_len(200), ] # first 200 rows

## fit covariance matrix
V = covar_exp(distm_scaled(cbind(df$lng, df$lat)), range = .01)

## run GLS
(GLS = fitGLS(CLS_coef ~ 0 + land, data = df, V = V))

## with F-test calculations to compare with the NULL model
(GLS.F = fitGLS(CLS_coef ~ 0 + land, data = df, V = V, no.F = FALSE))

## find ML nugget
fitGLS(CLS_coef ~ 0 + land, data = df, V = V, no.F = FALSE, nugget = NA)

## calculate V internally
coords = cbind(df$lng, df$lat)
fitGLS(CLS_coef ~ 0 + land, data = df, logLik.only = FALSE, coords = coords,
       distm_FUN = "distm_scaled", covar_FUN = "covar_exp", covar.pars = list(range = .01))

## use inverse cholesky
fitGLS(CLS_coef ~ 0 + land, data = df, invCholV = invert_chol(V))

## save inverse cholesky matrix
invchol = fitGLS(CLS_coef ~ 0 + land, data = df, V = V, save.invchol = TRUE)$invcholV

## re-use inverse cholesky instead of V
fitGLS(CLS_coef ~ 0 + land, data = df, invCholV = invchol)

## Log-likelihood (fast)
fitGLS(CLS_coef ~ 0 + land, data = df, V = V, logLik.only = TRUE)


[Package remotePARTS version 1.0.4 Index]