phenopar {sephora}R Documentation

Phenological parameters estimation

Description

Estimation of 6 phenological parameters from a numeric vector. The estimated parameters are: green up, start of season, maturity, senescence, end of season and dormancy. These parameters are critical points of some derivatives of an idealized curve which, in turn, is obtained through a functional principal component analysis (FPCA)-based regression model.

Usage

phenopar(
  x,
  startYear,
  endYear,
  frequency = 23,
  method = c("OLS", "WLS"),
  sigma = NULL,
  numFreq,
  delta = 0,
  distance,
  samples,
  basis,
  corr = NULL,
  k,
  trace = FALSE
)

Arguments

x

a numeric vector.

startYear

integer, time series initial year

endYear

integer, time series final year

frequency

integer giving number of observations per season. Default, 23.

method

character. Should OLS or WLS be used for smoothing x through a harmonic regression model. See Details.

sigma

numeric vector of length equal to frequency. Each entry gives the standard deviation of observations acquired at same day of the year. Pertinent when method=WLS only.

numFreq

integer specifying number of frequencies used in harmonic regression model.

delta

numeric. Default, 0. When harmonic regression problem is ill-posed, this parameter allows a simple regularization. See Details.

distance

character indicating what distance to use in hierarchical clustering. All distances in tsclust are allowed. See Details.

samples

integer with number of samples to draw from smoothed version of x. Used exclusively in Functional Principal Components Analysis (FPCA)-based regression. See Details.

basis

list giving numeric basis used in FPCA-based regression. See Details.

corr

Default NULL. Object defining correlation structure, can be numeric vector, matrix or function.

k

integer, number of principal components used in FPCA-based regression.

trace

logical. If TRUE, progress on the hierarchical clustering is printed on console. Default, FALSE.

Details

In order to estimate the phenological parameters, first x is assembled as a matrix. This matrix has as many rows as years (length(startYear:endYear)) in the studied period and as many columns as observations (frequency) per year. Then, each vector row is smoothed through the harmonic regression model haRmonics. This function allows for homogeneous (OLS) and heterogeneous (WLS) errors in the model. When method=WLS, sigma must be provided, hetervar is recommended for such a purpose. Additional parameters for haRmonics are numFreq and delta.

Next, equally spaced samples are drawn from each harmonic regression fit, the resulting observations are stored in the matrix m_aug_smooth. tsclust is applied to m_aug_smooth in order to obtain clusters of years sharing similar characteristics; 2 clusters are produced. The next step is applied to the dominating cluster (the one with the majority of years, >=10), or to the whole of columns of m_aug_smooth when no dominating cluster can be determined.

Based on the observations produced in the hierarchical clustering step, a regression model with the following representation is applied:

f_i(t) = \tau(t) + \sum_{j=1}^{k} \varepsilon_j(t) \nu_{ij} + \epsilon_i,

where f_i(t) is substituted by the vector of sample observations of the i-th year; \varepsilon_j(t) is the j-th functional principal component (FPC); \nu_{ij} is the score associated with the j-th FPC and the i-th vector of sampled observations; and \epsilon_i is a normally distributed random variable with variance \sigma^2, see Krivobokova et al. (2022) for further details. From this step, an estimate of \tau is produced -fpca- this is an idealized version of the original observations contained in x.

Parameter basis can be supplied through a call to drbasis with parameters nn=samples and qq=2. Parameter corr indicates whether correlation between annual curves must be considered; the current implementation does not incorporate correlation. The number of principal components is controlled by k.

Next, a harmonic regression is fitted to fpca (a numeric vector of length equal to samples) with the parameters provided above (method, sigma, numFreq, delta). Based on the estimated parameters of this fit (fpca_harmfit_params) a R function is calculated along with its first, second, third and fourth derivatives. These derivatives are used in establishing the phenological parameters (phenoparams) utilizing basic calculus criteria similar to what Baumann et al. (2017) have proposed.

Finally, when 6 phenoparams are found status=Success, otherwise status=Partial.

Value

A sephora-class object containing 14 elements

x

numeric vector

startYear

integer, time series initial year

endYear

integer, time series final year

freq

numeric giving number of observations per season. Default is 23.

sigma

when method="OLS", numeric of length one (standard deviation); when method="WLS", numeric vector of length equal to freq

m_aug_smooth

matrix with nrow=samples and ncol=(length(x)/freq) containing sampled observations

clustering

Formal class HierarchicalTSClusters with 20 slots. Output from a call to tsclust with parameters series=m_aug_smooth, type='h', distance=distance

fpca

numeric vector of length equal to samples

fpca_harmfit_params

list of 4: a.coef, b.coef, amplitude and phase as in haRmonics output.

fpca_fun_0der

function, harmonic fit for x

fpca_fun_1der

function, first derivative of harmonic fit for x

fpca_fun_2der

function, second derivative of harmonic fit for x

fpca_fun_3der

function, third derivative of harmonic fit for x

fpca_fun_4der

function, fourth derivative of harmonic fit for x

phenoparams

named numeric vector of length 6

status

character, specifying whether FPCA model was inverted successfully (Success) or partially ("Partial"). In other words, Success and Partial mean that 6 or less than 6 parameters were estimated, respectively.

References

Krivobokova, T. and Serra, P. and Rosales, F. and Klockmann, K. (2022). Joint non-parametric estimation of mean and auto-covariances for Gaussian processes. Computational Statistics & Data Analysis, 173, 107519.

Baumann, M. and Ozdogan, M. and Richardson, A. and Radeloff, V. (2017). Phenology from Landsat when data is scarce: Using MODIS and Dynamic Time-Warping to combine multi-year Landsat imagery to derive annual phenology curves. International Journal of Applied Earth Observation and Geoinformation, 54, 72–83

See Also

haRmonics, hetervar, tsclust, drbasis.

Examples

# --- Load dataset for testing
data("deciduous_polygon")

# --- Extracting first pixel of deciduous_polygon
pixel_deciduous <- vecFromData(data=deciduous_polygon, numRow=3)

# --- Following objects are used in this example
# --- for CRAN testing purposes only. In real life examples
# --- there is no need to shorten time series length

EndYear <- 2010
number_observations <- 23*11

# --- needed parameter
BASIS <- drbasis(n=50, q=2) 

# --- testing phenopar
sephora_deciduous <- phenopar(x=pixel_deciduous$vec[1:number_observations],
                              startYear=2000, endYear=EndYear,
                              numFreq=3, distance="dtw2",
                              samples=50, basis=BASIS, k=3)
                              
# --- testing ndvi_derivatives
f <- ndvi_derivatives(amp = sephora_deciduous$fpca_harmfit_params$amplitude,
                      pha = sephora_deciduous$fpca_harmfit_params$phase,
                      degree = 0, L = 365)
fprime <- ndvi_derivatives(amp = sephora_deciduous$fpca_harmfit_params$amplitude,
                           pha = sephora_deciduous$fpca_harmfit_params$phase,
                           degree = 1, L = 365)
fbiprime <- ndvi_derivatives(amp = sephora_deciduous$fpca_harmfit_params$amplitude,
                             pha = sephora_deciduous$fpca_harmfit_params$phase,
                             degree = 2, L = 365)
f3prime <- ndvi_derivatives(amp = sephora_deciduous$fpca_harmfit_params$amplitude,
                            pha = sephora_deciduous$fpca_harmfit_params$phase,
                            degree = 3, L = 365)
f4prime <- ndvi_derivatives(amp = sephora_deciduous$fpca_harmfit_params$amplitude,
                            pha = sephora_deciduous$fpca_harmfit_params$phase,
                            degree = 4, L = 365)
                            
# --- testing global_min_max and local_min_max
intervalo <- seq(1,365, length=365)
GU_Mat <- global_min_max(f=fbiprime, f1der=f3prime, f2der=f4prime, D=intervalo)
Sen <- local_min_max(f=fbiprime, f1der=f3prime, f2der=f4prime, 
                     what="min", x0=GU_Mat$min, D=intervalo)
SoS_EoS <- global_min_max(f=fprime, f1der=fbiprime, f2der=f3prime, D=intervalo)
Dor <- local_min_max(f=fbiprime, f1der=f3prime, f2der=f4prime, 
                     what="max", x0=GU_Mat$max, D=intervalo)
                      
# --- phenological dates (rough estimates)
c(GU=GU_Mat$max, SoS=SoS_EoS$max, Mat=GU_Mat$min,
  Sen=Sen$x_opt, EoS=SoS_EoS$min, Dor=Dor$x_opt)
# --- phenological dates provided by sephora
sephora_deciduous$phenoparams

# --- testing plotting methods
plot(x=sephora_deciduous, yLab="NDVI (no rescaled)")
plot(x=sephora_deciduous, type="profiles", 
     xLab="DoY", yLab="NDVI (no rescaled)")
     
# --- 2015 forms Cluster 2
plot(x=sephora_deciduous, type="ms")      

# --- graphical definition of phenological dates
plot(x=sephora_deciduous, type="derivatives")

# --- Overlapping FPCA fit to original time series
gg <- plot(x=sephora_deciduous, type="profiles", 
           xLab="DoY", yLab="NDVI (no rescaled)")
x_axis <- get_metadata_years(x=pixel_deciduous$vec, 
                             startYear=2000, endYear=EndYear, frequency=23)  
DoY <- seq(1,365, by=16)
fpca_DoY <- sephora_deciduous$fpca_fun_0der(t=DoY)
COLORS <- unique( ggplot_build(gg)$data[1][[1]]$colour )
df <- data.frame(values=c(sephora_deciduous$x, fpca_DoY),
                 years=as.factor(rep(c(x_axis$xLabels,"FPCA"), each=23)),
                 DoY=factor(DoY, levels=DoY), class=c(rep(1,number_observations), rep(2,23))) 
gg_fpca <- ggplot(data=df, 
                  aes(x=DoY, y=values, group=years, colour=years)) +
ggplot2::geom_line(linewidth = c(rep(1,number_observations), rep(4,23))) + 
ggplot2::labs(y="NDVI", x="DoY", color="years+FPCA") + 
ggplot2::scale_color_manual(values = c(COLORS, "#FF4500")) +
ggplot2::theme(legend.position = "right")
gg_fpca


[Package sephora version 0.1.31 Index]