R: Adds the fits, and optionally growth rates computed from...

splitSplines {growthPheno}

R Documentation

Adds the fits, and optionally growth rates computed from derivatives, after fitting splines to a response for an individual stored in a `data.frame` in long format

Description

Uses fitSpline to fit a spline to a subset of the values of response and stores the fitted values in data. The subsets are those values with the same levels combinations of the factors listed in individuals. The degree of smoothing is controlled by the tuning parameters df and lambda, related to the penalty, and by npspline.segments. The smoothing.method provides for direct and logarithmic smoothing.

The derivatives of the fitted spline can also be obtained, and the Absolute and Relative Growth Rates ( AGR and RGR) computed using them, provided correctBoundaries is FALSE. Otherwise, growth rates can be obtained by difference using splitContGRdiff.

The handling of missing values in the observations is controlled via na.x.action and na.y.action. If there are not at least four distinct, nonmissing x-values, a warning is issued and all smoothed values and derivatives are set to NA.

The function probeSmoothing can be used to investgate the effect the smoothing parameters (smoothing.method, df or lambda) on the smooth that results.

Note: this function is soft deprecated and may be removed in future versions.
Use byIndv4Times_SplinesGRs.

Usage

splitSplines(data, response, response.smoothed = NULL, x, 
             individuals = "Snapshot.ID.Tag", INDICES = NULL,
             smoothing.method = "direct", smoothing.segments = NULL, 
             spline.type = "NCSS", df=NULL, lambda = NULL, 
             npspline.segments = NULL, 
             correctBoundaries = FALSE, 
             deriv = NULL, suffices.deriv = NULL, extra.rate = NULL, 
             sep = ".", 
             na.x.action="exclude", na.y.action = "exclude", ...)

Arguments

`data`	A `data.frame` containing the column to be smoothed.
`response`	A `character` giving the name of the column in `data` that is to be smoothed.
`response.smoothed`	A `character` specifying the name of the column containing the values of the smoothed response variable, corresponding to `response`. If `response.smoothed` is `NULL`, then `response.smoothed` is set to the `response` to which `.smooth` is added.
`x`	A `character` giving the name of the column in `data` that contains the values of the predictor variable.
`individuals`	A `character` giving the name(s) of the `factor`(s) that define the subsets of `response` that correspond to the `response` values for an individual (e.g. plant, pot, cart, plot or unit) that are to be smoothed separately. If the columns corresponding to `individuals` are not `factor`(s) then they will be coerced to `factor`(s). The subsets are formed using `split`.
`INDICES`	A pseudonym for `individuals`.
`smoothing.method`	A `character` giving the smoothing method to use. The two possibilites are (i) `"direct"`, for directly smoothing the observed `response`, and (ii) `"logarithmic"`, for smoothing the `log`-transformed `response` and then back-transforming by taking the exponentional of the fitted values.
`smoothing.segments`	A named `list`, each of whose components is a numeric pair specifying the first and last values of an `x`-interval whose data is to be subjected as an entity to smoothing using splines. The separate smooths will be combined to form a whole smooth for each individual. If `smoothing.segments` is `NULL`, the data is not segmented for smoothing.
`spline.type`	A `character` giving the type of spline to use. Currently, the possibilites are (i) `"NCSS"`, for natural cubic smoothing splines, and (ii) `"PS"`, for P-splines.
`df`	A `numeric` specifying, for natural cubic smoothing splines (`NCSS`), the desired equivalent number of degrees of freedom of the smooth (trace of the smoother matrix). Lower values result in more smoothing. If `df = NULL`, the amount of smoothing can be controlled by setting `lambda`. If both `df` and `lambda` are `NULL`, smoothing is controlled by the default arguments for `smooth.spline`, and any that you supply via the ellipsis (...) argument.
`lambda`	A `numeric` specifying the positive penalty to apply. The amount of smoothing decreases as `lamda` decreases.
`npspline.segments`	A `numeric` specifying, for P-splines (`PS`), the number of equally spaced segments between `min(x)` and `max(x)`, excluding missing values, to use in constructing the B-spline basis for the spline fitting. If `npspline.segments` is NULL, `npspline.segments` is set to the maximum of 10 and `ceiling((nrow(data)-1)/2)` i.e. there will be at least 10 segments and, for more than 22 `x` values, there will be half as many segments as there are `x` values. The amount of smoothing decreases as `npspline.segments` increases. When the data has been segmented for smoothing (`smoothing.segments` is not `NULL`), an `npspline.segments` value can be supplied for each segment.
`correctBoundaries`	A `logical` indicating whether the fitted spline values are to have the method of Huang (2001) applied to them to correct for estimation bias at the end-points. Note that `spline.type` must be `NCSS` and `lambda` and `deriv` must be `NULL` for `correctBoundaries` to be set to `TRUE`.
`deriv`	A `numeric` specifying one or more orders of derivatives that are required.
`suffices.deriv`	A `character` giving the characters to be appended to the names of the derivatives. If `NULL` and the derivative is to be retained then `smooth.dv` is appended.
`extra.rate`	A named `character` nominating a single growth rate (`AGR` or `RGR`) to be computed using the first derivative, which one being dependent on the `smoothing.method`. The name of this element will used as a suffix to be appended to the response when naming the resulting growth rate (see Examples). If unamed, `AGR` or `RGR` will be used, as appropriate. Note that, for the `smoothing.method` set to `direct`, the first derivative is the AGR and so `extra.rate` must be set to `RGR`, which is computed as the `AGR / smoothed response`. For the `smoothing.method` set to `logarithmic`, the first derivative is the RGR and so `extra.rate` must be set to `AGR`, which is computed as the `RGR * smoothed response`. Make sure that `deriv` includes one so that the first derivative is available for calculating the `extra.rate`.
`sep`	A `character` giving the separator to use when the levels of `individuals` are combined. This is needed to avoid using a `character` that occurs in a `factor` to delimit levels when the levels of `individuals` are combined to identify subsets.
`na.x.action`	A `character` string that specifies the action to be taken when values of `x` are `NA`. The possible values are `fail`, `exclude` or `omit`. For `exclude` and `omit`, predictions and derivatives will only be obtained for nonmissing values of `x`. The difference between these two codes is that for `exclude` the returned `data.frame` will have as many rows as `data`, the missing values have been incorporated.
`na.y.action`	A `character` string that specifies the action to be taken when values of `y`, or the `response`, are `NA`. The possible values are `fail`, `exclude`, `omit`, `allx`, `trimx`, `ltrimx` or `rtrimx`. For all options, except `fail`, missing values in `y` will be removed before smoothing. For `exclude` and `omit`, predictions and derivatives will be obtained only for nonmissing values of `x` that do not have missing `y` values. Again, the difference between these two is that, only for `exclude` will the missing values be incorporated into the returned `data.frame`. For `allx`, predictions and derivatives will be obtained for all nonmissing `x`. For `trimx`, they will be obtained for all nonmissing `x` between the first and last nonmissing `y` values that have been ordered for `x`; for `ltrimx` and `utrimx` either the lower or upper missing `y` values, respectively, are trimmed.
`...`	allows for arguments to be passed to `smooth.spline`.

Value

A data.frame containing data to which has been added a column with the fitted smooth, the name of the column being response.smoothed. If deriv is not NULL, columns containing the values of the derivative(s) will be added to data; the name each of these columns will be the value of response.smoothed with .dvf appended, where f is the order of the derivative, or the value of response.smoothed with the corresponding element of suffices.deriv appended. If RGR is not NULL, the RGR is calculated as the ratio of value of the first derivative of the fitted spline and the fitted value for the spline. Any pre-existing smoothed and derivative columns in data will be replaced. The ordering of the data.frame for the x values will be preserved as far as is possible; the main difficulty is with the handling of missing values by the function merge. Thus, if missing values in x are retained, they will occur at the bottom of each subset of individuals and the order will be problematic when there are missing values in y and na.y.action is set to omit.

Author(s)

Chris Brien

References

Eilers, P.H.C and Marx, B.D. (2021) Practical smoothing: the joys of P-splines. Cambridge University Press, Cambridge.

Huang, C. (2001) Boundary corrected cubic smoothing splines. Journal of Statistical Computation and Simulation, 70, 107-121.

Examples

data(exampleData)
#smoothing with growth rates calculated using derivates
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP", 
                          individuals = "Snapshot.ID.Tag", 
                          df = 4, deriv=1, suffices.deriv="AGRdv", 
                          extra.rate = c(RGRdv = "RGR"))
#Use P-splines
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP", 
                          individuals = "Snapshot.ID.Tag", 
                          spline.type = "PS", lambda = 0.1, npspline.segments = 10, 
                          deriv=1, suffices.deriv="AGRdv", 
                          extra.rate = c(RGRdv = "RGR"))
#with segmented smoothing
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP", 
                          individuals = "Snapshot.ID.Tag", 
                          smoothing.segments = list(c(28,34), c(35,42)), df = 5)

[Package growthPheno version 2.1.25 Index]

Adds the fits, and optionally growth rates computed from derivatives, after fitting splines to a response for an individual stored in a data.frame in long format