splitSplines {growthPheno} | R Documentation |
Adds the fits, and optionally growth rates computed from derivatives, after
fitting splines to a response for an individual stored in a data.frame
in long format
Description
Uses fitSpline
to fit a spline to a subset of the values
of response
and stores the fitted values in data
.
The subsets are those values with the same levels combinations
of the factor
s listed in individuals
. The degree of smoothing
is controlled by the tuning parameters df
and lambda
,
related to the penalty, and by npspline.segments
. The smoothing.method
provides for direct
and logarithmic
smoothing.
The derivatives of the fitted spline can also be obtained, and the
Absolute and Relative Growth Rates ( AGR and RGR) computed using them, provided
correctBoundaries
is FALSE
. Otherwise, growth rates can be
obtained by difference using splitContGRdiff
.
The handling of missing values in the observations is controlled via
na.x.action
and na.y.action
. If there are not
at least four distinct, nonmissing x-values, a warning is issued and
all smoothed values and derivatives are set to NA
.
The function probeSmoothing
can be used to investgate the effect
the smoothing parameters (smoothing.method
, df
or
lambda
) on the smooth that results.
Note: this function is soft deprecated and may be removed in
future versions.
Use byIndv4Times_SplinesGRs
.
Usage
splitSplines(data, response, response.smoothed = NULL, x,
individuals = "Snapshot.ID.Tag", INDICES = NULL,
smoothing.method = "direct", smoothing.segments = NULL,
spline.type = "NCSS", df=NULL, lambda = NULL,
npspline.segments = NULL,
correctBoundaries = FALSE,
deriv = NULL, suffices.deriv = NULL, extra.rate = NULL,
sep = ".",
na.x.action="exclude", na.y.action = "exclude", ...)
Arguments
data |
A data.frame containing the column to be smoothed.
|
response |
A character giving the name of the column in
data that is to be smoothed.
|
response.smoothed |
A character specifying the name of the column
containing the values of the smoothed response variable, corresponding
to response . If response.smoothed is NULL , then
response.smoothed is set to the response to which
.smooth is added.
|
x |
A character giving the name of the column in
data that contains the values of the predictor variable.
|
individuals |
A character giving the name(s) of the
factor (s) that define the subsets of response
that correspond to the response values for an individual
(e.g. plant, pot, cart, plot or unit) that are to be smoothed
separately. If the columns corresponding to individuals
are not factor (s) then they will be coerced to
factor (s). The subsets are formed
using split .
|
INDICES |
A pseudonym for individuals .
|
smoothing.method |
A character giving the smoothing method
to use. The two possibilites are (i) "direct" , for directly
smoothing the observed response , and (ii) "logarithmic" , for
smoothing the log -transformed response and then
back-transforming by taking the exponentional of the fitted values.
|
smoothing.segments |
A named list , each of whose components
is a numeric pair specifying the first and last values of an
x -interval whose data is to be subjected as an entity to smoothing
using splines. The separate smooths will be combined to form a whole
smooth for each individual. If smoothing.segments is NULL ,
the data is not segmented for smoothing.
|
spline.type |
A character giving the type of spline
to use. Currently, the possibilites are (i) "NCSS" , for natural
cubic smoothing splines, and (ii) "PS" , for P-splines.
|
df |
A numeric specifying, for natural cubic smoothing splines
(NCSS ), the desired equivalent number of degrees of freedom of the
smooth (trace of the smoother matrix). Lower values result in more smoothing.
If df = NULL , the amount of smoothing can be controlled by setting
lambda . If both df and lambda are NULL , smoothing
is controlled by the default arguments for smooth.spline , and any
that you supply via the ellipsis (...) argument.
|
lambda |
A numeric specifying the positive penalty to apply.
The amount of smoothing decreases as lamda decreases.
|
npspline.segments |
A numeric specifying, for P-splines (PS ),
the number of equally spaced segments between min(x) and max(x) ,
excluding missing values, to use in constructing the B-spline basis for the
spline fitting. If npspline.segments is NULL, npspline.segments
is set to the maximum of 10 and ceiling((nrow(data)-1)/2) i.e. there will
be at least 10 segments and, for more than 22 x values, there will be
half as many segments as there are x values. The amount of smoothing
decreases as npspline.segments increases. When the data has been
segmented for smoothing (smoothing.segments is not NULL ),
an npspline.segments value can be supplied for each segment.
|
correctBoundaries |
A logical indicating whether the fitted spline
values are to have the method of Huang (2001) applied
to them to correct for estimation bias at the end-points. Note that
spline.type must be NCSS and lambda and deriv
must be NULL for correctBoundaries to be set to TRUE .
|
deriv |
A numeric specifying one or more orders of derivatives
that are required.
|
suffices.deriv |
A character giving the characters to be
appended to the names of the derivatives. If NULL
and the derivative is to be retained then smooth.dv
is appended.
|
|
A named character nominating a single growth
rate (AGR or RGR ) to be computed using the first
derivative, which one being dependent on the smoothing.method .
The name of this element will used as a suffix to be appended to
the response when naming the resulting growth rate (see Examples).
If unamed, AGR or RGR will be used, as appropriate.
Note that, for the smoothing.method set to direct ,
the first derivative is the AGR and so extra.rate must be set
to RGR , which is computed as the AGR / smoothed response .
For the smoothing.method set to logarithmic ,
the first derivative is the RGR and so extra.rate must be set
to AGR , which is computed as the RGR * smoothed response .
Make sure that deriv includes one so that the first derivative
is available for calculating the extra.rate .
|
sep |
A character giving the separator to use when the
levels of individuals are combined. This is needed to avoid
using a character that occurs in a factor to delimit
levels when the levels of individuals are combined to identify
subsets.
|
na.x.action |
A character string that specifies the action to
be taken when values of x are NA . The possible
values are fail , exclude or omit .
For exclude and omit , predictions and derivatives
will only be obtained for nonmissing values of x .
The difference between these two codes is that for exclude the returned
data.frame will have as many rows as data , the
missing values have been incorporated.
|
na.y.action |
A character string that specifies the action to
be taken when values of y , or the response , are
NA . The possible values are fail , exclude ,
omit , allx , trimx , ltrimx or
rtrimx . For all options, except fail , missing
values in y will be removed before smoothing.
For exclude and omit , predictions
and derivatives will be obtained only for nonmissing values of
x that do not have missing y values. Again, the
difference between these two is that, only for exclude
will the missing values be incorporated into the
returned data.frame . For allx , predictions and
derivatives will be obtained for all nonmissing x .
For trimx , they will be obtained for all nonmissing
x between the first and last nonmissing y values
that have been ordered for x ; for ltrimx and
utrimx either the lower or upper missing y
values, respectively, are trimmed.
|
... |
allows for arguments to be passed to smooth.spline .
|
Value
A data.frame
containing data
to which has been
added a column with the fitted smooth, the name of the column being
response.smoothed
. If deriv
is not NULL
,
columns containing the values of the derivative(s) will be added
to data
; the name each of these columns will be the value of
response.smoothed
with .dvf
appended, where f
is the order of the derivative, or the value of response.smoothed
with the corresponding element of suffices.deriv
appended.
If RGR
is not NULL
, the RGR is calculated as the ratio
of value of the first derivative of the fitted spline and the fitted
value for the spline.
Any pre-existing smoothed and derivative columns in data
will be
replaced. The ordering of the data.frame
for the x
values will be preserved as far as is possible; the main difficulty
is with the handling of missing values by the function merge
.
Thus, if missing values in x
are retained, they will occur at
the bottom of each subset of individuals
and the order will be
problematic when there are missing values in y
and
na.y.action
is set to omit
.
Author(s)
Chris Brien
References
Eilers, P.H.C and Marx, B.D. (2021) Practical smoothing: the joys of P-splines. Cambridge University Press, Cambridge.
Huang, C. (2001) Boundary corrected cubic smoothing splines. Journal of Statistical Computation and Simulation, 70, 107-121.
See Also
fitSpline
, probeSmoothing
, splitContGRdiff
,
smooth.spline
, predict.smooth.spline
,
split
Examples
data(exampleData)
#smoothing with growth rates calculated using derivates
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP",
individuals = "Snapshot.ID.Tag",
df = 4, deriv=1, suffices.deriv="AGRdv",
extra.rate = c(RGRdv = "RGR"))
#Use P-splines
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP",
individuals = "Snapshot.ID.Tag",
spline.type = "PS", lambda = 0.1, npspline.segments = 10,
deriv=1, suffices.deriv="AGRdv",
extra.rate = c(RGRdv = "RGR"))
#with segmented smoothing
longi.dat <- splitSplines(longi.dat, response="PSA", x="xDAP",
individuals = "Snapshot.ID.Tag",
smoothing.segments = list(c(28,34), c(35,42)), df = 5)
[Package
growthPheno version 2.1.25
Index]