changeSupport {timetools} | R Documentation |
Function to change time support of TimeIntervalDataFrame
Description
Methods that allows to agregate AND disagregate homogeneous AND heterogeneous time data.
Usage
changeSupport(from, to, min.coverage, FUN = NULL,
weights.arg = NULL, split.from = FALSE,
merge.from = TRUE, ...)
## S4 method for signature 'TimeIntervalDataFrame,POSIXctp,numeric'
changeSupport(from, to, min.coverage, FUN=NULL,
weights.arg=NULL, split.from=FALSE,
merge.from=TRUE, ...)
## S4 method for signature
## 'TimeIntervalDataFrame,TimeIntervalDataFrame,numeric'
changeSupport(from, to, min.coverage,
FUN=NULL, weights.arg=NULL,
split.from=FALSE, merge.from=TRUE, ...)
## S4 method for signature 'TimeIntervalDataFrame,character,numeric'
changeSupport(from, to, min.coverage, FUN=NULL,
weights.arg=NULL, split.from=FALSE,
merge.from=TRUE, ...)
Arguments
from |
|
to |
an object indicating the new support, see specific sections |
min.coverage |
a numeric between 0 and 1 indicating the percentage of valid values over each interval to allow an aggregation. NA is returned if the percentage is not reach. In changeSupport, when values are aggregated, intervals are not allowed to overlap. When a function (FUN) has a na.rm argument, the na.rm=TRUE behaviour is met if na.rm is set to TRUE and min.coverage to 0 (zero) ; the na.rm=FALSE behaviour is met if na.rm is set to FALSE whatever is the value of min.coverage. If min.coverage si as.numeric(NA), the function FUN is apply on all data within the interval, without checking if there is any overlapping. In this case, the result of the transformation must be analysed carefully. |
FUN |
function use to agregate data of from. By
default |
weights.arg |
if FUN has a ‘weight’ argument,
this parameter must be a character naming the weight
argument. For instance, if FUN is
|
... |
arguments for FUN or for other methods |
split.from |
logical indicating if data in ‘from’ can be used for several intervals of the new time support (see ‘details’). |
merge.from |
logical indicating if data in ‘from’ can be merged over interval of the new time support. |
Details
Agregating homogeneous data is for example to calculate daily means of time series from hourly time series.
Agregating heterogeneous data is for example to calculate annual means of time series from monthly time series (because each month doesn(t have identical weight).
In above cases, the min.coverage
allows to control
if means should be calculated or not : for the monthly
case, if there are NA
values and the time coverage
of ‘not NA’ values is lower min.coverage
the result will be NA
; if time coverage is higher
than min.coverage
, the annual mean will be
‘estimated’ by the mean of available data.
Disagregating data is more ‘articficial’ and is
disabled by default (with the split.from
argument). This argument is also used to precise if one
value can be use for agregation in more than one interval
in the resulting TimeIntervalDataFrame (for sliding
intervals for instance). Here are some examples of time
disagregration :
A weekly mean can be dispatched over the days of the week. By default, the value attribuated to each day is the value of the week, but this can be changed by using a special function (
FUN
argument).The value of a variable is known from monday at 15 hours to tuesday at 15 hours and from tuesday at 15 hours to wednesday at 15 hours. To ‘evaluate’ the value of the variable for tuesday can be estimated by doing a weigthed mean between the two values. Weights are determined by the intersection between each interval and tuesday. Here weights will be
0.625
(15/24) and0.375
(9/24) (In this case, disagration is combined with a ‘reagregation’).
These are ‘trivial’ examples but many other usage
can be found for these methods. Other functions than
weighted.mean
or mean
can be used. The Qair
package (in its legislative part) gives several examples
of usage (this package is not availables on CRAN but see
‘references’ to know where you can find it).
Value
from=TimeIntervalDataFrame, to=TimeIntervalDataFrame
to
is a TimeIntervalDataFrame. The method will try
to adapt data of from
over interval of to
.
The returned object is the to
TimeIntervalDataFrame with new columns corresponding of
those of from
.
If merge.from is TRUE, values affected for each interval
of to
will be calculated with all data in the
interval. If split.from is TRUE, values partially in the
interval will also be used for calculation.
If merge.from is FALSE, values affected for each interval
of to
will be the one inside this interval. If
several values are inside the interval, NA
will be
affected. If split.from is TRUE, a value partially inside
the interval is considered as being inside it. So if
there is no other values in the interval, this value will
be affected, else NA
will be affected.
from=TimeIntervalDataFrame, to=character
to
is one of 'year', 'month', 'day', 'hour',
'minute' or 'second'. It defines the period
(POSIXctp
) to use to build the new
TimeIntervalDataFrame on which from
will be
agregated (or disagregated).
So first, an ‘empty’ (no data) TimeIntervalDataFrame is created, and then, the agregation is done accordingly to the ‘from=TimeIntervalDataFrame, to=TimeIntervalDataFrame’ section.
from=TimeIntervalDataFrame, to=POSIXctp
to
is period (see POSIXctp
). It
defines the base of the new TimeIntervalDataFrame on
which from
will be agregated (or disagregated).
So first, an ‘empty’ (no data) TimeIntervalDataFrame is created, and then, the agregation is done accordingly to the ‘from=TimeIntervalDataFrame, to=TimeIntervalDataFrame’ section.
References
Qair-package : https://sourceforge.net/projects/packagerqair/
See Also
TimeIntervalDataFrame
,
POSIXcti
Examples
ti3 <- TimeIntervalDataFrame (
c('2010-01-01', '2010-01-02', '2010-01-04'), NULL,
'UTC', data.frame(ex3=c(6, 1.5)))
# weighted mean over a period of 3 days with at least 75% of
# coverage (NA is retunr if not)
ti3
d <- POSIXctp(unit='day')
changeSupport (ti3, 3L*d, 0.75)
ti4 <- TimeIntervalDataFrame (
c('2010-01-01', '2010-01-02', '2010-01-04',
'2010-01-07', '2010-01-09', '2010-01-10'), NULL,
'UTC', data.frame(ex4=c(6, 1.5, 5, 3, NA)))
# weighted mean over a period of 3 days with at least 75% of
# coverage (NA is retunr if not) or 50%
ti4
changeSupport (ti4, 3L*d, 0.75)
changeSupport (ti4, 3L*d, 0.5)
# use of split.from
ti1 <- RegularTimeIntervalDataFrame('2011-01-01', '2011-02-01', 'hour')
ti1$value <- 1:nrow(ti1)
# we can calculate sliding mean over periods of 24 hours.
# first lets build the corresponding TimeIntervalDataFrame
ti2 <- RegularTimeIntervalDataFrame('2011-01-01', '2011-02-01', 'hour', 'day')
# if we try to 'project' ti1 over ti2 it won't work :
summary (changeSupport (ti1[1:200,], ti2[1:200,], 0))
# all data are NA because 'spliting' is not enabled. Let's enable it :
summary (changeSupport (ti1[1:200,], ti2[1:200,], 0, split.from=TRUE))