mts_sample {MazamaTimeSeries} | R Documentation |
Sample time series for an mts time series object
Description
Reduce the number of records (timesteps) in the data
dataframe of the incoming mts
through random sampling.
Usage
mts_sample(
mts = NULL,
sampleSize = 5000,
seed = NULL,
keepOutliers = FALSE,
width = 5,
thresholdMin = 3
)
Arguments
mts |
mts object. |
sampleSize |
Non-negative integer giving the number of rows to choose. |
seed |
Integer passed to |
keepOutliers |
Logical specifying a graphics focused sampling algorithm that retains outliers (see Details). |
width |
Integer width of the rolling window used for outlier detection. |
thresholdMin |
Numeric threshold for outlier detection. |
Details
When keepOutliers = FALSE
, random sampling is used to provide
a statistically relevant subsample of the data.
Value
A subset of the given mts object.
An mts time series object with fewer timesteps.
(A list with meta
and data
dataframes.)
Outlier Detection
When keepOutliers = TRUE
, a customized sampling algorithm is used that
attempts to create subsets for use in plotting that create plots that are
visually identical to plots using all data. This is accomplished by
preserving outliers and only sampling data in regions where overplotting
is expected.
The process is as follows:
find outliers using
MazamaRollUtils::findOutliers()
create a subset consisting of only outliers
sample the remaining data
merge the outliers and sampled data
This algorithm works best when the mts object has only one or two timeseries.
The width
and thresholdMin
parameters determine the number of
outliers detected. For hourly data, a width
of 5 and a thresholdMin
of 3 or 4 seem to find many visually obvious outliers.
Users attempting to optimize plotting speed for lengthy time series are
encouraged to experiment with these two parameters along with
sampleSize
and review the results visually.
See MazamaRollUtils::findOutliers()
.