summarize_skewness {sparklyr.flint} | R Documentation |
Skewness summarizer
Description
Compute skewness (third standardized moment) of 'column' and store the result in a new column named '<column>_skewness'
Usage
summarize_skewness(ts_rdd, column, key_columns = list(), incremental = FALSE)
Arguments
ts_rdd |
Timeseries RDD being summarized |
column |
Column to be summarized |
key_columns |
Optional list of columns that will form an equivalence relation associating each record with the time series it belongs to (i.e., any 2 records having equal values in those columns will be associated with the same time series, and any 2 records having differing values in those columns are considered to be from 2 separate time series and will therefore be summarized separately) By default, 'key_colums' is empty and all records are considered to be part of a single time series. |
incremental |
If FALSE and 'key_columns' is empty, then apply the summarizer to all records of 'ts_rdd'. If FALSE and 'key_columns' is non-empty, then apply the summarizer to all records within each group determined by 'key_columns'. If TRUE and 'key_columns' is empty, then for each record in 'ts_rdd', the summarizer is applied to that record and all records preceding it, and the summarized result is associated with the timestamp of that record. If TRUE and 'key_columns' is non-empty, then for each record within a group of records determined by 1 or more key columns, the summarizer is applied to that record and all records preceding it within its group, and the summarized result is associated with the timestamp of that record. |
See Also
Other summarizers:
ols_regression()
,
summarize_avg()
,
summarize_corr2()
,
summarize_corr()
,
summarize_count()
,
summarize_covar()
,
summarize_dot_product()
,
summarize_ema_half_life()
,
summarize_ewma()
,
summarize_geometric_mean()
,
summarize_kurtosis()
,
summarize_max()
,
summarize_min()
,
summarize_nth_central_moment()
,
summarize_nth_moment()
,
summarize_product()
,
summarize_quantile()
,
summarize_stddev()
,
summarize_sum()
,
summarize_var()
,
summarize_weighted_avg()
,
summarize_weighted_corr()
,
summarize_weighted_covar()
,
summarize_z_score()
Examples
library(sparklyr)
library(sparklyr.flint)
sc <- try_spark_connect(master = "local")
if (!is.null(sc)) {
price_sdf <- copy_to(
sc,
data.frame(
time = ceiling(seq(12) / 2),
price = seq(12) / 2,
id = rep(c(3L, 7L), 6)
)
)
ts <- fromSDF(price_sdf, is_sorted = TRUE, time_unit = "DAYS")
ts_skewness <- summarize_skewness(ts, column = "price")
} else {
message("Unable to establish a Spark connection!")
}