simu_db {REDI} | R Documentation |
Generate a synthetic dataset tailored for REDI computations
Description
Simulate a complete training dataset, which may be representative of various
applications. Several flexible arguments allow adjustment of the range of
observed days, the distribution and the mean of Output
values, as well as
the ratio of missing data.
Usage
simu_db(
start_date = "2022-01-01",
end_date = "2023-01-01",
by = "day",
output_distrib = "Gaussian",
ratio_missing = 0.5,
mean = 50,
var = 10,
range_unif = c(0, 100)
)
Arguments
start_date |
A date, indicating the starting time of observations. Default is '2022-01-01'. |
end_date |
A date, indicating the ending time of observations. Default is '2023-01-01'. |
by |
A number or a character string, indicating the reference time time period between two observations. Possible values are 'day', 'week', 'month', 'year', or any arbitrary number. See documentation of the 'seq()' for additional information if necessary. Default is 'day'. |
output_distrib |
A character string, indicating the distribution of
|
ratio_missing |
A number, between 0 and 1, indicating the ratio of missing values in the dataset. Default is 0.5. |
mean |
A number, indicating the mean value of the Gaussian distribution. Default is 50. |
var |
A number, indicating the variance of the Gaussian distribution. Default is 10. |
range_unif |
A vector, indicating the range of values for the Uniform distribution. Default is c(0,100). |
Value
A full dataset of synthetic data.
Examples
## Generate a dataset with Gaussian measurements
data = simu_db(output_distrib = 'Gaussian')
## Generate a dataset with Uniform measurements and 30% of missing data.
data = simu_db(output_distrib = 'Uniform', ratio_missing = 0.3)