percdata {dataprep} | R Documentation |
Calculate the top and bottom percentiles of each selected variable
Description
Outliers can be preliminarily checked by the calculated top and bottom percentiles. Basic R functions in packages from system library are used to get these percentiles of selected variables in data frames, instead of calling other packages. It saves time.
Usage
percdata(data, start = NULL, end = NULL, group = NULL, diff = 0.1, part = 'both')
Arguments
data |
A data frame to calculate percentiles, from the column |
start |
The column number of the first variable to calculate percentiles for. |
end |
The column number of the last variable to calculate percentiles for. |
group |
The column number of the grouping variable. It can be selected according to whether the data needs to be processed in groups. If grouping is not required, leave it default (NULL); if grouping is required, set |
diff |
The common difference between |
part |
The option of calculating bottom and/or top percentiles (parts). Default is 'both', or 2 for both bottom and top parts. Setting it as 'bottom' or 0 for bottom part and 'top' or 1 for top part. |
Details
The data to be processed ranges from the column start
to the last column end
. The column numbers of these two columns are needed for the arguments. This requires that the variables of the data to be processed are arranged continuously in the database or table. Or else, it is necessary to move the columns in advance to make a continuous arrangement.
Value
Top (highest or greatest) and bottom (lowest or smallest) percentiles are calculated. According to the default diff
(=0.1), the calculated values are as follows.
0th |
Quantile with |
0.1th |
Quantile with |
0.2th |
Quantile with |
0.3th |
Quantile with |
0.4th |
Quantile with |
0.5th |
Quantile with |
99.5th |
Quantile with |
99.6th |
Quantile with |
99.7th |
Quantile with |
99.8th |
Quantile with |
99.9th |
Quantile with |
100th |
Quantile with |
Author(s)
Chun-Sheng Liang <liangchunsheng@lzu.edu.cn>
References
1. Example data is from https://smear.avaa.csc.fi/download. It includes particle number concentrations in SMEAR I Varrio forest.
See Also
dataprep::percplot
Examples
# Select the grouping variable and remaining variables after deletion by varidele.
# Column 4 ('monthyear') is the group and the fraction for varidele is 0.25.
# After extracting according to the result by varidele, the group is in the first column.
percdata(data[,c(4,27:61)],2,36,1)