interpollen {AeRobiology} | R Documentation |
Function to simultaneously replace all missing data of an historical database of several pollen types by using different methods of interpolation.
interpollen(data, method = "lineal", maxdays = 30, plot = TRUE, factor = 2, ndays = 3, spar = 0.5, data2 = NULL, data3 = NULL, data4 = NULL, data5 = NULL, mincorr = 0.6, result = "wide")
data |
A |
method |
A |
maxdays |
A |
plot |
A |
factor |
A |
ndays |
A |
spar |
A |
data2, data3, data4, data5 |
A |
mincorr |
A |
result |
A |
This function allows to interpolate missing data in a pollen database using 4 different methods which are described below. Interpolation for each pollen type will be automatically done for gaps smaller than the "maxdays"
argument.
"lineal"
method. The interpolation will be carried out by tracing a straight line between the gap extremes.
"movingmean"
method. It calculates the moving mean of the pollen daily concentrations with a window size of the gap size multiplicated by the factor
argument and replace the missing data with the moving mean for these days. It is a dynamic function and for each gap of the database, the window size of the moving mean changes depending of each gap size.
"spline"
method. The interpolation will be carried out by performing a spline regression with the previous and following days to the gap. The number of days of each side of the gap that will be taken into account for calculating the spline regression are specified by ndays
argument. The smoothness of the adjustment of the spline regression can be specified by the spar
argument.
"tseries"
method. The interpolation will be carried out by analysing the time series of pollen database. It performs a seasonal_trend decomposition based on LOESS (Cleveland et al., 1990). The seasonality of the historical database is extracted and used to predict the missing data by performing a linear regression with the target year.
"neighbour"
method. Other near stations provided by the user are used to interpolate the missing data of the target station. First of all, a Spearman correlation is performed between the target station and the neighbour stations to discard the neighbour stations with a correlation coefficient smaller than mincorr
value. For each gap, a linear regression is performed between the neighbour stations and the target stations to determine the equation which converts the pollen concentrations of the neighbour stations into the pollen concentration of the target station. Only neighbour stations without any missing data during the gap period are taken into account for each gap.
This function returns different results:
If result = "wide"
, returns a data.frame
including the original data and completed with the interpolated data.
If result = "long"
, returns a data.frame
containing your data in long format (the first column for date, the second for pollen type, the third for concentration and an additional fourth column with 1
if this data has been interpolated or 0
if not).
If plot = TRUE
, plots for each year and pollen type with daily values are represented in the active graphic window. Interpolated values are marked in red. If method
argument is "tseries"
, the seasonality is also represented in grey.
Cleveland RB, Cleveland WS, McRae JE, Terpenning I (1990) STL: a seasonal_trend decomposition procedure based on loess. J Off Stat 6(1):3_33.
data("munich_pollen") interpollen(munich_pollen, method = "lineal", plot = FALSE)