interpollen {AeRobiology} | R Documentation |
Interpolation of Missing Data in a Pollen Database by Different Methods
Description
Function to simultaneously replace all missing data of an historical database of several pollen types by using different methods of interpolation.
Usage
interpollen(data, method = "lineal", maxdays = 30, plot = TRUE,
factor = 2, ndays = 3, spar = 0.5, data2 = NULL, data3 = NULL,
data4 = NULL, data5 = NULL, mincorr = 0.6, result = "wide")
Arguments
data |
A |
method |
A |
maxdays |
A |
plot |
A |
factor |
A |
ndays |
A |
spar |
A |
data2 , data3 , data4 , data5 |
A |
mincorr |
A |
result |
A |
Details
This function allows to interpolate missing data in a pollen database using 4 different methods which are described below. Interpolation for each pollen type will be automatically done for gaps smaller than the "maxdays"
argument.
-
"lineal"
method. The interpolation will be carried out by tracing a straight line between the gap extremes. -
"movingmean"
method. It calculates the moving mean of the pollen daily concentrations with a window size of the gap size multiplicated by thefactor
argument and replace the missing data with the moving mean for these days. It is a dynamic function and for each gap of the database, the window size of the moving mean changes depending of each gap size. -
"spline"
method. The interpolation will be carried out by performing a spline regression with the previous and following days to the gap. The number of days of each side of the gap that will be taken into account for calculating the spline regression are specified byndays
argument. The smoothness of the adjustment of the spline regression can be specified by thespar
argument. -
"tseries"
method. The interpolation will be carried out by analysing the time series of pollen database. It performs a seasonal_trend decomposition based on LOESS (Cleveland et al., 1990). The seasonality of the historical database is extracted and used to predict the missing data by performing a linear regression with the target year. -
"neighbour"
method. Other near stations provided by the user are used to interpolate the missing data of the target station. First of all, a Spearman correlation is performed between the target station and the neighbour stations to discard the neighbour stations with a correlation coefficient smaller thanmincorr
value. For each gap, a linear regression is performed between the neighbour stations and the target stations to determine the equation which converts the pollen concentrations of the neighbour stations into the pollen concentration of the target station. Only neighbour stations without any missing data during the gap period are taken into account for each gap.
Value
This function returns different results:
If
result = "wide"
, returns adata.frame
including the original data and completed with the interpolated data.If
result = "long"
, returns adata.frame
containing your data in long format (the first column for date, the second for pollen type, the third for concentration and an additional fourth column with1
if this data has been interpolated or0
if not).If
plot = TRUE
, plots for each year and pollen type with daily values are represented in the active graphic window. Interpolated values are marked in red. Ifmethod
argument is"tseries"
, the seasonality is also represented in grey.
References
Cleveland RB, Cleveland WS, McRae JE, Terpenning I (1990) STL: a seasonal_trend decomposition procedure based on loess. J Off Stat 6(1):3_33.
See Also
Examples
data("munich_pollen")
interpollen(munich_pollen, method = "lineal", plot = FALSE)