add_gaps {mpathsenser} | R Documentation |
Add gap periods to sensor data
Description
Since there may be many gaps in mobile sensing data, it is pivotal to pay attention to them in
the analysis. This function adds known gaps to data as "measurements", thereby allowing easier
calculations for, for example, finding the duration. For instance, consider a participant spent
30 minutes walking. However, if it is known there is gap of 15 minutes in this interval, we
should somehow account for it. add_gaps
accounts for this by adding the gap data to
sensors data by splitting intervals where gaps occur.
Usage
add_gaps(data, gaps, by = NULL, continue = FALSE, fill = NULL)
Arguments
data |
A data frame containing the data. See |
gaps |
A data frame (extension) containing the gap data. See |
by |
A character vector indicating the variable(s) to match by, typically the participant
IDs. If NULL, the default, |
continue |
Whether to continue the measurement(s) prior to the gap once the gap ends. |
fill |
A named list of the columns to fill with default values for the extra measurements that are added because of the gaps. |
Details
In the example of 30 minutes walking where a 15 minute gap occurred (say after 5
minutes), add_gaps()
adds two rows: one after 5 minutes of the start of the interval
indicating the start of the gap(if needed containing values from fill
), and one after 20
minutes of the start of the interval signalling the walking activity. Then, when calculating
time differences between subsequent measurements, the gap period is appropriately accounted
for. Note that if multiple measurements occurred before the gap, they will both be continued
after the gap.
Value
A tibble containing the data and the added gaps.
Warning
Depending on the sensor that is used to identify the gaps (though this is
typically the highest frequency sensor, such as the accelerometer or gyroscope), there may be a
small delay between the start of the gap and the actual start of the gap. For example, if the
accelerometer samples every 5 seconds, it may be after 4.99 seconds after the last
accelerometer measurement (so just before the next measurement), the app was killed. However,
within that time other measurements may still have taken place, thereby technically occurring
"within" the gap. This is especially important if you want to use these gaps in
add_gaps
since this issue may lead to erroneous results.
An easy way to solve this problem is by taking into account all the sensors of interest when identifying the gaps, thereby ensuring there are no measurements of these sensors within the gap. One way to account for this is to (as in this example) search for gaps 5 seconds longer than you want and then afterwards increasing the start time of the gaps by 5 seconds.
See Also
identify_gaps()
for finding gaps in the sampling; link_gaps()
for linking gaps to
ESM data, analogous to link()
.
Examples
# Define some data
dat <- data.frame(
participant_id = "12345",
time = as.POSIXct(c("2022-05-10 10:00:00", "2022-05-10 10:30:00", "2022-05-10 11:30:00")),
type = c("WALKING", "STILL", "RUNNING"),
confidence = c(80, 100, 20)
)
# Get the gaps from identify_gaps, but in this example define them ourselves
gaps <- data.frame(
participant_id = "12345",
from = as.POSIXct(c("2022-05-10 10:05:00", "2022-05-10 10:50:00")),
to = as.POSIXct(c("2022-05-10 10:20:00", "2022-05-10 11:10:00"))
)
# Now add the gaps to the data
add_gaps(
data = dat,
gaps = gaps,
by = "participant_id"
)
# You can use fill if you want to get rid of those pesky NA's
add_gaps(
data = dat,
gaps = gaps,
by = "participant_id",
fill = list(type = "GAP", confidence = 100)
)