R: Add gap periods to sensor data

add_gaps {mpathsenser}

R Documentation

Add gap periods to sensor data

Description

Since there may be many gaps in mobile sensing data, it is pivotal to pay attention to them in the analysis. This function adds known gaps to data as "measurements", thereby allowing easier calculations for, for example, finding the duration. For instance, consider a participant spent 30 minutes walking. However, if it is known there is gap of 15 minutes in this interval, we should somehow account for it. add_gaps accounts for this by adding the gap data to sensors data by splitting intervals where gaps occur.

Usage

add_gaps(data, gaps, by = NULL, continue = FALSE, fill = NULL)

Arguments

`data`	A data frame containing the data. See `get_data()` for retrieving data from an mpathsenser database.
`gaps`	A data frame (extension) containing the gap data. See `identify_gaps()` for retrieving gap data from an mpathsenser database. It should at least contain the columns `from` and `to` (both in a date-time format), as well as any specified columns in `by`.
`by`	A character vector indicating the variable(s) to match by, typically the participant IDs. If NULL, the default, `⁠*_join()⁠` will perform a natural join, using all variables in common across `⁠x and ⁠`y'.
`continue`	Whether to continue the measurement(s) prior to the gap once the gap ends.
`fill`	A named list of the columns to fill with default values for the extra measurements that are added because of the gaps.

Details

In the example of 30 minutes walking where a 15 minute gap occurred (say after 5 minutes), add_gaps() adds two rows: one after 5 minutes of the start of the interval indicating the start of the gap(if needed containing values from fill), and one after 20 minutes of the start of the interval signalling the walking activity. Then, when calculating time differences between subsequent measurements, the gap period is appropriately accounted for. Note that if multiple measurements occurred before the gap, they will both be continued after the gap.

Value

A tibble containing the data and the added gaps.

Warning

Depending on the sensor that is used to identify the gaps (though this is typically the highest frequency sensor, such as the accelerometer or gyroscope), there may be a small delay between the start of the gap and the actual start of the gap. For example, if the accelerometer samples every 5 seconds, it may be after 4.99 seconds after the last accelerometer measurement (so just before the next measurement), the app was killed. However, within that time other measurements may still have taken place, thereby technically occurring "within" the gap. This is especially important if you want to use these gaps in add_gaps since this issue may lead to erroneous results.

An easy way to solve this problem is by taking into account all the sensors of interest when identifying the gaps, thereby ensuring there are no measurements of these sensors within the gap. One way to account for this is to (as in this example) search for gaps 5 seconds longer than you want and then afterwards increasing the start time of the gaps by 5 seconds.

Examples

# Define some data
dat <- data.frame(
  participant_id = "12345",
  time = as.POSIXct(c("2022-05-10 10:00:00", "2022-05-10 10:30:00", "2022-05-10 11:30:00")),
  type = c("WALKING", "STILL", "RUNNING"),
  confidence = c(80, 100, 20)
)

# Get the gaps from identify_gaps, but in this example define them ourselves
gaps <- data.frame(
  participant_id = "12345",
  from = as.POSIXct(c("2022-05-10 10:05:00", "2022-05-10 10:50:00")),
  to = as.POSIXct(c("2022-05-10 10:20:00", "2022-05-10 11:10:00"))
)

# Now add the gaps to the data
add_gaps(
  data = dat,
  gaps = gaps,
  by = "participant_id"
)

# You can use fill if you want to get rid of those pesky NA's
add_gaps(
  data = dat,
  gaps = gaps,
  by = "participant_id",
  fill = list(type = "GAP", confidence = 100)
)

[Package mpathsenser version 1.2.3 Index]