deduplicate {webtrackR} | R Documentation |
Deduplicate visits
Description
deduplicate()
flags, drops or aggregates duplicates, which are defined as
consecutive visits to the same URL within a certain time frame.
Usage
deduplicate(
wt,
method = "aggregate",
within = 1,
duration_var = "duration",
keep_nvisits = FALSE,
same_day = TRUE,
add_grpvars = NULL
)
Arguments
wt |
webtrack data object. |
method |
character. One of |
within |
numeric (seconds). If |
duration_var |
character. Name of duration variable. Defaults to |
keep_nvisits |
boolean. If method set to |
same_day |
boolean. If method set to |
add_grpvars |
vector. If method set to |
Value
webtrack data.frame with the same columns as wt with updated duration
Examples
## Not run:
data("testdt_tracking")
wt <- as.wt_dt(testdt_tracking)
wt <- add_duration(wt, cutoff = 300, replace_by = 300)
# Dropping duplicates with one-second default
wt_dedup <- deduplicate(wt, method = "drop")
# Flagging duplicates with one-second default
wt_dedup <- deduplicate(wt, method = "flag")
# Aggregating duplicates
wt_dedup <- deduplicate(wt[1:1000], method = "aggregate")
# Aggregating duplicates and keeping number of visits for aggregated visits
wt_dedup <- deduplicate(wt[1:1000], method = "aggregate", keep_nvisits = TRUE)
# Aggregating duplicates and keeping "domain" variable despite grouping
wt <- extract_domain(wt)
wt_dedup <- deduplicate(wt, method = "aggregate", add_grpvars = "domain")
## End(Not run)