slide {DataCombine}R Documentation

A function for creating lag and lead variables, including for time-series cross-sectional data.

Description

The function slides a column up or down to create lag or lead variables. If GroupVar is specified it will slide Var for each group. This is important for time-series cross-section data. The slid data is placed in a new variable in the original data frame. Note: your data needs to be sorted by date. The date should be ascending (i.e. increasing as it moves down the rows). Also, the time difference between rows should be constant, e.g. days, months, years.

Usage

slide(data, Var, TimeVar, GroupVar, NewVar, slideBy = -1,
  keepInvalid = FALSE, reminder = TRUE)

Arguments

data

a data frame object.

Var

a character string naming the variable you would like to slide (create lag or lead).

TimeVar

optional character string naming the time variable. If specified then the data is ordered by Var-TimeVar before sliding.

GroupVar

a character string naming the variable grouping the units within which Var will be slid. If GroupVar is missing then the whole variable is slid up or down. This is similar to shift, though shift returns the slid data to a new vector rather than the original data frame.

NewVar

a character string specifying the name for the new variable to place the slid data in.

slideBy

numeric value specifying how many rows (time units) to shift the data by. Negative values slide the data down–lag the data. Positive values shift the data up–lead the data.

keepInvalid

logical. Whether or not to keep observations for groups for which no valid lag/lead can be created due to an insufficient number of time period observations. If TRUE then these groups are returned to the bottom of the data frame and NA is given for their new lag/lead variable value.

reminder

logical. Whether or not to remind you to order your data by the GroupVar and time variable before running slide, plus other messages.

Details

slide a function for creating lag and lead variables, including for time-series cross-sectional data.

Value

a data frame

Source

Partially based on TszKin Julian's shift function: http://ctszkin.com/2012/03/11/generating-a-laglead-variables/

See Also

shift, dplyr

Examples

# Create dummy data
A <- B <- C <- sample(1:20, size = 20, replace = TRUE)
ID <- sort(rep(seq(1:4), 5))
Data <- data.frame(ID, A, B, C)

# Lead the variable by two time units
DataSlid1 <- slide(Data, Var = 'A', NewVar = 'ALead', slideBy = 2)

# Lag the variable one time unit by ID group
DataSlid2 <- slide(data = Data, Var = 'B', GroupVar = 'ID',
                NewVar = 'BLag', slideBy = -1)

# Lag the variable one time unit by ID group, with invalid lags
Data <- Data[1:16, ]

DataSlid3 <- slide(data = Data, Var = 'B', GroupVar = 'ID',
                 NewVar = 'BLag', slideBy = -2, keepInvalid = TRUE)


[Package DataCombine version 0.2.21 Index]