seqblock {blockTools} | R Documentation |

## Sequential assignment of unit(s) into experimental conditions using covariates

### Description

Sequentially assign units into experimental conditions. Blocking begins by creating a measure of multivariate distance between a *current* unit and one or multiple *prior*, already-assigned unit(s). Then, average distance between current unit and each treatment condition is calculated, and random assignment is biased toward conditions more dissimilar to current unit. Argument values can be specified either as argument to the function, or via a query. The query directly asks the user to identify the blocking variables and to input, one-by-one, each unit's variable values.

### Usage

```
seqblock(object = NULL, id.vars, id.vals, exact.vars = NULL, exact.vals = NULL,
exact.restr = NULL, exact.alg = "single", covar.vars = NULL, covar.vals = NULL,
covar.restr = NULL, covars.ord = NULL, n.tr = 2, tr.names = NULL, assg.prob = NULL,
seed = NULL, seed.dist, assg.prob.stat = NULL, trim = NULL, assg.prob.method = NULL,
assg.prob.kfac = NULL, distance = NULL, file.name = NULL, query = FALSE,
verbose = TRUE, ...)
```

### Arguments

`object` |
a character string giving the file name of a |

`id.vars` |
a string or vector of strings specifying the name of the identifying variable(s); if |

`id.vals` |
a vector of ID values for every unit being assigned to a treatment group; those are corresponding to the |

`exact.vars` |
a string or vector of strings containing the names of each of the exact blocking variables. |

`exact.vals` |
a vector containing the unit's values on each of the exact blocking variables. |

`exact.restr` |
a list object containing the restricted values that the exact blocking variables can take on. Thus the first element of |

`exact.alg` |
a string specifying the blocking algorithm. Currently the only acceptable value is |

`covar.vars` |
a string or vector of strings containing the names of each of the non-exact blocking variables. |

`covar.vals` |
a vector containing the unit's values on each of the non-exact blocking variables. |

`covar.restr` |
a list object containing the restricted values that the non-exact blocking variables can take on. Thus the first element of |

`covars.ord` |
a string or vector of strings containing the name of the non-exact blocking variables ordered so that the highest priority covariate comes first, followed by the second highest priority covariate, then the third, etc. |

`n.tr` |
the number of treatment groups. If not specified, this defaults to |

`tr.names` |
a string or vector of strings containing the names of the different treatment groups. |

`assg.prob` |
a numeric vector containing the probabilities that a unit will be assigned to the treatment groups; this vector should sum to 1. |

`seed` |
an optional integer value for the random seed, which is used when assigning units to treatment groups. |

`seed.dist` |
an optional integer value for the random seed set in |

`assg.prob.stat` |
a string specifying which assignment probability summary statistic to use; valid values are |

`trim` |
a numeric value specifying what proportion of the observations are to be dropped from each tail when the assignment probability summary statistic ( |

`assg.prob.method` |
a string specifying which algorithm should be used when assigning treatment probabilities. Acceptable values are |

`assg.prob.kfac` |
a numeric value for |

`distance` |
a string specifying how the multivariate distance used for blocking covariates are calculated. If not specified, this defaults to |

`file.name` |
a string containing the name of the file that one would like the output to be written to. Ideally this file name should have the extension .RData. |

`query` |
a logical stating whether the console should ask the user questions to input the data and assign a treatment condition. If not specified, this defaults to |

`verbose` |
a logical stating whether the function should print the name of the output file, the current working directory, the treatment group that the most recent unit was assigned to, and the dataframe |

`...` |
additional arguments. |

### Details

The `seqblock`

function's code is primarily divided into two parts: the first half deals with instances, in which the unit being assigned is the first unit in a given study to receive an assignment; the second half addresses subsequent units that are assigned after at least one first assignment has already been made. If the `object`

argument is left as `NULL`

, the function will run the first half; if the `object`

argument is specified, the second part will be executed. When `object = NULL`

, the researcher has no past file from which to pull variable names and past data; this corresponds to the case when the unit being assigned is the first one. If the researcher does specify `object`

, it implies the user is drawing data from a past file, which means this is not the first unit in the study to be assigned to a treatment.

However, the function can be called for subsequent units even when `object`

is not specified. By setting `query = TRUE`

, the console will ask the researcher whether this is the first unit to be assigned in the study. Based on the researcher's response, it will decide which part of the code to run.

If the `object`

and `file.name`

arguments are set to the same value, then `seqblock`

overwrites the specified file with a new file, which now contains both the previously-assigned units and the newly-assigned unit. To create a new file when a new unit is assigned, use a new `file.name`

.

The `single`

algorithm (see `exact.alg`

in the Arguments section above) creates a variable that has a unique level for every possible combination of the exact variables. As an example, say that there were 3 exact blocking variables: *party* (Democrat, Republican); *region* (North, South); and *education* (HS, NHS). The `single`

algorithm creates one level for units with the following values: Democrat-North-HS. It would create another level for Democrat-North-NHS; a third level for Republican-North-HS; and so forth, until every possible combination of these 3 variables has its own level. Thus if there are `k`

exact blocking variables and each exact blocking variable has `q_{i}`

values it can take on, then there are a total of `\prod_{1}^{k} q_{i}`

levels created.

The `distance = "mcd"`

and `distance = "mve"`

options call `cov.rob`

to calculate measures of multivariate spread robust to outliers. The `distance = "mcd"`

option calculates the Minimum Covariance Determinant estimate (Rousseeuw 1985); the `distance = "mve"`

option calculates the Minimum Volume Ellipsoid estimate (Rousseeuw and van Zomeren 1990). When `distance = "mcd"`

, the interquartile range on blocking variables should not be zero. The `distance = "euclidean"`

option calculates the Euclidean distance between the new unit and the previously-assigned units. The default `distance = "mahalanobis"`

option calculates the Mahalanobis distance.

### Value

A list (called `bdata`

) with elements

`x` |
a dataframe containing the names and values for the different ID and blocking variables, as well as each unit's initial treatment assignment. |

`nid` |
a string or vector of strings containing the name(s) of the ID variable(s). |

`nex` |
a string or vector of strings containing the name(s) of the exact blocking variable(s). |

`ncv` |
a string or vector of strings containing the name(s) of the non-exact blocking variable(s). |

`rex` |
a list of the restricted values of the exact blocking variables. |

`rcv` |
a list of the restricted values of the non-exact blocking variables. |

`ocv` |
a vector of the order of the non-exact blocking variables. |

`trn` |
a string or vector of strings containing the name(s) of the different treatment groups. |

`apstat` |
a string specifying the assignment probability summary statistic that was used. |

`mtrim` |
a numeric value specifying the proportion of observations to be dropped when the assignment probability statistic takes on the value |

`apmeth` |
a string specifying the assignment probability algorithm that was used. |

`kfac` |
the assignment probability |

`assgpr` |
a vector of assignment probabilities to each treatment group. |

`distance` |
a string specifying how the multivarite distance used for blocking was calculated. |

`trd` |
a list with the length equal to the number of previously assigned treatment conditions; each object in the list contains a vector of the distance between each unit in one treatment group and the new unit. This will be |

`tr.sort` |
a string vector of treatment conditions, sorted from the largest to the smallest. Set to |

`p` |
a vector of assignment probabilities to each treatment group used in assigning a treatment condition to the new unit. |

`distance` |
a string specifying how the multivarite distance used for blocking is calculated |

`trcount` |
a table containing the counts for each experimental/treatment conditions. |

`datetime` |
the date and time at which each unit was assigned their treatment group. |

`orig` |
a dataframe containing the names and values for the different id and blocking variables, as well as each unit's treatment assignment. |

### Author(s)

Ryan T. Moore rtm@wustl.edu, Tommy Carroll tcarroll22@wustl.edu, Jonathan Homola homola@wustl.edu and Jeong Hyun Kim jeonghyun.kim@wustl.edu

### References

Moore, Ryan T. and Sally A. Moore. 2013. "Blocking for Sequential Political Experiments." *Political Analysis* 21(4):507-523.

Moore, Ryan T. 2012. "Multivariate Continuous Blocking to Improve Political Science
Experiments." *Political Analysis* 20(4):460-479.

Rousseeuw, Peter J. 1985. "Multivariate Estimation with High Breakdown Point". *Mathematical Statistics and Applications* 8:283-297.

Rousseeuw, Peter J. and Bert C. van Zomeren. 1990. "Unmasking Multivariate Outliers and Leverage Points". *Journal of the American Statistical Association* 85(411):633-639.

### See Also

### Examples

```
## Assign first unit (assume a 25 year old member of the Republican Party) to a treatment group.
## Save the results in file "sdata.RData":
## seqblock(query = FALSE, id.vars = "ID", id.vals = 001, exact.vars = "party",
## exact.vals = "Republican", covar.vars = "age", covar.vals = 25, file.name = "sdata.RData")
## Assign next unit (age 30, Democratic Party):
## seqblock(query = FALSE, object = "sdata.RData", id.vals = 002, exact.vals = "Democrat",
## covar.vars = "age", covar.vals = 30, file.name = "sdata.RData")
```

*blockTools*version 0.6.4 Index]