sigint {sigInt} | R Documentation |
Estimating the parameters of the canonical discrete crisis bargaining game.
Description
This function fits the Lewis and Schultz (2003) model to data using either
the pseudo-likelihood (PL) or nested-pseudo likelihood (NPL) method from
Crisman-Cox and Gibilisco (2018). Throughout, we refer to the data as
containing D
games, where each game is observed one or more times.
Usage
sigint(
formulas,
data,
subset,
na.action,
fixed.par = list(),
method = c("npl", "pl"),
npl.maxit = 25,
npl.tol = 1e-07,
npl.trace = FALSE,
start.beta,
maxlik.method = "NR",
phat,
phat.formulas,
pl.vcov = FALSE,
phat.vcov,
seed = 12345,
maxlik.options = list()
)
Arguments
formulas |
a |
data |
a data frame containing the variables used to fit the model.
Each row of the data frame describes an individual game
|
subset |
an optional logical expression to specify a subset of observations to be used in fitting the model. |
na.action |
how do deal with missing data ( |
fixed.par |
a list with up to seven (7) named elements for normalizing payoffs to non-zero values.
Names must match a payoff name as listed in "Details."
Each named element should contain a single number that is the fixed (not estimated) value of that payoff.
For example, to fix each side's victory-without-fighting payoff to 1
use |
method |
whether to use the nested-pseudo-likelihood ( |
npl.maxit |
maximum number of outer-loop iterations to be used when fitting the NPL. See "Details" for more information. |
npl.tol |
Convergence criteria for the NPL. When the estimates change by less than this amount, convergence is considered successful. |
npl.trace |
logical. Should the NPL's progress be printed to screen? |
start.beta |
starting values for the model coefficients as a single vector. If missing, random values are drawn from a normal distribution with mean zero and standard deviation 0.05. |
maxlik.method |
method used by |
phat |
a list containing two vectors: |
phat.formulas |
if |
pl.vcov |
number of bootstrap iterations to generate |
phat.vcov |
a covariance matrix for the estimates |
seed |
integer. Used to set the seed for the random forest and for drawing the the starting values. The PL can be sensitive to starting value, so this makes results reproducible. The NPL is less sensitive, but we always recommend checking the first order conditions. |
maxlik.options |
a list of options to be passed to
|
Details
The model corresponds to an extensive-form, discrete-crisis-bargaining game from Lewis and Schultz (2003):
. A . / \ . / \ . / \ . S_A B . 0 / \ . / \ . / \ . V_A A . C_B / \ . / \ . / \ . W_A + e_A a + e_a . W_B + e_B V_B
If A
chooses not to challenge B
,
then the game ends at the leftmost node (SQ
) and payoffs are
S_A
and 0 to players A
and B
, respectively. If A
challenges B
, B
can concede or resist. If B
concedes,
the game ends at CD
with payoffs V_A
and C_B
. However,
if B
resists, A
decides to stand firm, which ends the game at
SF
with payoffs W_A + \epsilon_A
and W_B + \epsilon_B
.
Finally, if A
decides to back down in the face of B
's
resistance, then the game ends at the rightmost node BD
, with payoffs
a + \epsilon_a
and V_B
.
The seven right-hand formulas that are specified in the formula argument
correspond to the regressors to be placed in S_A, V_A, C_B, W_A, W_B,
a
, and V_B
, respectively. The model is unidentified if any regressor
(including a constant term) is included in all the formulas for each player
(Lewis and Schultz 2003). Often the easiest way to meet this requirement is
set one formula per player to 0. When an identification problem is
detected, an error is issued. For example, the syntax for the formula
argument could be:
formulas = sq + cd + sf + bd ~ x1 + 0 | x2 | x2 | x1 + x2 | x1 | 1 | 0)
Where:
-
sq + cd + sf + bd
are the tallies of how many times each outcome is observed for each observation. When the game is only observed once, that observation will be a 1 and three 0s. When the game is observed multiple times, these variables should count the number of times each outcome is observed. They need to be in the order ofSQ
,CD
,SF
,BD
. -
S_A
is a function of the variablex1
and no constant term. -
V_A
is a function of the variablex2
and a constant term. -
C_B
is a function of the variablex2
and a constant term. -
W_A
is a function of the variablesx1
,x2
and a constant term. -
W_B
is a function of the variablex1
and a constant term. -
a
is a constant term. -
V_B
is fixed to 0 (or a non-zero value set byfixed.par
.
Each row of the data frame should be a summary of the covariates and outcomes associated with that particular game.
When each game is observed only once, then this will resemble an ordinary dyad-time data frame.
However, if there are multiple observations per game, then each row should be a summary of all the data associated
with that game.
For example, if there are D
games in the data, where each is observed T_d
times, then the data frame
should have D
rows.
The four columns making up the dependent variable will denote the frequencies of each outcome for game d
,
such that sq
_d
+ cd
_d
+ sf
_d
+ bd
_d = T_d
.
The covariates in row d
should be summary statistics for the exogenous variables (e.g., mean, median, mode, first observation).
The model is first fit using a pseudo-likelihood estimator. This approach
requires first stage estimation of the probability that B
resists and
the probability that A
fights conditional on B
choosing to
resist. These first stage estimates should be flexible and we recommend
that users fit a flexible semi-parametric or non-parametric model to
produce them. If these estimates are produced by the analyst prior to using
this function, then they can be provided by providing a list to the
phat
argument. This list should contain two named elements
-
PRhat
is the probability thatB
resists. This should be a vector of probabilities with one estimated probability for each observation. -
PFhat
is the probability thatA
stands firm conditional onB
resisting. This should be a vector of probabilities with one estimated probability for each observation.
If the user leaves the phat
argument empty, then these first-stage
estimates are produced internally using the
randomForest
function.
Users wanting to use the
random forest, can supply a formula for it using the argument
phat.formulas
.
This argument can take a formula with nothing on the
left-hand side and 1-2 right-hand sides.
If two right-hand sides are
provided then the first is used to generate PRhat
, and the second is
used for PFhat
.
If only one right-hand side is provided, it is used
for both. Some examples:
-
phat.formulas = ~ x1 + x2
predictPRhat
andPFhat
usingx1
andx2
. -
phat.formulas = ~ x1 + x2 | x1 + x2
predictPRhat
andPFhat
usingx1
andx2
-
phat.formulas = ~ x1 + x2 | x1
predictPRhat
usingx1
andx2
, but predictPFhat
using onlyx1
.
If both phat
and phat.formula
are missing, then a random forest is fit using all the
exogenous variables listed in the formulas argument
If method = "npl"
, then estimation continues.
For each iteration of the NPL, the estimates of PRhat
and PFhat
are updated
by one best-response iteration using the current parameter estimates.
The model is then refit using these updated choice probabilities.
This process continues until the maximum absolute change in
parameters and choice probabilities is less than npl.tol
(default, 1e-7
), or
the number of outer iterations exceeds npl.maxit
(default, 25
).
In the latter case, a warning is produced.
If pseudo-likelihood (method="pl"
) is used, then
pl.vcov
is checked.
There are four possibilities here:
-
pl.vcov = FALSE
(default), then no covariance matrix or standard errors are returned, only the point estimates. -
pl.vcov > 0
andphat.vcov
is supplied, thenphat.vcov
is used to estimate the PL's covariance matrix. -
pl.vcov > 0
,phat.vcov
is missing, andphat
is missing, then the random forest used to estimatePRhat
andPFhat
is bootstrapped (simple, nonparametric bootstrap)pl.vcov
times. -
pl.vcov > 0
,phat.vcov
is missing, andphat
is not missing, then an error is returned.
Value
An object of class sigfit
, containing:
coefficients
A vector of estimated model parameters.
vcov
Estimated variance-covariance matrix. When
pl.vcov = FALSE
, this slot is omitted.utilities
Each actor's utilities at the estimated values.
fixed.par
The fixed utilities if specified in the call.
logLik
Final log-likelihood value of the model.
gradient
First derivative values at the estimated parameters.
Phat
List of two elements
-
PRhat
The first stage estimates of the probability thatB
resists (method = "pl"
) or the final estimates thatB
resists (ifmethod = "npl"
) -
PFhat
The first stage estimates of the probability thatA
stands firms given thatA
challenged (method = "pl"
) or the final estimates thatA
stands firms given thatA
challenged (ifmethod = "npl"
)
Note that
PRhat
will only be an equilibrium ifmethod = "npl"
and the NPL convergences-
user.phat
Logical. Did the user provide phat?
start.beta
The vector of starting values used in the PL optimization.
call
The call used to produce the object.
model
The data frame used to fit the model.
method
The method (
"pl"
or"npl"
) used to fit the model.maxlik.method
The optimization used by
maxLik
to fit the model.maxlik.code
The convergence code returned by
maxLik
.maxlik.message
The convergence message returned by
maxLik
.
Additionally, when method = "npl"
, the following are also included in the sigfit
object.
npl.iter
Number of best response iterations used in fitting the NPL.
npl.eval
Maximum difference between the parameters at the last two NPL iterations. If the NPL method converged, this should be less than
npl.tol
specified in the function call.eq.constraint
Maximum equilibrium constraint violation.
References
Casey Crisman-Cox and Michael Gibilisco. 2019. "Estimating Signaling Games in International Relations: Problems and Solutions." Political Science Research and Methods. Online First.
Jeffrey B. Lewis and Kenneth A. Schultz. 2003. "Revealing Preferences: Empirical Estimation of a Crisis Bargaining Game with Incomplete Information." Political Analysis 11:345–367.
Examples
data("sanctionsData")
f1 <- sq+cd+sf+bd ~ sqrt(senderecondep) + senderdemocracy + contig + ally -1|#SA
anticipatedsendercosts|#VA
sqrt(targetecondep) + anticipatedtargetcosts + contig + ally|#CB
sqrt(senderecondep) + senderdemocracy + lncaprat | #barWA
targetdemocracy + lncaprat| #barWB
senderdemocracy| #bara
-1#VB
## Using Nested-Pseudo Likelihood with default first stage
## Not run:
fit1 <- sigint(f1, data=sanctionsData, npl.trace=TRUE)
summary(fit1)
## End(Not run)
## Using Pseudo Likelihood with user supplied first stage
Phat <- list(PRhat=sanctionsData$PRnpl, PFhat=sanctionsData$PFnpl)
fit2 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat)
summary(fit2)
## Using Pseudo Likelihood with user made first stage and user covariance
## SIGMA is a bootstrapped first-stage covariance matrix (not provided)
## Not run:
fit3 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat, phat.vcov=SIGMA, pl.vcov=TRUE)
summary(fit3)
## End(Not run)
## Using Pseudo Likelihood with default first stage and
## bootstrapped standard errors for the first stage covariance
## Not run:
fit4 <- sigint(f1, data=sanctionsData, method="pl", pl.vcov=25)
summary(fit4)
## End(Not run)