endoSwitch {endoSwitch}R Documentation

Endogenous Switching Regression Models


This is the main interface for the endoSwitch package to estimate the endogenous switching regression models (Heckman, 1979).


  Weight = NA,
  treatEffect = TRUE,
  method = "BFGS",
  start = NULL,
  verbose = FALSE,



a data frame. Data for running the regression analysis.


character. Dependent variable in the outcome equation.


character. Dependent variable in the Selection model. The variable must be binary (0 or 1).


character vector. Covariates in the outcome equation.


character vector. Covariates in the selection equation.


optional character. Name of the weight variable in the dataset, or NA (equal weight).


TRUE/FALSE. If TRUE, average treatment effects will be calculated and returned. If FALSE, expected outcome values will be calculated and returned.


character. Maximization method to be used. The default is "BFGS" (for Broyden-Fletcher-Goldfarb-Shanno). Other methods can also be used. See maxLik.


optional numeric vector. Used as initial values of parameters for maximization purpose. If NULL, the coefficient estimates from the two-stage estimation will be used.


TRUE/FALSE. Choose to show the status of optimization or not.


Other parameters to be passed to the selected maximization routine. See maxLik.


This function estimates the endogenous switching regression model using the full maximum likelihood estimation method. In this model, a selection equation sorts observation units over two different regimes (e.g., treated and not-treated, or adopter and non-adopter), and two outcome equations that determine the outcome. Estimation of the model relies on joint normality of the error terms in the three-equation system (the selection equation plus two outcome equations). The model is estimated by maximizing the joint likelihood function that is provided in Lokshin and Sajaia (2004).

The endoSwitch uses the maxLik function in the maxLik package to do the optimization. The function automatically searches for starting values for maximization using the results from two-stage estimation following Maddala (1986, chapter 8). Though not recommended, users may provide starting values manually. Assume that you have M variables (including the constant) in the selection equation, and N variables (including the constant) in each outcome equation. Then you need (M + 2*N + 4) starting values. The first M values are for the variables in the selection equation (last value for the constant), followed by N values for the outcome equation for the non-treated individuals (SelectDep = 0), and another N values for the outcome equation for the treated individuals (SelectDep = 1). The last four values are: sigma in outcome equation for the non-treated, sigma in outcome equation for the treated, rho in outcome equation for the non-treated, rho in outcome equation for the treated.

If treatEffect = TRUE, the endoSwitch function will report average treatment effects (for the treated or untreated) as well as heterogeneity effects. A detailed description of these effects is provided in Di Falco, Veronesi, and Yesuf (2011, p.837). If treatEffect = FALSE, the endoSwitch function will report expected outcome values in a list of two dataframes: dataframe EYA1 reports actual (column EY1.A1) and counterfactual (column EY0.A1) expected outcome values for the treated; dataframe EYA0 reports actual (column EY0.A0) and counterfactual (column EY1.A0) expected outcome values for the untreated.


A list containing three elements. The first element is an object of class "maxLik", which includes parameters in the selection equation, parameters in the outcome equations, and the transformed distributional parameters (parameters are transformed to faciliate maximization, as recommended by Lokshin and Sajaia (2004)). The second element contains the estimates of original distributional parameters (transformed back via the delta method). The third element contains a table reporting average treatment effects or a list of expected outcome values, depending on users' choice of treatEffect.


Lokshin, Michael, and Roger B. Newson. “Impact of Interventions on Discrete Outcomes: Maximum Likelihood Estimation of the Binary Choice Models with Binary Endogenous Regressors.” Stata Journal 11, no. 3 (2011): 368–85.

Heckman, James J. “Sample Selection Bias as a Specification Error.” Econometrica 47, no. 1 (1979): 153–61. https://doi.org/10.2307/1912352.

Maddala, G. S. “Limited-Dependent and Qualitative Variables in Econometrics.” Cambridge Books. Cambridge University Press, 1986.

Di Falco, Salvatore, Marcella Veronesi, and Mahmud Yesuf. “Does Adaptation to Climate Change Provide Food Security? A Micro-Perspective from Ethiopia.” American Journal of Agricultural Economics 93, no. 3 (2011): 829–46. https://doi.org/10.1093/ajae/aar006.

Abdulai, Abdul Nafeo. “Impact of Conservation Agriculture Technology on Household Welfare in Zambia.” Agricultural Economics 47, no. 6 (2016): 729–41. https://doi.org/10.1111/agec.12269.


data(ImpactData) # Data are from Abdulai (2016)
OutcomeDep <- 'Output'
SelectDep <- 'CA'
OutcomeCov <- c('Age')
SelectCov <- c('Age', 'Perception')
endoReg <- endoSwitch(ImpactData, OutcomeDep, SelectDep, OutcomeCov, SelectCov)

summary(endoReg) # Summarize the regression results

[Package endoSwitch version 1.0.0 Index]