On the instrument functional form with a binary endogenous explanatory variable
Introduction
When TSLS is applied in empirical research, the typical way is to have the first stage linear in a single instrument along with other exogenous variables. When the structural error term is assumed to satisfy a zero conditional mean assumption, a linear first stage may neglect additional instruments such as higher order terms of the exogenous variables.1 Antoine and Lavergne (2020) also construct examples illustrating how inadequate functional form in the first-stage equations may artificially create a weak identification issue. In this study, I evaluate the case where linear instruments in the first stage do not explore the functional link between a binary endogenous explanatory variable (EEV) and the excluded exogenous variables, resulting in a potentially avoidable weak instruments problem.
An alternative approach is based on the two-step IV method proposed in Wooldridge (2010): in the first step, I estimate a binary response model by maximum likelihood; in the second step, I use the obtained fitted probability as an instrument in the IV estimation. In the context of a binary EEV, when the correct specification of its conditional mean and homoskedasticity of the structural error term are assumed, the fitted probability from maximum likelihood estimation (MLE) is the exact feasible optimal instrument.
Further, I show that when the partial correlation between the excluded exogenous variables and the binary EEV is weak, the IV estimator obtained using a nonlinear fitted probability outperforms the linear TSLS estimator. Explicitly, using the alternative instrument can generate a consistent and more efficient IV estimator when the included exogenous variables are strongly correlated with the binary EEV.
Section snippets
Econometric framework and assumptions
I study a linear regression model with a binary EEV and weak linear instruments. With a sample size of , the econometricians observe the dataset , where is the continuous outcome variable, is the single EEV, is a vector of exogenous variables included in Eq. (1), and is a vector of instruments excluded from Eq. (1). The parameter of interest is the coefficient on the EEV, .
Eq. (2) is a first-stage linear projection defined for any
Asymptotic theory
The estimation procedure is the same as Procedure 21.1 in Wooldridge (2010). In the first step, estimate a binary response model by QMLE and obtain the fitted probability , where is the estimator of . In the second step, the fitted probability is used as an instrument for IV estimation. As claimed by Wooldridge (2010), this procedure has a desired robustness property as the model for does not need to be correctly specified.
Let and
Simulation results
I compare the performance of the IV estimators obtained using various fitted probabilities with that of the linear IV estimator in finite samples. In the main design, and are standard normal and independent of each other. and are standard normal with a correlation of 0.5. The simulations are done with 1,000 replications and a sample size of 500.
I focus on the ratio of the bias of the IV estimator relative to the bias of the OLS estimator and the coverage rate
Application
I apply the two-step IV method to Dinkelman (2011), who evaluates the effects of rural electrification on employment growth in South Africa. The dependent variables are the community-level growth of female and male employment rates from 1996 to 2001. Electricity project placement is the binary EEV and the average land gradient is the instrument.
Table 2 mimics Tables 4 and 5 in Dinkelman (2011) with columns (1) and (6) reproducing the main results. In columns (2) and (7), fitted probit
Conclusion
In this study, I show that a more appropriate first-stage parameterization can help with the weak instruments problem when the EEV is binary. A similar argument can be generalized to other limited EEVs, such as count variables and fractional responses, where nonlinear instruments can be constructed.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
I am grateful to Jeffrey Wooldridge, Todd Elder, Kyoo il Kim, Jinyong Hahn, and Tetsuya Kaji for their helpful comments and suggestions. An earlier version of this paper has been circulated under the title “Weak Instruments with a Binary Endogenous Explanatory Variable”. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
References (8)
- et al.
A simple diagnostic to investigate instrument validity and heterogeneous effects when using a single instrument
Lab. Econ.
(2016) - Antoine, B., Lavergne, P., 2020. Identification-robust nonparametric inference in a linear IV model, Unpublished...
The effects of rural electrification on employment: new evidence from South Africa
Amer. Econ. Rev.
(2011)- et al.
Testing under weak identification with conditional moment restrictions
Econom. Theory
(2012)