-------------------------------------------------------------------------------------------------- name: log: c:\Users\ccameron\Dropbox\Desktop\TEACHING\190\Causal Methods Slides\iv.txt log type: text opened on: 7 Oct 2023, 12:02:26 . . ********** OVERVIEW OF iv.do ********** . . * This STATA program is an example of instrumental variables . . * Program by A. Colin Cameron Dept. of Economics Univ. of California - Davis . * Used for ECN 190 Research with Economics Data . . * To run you need file . * AED_RETURNSTOSCHOOLING.DTA . * in your directory . . * And you need user-written Stata command . * rivtest . . * The dataset is used in chapter 17.4 of . * A. Colin Cameron Analysis of Economics Data: An Introduction to Econometics . * https://cameron.econ.ucdavis.edu/ . * It is studied in more detail in A. Colin Cameron and Pravin K. Trivedi (2022) . * Microeconometrics using Stata, Voulme 2, chapter 25.7 . . * Original source is Jeffrey R. Kling (2001) . * "Interpreting Instrumental Variables Estimates of the Returns to Schooling," . * Journal of Business and Economic Statistics, 19, 358-364. . * and . * David E. Card (1995) "Using Geographic Variation in College Proximity . * to Estimate the Return to Schooling," . * in Aspects of Labor Market Behavior: Essays in Honor of John Vanderkamp, . * L.N. Christofides, E.K. Grant and R. Swidinsky (Eds.). . . ********** SETUP ********** . . clear all . set scheme s1mono /* Graphics scheme */ . . ************ INSTRUMENTAL VARIABLES ESTIMATION . . clear . use AED_RETURNSTOSCHOOLING.DTA (Data for A. Colin Cameron (2022), ANALYSIS OF ECONOMIC DATA, Amazon) . . * Describe and summarize dependent variable, key regressor and instrument . desc wage76 grade76 col4 age76 Variable Storage Display Value name type format label Variable label -------------------------------------------------------------------------------------------------- wage76 float %9.0g '76 Wage grade76 float %9.0g '76 Grade level col4 float %9.0g If any 4-year college nearby (r0004000!=4) age76 float %9.0g '76 age (age66 +10) . sum wage76 grade76 col4 age76 Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- wage76 | 3,010 1.656664 .443798 0 3.1797 grade76 | 3,010 13.26346 2.676913 1 18 col4 | 3,010 .6820598 .4657535 0 1 age76 | 3,010 28.1196 3.137004 24 34 . * Correlation . correlate grade76 col4 (obs=3,010) | grade76 col4 -------------+------------------ grade76 | 1.0000 col4 | 0.1442 1.0000 . . * OLS and IV estimates . reg wage76 grade76 age76, vce(robust) Linear regression Number of obs = 3,010 F(2, 3007) = 321.34 Prob > F = 0.0000 R-squared = 0.1813 Root MSE = .40169 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- grade76 | .052511 .0027826 18.87 0.000 .0470551 .0579669 age76 | .0406575 .002395 16.98 0.000 .0359616 .0453534 _cons | -.1830865 .0772766 -2.37 0.018 -.334607 -.0315661 ------------------------------------------------------------------------------ . estimates store OLS . ivregress 2sls wage76 age76 (grade76 = col4), vce(robust) Instrumental variables 2SLS regression Number of obs = 3,010 Wald chi2(2) = 236.58 Prob > chi2 = 0.0000 R-squared = . Root MSE = .51659 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- grade76 | .1739731 .0242359 7.18 0.000 .1264717 .2214745 age76 | .0415634 .003019 13.77 0.000 .0356463 .0474805 _cons | -1.819567 .334451 -5.44 0.000 -2.475079 -1.164055 ------------------------------------------------------------------------------ Endogenous: grade76 Exogenous: age76 col4 . estimates store IV . estimates table OLS IV, b(%8.4f) se t(%8.2f) stats(N r2) ------------------------------------ Variable | OLS IV -------------+---------------------- grade76 | 0.0525 0.1740 | 0.0028 0.0242 | 18.87 7.18 age76 | 0.0407 0.0416 | 0.0024 0.0030 | 16.98 13.77 _cons | -0.1831 -1.8196 | 0.0773 0.3345 | -2.37 -5.44 -------------+---------------------- N | 3010 3010 r2 | 0.1813 . ------------------------------------ Legend: b/se/t . . /* Estimating the IV estimator in two stages > * Gives the same coefficients but not the correct standard errors > * First step OLS endogenous variable on instrument and any other exogenous > * And get the prediction > * Second step OLS of original modwl with prediction replacing the endogenous variable > */ . * (1) First stage OLS regression and get prediction . quietly regress grade76 col4 age76, vce(robust) . predict grade76hat (option xb assumed; fitted values) . * (2) OLS with predicted value replacing the endogenous variable . regress wage76 grade76hat age76 , vce(robust) Linear regression Number of obs = 3,010 F(2, 3007) = 176.90 Prob > F = 0.0000 R-squared = 0.1041 Root MSE = .42021 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- grade76hat | .1739731 .0200107 8.69 0.000 .1347371 .2132091 age76 | .0415634 .0024988 16.63 0.000 .0366638 .046463 _cons | -1.819567 .2743573 -6.63 0.000 -2.357514 -1.28162 ------------------------------------------------------------------------------ . . * Repeat analysis with more controls . global exogregressors black south76 smsa76 reg2-reg9 /// > smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1-famed8 . desc wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors Variable Storage Display Value name type format label Variable label -------------------------------------------------------------------------------------------------- wage76 float %9.0g '76 Wage grade76 float %9.0g '76 Grade level exp76 float %9.0g '76 experience, (10 + age66) - grade76 - 6) expsq76 float %9.0g '76 experience, exp76 ^2/100 col4 float %9.0g If any 4-year college nearby (r0004000!=4) age76 float %9.0g '76 age (age66 +10) agesq76 float %9.0g '76 age squared (age76^2) black float %9.0g Race (r0002300) n= 5225 mean= 1.296 min= 1 max=3 south76 float %9.0g If lived in South in 1976 (r0437511=1) smsa76 float %9.0g If lived in SMSA in 1976 (r0437515=1,2) reg2 float %9.0g If lived in Region 2 (region= MidAtl) reg3 float %9.0g If lived in Region 3 (region= ENC) reg4 float %9.0g If lived in Region 4 (region= WNC) reg5 float %9.0g If lived in Region 5 (region= SA ) reg6 float %9.0g If lived in Region 6 (region= ESC) reg7 float %9.0g If lived in Region 7 (region= WSC) reg8 float %9.0g If lived in Region 8 (region= M ) reg9 float %9.0g If lived in Region 9 (region= P ) smsa66 float %9.0g If lived in SMSA in 1966 (r0002455=1,2) momdad14 float %9.0g If lived with both parents at age 14 sinmom14 float %9.0g If lived with mother only at age 14 nodaded float %9.0g If father has no formal education nomomed float %9.0g If mother has no formal education daded float %9.0g Mean grade level of father momed float %9.0g Mean grade level of mother famed1 float %9.0g If mgrade> 12 & fgrade> 12 (famed=1) famed2 float %9.0g If mgrade>=12 & fgrade>=12 (famed=2) famed3 float %9.0g If mgrade==12 & fgrade==12 (famed=3) famed4 float %9.0g If mgrade>=12 & fgrade==-1 (famed=4) famed5 float %9.0g If fgrade>=12 (famed=5) famed6 float %9.0g If mgrade>=12 & fgrade> -1 (famed=6) famed7 float %9.0g If mgrade>=9 & fgrade>=9 (famed=7) famed8 float %9.0g If mgrade> -1 & fgrade> -1 (famed=8) . sum wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- wage76 | 3,010 1.656664 .443798 0 3.1797 grade76 | 3,010 13.26346 2.676913 1 18 exp76 | 3,010 8.856146 4.141672 0 23 expsq76 | 3,010 .9557907 .8461831 0 5.29 col4 | 3,010 .6820598 .4657535 0 1 -------------+--------------------------------------------------------- age76 | 3,010 28.1196 3.137004 24 34 agesq76 | 3,010 800.5495 180.7484 576 1156 black | 3,010 .2335548 .4231624 0 1 south76 | 3,010 .4036545 .4907113 0 1 smsa76 | 3,010 .7129568 .4524571 0 1 -------------+--------------------------------------------------------- reg2 | 3,010 .1607973 .367405 0 1 reg3 | 3,010 .1956811 .39679 0 1 reg4 | 3,010 .0641196 .2450066 0 1 reg5 | 3,010 .2083056 .406164 0 1 reg6 | 3,010 .0960133 .2946584 0 1 -------------+--------------------------------------------------------- reg7 | 3,010 .1099668 .3129003 0 1 reg8 | 3,010 .0282392 .165683 0 1 reg9 | 3,010 .0903654 .2867522 0 1 smsa66 | 3,010 .6495017 .4772053 0 1 momdad14 | 3,010 .7893688 .4078247 0 1 -------------+--------------------------------------------------------- sinmom14 | 3,010 .1006645 .3009339 0 1 nodaded | 3,010 .2292359 .4204111 0 1 nomomed | 3,010 .1172757 .321802 0 1 daded | 3,010 9.988262 3.266511 0 18 momed | 3,010 10.33675 2.987507 0 18 -------------+--------------------------------------------------------- famed1 | 3,010 .0614618 .2402153 0 1 famed2 | 3,010 .0787375 .2693734 0 1 famed3 | 3,010 .1249169 .3306796 0 1 famed4 | 3,010 .0475083 .2127588 0 1 famed5 | 3,010 .0790698 .2698925 0 1 -------------+--------------------------------------------------------- famed6 | 3,010 .1328904 .3395126 0 1 famed7 | 3,010 .0504983 .2190073 0 1 famed8 | 3,010 .2202658 .4144947 0 1 . reg wage76 grade76 $exogregressors, vce(robust) Linear regression Number of obs = 3,010 F(27, 2982) = 34.63 Prob > F = 0.0000 R-squared = 0.2192 Root MSE = .39393 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- grade76 | .0348294 .0032075 10.86 0.000 .0285402 .0411186 black | -.2085796 .0204382 -10.21 0.000 -.2486541 -.1685051 south76 | -.1475648 .0296327 -4.98 0.000 -.2056674 -.0894622 smsa76 | .1273469 .0206334 6.17 0.000 .0868897 .1678041 reg2 | .0718245 .0380278 1.89 0.059 -.0027389 .1463878 reg3 | .1212616 .0364779 3.32 0.001 .0497372 .1927861 reg4 | .0350835 .0433668 0.81 0.419 -.0499484 .1201155 reg5 | .1059933 .0455072 2.33 0.020 .0167645 .195222 reg6 | .1072257 .0476346 2.25 0.024 .0138256 .2006257 reg7 | .0992566 .0481535 2.06 0.039 .0048392 .193674 reg8 | -.0461341 .0534706 -0.86 0.388 -.1509771 .0587089 reg9 | .1180226 .0417768 2.83 0.005 .0361084 .1999369 smsa66 | .0441862 .0200221 2.21 0.027 .0049277 .0834447 momdad14 | .1103225 .0267528 4.12 0.000 .0578667 .1627783 sinmom14 | .0215818 .0373176 0.58 0.563 -.0515891 .0947526 nodaded | -.0152982 .0522934 -0.29 0.770 -.117833 .0872365 nomomed | .0157279 .0352884 0.45 0.656 -.0534641 .0849199 daded | -.0032475 .0046861 -0.69 0.488 -.0124357 .0059407 momed | .0087595 .0043838 2.00 0.046 .0001639 .0173551 famed1 | -.1790964 .0833246 -2.15 0.032 -.342476 -.0157168 famed2 | -.1162254 .072579 -1.60 0.109 -.2585354 .0260847 famed3 | -.1401684 .0666914 -2.10 0.036 -.2709343 -.0094025 famed4 | -.0200873 .0447863 -0.45 0.654 -.1079024 .0677278 famed5 | -.0752873 .0644591 -1.17 0.243 -.201676 .0511014 famed6 | -.1158021 .0619653 -1.87 0.062 -.2373012 .005697 famed7 | -.1255939 .0663979 -1.89 0.059 -.2557842 .0045965 famed8 | -.1019673 .0559191 -1.82 0.068 -.2116112 .0076765 _cons | 1.03638 .0859303 12.06 0.000 .8678909 1.204868 ------------------------------------------------------------------------------ . estimates store ols . ivregress 2sls wage76 $exogregressors (grade76 = col4), vce(robust) Instrumental variables 2SLS regression Number of obs = 3,010 Wald chi2(27) = 670.82 Prob > chi2 = 0.0000 R-squared = 0.0321 Root MSE = .43653 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- grade76 | .1199825 .0530185 2.26 0.024 .0160682 .2238969 black | -.1735051 .030574 -5.67 0.000 -.233429 -.1135813 south76 | -.1353794 .0320709 -4.22 0.000 -.1982371 -.0725216 smsa76 | .0598065 .0483921 1.24 0.217 -.0350402 .1546533 reg2 | .0373795 .0469065 0.80 0.426 -.0545557 .1293146 reg3 | .0880322 .0454959 1.93 0.053 -.0011381 .1772025 reg4 | -.0141466 .0573732 -0.25 0.805 -.126596 .0983028 reg5 | .0933561 .0500849 1.86 0.062 -.0048085 .1915208 reg6 | .0977065 .0521856 1.87 0.061 -.0045753 .1999883 reg7 | .0643754 .0564988 1.14 0.255 -.0463603 .175111 reg8 | -.1057064 .0685142 -1.54 0.123 -.2399918 .028579 reg9 | .0659637 .0553787 1.19 0.234 -.0425766 .1745041 smsa66 | .0649953 .0253523 2.56 0.010 .0153058 .1146849 momdad14 | .0756596 .0374642 2.02 0.043 .002231 .1490881 sinmom14 | .0116047 .0418279 0.28 0.781 -.0703766 .0935859 nodaded | -.0220022 .0590792 -0.37 0.710 -.1377953 .093791 nomomed | .0364682 .0411234 0.89 0.375 -.0441322 .1170686 daded | -.016703 .0098527 -1.70 0.090 -.0360139 .0026079 momed | -.0061751 .0104787 -0.59 0.556 -.0267131 .0143629 famed1 | -.2901527 .1149989 -2.52 0.012 -.5155464 -.0647591 famed2 | -.2441777 .1133576 -2.15 0.031 -.4663546 -.0220009 famed3 | -.2158222 .0884402 -2.44 0.015 -.3891619 -.0424826 famed4 | -.0885927 .0645969 -1.37 0.170 -.2152004 .0380149 famed5 | -.1321952 .0803181 -1.65 0.100 -.2896158 .0252254 famed6 | -.1717061 .0776202 -2.21 0.027 -.3238389 -.0195733 famed7 | -.1665693 .078149 -2.13 0.033 -.3197386 -.0134 famed8 | -.1645073 .0733404 -2.24 0.025 -.3082519 -.0207628 _cons | .3310691 .4486024 0.74 0.461 -.5481755 1.210314 ------------------------------------------------------------------------------ Endogenous: grade76 Exogenous: black south76 smsa76 reg2 reg3 reg4 reg5 reg6 reg7 reg8 reg9 smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1 famed2 famed3 famed4 famed5 famed6 famed7 famed8 col4 . estimates store iv . estimates table ols iv, se stats(N r2) b(%8.4f) keep(grade76) ------------------------------------ Variable | ols iv -------------+---------------------- grade76 | 0.0348 0.1200 | 0.0032 0.0530 -------------+---------------------- N | 3010 3010 r2 | 0.2192 0.0321 ------------------------------------ Legend: b/se . . ********* Following maerial is more advanced . . * First-stage diagnostic for weak instruments . correlate grade76 col4 (obs=3,010) | grade76 col4 -------------+------------------ grade76 | 1.0000 col4 | 0.1442 1.0000 . regress grade76 col4 age76, vce(robust) noheader ------------------------------------------------------------------------------ | Robust grade76 | Coefficient std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- col4 | .832565 .1067308 7.80 0.000 .6232922 1.041838 age76 | -.0126164 .0156219 -0.81 0.419 -.0432471 .0180142 _cons | 13.05037 .4366304 29.89 0.000 12.19424 13.90649 ------------------------------------------------------------------------------ . test col4 ( 1) col4 = 0 F( 1, 3007) = 60.85 Prob > F = 0.0000 . . * Weak instruments infenrece using Anderson-Rubin Wald test . ivregress 2sls wage76 age76 (grade76 = col4), vce(robust) Instrumental variables 2SLS regression Number of obs = 3,010 Wald chi2(2) = 236.58 Prob > chi2 = 0.0000 R-squared = . Root MSE = .51659 ------------------------------------------------------------------------------ | Robust wage76 | Coefficient std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- grade76 | .1739731 .0242359 7.18 0.000 .1264717 .2214745 age76 | .0415634 .003019 13.77 0.000 .0356463 .0474805 _cons | -1.819567 .334451 -5.44 0.000 -2.475079 -1.164055 ------------------------------------------------------------------------------ Endogenous: grade76 Exogenous: age76 col4 . rivtest, ci null(0) Estimating confidence sets over grid points ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 Weak instrument robust tests and confidence sets for linear IV with robust VCE H0: beta[wage76:grade76] = 0 -------------------------------------------------------------------------------------------------- Test | Statistic p-value 95% Confidence Set ------+------------------------------------------------------------------------------------------- AR | chi2(1) = 75.66 Prob > chi2 = 0.0000 [ .132709, .230591] ------+------------------------------------------------------------------------------------------- Wald | chi2(1) = 51.53 Prob > chi2 = 0.0000 [ .126472, .221475] -------------------------------------------------------------------------------------------------- Note: Wald test not robust to weak instruments. Confidence sets estimated for 100 points in [ .07897, .268976]. . . ********** CLOSE OUTPUT . log close name: log: c:\Users\ccameron\Dropbox\Desktop\TEACHING\190\Causal Methods Slides\iv.txt log type: text closed on: 7 Oct 2023, 12:02:28 --------------------------------------------------------------------------------------------------