------------------------------------------------------------------------------------------------------ log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt log type: text opened on: 17 May 2005, 13:53:31 . . ********** OVERVIEW OF MMA05P2NLS.DO ********** . . * STATA Program . * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi . * used for "Microeconometrics: Methods and Applications" . * by A. Colin Cameron and Pravin K. Trivedi (2005) . * Cambridge University Press . . * Chapter 5.9 pp.159-63 . * Nonlinear least squares . . * Provides last three columns of Table 5.7 results for . * (1) NLS using Stata command nl (hard to get robust s.e.'s) . * (2) FGNLS using Stata command nl (hard to get robust s.e.'s) . * (3) WNLS using Stata command nl (hard to get robust s.e.'s) . * using generated data set mma05data.asc . . * Note: Stata 8 does not give robust se's for nl . * But ml does - see program mma05p3nlsbyml.do . * New Stata 9 does have a robust se option (unlike Stata 8) . . * Related programs: . * mma05p1mle.do OLS and MLE for the same data . * mma05p3nlsbyml.do NLS using ml rather than nl . * mma05p4margeffects.do Calculates marginal effects . . * To run this program you need data and dictionary files . * mma05data.asc ASCII data set generated by mma05p1mle.do . . ********** SETUP ********** . . set more off . version 8 . . ********** READ IN DATA and SUMMARIZE ********** . . * Model is y ~ exponential(exp(a + bx)) . * x ~ N[mux, sigx^2] . * f(y) = exp(a + bx)*exp(-y*exp(a + bx)) . * lnf(y) = (a + bx) - y*exp(a + bx) . * E[y] = exp(-(a + bx)) note sign reversal for the mean . * V[y] = exp(-(a + bx)) = E[y]^2 . * Here a = 2, b = -1 and x ~ N[mux=1, sigx^21] . * and Table 5.7 uses N=10,000 . . * Data was generated by program mma05p1mle.do . infile y x using mma05data.asc (10000 observations read) . . * Descriptive Statistics . describe Contains data obs: 10,000 vars: 2 size: 120,000 (98.8% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- y float %9.0g x float %9.0g ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- y | 10000 .6194352 1.291416 .0000445 30.60636 x | 10000 1.014313 1.004905 -2.895741 4.994059 . . ********** DO THE ANALYSIS: NLS, WNLS and NFGLS ********** . . *** (1) NLS ESTIMATION USING STATA NL COMMAND (Nonlinear LS) . . * To do this in Stata . * (A) program define nlfcn where fcn is the function name . * defines g(x_i'b) and says what the regressors x are . * (B) nl fcn y where fcn is the function name in (A) . * and y is the dependent variable . * does NLS of y on fcn defined in (A) . * (C) Heteroskedastic-consistent standard errors requires extra coding . . * (1A) Define g(x'b) . * Note: Since E[y] = exp(-(a + bx)) there is sign reversal for the mean . program define nlexpnls 1. version 7.0 2. if "`1'" == "?" { /* if query call ... */ 3. global S_1 "b1int b2x" /* declare parameters */ 4. global b1int=1 /* initial values */ 5. global b2x=0 6. exit} 7. replace `1'=exp(-$b1int-$b2x*x) /* calculate function */ 8. end . . * (1B) Do NLS of y on the function expnls defined in (A) . nl expnls y (obs = 10000) Iteration 0: residual SS = 17308.68 Iteration 1: residual SS = 10333.37 Iteration 2: residual SS = 10150.66 Iteration 3: residual SS = 10149.86 Iteration 4: residual SS = 10149.86 Iteration 5: residual SS = 10149.86 Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 5103.98 Model | 10363.0157 2 5181.50784 Prob > F = 0.0000 Residual | 10149.8633 9998 1.01518937 R-squared = 0.5052 -------------+------------------------------ Adj R-squared = 0.5051 Total | 20512.879 10000 2.0512879 Root MSE = 1.007566 Res. dev. = 28527.52 (expnls) ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- b1int | 1.887563 .0306819 61.52 0.000 1.82742 1.947705 b2x | -.9574684 .0097419 -98.28 0.000 -.9765645 -.9383724 ------------------------------------------------------------------------------ (SEs, P values, CIs, and correlations are asymptotic approximations) . estimates store bnls . . * Complications now begin: getting standard erors. Easier to use (1) !! . . * (1C) Get sandwich heteroskedastic-robust standard errors for NLS . . * Note that robust option does not work for nl . * So wrong standard errors are given for this problem as errors are heterosckeastic . . * To get robust standard errors is not straightforward . . * Obtain them by OLS regress y - g(x,b) on dg/db with robust option. . * Explanation: OLS regress y - g(x,b) = (dg/db)'a + v . * This is NR algorithm for update of b . * But a = 0 since iterations have converged, so v = y - g(x,b) . * So nonrobust standard errors from this OLS regression yield . * V[a] = s^2 (Sum_i (dg_i/db)(dg_i/db)') . * where s^2 = (Sum_i(y - g(x_i,b)^2)) . * This is the nonrobust standard errors for NLS . * And robust option gives robust standard errors from this OLS regression. . . * Obtain the derivatives dg/db . * Here g = exp(x'b) so dg/db = exp(x'b)*x = yhat*x . quietly nl expnls y . predict residnls, residuals . predict yhatnls, yhat . scalar snls = e(rmse) /* Use in earlier code */ . gen d1 = yhatnls . gen d2 = x*yhatnls . * This OLS regression gives robust standard errors . regress residnls d1 d2, noconstant robust Regression with robust standard errors Number of obs = 10000 F( 2, 9998) = 0.00 Prob > F = 1.0000 R-squared = 0.0000 Root MSE = 1.0076 ------------------------------------------------------------------------------ | Robust residnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1 | 4.46e-07 .1420794 0.00 1.000 -.2785037 .2785046 d2 | -1.49e-07 .0611969 -0.00 1.000 -.1199583 .119958 ------------------------------------------------------------------------------ . estimates store bnlsrobust . . * Check: Do OLS regression that gives nonrobust standard errors . * and verify that same results as in (1B) . regress residnls d1 d2, noconstant Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 0.00 Model | 2.6739e-10 2 1.3370e-10 Prob > F = 1.0000 Residual | 10149.8633 9998 1.01518937 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = -0.0002 Total | 10149.8633 10000 1.01498633 Root MSE = 1.0076 ------------------------------------------------------------------------------ residnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1 | 4.46e-07 .0306819 0.00 1.000 -.0601423 .0601432 d2 | -1.49e-07 .0097419 -0.00 1.000 -.0190961 .0190958 ------------------------------------------------------------------------------ . estimates store bnlscheck . . * (1D) Alternative to (1C) robust NLS standard errors that are better. . * These are sandwich form but use knowledge that V[u]=exp(x'b)^2 . * which can be estimated by Vhat[u] = yhat . * Now use this knowledge here in computing S in DSD. . * Form DSDknown = D'SD with S = Diag(yhat^2) . gen ds1known = yhatnls*yhatnls . gen ds2known = x*yhatnls*yhatnls . matrix accum DSDknown = ds1known ds2known, noconstant (obs=10000) . matrix accum DD2 = d1 d2, noconstant /* DD commented above */ (obs=10000) . * Form the robust variance matrix estimate . matrix vnlsknown = syminv(DD2)*DSDknown*syminv(DD2) . * Calculate the robust standard errors . scalar seb1intnlsknown = sqrt(vnlsknown[1,1]) . scalar seb2xnlsknown = sqrt(vnlsknown[2,2]) . di "Robust standard errors of NLS estimates of b1int and b2x: " Robust standard errors of NLS estimates of b1int and b2x: . di "Using knowledge that Var[u] = exp(x'b)^2 estimated by yhat" Using knowledge that Var[u] = exp(x'b)^2 estimated by yhat . di seb1intnlsknown " " seb2xnlsknown .21097066 .08798113 . . * (1E) Calculate R-squared and log-likelihood at the NLS estimates . * Note that Stata version 8 reports the wrong R-squared . * as uses TSS = Sum_i y_i^2 and not Sum_i(y_i - ybar)^2 . * lnL sums lnf(y) = ln(lamda) - y*lamda . gen lamdanls = 1 / yhatnls /* yhatnls saved earlier */ . gen lnfnls = ln(lamdanls) - y*lamdanls . quietly means lnfnls . scalar LLnls = r(mean)*r(N) . * R-squared = 1 - Sum_i(y_i - yhat_i)^2 / Sum_i(y_i - ybar)^2 . egen ybar = mean(y) . * quietly means y . * scalar ybar = r(mean) . gen y_ybarsq = (y - ybar)^2 . quietly means y_ybarsq . scalar SStotal = r(mean) . gen y_yhatsqnls = (y - yhatnls)^2 . quietly means y_yhatsqnls . scalar SSresidnls = r(mean) . scalar Rsqnls = 1 - SSresidnls/SStotal /* SStotal found earlier */ . di LLnls " " Rsqnls -232.97524 .39134462 . . ** (2) FGNLS ESTIMATION USING STATA NL COMMAND . . * The following gives FGNLS in Table 5.7 . * To instead get the WNLS estimates in Table 5.7 . * replace gen wfgnls = (1/yhatnls)^2 below by gen wfgnls = 1/yhatnls . . * The Feasible generalized NLS estimator minimizes . * SUM_i (y_i - g(x_i'b))^2 / s_i^2 where s_i^2 = estimate of sigma_i^2 . * This is y_i = g(x_i'b) + u_i where u_i ~ (0,s_i^2) . * Can do NLS with weighting option [aweight = 1/(s_i^2)] . * Here s_i^2 = [exp(x_i'b)]^2 = yhatnls^2 . . * The simplest way to proceed is to use the aweights option. . . * (2A) nls program expnls already defined in (1A) . . * (2B) For FGNLS do this nls but now with weights . gen wfgnls = (1/yhatnls)^2 . * gen wfgnls = 1/yhatnls . nl expnls y [aweight=wfgnls] (sum of wgt is 405584.32) Iteration 0: residual SS = 1127.256 Iteration 1: residual SS = 363.8331 Iteration 2: residual SS = 239.3399 Iteration 3: residual SS = 220.6796 Iteration 4: residual SS = 220.2856 Iteration 5: residual SS = 220.2851 Iteration 6: residual SS = 220.2851 Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 4946.06 Model | 217.95244 2 108.97622 Prob > F = 0.0000 Residual | 220.285065 9998 .022032913 R-squared = 0.4973 -------------+------------------------------ Adj R-squared = 0.4972 Total | 438.237505 10000 .043823751 Root MSE = .1484349 Res. dev. = 8924.231 (expnls) ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- b1int | 1.984035 .0147737 134.30 0.000 1.955075 2.012994 b2x | -.990691 .01001 -98.97 0.000 -1.010313 -.9710694 ------------------------------------------------------------------------------ (SEs, P values, CIs, and correlations are asymptotic approximations) . estimates store bfgnls . . * (2C) Robust standard errors . * The standard errors obtained given are consistent . * assuming correct model for heteroskedasticity. . * To guard against misspecification use similar approach to nls case . * Obtain the derivatives dg/db . * Here g = exp(x'b) so dg/db = exp(x'b)*x = yhat*x . predict residoptnls, residuals . predict yhatoptnls, yhat . gen d1opt = yhatoptnls . gen d2opt = x*yhatoptnls . * This OLS regression gives robust standard errors . regress residoptnls d1opt d2opt [aweight=wfgnls], noconstant robust (sum of wgt is 4.0558e+05) Regression with robust standard errors Number of obs = 10000 F( 2, 9998) = 0.00 Prob > F = 1.0000 R-squared = 0.0000 Root MSE = .14843 ------------------------------------------------------------------------------ | Robust residoptnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1opt | -9.85e-09 .0145803 -0.00 1.000 -.0285803 .0285802 d2opt | 8.81e-09 .0101319 0.00 1.000 -.0198606 .0198606 ------------------------------------------------------------------------------ . estimates store bfgnlsrobust . * This OLS regression gives nonrobust standard errors . * It is a check and should equal (C) . regress residoptnls d1opt d2opt [aweight=wfgnls], noconstant (sum of wgt is 4.0558e+05) Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 0.00 Model | 2.2737e-13 2 1.1369e-13 Prob > F = 1.0000 Residual | 220.285065 9998 .022032913 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = -0.0002 Total | 220.285065 10000 .022028506 Root MSE = .14843 ------------------------------------------------------------------------------ residoptnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1opt | -9.85e-09 .0147737 -0.00 1.000 -.0289594 .0289594 d2opt | 8.81e-09 .01001 0.00 1.000 -.0196216 .0196216 ------------------------------------------------------------------------------ . estimates store bfgnlscheck . . * (2D) Calculate R-squared and log-likelihood at the NLS estimates . * Note that Stata version 8 reports the wrong R-squared . * as uses TSS = Sum_i y_i^2 and not Sum_i(y_i - ybar)^2 . * lnL sums lnf(y) = ln(lamda) - y*lamda . gen lamdafgnls = 1 / yhatoptnls /* yhatoptnls saved earlier */ . gen lnffgnls = ln(lamdafgnls) - y*lamdafgnls . quietly means lnffgnls . scalar LLfgnls = r(mean)*r(N) . * R-squared = 1 - Sum_i(y_i - yhat_i)^2 / Sum_i(y_i - ybar)^2 . gen y_yhatsqfgnls = (y - yhatoptnls)^2 . quietly means y_yhatsqfgnls . scalar SSresidfgnls = r(mean) . scalar Rsqfgnls = 1 - SSresidfgnls/SStotal /* SStotal found earlier */ . di LLfgnls " " Rsqfgnls -208.71965 .39056605 . . ** (3) WNLS ESTIMATION USING STATA NL COMMAND . . * To get WNLS estimates in Table 5.7 . * replace gen wfgnls = (1/yhatnls)^2 in (3) FGNLS by gen wfgnls = 1/yhatnls . * Code is shorter as all comments are dropped . . gen wwnls = 1/yhatnls . nl expnls y [aweight=wwnls] (sum of wgt is 39858.614) Iteration 0: residual SS = 2630.417 Iteration 1: residual SS = 1694.802 Iteration 2: residual SS = 1500.277 Iteration 3: residual SS = 1494.658 Iteration 4: residual SS = 1494.653 Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 5073.75 Model | 1517.00087 2 758.500436 Prob > F = 0.0000 Residual | 1494.6525 9998 .149495149 R-squared = 0.5037 -------------+------------------------------ Adj R-squared = 0.5036 Total | 3011.65337 10000 .301165337 Root MSE = .386646 Res. dev. = 14035.49 (expnls) ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- b1int | 1.990623 .0224903 88.51 0.000 1.946537 2.034708 b2x | -.9960671 .009777 -101.88 0.000 -1.015232 -.9769022 ------------------------------------------------------------------------------ (SEs, P values, CIs, and correlations are asymptotic approximations) . estimates store bwnls . predict residwnls, residuals . predict yhatwnls, yhat . gen d1w = yhatwnls . gen d2w = x*yhatwnls . regress residwnls d1w d2w [aweight=wwnls], noconstant robust (sum of wgt is 3.9859e+04) Regression with robust standard errors Number of obs = 10000 F( 2, 9998) = 0.00 Prob > F = 1.0000 R-squared = 0.0000 Root MSE = .38665 ------------------------------------------------------------------------------ | Robust residwnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1w | -1.11e-07 .0358551 -0.00 1.000 -.0702833 .0702831 d2w | 5.35e-08 .0224175 0.00 1.000 -.0439428 .043943 ------------------------------------------------------------------------------ . estimates store bwnlsrobust . regress residwnls d1w d2w [aweight=wwnls], noconstant (sum of wgt is 3.9859e+04) Source | SS df MS Number of obs = 10000 -------------+------------------------------ F( 2, 9998) = 0.00 Model | 1.8190e-12 2 9.0949e-13 Prob > F = 1.0000 Residual | 1494.6525 9998 .149495149 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = -0.0002 Total | 1494.6525 10000 .14946525 Root MSE = .38665 ------------------------------------------------------------------------------ residwnls | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1w | -1.11e-07 .0224903 -0.00 1.000 -.0440856 .0440853 d2w | 5.35e-08 .009777 0.00 1.000 -.0191649 .019165 ------------------------------------------------------------------------------ . estimates store bwnlscheck . gen lamdawnls = 1 / yhatwnls /* yhatwnls saved earlier */ . gen lnfwnls = ln(lamdawnls) - y*lamdawnls . quietly means lnfwnls . scalar LLwnls = r(mean)*r(N) . gen y_yhatsqwnls = (y - yhatwnls)^2 . quietly means y_yhatsqwnls . scalar SSresidwnls = r(mean) . scalar Rsqwnls = 1 - SSresidwnls/SStotal /* SStotal found earlier */ . di LLwnls " " Rsqwnls -208.93381 .39017996 . . ***** PRINT RESULTS: Last three columns of Table 5.7 page 161 . . * (1) NLS using NL - nonrobust and robust standard errors . * Here nonrobust differs from robust asymptotically . . * Table 5.7 NLS nonrobust standard errors . estimates table bnls, b(%10.4f) se(%10.4f) t stats(N ll) --------------------------- Variable | bnls -------------+------------- b1int | 1.8876 | 0.0307 | 61.52 b2x | -0.9575 | 0.0097 | -98.28 -------------+------------- N | 10000.0000 ll | --------------------------- legend: b/se/t . * Table 5.7 NLS robust standard errors . estimates table bnlscheck bnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll) ---------------------------------------- Variable | bnlscheck bnlsrobust -------------+-------------------------- d1 | 0.0000 0.0000 | 0.0307 0.1421 | 0.00 0.00 d2 | -0.0000 -0.0000 | 0.0097 0.0612 | -0.00 -0.00 -------------+-------------------------- N | 10000.0000 10000.0000 ll | -1.426e+04 -1.426e+04 ---------------------------------------- legend: b/se/t . . /* > * Check: Nonrobust standard errors of NLS b1int and b2x: > di seb1intnlsnr " " seb2xnlsnr > * Robust standard errors of NLS estimates of b1int and b2x: > di seb1intnls " " seb2xnls > */ . * Alternative Robust standard errors of NLS estimates of b1int and b2x: . * These use knowledge that Var[u] = exp(x'b) . di seb1intnlsknown " " seb2xnlsknown .21097066 .08798113 . . * (3) WNLS - nonrobust and robust standard errors . * Here nonrobust = robust asymptotically as WNLS in LEF . * Also should be same as MLE asymptotically . * Table 5.7 WNLS nonrobust standard errors . estimates table bwnls, b(%10.4f) se(%10.4f) t stats(N ll) --------------------------- Variable | bwnls -------------+------------- b1int | 1.9906 | 0.0225 | 88.51 b2x | -0.9961 | 0.0098 | -101.88 -------------+------------- N | 10000.0000 ll | --------------------------- legend: b/se/t . * Table 5.7 WNLS robust standard errors . estimates table bwnlscheck bwnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll) ---------------------------------------- Variable | bwnlscheck bwnlsrob~t -------------+-------------------------- d1w | -0.0000 -0.0000 | 0.0225 0.0359 | -0.00 -0.00 d2w | 0.0000 0.0000 | 0.0098 0.0224 | 0.00 0.00 -------------+-------------------------- N | 10000.0000 10000.0000 ll | -4685.9286 -4685.9286 ---------------------------------------- legend: b/se/t . . * (2) FGNLS - nonrobust and robust standard errors . * Here nonrobust = robust asymptotically as FGNLS in LEF . * Also should be same as MLE asymptotically . * Table 5.7 FGNLS nonrobust standard errors . estimates table bfgnls, b(%10.4f) se(%10.4f) t stats(N ll) --------------------------- Variable | bfgnls -------------+------------- b1int | 1.9840 | 0.0148 | 134.30 b2x | -0.9907 | 0.0100 | -98.97 -------------+------------- N | 10000.0000 ll | --------------------------- legend: b/se/t . * Table 5.7 FGNLS robust standard errors . estimates table bfgnlscheck bfgnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll) ---------------------------------------- Variable | bfgnlsch~k bfgnlsro~t -------------+-------------------------- d1opt | -0.0000 -0.0000 | 0.0148 0.0146 | -0.00 -0.00 d2opt | 0.0000 0.0000 | 0.0100 0.0101 | 0.00 0.00 -------------+-------------------------- N | 10000.0000 10000.0000 ll | 4887.7042 4887.7042 ---------------------------------------- legend: b/se/t . . * (4) Print the various log-likelihoods and R-squared . * Log-likelihood for NLS and FNGLS . di "LLnls: " LLnls " LLfgnls: " LLfgnls " LLwnls: " LLwnls LLnls: -232.97524 LLfgnls: -208.71965 LLwnls: -208.93381 . * R-squared for MLE, NLS and FNGLS . di "Rsqnls: " Rsqnls " Rsqfgnls: " Rsqfgnls " Rsqwnls: " Rsqwnls Rsqnls: .39134462 Rsqfgnls: .39056605 Rsqwnls: .39017996 . . ********** CLOSE OUTPUT ********** . log close log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt log type: text closed on: 17 May 2005, 13:53:34 ----------------------------------------------------------------------------------------------------