MACHINE LEARNING or STATISTICAL LEARNING
Based on (Econ/ARE 240F)

Department of Economics, University of California - Davis  Spring 2016

TYPED LECTURE SLIDES

trstatisticallearning.pdf   Based on the two books by Hastie, Tibsharani and coauthors

HAND-WRITTEN  LECTURE SLIDES

Statistical_learning_3_econometrics.pdf    Covers applications in econometrics with inference that controls for data-mining

USEFUL TEXTS

For statistical learning the main text used in 240F is an undergraduate / masters level book
ISL: Gareth James, Daniela Witten, Trevor Hastie and Robert Tibsharani (2013), An Introduction to Statistical Learning: with Applications in R, Springer.
A free legal pdf is at http://www-bcf.usc.edu/~gareth/ISL/ and a $25 hardcopy can be obtained via http://www.springer.com/gp/products/books/mycopy

Supplementary material on statistical learning came from the Ph.D. level book
ESL: Trevor Hastie, Robert Tibsharani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer.
A free legal pdf is at http://statweb.stanford.edu/~tibs/ElemStatLearn/index.html and a $25 hardcopy can e obtained via http://www.springer.com/gp/products/books/mycopy

A new book that will be good but I haven't used is
Bradley Efron and Trevor Hastie (2016)
Computer Age Statistical Inference: Algorithms, Evidence and Data Science

LEADERS IN ECONOMETRICS

Bringing established machine learning methods into econometrics is currently an active area. The literature focuses on valid statistical inference controlling for fist-stage data mining, and causal inference. Leading econometricians include
Victor Chereozhukov    http://web.mit.edu/~vchern/www/
Guido Imbens    https://www.gsb.stanford.edu/faculty-research/faculty/guido-w-imbens  https://people.stanford.edu/imbens/publications
Susan Athey  https://www.gsb.stanford.edu/faculty-research/faculty/susan-athey   https://people.stanford.edu/athey/research

ONLINE COURSES

Coursera has many courses   https://www.coursera.org/browse/data-science/machine-learning?languages=en

REFERENCES FOR 240F Spring 2016

This is a very active area: All the papers below were published in 2012 or later.

Partial Survey focused on using LASSO: A. Belloni, V. Chernozhukov and C. Hansen: 54. "High-Dimensional Methods and Inference on Treatment and Structural Effects in Economics, " J. Economic Perspectives Spring 2014, pp.29-50 with
Stata and Matlab programs here; and Stata replication code here

Lasso and IV: A. Belloni, V. Chernozhukov, D. Chen, and C. Hansen. "Sparse Models and Methods for Instrumental Regression, with an Application to Eminent Domain", Arxiv 2010, Econometrica 2012, pp.2369-2429.

Lasso and control function: A. Belloni, V. Chernozhukov and C. Hansen: "Inference on Treatment Effects After Selection Among High-Dimensional Controls," The Review of Economic Studies 2014, p.608-650.

Lasso and Propensity score weighting: M. Farrell, "Robust Inference on Average Treatment effects with possibly more Covariates than Observations," Journal of Econometrics, 2015, vol.189, pp.1-23.

H. Varian Big Data: New Tricks for Econometrics J. Economic Perspectives Spring 2014, pp. 3-28.
Dataset can be obtained from https://www.aeaweb.org/articles.php?doi=10.1257/jep.28.2

Other papers by Chernozhukov and coauthors on this topic are at http://www.mit.edu/~vchern/#veryhigh

G. Imbens and S. Athey
"Machine Learning Methods for Estimating Heterogeneous Causal Effects"

Brief overview paper by S. Athey "Machine Learning and Causal Inference for Policy Evaluation" http://faculty-gsb.stanford.edu/athey/documents/AtheyKDDfinal.pdf


Other papers by Athey are at http://faculty-gsb.stanford.edu/athey/research.html#Econometric_Theory_%28Identification_and_E