MACHINE LEARNING or STATISTICAL LEARNING
Colin Cameron, Department of Economics,
University of California - Davis June 2019
Machine learning methods for
prediction are well-established in the statistical and
computer science literature.
Applying machine learning methods for causal influence is a
very active area in the economics literature.
A summary such as that in the slides below can become dated
SLIDES: MACHINE LEARNING BRIEF OVERVIEW
This 60 slide overview was presented June 2019
SLIDES: MORE DETAIL ON MACHINE LEARNING IN GENERAL
The following two sets of slides provide much
more detail on basic machine learning methods.
SLIDES: MORE DETAIL ON MACHINE LEARNING FOR
They were created in April 2019 for short courses in Germany
(Basics: selection, shrinkage, dimension reduction, LASSO)
(Flexible methods: including random forests, classification and
The following set of slides
provides much more detail on use in economics of machine
USEFUL TEXTS FOR MACHINE LEARNING (NOT ECONOMICS)
These slides were created in April 2019 for short courses
in Germany and presentation at U.C. Riverside.
They cover a prediction example in economics and then various
methods for causal inference in the partially linear model and
in heterogeneous effects models.
The slides also list key references in the current economics
For statistical learning the main text used in 240F is an
undergraduate / masters level book
ISL: Gareth James, Daniela Witten, Trevor Hastie and
Robert Tibsharani (2013), An Introduction to Statistical
Learning: with Applications in R, Springer.
A free legal pdf is at http://www-bcf.usc.edu/~gareth/ISL/
and a $25 hardcopy can be obtained via
Supplementary material on statistical learning came from the
Ph.D. level book
A newer book that is good but I haven't used is
ESL: Trevor Hastie, Robert Tibsharani and Jerome
Friedman (2009), The Elements of Statistical Learning: Data
Mining, Inference and Prediction, Springer.
A free legal pdf is at
and a $25 hardcopy can be obtained via
Bradley Efron and Trevor Hastie (2016) Computer Age
Statistical Inference: Algorithms, Evidence and Data Science,
Cambridge University Press.
LEADERS IN ECONOMETRICS
Bringing established machine learning methods into
econometrics is currently an active area. The literature focuses
on valid statistical inference controlling for fist-stage data
mining, and causal inference. Leading econometricians include
Alex Belloni https://faculty.fuqua.duke.edu/~abn5/belloni-index.html
Christian Hansen http://faculty.chicagobooth.edu/christian.hansen/research/
Susan Athey https://www.gsb.stanford.edu/faculty-research/faculty/susan-athey
Guido Imbens https://www.gsb.stanford.edu/faculty-research/faculty/guido-w-imbens
Coursera has many courses https://www.coursera.org/browse/data-science/machine-learning?languages=en
SOME ECONOMICS REFERENCES
This is a very active
area: All the papers below were published in 2011 or later.
Machine learning prediction in economics
Hal Varian (2014), "Big Data: New Tricks for
Econometrics", Journal of Economic Perspectives, Spring, 3-28.
Sendhil Mullainathan and J. Spiess: "Machine Learning: An
Applied Econometric Approach", Journal of Economic Perspectives,
Spring 2017, 87-106.
Jon Kleinberg, H. Lakkaraju, Jure Leskovec, Jens Ludwig, Sendhil
Mullainathan (2018), "Human decisions and Machine Predictions",
Quarterly Journal of Economics, 237-293.
Surveys of causal inference in economics
Susan Athey (2018), "The Impact of Machine Learning on
Susan Athey and Guido Imbens (2019), "Machine Learning Methods
Economists Should Know About."
Alex Belloni, Victor Chernozhukov and Christian Hansen (2014),
"High-dimensional methods and inference on structural and
treatment effects," Journal of Economic Perspectives, Spring,
Causal inference in economics
Alex Belloni, Victor Chernozhukov and Christian Hansen
(2011), "Inference Methods for High-Dimensional Sparse Econometric
Models," Advances in Economics and Econometrics, ES World Congress
2010, ArXiv 2011.
Alex Belloni, D. Chen, Victor Chernozhukov and Christian Hansen
(2012), "Sparse Models and Methods for Optimal Instruments with an
Application to Eminent Domain", Econometrica, Vol. 80, 2369-2429.
Alex Belloni, Victor Chernozhukov, Ivan Fernandez-Val and
Christian Hansen (2017), "Program Evaluation and Causal Inference
with High-Dimensional Data," Econometrica, 233-299.
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther
Duflo, Christian Hansen, Whitney Newey and James Robins (2018),
"Double/debiased machine learning for treatment and structural
parameters," The Econometrics Journal, 21, C1-C68.
Max Farrell (2015), "Robust Estimation of Average Treatment Effect
with Possibly more Covariates than Observations", Journal of
Econometrics, 189, 1-23.
Max Farrell, Tengyuan Liang and Sanjog Misra (2018), "Deep Neural
Networks for Estimation and Inference: Application to Causal
Effects and Other Semiparametric Estimands," arXiv:1809.09953v2.
Stefan Wager and Susan Athey (2018), "Estimation and Inference of
Heterogeneous Treatment Effects using Random Forests," JASA,
Achim Ahrens, Christian
Hansen, Mark Schaffer (2019), "lassopack: Model selection and
prediction with regularized regression in Stata,"