0% found this document useful (0 votes)
119 views45 pages

Econometrics Vs ML

Uploaded by

Ikhlas Mokhtari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views45 pages

Econometrics Vs ML

Uploaded by

Ikhlas Mokhtari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Econometrics and Machine Learning: A Comprehensive

Comparison

Dr Merwan Roudane

July 24, 2024

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 1 / 45
Table of Contents
1 Leo Breiman’s Philosophy of Modeling
2 Econometrics and Machine Learning: Definitions and Types
3 Terminology: Econometrics vs. Machine Learning
4 Differences Between Econometrics and Machine Learning
5 Challenges and Limitations
6 What Econometrics Can Learn from Machine Learning
7 What Machine Learning Can Learn from Econometrics
8 Data Splitting in ML vs. Full Data Use in Econometrics
9 Combining Econometrics and Machine Learning
10 Research Problems: Econometrics vs. ML
11 Recent Developments

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 2 / 45
Leo Breiman’s Philosophy of Modeling

Leo Breiman’s ”Two Cultures” in Statistical Modeling

Data Modeling Culture


Assumes a stochastic data model
Focus on parameter estimation and inference
Algorithmic Modeling Culture
Treats the data mechanism as unknown
Focus on predictive accuracy

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 3 / 45
Leo Breiman’s Philosophy of Modeling

Data Modeling Culture

Rooted in traditional statistics and econometrics


Assumes data is generated by a specific stochastic model
Goals:
Estimate model parameters
Test hypotheses
Make inferences about the population
Examples: Linear regression, logistic regression, time series models

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 4 / 45
Leo Breiman’s Philosophy of Modeling

Algorithmic Modeling Culture

Emerged with the rise of machine learning and data science


Treats the data mechanism as a complex, unknown ”black box”
Goals:
Achieve high predictive accuracy
Find a function that maps inputs to outputs
Examples: Random forests, neural networks, support vector machines

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 5 / 45
Leo Breiman’s Philosophy of Modeling

Implications of the Two Cultures

Different approaches to model validation


Trade-offs between interpretability and predictive power
Varying emphasis on theoretical foundations
Distinct perspectives on the role of domain knowledge
Ongoing debate about the most appropriate approach in different
contexts

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 6 / 45
Econometrics and Machine Learning: Definitions and Types

Econometrics: Definition

Definition
Econometrics is the application of statistical methods to economic data to
give empirical content to economic relationships.

Combines economic theory, mathematics, and statistical inference


Aims to quantify economic relationships and test economic theories
Focuses on causal inference and parameter estimation

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 7 / 45
Econometrics and Machine Learning: Definitions and Types

Types of Econometric Methods

Cross-sectional analysis
Time series analysis
Panel data methods
Instrumental variables estimation
Difference-in-differences
Regression discontinuity design
Structural equation modeling
Vector autoregression (VAR)

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 8 / 45
Econometrics and Machine Learning: Definitions and Types

Machine Learning: Definition

Definition
Machine Learning is a field of study that gives computers the ability to
learn without being explicitly programmed.

Focuses on developing algorithms that can learn from and make


predictions or decisions based on data
Emphasizes predictive accuracy and pattern recognition
Often deals with high-dimensional and unstructured data

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 9 / 45
Econometrics and Machine Learning: Definitions and Types

Types of Machine Learning Methods

Supervised Learning
Classification (e.g., logistic regression, decision trees)
Regression (e.g., linear regression, random forests)
Unsupervised Learning
Clustering (e.g., k-means, hierarchical clustering)
Dimensionality reduction (e.g., PCA, t-SNE)
Reinforcement Learning
Deep Learning (neural networks)

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 10 / 45
Terminology: Econometrics vs. Machine Learning

Terminology Comparison

Concept Econometrics Machine Learning


Output variable Dependent variable Label / Target
Input variables Independent variables Features / Predictors
Model fit R-squared, Adjusted R-squared Performance metrics
Model assessment Hypothesis testing, p-values Cross-validation
Error term Residual Loss
Model Estimator Learner / Algorithm

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 11 / 45
Terminology: Econometrics vs. Machine Learning

Terminology Differences: Implications

Reflects different focuses and philosophical approaches


Econometrics terminology emphasizes statistical inference
Machine learning terminology reflects focus on prediction and
algorithm performance
Understanding both sets of terminology is crucial for interdisciplinary
work
Bridging terminological gaps can lead to better integration of methods

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 12 / 45
Differences Between Econometrics and Machine Learning

Key Differences

Primary goals
Approach to model selection
Handling of high-dimensional data
Emphasis on interpretability
Treatment of causality
Theoretical foundations
Typical applications

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 13 / 45
Differences Between Econometrics and Machine Learning

Primary Goals

Econometrics Machine Learning


Causal inference Predictive accuracy
Parameter estimation Pattern recognition
Hypothesis testing Automation
Policy evaluation Scalability

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 14 / 45
Differences Between Econometrics and Machine Learning

Approach to Model Selection

Econometrics
Machine Learning
Theory-driven
Data-driven
Emphasis on model
Emphasis on predictive
interpretability
performance
Focus on unbiasedness and
Use of cross-validation
efficiency
Ensemble methods and model
Use of information criteria (AIC,
averaging
BIC)

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 15 / 45
Differences Between Econometrics and Machine Learning

Handling of High-Dimensional Data

Econometrics Machine Learning


Traditional focus on Designed to handle
low-dimensional data high-dimensional data
Instrumental variables for Feature selection and
endogeneity dimensionality reduction
Recent developments in Regularization techniques
high-dimensional econometrics (Lasso, Ridge)

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 16 / 45
Differences Between Econometrics and Machine Learning

Emphasis on Interpretability

Econometrics
Machine Learning
High emphasis on
Often sacrifices interpretability
interpretability
for predictive power
Focus on marginal effects and
”Black box” models common
elasticities
Recent focus on interpretable
Importance of economic
ML
significance

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 17 / 45
Differences Between Econometrics and Machine Learning

Treatment of Causality

Econometrics Machine Learning


Central focus on causal Traditionally focused on
relationships correlation, not causation
Extensive toolbox for causal Recent developments in causal
inference ML
Emphasis on identifying Challenges with
assumptions high-dimensional causality

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 18 / 45
Differences Between Econometrics and Machine Learning

Theoretical Foundations

Machine Learning
Econometrics Rooted in computer science and
Grounded in economic theory optimization
Strong statistical foundations Focus on computational
Emphasis on asymptotic efficiency
properties Emphasis on empirical
performance

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 19 / 45
Differences Between Econometrics and Machine Learning

Typical Applications

Econometrics Machine Learning


Policy evaluation Image and speech recognition
Demand estimation Recommender systems
Macroeconomic forecasting Fraud detection
Labor market analysis Autonomous vehicles

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 20 / 45
Challenges and Limitations

Challenges in Econometrics

Endogeneity and identification issues


Model misspecification
Limited external validity
Difficulty handling very large datasets
Assumptions of linearity and normality
Interpretability vs. complexity trade-off

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 21 / 45
Challenges and Limitations

Limitations of Econometrics

Often relies on strong assumptions


May struggle with high-dimensional data
Can be computationally intensive for large datasets
Limited flexibility in modeling complex, non-linear relationships
May overfit in small samples if model is too complex
Challenges in handling unstructured data (text, images)

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 22 / 45
Challenges and Limitations

Challenges in Machine Learning

Lack of causal interpretation


Overfitting and poor generalization
Black-box nature of complex models
Data quality and bias
Computational intensity
Difficulty in quantifying uncertainty
Limited theoretical guarantees

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 23 / 45
Challenges and Limitations

Limitations of Machine Learning

Often lacks clear causal interpretation


May capture spurious correlations
Requires large amounts of data for complex models
Can be sensitive to distribution shifts
May struggle with small, nuanced datasets
Limited ability to incorporate domain knowledge
Challenges in model interpretability and explainability

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 24 / 45
What Econometrics Can Learn from Machine Learning

Lessons from Machine Learning for Econometrics

Cross-validation for model selection and evaluation


Regularization techniques for high-dimensional data
Ensemble methods for improved prediction
Flexible modeling of non-linear relationships
Handling of unstructured data (text, images)
Scalable algorithms for big data

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 25 / 45
What Econometrics Can Learn from Machine Learning

Cross-validation in Econometrics

Provides a more robust measure of out-of-sample performance


Helps mitigate overfitting
Can be used alongside traditional model selection criteria
Challenges:
Maintaining temporal structure in time series data
Accounting for clustered or hierarchical data structures

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 26 / 45
What Econometrics Can Learn from Machine Learning

Regularization in Econometrics

Useful for high-dimensional problems (e.g., many covariates)


Lasso: Can perform variable selection
Ridge: Handles multicollinearity
Elastic Net: Combines Lasso and Ridge
Applications:
Selecting instrumental variables
Estimating treatment effects with many controls
High-dimensional fixed effects models

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 27 / 45
What Econometrics Can Learn from Machine Learning

Ensemble Methods in Econometrics

Can improve prediction accuracy


Examples:
Bagging for more stable estimates
Random forests for non-linear relationships
Boosting for iterative improvements
Challenges:
Maintaining interpretability
Incorporating economic theory

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 28 / 45
What Machine Learning Can Learn from Econometrics

Lessons from Econometrics for Machine Learning

Causal inference frameworks


Treatment of endogeneity
Incorporation of domain knowledge
Emphasis on model interpretability
Rigorous statistical inference
Handling of panel and time series data

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 29 / 45
What Machine Learning Can Learn from Econometrics

Causal Inference in Machine Learning

Moving beyond prediction to causal relationships


Adapting econometric tools for causal ML:
Instrumental variables
Difference-in-differences
Regression discontinuity
Challenges:
Maintaining flexibility of ML models
Scaling causal inference to high dimensions

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 30 / 45
What Machine Learning Can Learn from Econometrics

Incorporating Domain Knowledge in ML

Econometrics often relies heavily on domain expertise


Ways to incorporate domain knowledge in ML:
Feature engineering based on theory
Constrained optimization respecting economic laws
Transfer learning from theoretical models
Bayesian priors informed by economic intuition
Benefits:
Improved interpretability
Better generalization to out-of-sample scenarios
Alignment with existing theoretical frameworks

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 31 / 45
Data Splitting in ML vs. Full Data Use in Econometrics

Data Splitting Philosophy

Machine Learning (Data


Econometrics (Full Data Use)
Splitting)
Maximizes statistical power
Estimates out-of-sample
Focuses on in-sample fit and performance
inference
Mitigates overfitting
Relies on asymptotic theory
Validates model generalizability

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 32 / 45
Data Splitting in ML vs. Full Data Use in Econometrics

Rationale Behind Data Splitting

Provides unbiased estimate of model performance on new data


Helps detect and prevent overfitting
Allows for model selection and hyperparameter tuning
Mimics real-world scenario of predicting on unseen data
Essential for complex, flexible models prone to overfitting

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 33 / 45
Data Splitting in ML vs. Full Data Use in Econometrics

How to Split Data

Common split ratios: 70-30, 80-20 (train-test)


Cross-validation: k-fold, leave-one-out
Stratified sampling for imbalanced datasets
Time-based splitting for time series data
Nested cross-validation for model selection and evaluation

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 34 / 45
Data Splitting in ML vs. Full Data Use in Econometrics

Limitations of Data Splitting

Reduced sample size for model estimation


Potential loss of statistical power
May not work well for small datasets
Can be sensitive to specific random splits
Challenges with non-i.i.d. data (e.g., time series, spatial data)
May not capture long-term or rare events in test set

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 35 / 45
Data Splitting in ML vs. Full Data Use in Econometrics

When Data Splitting is Useful

Large datasets where statistical power is not an issue


Complex models with many parameters
When the primary goal is predictive performance
In presence of potential overfitting
When assessing model generalizability is crucial
For model comparison and selection

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 36 / 45
Combining Econometrics and Machine Learning

Hybrid Approaches

Double Machine Learning for Treatment Effects


Causal Forests
Neural Network-based Instrumental Variables
High-dimensional Econometrics with ML Feature Selection
ML for Heterogeneous Treatment Effects
Synthetic Control Methods with ML

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 37 / 45
Combining Econometrics and Machine Learning

Applications of Combined Approaches

Policy Evaluation
Consumer Demand Estimation
Labor Market Analysis
Financial Risk Assessment
Macroeconomic Forecasting
Text-based Economic Indicators

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 38 / 45
Research Problems: Econometrics vs. ML

Problems Better Suited for Econometrics

Causal impact of minimum wage on employment


Effect of education on earnings
Evaluating the effectiveness of a new economic policy
Estimating price elasticity of demand
Analyzing the determinants of economic growth

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 39 / 45
Research Problems: Econometrics vs. ML

Problems Better Suited for Machine Learning

Predicting consumer purchasing behavior


Credit scoring and fraud detection
Stock price prediction
Sentiment analysis of economic news
Clustering countries based on economic indicators

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 40 / 45
Recent Developments

Recent Developments in Econometrics

Machine Learning for Heterogeneous Treatment Effects


Synthetic Control Methods
High-dimensional Econometrics
Econometrics of Networks
Structural Estimation with Deep Learning
Text Analysis in Economics

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 41 / 45
Recent Developments

Recent Developments in Machine Learning for Economics

Causal Machine Learning


Interpretable and Explainable AI
Reinforcement Learning for Economic Decision Making
Federated Learning for Privacy-preserving Analysis
Automated Machine Learning (AutoML)
Transfer Learning in Economics

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 42 / 45
Recent Developments

Conclusion

Econometrics and Machine Learning have distinct strengths


Increasing convergence and cross-pollination of ideas
Hybrid approaches leverage the best of both worlds
Future research likely to further integrate these fields
Importance of understanding both paradigms for modern data
analysis in economics

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 43 / 45
Recent Developments

References
Breiman, L. (2001). Statistical modeling: The two cultures. Statistical
Science, 16(3), 199-231.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and
Panel Data. MIT Press.
Hastie, T., Tibshirani, R., Friedman, J. (2009). The Elements of
Statistical Learning. Springer.
Varian, H. R. (2014). Big data: New tricks for econometrics. Journal
of Economic Perspectives, 28(2), 3-28.
Athey, S., Imbens, G. W. (2019). Machine Learning Methods
Economists Should Know About. Annual Review of Economics, 11,
685-725.
Mullainathan, S., Spiess, J. (2017). Machine learning: an applied
econometric approach. Journal of Economic Perspectives, 31(2),
87-106.
Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison
July 24, 2024 44 / 45
Recent Developments

Thank You

Questions? Comments?

Dr Merwan Roudane Econometrics and Machine Learning: A Comprehensive Comparison


July 24, 2024 45 / 45

You might also like