A list of papers/notes/books in statistical learning, machine learning, datascience, statistics with some leaning towards R.
- Discovering general multidimensional associations by Ben Murrell, Daniel Murrell, Hugh Murrell
- Do little interactions get lost in dark random forests? by Marvin N. Wright et al
- A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques by Mehdi Allahyari et al
- An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning by Guoqiang Zhong et al
- Random Forests by Leo Breiman (2001)
- An Introduction to Variable and Feature Selection by Guyon et al
- Correlation and variable importance in random forests by Baptiste Gregorutti et al
- A Comparison of Resampling and Recursive Partitioning Methods in Random Forest for Estimating the Asymptotic Variance Using the Infinitesimal Jackknife by Cole Brokamp, MB Rao, Patrick Ryan, Roman Jandarov
- An analytic approach for interpretable predictive models in high dimensional data, in the presence of interactions with exposures by Bhatnagar, SR., Yang, Y., Blanchette, M., Bouchard, L., Khundrakpam, B., Evans, A., Greenwood, CMT
- Randomer Forests by Tyler M. Tomita, Mauro Maggioni, Joshua T. Vogelstein
- Estimating Optimal Transformations for Multiple Regression Using the ACE Algorithm by Wang et al
- The Random Forest Kernel and other kernels for big data from random partitions by Alex Davies, Zoubin Ghahramani
- Quantile Regression Forests by Nicolai Meinshausen
- Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations by Sander Greenland et al
- The Matrix Calculus You Need For Deep Learning by Terence Parr, Jeremy Howard
- TABNET: ATTENTIVE INTERPRETABLE TABULAR LEARNING by Sercan O Arik, TomasPfister
- Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data by Sergei Popov, Stanislav Morozov, Artem Babenko
- Hierarchical Shrinkage: Improving the accuracy and interpretability of tree-based models. by Abhineet Agarwal et al
- Splitting criteria for ordinal decision trees: an experimental study by Rafael Ayllón-Gavilán et al
- Statistics 701: Special Topics in Applied Statistics II by Josh Errickson
- Spring 2006: Introduction to Machine Learning (CMSC 726)
- CS534: Machine Learning
- Data Mining: Spring 2013 bt Ryan Tibshirani
- Statistical Machine Learning: Spring 2017 by Ryan Tibshirani, Larry Wasserman
- Online resources to learn statistics from Penn State Eberly College of Science
- Data Mining and Machine Learning by Taylor Arnold
- Statistical Methods for Behavioral and Social Sciences by Ewart Thomas, Benoît Monin
- Reproducible Research Course by Eric C. Anderson
- A workshop on analyzing topic modeling (LDA, CTM, STM) using R
- Bayesian Deep Learning -- Tufts CS Special Topics Course
- Data processing and more with R: LLAP 2019 workshop
- ECON 305: Economics, Causality, and Analytics by Nick Huntington-Klein
- Machine learning 2015 by Tom Mitchell
- Practical Applications in R for Psychologists. by Mattan S. Ben-Shachar
- Full Stack deep learning by Karayev, Tobin, Abbeel
- Mathematical Python by Patrick Walls
- Mathematical Tools for Data Science by Carlos Fernandez-Granda et al
- DEEP LEARNING NYU course 2020 by Yann LeCun & Alfredo Canziani
- Program Evaluation for Public Service -- Combine research design, causal inference, and econometric tools to measure the effects of social programs by Dr. Andrew Heiss
- Causal Inference R workshop by Malcolm Barrett
- Machine learning @VU
- Deep learning @ VU
- Applied Machine Learning course by Cornell Univ
- Causal Inference Crash Course for Scientists by Shoepaladin
- LLM Course by Maxime Labonne
- https://www.gsb.stanford.edu/faculty-research/centers-initiatives/sil/research/methods/ai-machine-learning/short-course by Susan Athey et al
- Practical Bandits: A tutorial by Bram Van Dan Akker et al
- I2ML by Bernd Bischl et al
- Causal ML by MC Knaus
- Introduction to R and Geographic Information Systems (GIS) by Dr. Vallicrosa
- Examples to implement asynchronous programming in Shiny
- Data Visualization bt Andrew Heiss
- Econometrics with Unobserved Heterogeneity by Vladislav Morozov
- Advanced Econometrics (Econometrics II) by Vladislav Morozov
- Data Science Interviews by alexey grigorev
- The IllustratedMachine Learning website by Francesco Di Salvo et al
- Machine learning tutorials by MingYu (Ethen) Liu
- Data-Science-Interview-Questions-Answers by youssefHosni
- Data-Science-Interview-Preperation-Resources by youssefHosni
- Data Science Topics by Jee Vang
- An Introduction to Statistical Learning by Hastie et al
- The Elements of Statistical Learning by Hastie et al
- Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David
- Advanced Data Analysis from an Elementary Point of View by Cosma Rohilla Shalizi
- A course in Machine learning by Hall Daume III
- Data Analysis for the Life Sciences by Irizarry
- Statlect: digital textbook on probability and statistics
- Forecasting: Principles and Practice by Rob J Hyndman, George Athanasopoulos
- Time Series Analysis with R by A. Ian McLeod, Hao Yu, Esam Mahdi
- Time Series Analysis by DD Stoffer
- Statistical foundations of machine learning by G Bontempi
- Broadening Your Statistical Horizons: Generalized Linear Models and Multilevel Models by J. Legler and P. Roback
- Principles of econometrics with R by Constantin Colonescu
- A Compendium of Clean Graphs in R by Eric-Jan Wagenmakers and Quentin F. Gronau
- R for statistical learning by David Dalpiaz
- Applied Statistics with R by David Dalpiaz
- Fundamentals of Data Visualization by Claus O. Wilke
- Introduction to Data Science by Irizarry
- MATH 3070 R Lecture Notes by Curtis Miller
- plotly for R by Carson Sievert
- Tutorials on Bayesian Nonparametrics
- Hands-on Machine Learning with R by Bradley Boehmke
- Process Improvement Using Data by Kevin Dunn
- R BGU Course by Jonathan D. Rosenblatt
- Statistical Thinking for the 21st Century by Russell A. Poldrack
- Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes Muenchow et al
- Learning Statistics with R by Danielle Navarro
- An Introduction to Statistical and Data Sciences via R by Chester Ismay and Albert Y. Kim
- R for healthcare data analysis by Ewen Harrison and Riinu Ots
- Statistical Machine Learning course at Uppsala University by Lindholm et al
- Mathematics for machine learning by Marc Peter Deisenroth et al
- Five minute Stats by Matthew Stephens
- Another Book on Data Science: Learn R and Python in Parallel by Nailong Zhang
- Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams
- Computer age statistical inference by Bradley Efron, Trevor Hastie
- Predictive modeling with text by Emil Hvitfeldt, Julia Silge
- Text Mining with R by Julia Silge and David Robinson
- Using Spark from R for performance with arbitrary code by Jozef Hajnala
- Mastering spark with R by Javier Luraschi, Kevin Kuo, Edgar Ruiz
- Explanatory Model Analysis: Explore, Explain and Examine Predictive Models by Przemyslaw Biecek and Tomasz Burzykowski
- Rcpp for everyone by Masaki E. Tsuda
- GAMs: Generalized additive models by M Clark
- Data Science: Theories, Models, Algorithms, and Analytics by Sanjiv Ranjan Das
- Deep Learning V1 and V2 by Subir Varma and Sanjiv Das
- Data Visualization: A practical introduction by Kieran Healy
- ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham
- Analyzing Financial and Economic Data with R by Marcelo S. Perlin
- Lightweight Machine Learning Classics with R by Marek Gagolewski
- HPC with R by Vega Yon et al from USC Epidemology
- Evidence-based Software Engineering by Derek Jones
- JavaScript for Data Science by Maya Gans, Toby Hodges, and Greg Wilson
- R for python programmers by Greg Wilson
- Spatial Data Science by Edzer Pebesma, Roger Bivand
- Intro to GIS and Spatial Analysis by Manuel Gimond
- Limitations of Interpretable Machine Learning Methods
- XAI stores : case studies for explainable artificial intelligence
- Supervised Machine Learning for Text Analysis in R by Emil Hvitfeldt and Julia Silge
- A Business Analyst’s Introduction to Business Analytics by Adam Fleischhacker
- Modern Statistics for Modern Biology by Susan Holmes, Wolfgang Huber
- Machine Learning from Scratch by Daniel Friedman
- Bayes Rules! An Introduction to Bayesian Modeling with R by Alicia A. Johnson, Miles Ott, Mine Dogucu
- An Introduction to Bayesian Thinking by Merlise Clyde et al
- Causal Inference for The Brave and True by Matheus Facure Alves with Python code
- Handbook of Regression Modeling in People Analytics by Keith McNulty
- Applications of Deep neural networks by Jeff Heaton
- Data Science: A First Introduction by Tiffany-Anne Timbers, Trevor Campbell, Melissa Lee
- dive into deep learning by multiple authors with mxnet, pytorch and tensorflow adaptations
- Modern Statistics with R by Mans Thulin
- Getting Started with Causal Inference by Emre Kiciman, Amit Sharma
- Machine Learning and AI in TensorFlow and R by Maximilian Pichler and Florian Hartig
- Notes of linear algebra chapters of Goodfellow's book by Hadrien Jean
- Data Visualization with R by Rob Kabacof
- Surrogates: Gaussian process modeling, design and optimization for the applied sciences by Robert B. Gramacy
- Statistics for (Micro/Immuno)Biologists by Avinash Shenoy
- Spatial Modelling for Data Scientists by Francisco Rowe, Dani Arribas-Bel
- Practical Data Science by Michael Clark
- The Effect: An Introduction to Research Design and Causality by Nick Huntington-Klein
- The Mechanics of Machine Learning by Terence Parr, Jeremy Howard
- Machine learning foundations by John Krohn
- Multivariate Statistics and Machine Learning by Korbinian Strimmer
- Statistical Methods: Likelihood, Bayes and Regression by Korbinian Strimmer
- Targeted Learning in R: Causal Data Science with the tlverse by Mark van der Laan et al
- Modern Data Science with R (2ed) by Benjamin S. Baumer, Daniel T. Kaplan, and Nicholas J. Horton
- Doing Bayesian Data Analysis in brms and the tidyverse by A Solomon Kurz
- Reproducible Data Science: Accessible Data Analysis with Open Source Python Tools and Real-World Data by Valentin Danchev
- Linear Algebra for Data Science with examples in R by Shaina Race Bennett
- Beyond multiple linear regression by Paul Roback and Julie Legler
- Handbook of Graphs and Networks in People Analytics by Keith McNulty
- Reinforcement Learning Course Materials by Paderborn University
- STAT 447 : Data Science Programming Methods by Dirk Eddelbuettel
- Tidy Finance with R by Christoph Scheuch, Stefan Voigt, and Patrick Weiss
- Introduction to deep learning by MIT
- The Mathematical Engineering of Deep Learning by Benoit Liquet, Sarat Moka and Yoni Nazarathy
- Improving your statistical inferences by Daniel Lakens
- Statistics and Machine Learning in Python by Edouard Duchesnay, Tommy Lofstedt, Feki Younes
- The Mechanics of Machine Learning by Terence Parr and Jeremy Howard
- Practical Data Analysis for Political Scientists by Brenton Kenkel
- Applied Causal Analysis with R by Paul C. Bauer
- The Hitchhiker’s Guide to Longitudinal Models by Ethan M. McCormick et al
- Understanding deep learning book by JD Prince
- Geographic Data Science with R: Visualizing and Analyzing Environmental Change by Michael C. Wimberly
- R Without Statistics by David Keyes
- Deep Learning and Scientific Computing with R torch by Sigrid Keydana
- Deep R programming by Gagolewski
- Causal Inference in R by Malcolm Barrett, Lucy D’Agostino McGowan, Travis
- OSE data science (Evaluating causal claims) by Prof. Dr. Philipp Eisenhauer et al
- First course in causal inference by Peng Ding
- Spatial Statistics for Data Science: Theory and Practice with R by Paula Moraga
- Devops for data science by Alex Gold
- Building reproducible analytical pipelines with R by Bruno Rodrigues
- Data Science: Theories, Models, Algorithms, and Analytics by Sanjiv Ranjan Das
- That’s weird! Anomaly detection using R by Rob Hyndman
- Machine Learning: The basics by Alexander Jung
- Applied Multivariate Statistics in R by Jonathan Bakker
- Supervised Machine Learning for Science by Christoph Molnar and Timo Freiesleben
- Big Data Analytics by Ulrich Matter
- Fraud Detection (practical handbook) by Yann-Aël Le Borgne et al
- Imbalanced Binary Classification: A Survey with code by Alessandro Morita et al
- Introduction to Data mining (R book) by Hahsler
- Data Science Notes by Foley
- End-to_end data science with R by Rene Essomba
- Bandit algorithms by Tor Lattimore et al
- Data Analytics: Small Data Approach by Shuai Huang and Houtao Deng
- Veridical Data Science by Bin YU et al
- Efficient Machine Learning with R by Simon P. Couch
- Applied Causal Inference by Uday Kamath et al
- Mathematical Methods in Data Science (MMiDS) by Sebastien Roch
- Causal Inference: Design Patterns in Causal Inference by Gorkem Tugrut Ozer
- Models Demystified: A Practical Guide from Linear Regression to Deep Learning by Michael Clark et al
- Design and Analysis of Experiments and Observational Studies using R by Nathan Taback
- Bayesian Modeling and Computation in Python by Osawaldo Martin et al
- Introduction to Mathematical Optimization with Python by Indranil Ghosh
- Probability and Statistics for Data Science by Carlos Fernandez-Granda
- Causal Artificial Intelligence by Elias Bareinboim
- Regression modeling strategies by Frank Harrel
- Introductory econometrics (R, Python, Julia) by Florian Heiss, Brunner
- Introduction to Econometrics with R by Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer