CRC Press sent CHANCE this book for review. Since the topic was of clear interest to me, with an author who significantly contributed to the field—my only recollection meeting Roderick Little was during the Australian Statistical Conference in Adelaïde, in 2012, at the start of my Oz 2012 Tour!—, I took the opportunity of the nearest weekend to browse through Seminal ideas and controversies in Statistics. I like very much the idea of selecting a dozen key papers in the history of Statistics and of discussing why. In fact, this reminded me of my classics seminar, which lasted the few years I was 100% in charge of the Master program in Dauphine (and which I hope I could restart!). Checking the list of the papers I then suggested my students, I see some overlap with 9 papers out of the 15 groups. (I also remember Steve Fienberg making suggestions for that list, while he was spending a sabbatical in Paris at CREST.) Given that community of focus and purpose, and contrary to my wont, I have really very little of substance to criticize or wish about the book. The less when reading the following
“On a personal note, I met Yates [author of a 1984 paper on tests for 2×2 contingency tables discussing the relevance of conditioning on one or both margins], a charming man, when I was a young graduate student who knew next to nothing about statistics; we discussed the joys of traversing the Cuillin Ridge in Skye.”
since completing that ridge remains high in my mountain-climbing bucket-list! (Possibly next year, since we are running an ICMS workshop on the Island.)
The first paper in the series is more than a foundational paper since (The) Fisher’s 1922 paper is about creating (almost) ex nihilo the field of (modern) mathematical statistics. I don’t know if there is any equivalence in other scientific disciplines of such an impact (and of such a man)… Roderick Little manages to convincingly engage with Fisher’s dismissive views on (not yet called) Bayesian analysis, although, to the latter’s defence, the formalisation of Bayesian inference at that time had not yet emerged. The second chapter is discussing Yates’ 1984 paper on tests for 2×2 contingency tables that he wrote 50 years after writing the original one in the first volume of JRSS. Roderick Little adds a detailed Bayesian analysis with the three standard reference priors, Jeffreys’ version proving quite close to Fisher’s exact test (conditional on both margins). The third chapter is aiming at the generic challenge of hypothesis testing, from the well-known opposition between Fisher and Neyman (both on the cover), to questioning the sanity of hard-set thresholds (with a mention of our American Statistician call to abandon (shi)p!). The later (thus) refers to the recent literature on the replicability crisis and the now famous ASA statement on p-values by Ron Wasserstein and Nicole Lazar, analysed in the chapter. But I would have like to read another full section on alternatives to hypothesis testing. While now a niche interest (imho), Fisher’s attempt at creating a posterior distribution without a prior, aka fiducial inference, is discussed in Chapter 4 with the Behrens-Fisher problem as the illustrating example. The chapter feels rather anticlimactic, with the comparison relying on the (Malay) Ghosh and Kim (2001) simulation results.
Birnbaum’s (1962) likelihood principle is the topic of Chapter 5 (and I cannot remember any of my students choosing this paper over the years, although there was at least one). Roderick Little recalls some sentences from the JASA discussion as an appetiser, a reminder of the time when these discussions could turn in scathing attacks. The chapter contains excerpts from Berger and Wolpert (1988)—which they were writing while I was spending a year at Purdue and which I have always recommended to my PhD students, albeit not for the classic seminar. It then moves to the controversies that surround this principle since its inception, in particular those accumulated by Deborah Mayo (also on the cover) as reported on the ‘Og. In the recent years, I have become less excited about the LP, in part due to the imprecision in its statement, which opens the door to conflicting interpretations. And in part due to the scarcity of models with non-trivial sufficient statistics. (I am also wondering if the sufficiency issue we highlighted in our ABC model choice criticism does relate to the mixture example at the end of the chapter.)
The next chapter is one all for compromise, through the calibrated Bayes perspective that credible statements should be close to confidence statements in the long run. Which I remember him presenting at ASC 2012. The concept is found in the very 1984 paper by Don Rubin (also on the cover) that contains the concept behind Approximate Bayesian Computation (ABC). And the chapter proceeds by listing strengths and weaknesses of frequentist and Bayesian perspectives, towards a fusion of both., e.g. though posterior predictive checks.
While the choice of a (general public) paper from Scientific American may sound surprising in Chapter 7, with Efron’s (on the cover) and Morris’ 1977 Stein’s paradox, I cannot but applaud, the more because this was the first paper I read when starting my PhD on the James-Stein estimators. Although this may sound like happening eons ago, the James and Stein (1961) paper—which is my age!—”created a considerable backlash” by toppling unbiasedness from its pedestal and exhibiting a paradox that 1+1+1≠3… Which Little reinterprets via a random effect (or Bayesian hierarchical) model. (And a chapter where I learned that Little’s father was a journalist, a characteristic he shared with Bruce Lindsay, as I found at Blonde, Glasgow, during an ICMS workshop). Relatedly, the next chapter is about the “57 varieties [of regression] paper” by Demptster, Schatzoff and Wermuth (1977). Apparently connected with Heinz 57 varieties of pickles. The paper considers Stein and ridge and variable selections versions for variable selection. The chapter also covers (Bayesian) Lasso and BART, as well as a brief all too brief mention of Spike & Slab priors—with my friend Veronika Ročková missing from the authors’ index!—, but I was expecting from the title other, robust, forms of regression like L¹ regression and econometrics digressions. Chapter 10 can however been seen as a proxy since covering generalized estimating equations from a 1986 Biometrika paper of Liang and Zeger, with no Bayesian aspect (and an expected appearance of Communications in Statistics B).
Chapter 9 covers the almost immediately classic 1995 paper of Benjamini and Hochbeg on multiple regressions (that Series B turned into a discussion paper ten years later!). Although it spends more time on Berry’s (2012) recommendations than on FDR. The computational Chapter 11 brings together Efron’s (1979) bootstrap [with his picture on the cover] and MCMC, represented by the founding paper of Gelfand and Smith (1990, if mistakenly set in 1988 on p140). A bit of a strange mix imho as the former is more inferential than computational. And not giving the EM algorithm that much space. And not questioning MCMC methods as a good proxy to posterior distributions. Tukey’s Future of Data Analysis (as founding exploratory data analysis) and Breiman’s Two cultures (as launching statistical machine learning) meet in Chapter 12. (With a reminder that the latter invokes Occam’s razor—which may not be that appropriate for hugely overparameterised machine learning black boxes—and…the Rashomon principle! Meaning that distinct models may all fit the same data. Let me nitpickingly add the reference to Ryûnosuke Akutagawa as the author of Rashômon and other stories that Kurosawa adapted in his splendid movie). The chapter contains critical remarks from David Cox, Brad Efron, David Bickel, and Andrew Gelman, with a further section on Little’s view on modelling.
The last three chapters are on design and sampling, in connection with Little’s (and Rubin’s) works in the area. With a 1934 paper of Neyman (whose picture on the cover could have been chosen differently, albeit no fault of Neyman [or of Little!] that his toothbrush style of moustache dramatically got out of fashion!). With a return to calibrated Bayes and a reminiscence of Little’s time at the World Fertility Survey but (apparently) no mention of the probabilistic aspects of modern censuses (that saw my friends Steve Fienberg on the one side and Larry Brown and Marty Wells on the other side argue for and against it!), again relating to the reliance on statistical models. Chapter 14 relates randomized clinical trials to causality, which makes a (worthy) appearance there. Roderick Little also makes a clear case there against the retracted study linking vaccines and autism, a call that will unlikely not reach the current Trump administration and its Secretary of Health.
The book concludes with a list of twenty style and grammar suggestions for improved writing.
As should be crystal-clear from the above, I quite enjoyed the book and would definitely use its reading list in a graduate course whenever the opportunity arises. Once again, some choices are more personal to the author than others, and I would have place more emphasis on the fantastic Dawid, Stone and Zidek (1973)—with Jim Zidek also missing from the author index—, but all make sense in a walk through statistical classics. Let me however regret the absence therein of major actors like, e.g., D. Blackwell, C.R. Rao, or G. Wahba (except in a stylistic example p199), two of whom were awarded the International Prize in Statistics.
[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]