By Adrian Bevan
Cambridge University Press
Hardback: £40 $75
Paperback: £18.99 $31.99
E-book: $26
Also available at the CERN bookshop
The numerous foundational errors and misunderstandings in this book make it inappropriate for use by students or research physicists at any level. There is space here to indicate only a few of the more serious problems.
The fundamental concepts – probability, probability density function (PDF) and likelihood function – are confused throughout. Likelihood is defined as being “proportional to probability”, and both are confused with a PDF in section 3.8(6). Exercise 3.11 invites the reader to “re-express the PDF as a likelihood function”, which is absurd because the two are functions of different kinds of arguments.
Probability and probability density are confused most notably in section 5.5 (χ2 distribution), where the “probability of χ2” is given as the value of the PDF instead of its integral from χ2 to infinity. (The latter quantity is in fact the p value, which is introduced later in section 8.2, but is needed here already.) The student who evaluates the PDFs labelled P(χ2, ν) in figure 5.6 to do exercises 5.10 to 5.12 will get the wrong answers, but the numbers given in table E11 – miraculously – are correct p values. Fortunately the formulas in the book were not used for the tables.
From the beginning there is confusion about what is Bayesian and what is not. Bayesian probability is defined correctly as a degree of belief, but Bayes’s theorem is introduced in the section entitled “Bayesian probability”, even though it can be used equally well in frequentist statistics, and in fact nearly all of the examples use frequentist probabilities. The different factors in Bayes’s theorem are given Bayesian names (one of which is wrong: the likelihood function is inexplicably called “a priori probability”), but the examples labelled “Bayesian” do not use the theorem in a Bayesian way. Worse, the example 3.7.4, labelled Bayesian, confuses the two arguments of conditional probability throughout, and equation 3.17 is wrong (as can be seen by comparing it with P(A) in section 3.2, which is correct). On the other hand, in section 8.7.1 a similar example – with frequentist probabilities again – is presented clearly and correctly. Example 3.7.5 (also labelled Bayesian) is, as far as I can see, nonsense (what is outcome A?).
The most serious errors occur in chapter 7 (confidence intervals). Confidence intervals are frequentist by definition, otherwise they should be called credible intervals. But the treatment here is a curious mixture of Bayesian, frequentist and pure invention. The definition of the confidence level (CL) is novel and involves integration under a PDF that could be the Bayesian posterior but in some examples turns out to be a likelihood function. Coverage is then defined in a frequentist-inspired way (invoking repeated experiments), but it is not the correct frequentist definition. The Feldman–Cousins (F–C) frequentist method is presented without having described the more general Neyman construction on which it is based. A good treatment of the Neyman construction would have allowed the reader to understand coverage better, which the book identifies correctly as the most important property of confidence intervals. It is true that for discrete (e.g. Poisson) data, the F–C method in general over-covers, but it should also have been stated that for this case any method (including Bayesian) that covers for all parameter values must over-cover for some. The “coverage” that this book claims to be exact for Bayesian methods is not an accepted definition because it represents subjective belief only and does not have the frequentist properties required by physicists.