Preface |
|
xiii | |
Software support |
|
xv | |
Acknowledgements |
|
xvii | |
|
Role of probability theory in science |
|
|
1 | (20) |
|
|
1 | (1) |
|
Inference requires a probability theory |
|
|
2 | (3) |
|
The two rules for manipulating probabilities |
|
|
4 | (1) |
|
Usual form of Bayes' theorem |
|
|
5 | (5) |
|
Discrete hypothesis space |
|
|
5 | (1) |
|
Continuous hypothesis space |
|
|
6 | (1) |
|
Bayes' theorem -- model of the learning process |
|
|
7 | (1) |
|
Example of the use of Bayes' theorem |
|
|
8 | (2) |
|
Probability and frequency |
|
|
10 | (2) |
|
Example: incorporating frequency information |
|
|
11 | (1) |
|
|
12 | (3) |
|
The two basic problems in statistical inference |
|
|
15 | (1) |
|
Advantages of the Bayesian approach |
|
|
16 | (1) |
|
|
17 | (4) |
|
Probability theory as extended logic |
|
|
21 | (20) |
|
|
21 | (1) |
|
|
21 | (4) |
|
|
21 | (1) |
|
|
22 | (1) |
|
Truth tables and Boolean algebra |
|
|
22 | (2) |
|
|
24 | (1) |
|
Inductive or plausible inference |
|
|
25 | (1) |
|
|
25 | (1) |
|
An adequate set of operations |
|
|
26 | (3) |
|
Examination of a logic function |
|
|
27 | (2) |
|
Operations for plausible inference |
|
|
29 | (8) |
|
The desiderata of Bayesian probability theory |
|
|
30 | (1) |
|
Development of the product rule |
|
|
30 | (4) |
|
|
34 | (2) |
|
Qualitative properties of product and sum rules |
|
|
36 | (1) |
|
Uniqueness of the product and sum rules |
|
|
37 | (2) |
|
|
39 | (1) |
|
|
39 | (2) |
|
The how-to of Bayesian inference |
|
|
41 | (31) |
|
|
41 | (1) |
|
|
41 | (2) |
|
|
43 | (2) |
|
|
45 | (1) |
|
Model comparison and Occam's razor |
|
|
45 | (5) |
|
Sample spectral line problem |
|
|
50 | (2) |
|
|
50 | (2) |
|
|
52 | (7) |
|
Choice of prior p(T\M1,I) |
|
|
53 | (2) |
|
Calculation of p(D\M1,T,I) |
|
|
55 | (3) |
|
|
58 | (1) |
|
|
58 | (1) |
|
|
58 | (1) |
|
Parameter estimation problem |
|
|
59 | (2) |
|
Sensitivity of odds to Tmax |
|
|
59 | (2) |
|
|
61 | (2) |
|
|
63 | (2) |
|
|
65 | (4) |
|
|
66 | (3) |
|
|
69 | (3) |
|
|
72 | (24) |
|
|
72 | (1) |
|
|
72 | (7) |
|
Bernoulli's law of large numbers |
|
|
75 | (1) |
|
The gambler's coin problem |
|
|
75 | (2) |
|
Bayesian analysis of an opinion poll |
|
|
77 | (2) |
|
|
79 | (1) |
|
Can you really answer that question? |
|
|
80 | (2) |
|
Logical versus causal connections |
|
|
82 | (1) |
|
Exchangeable distributions |
|
|
83 | (2) |
|
|
85 | (4) |
|
Bayesian and frequentist comparison |
|
|
87 | (2) |
|
Constructing likelihood functions |
|
|
89 | (4) |
|
|
90 | (1) |
|
|
91 | (2) |
|
|
93 | (1) |
|
|
94 | (2) |
|
Frequentist statistical inference |
|
|
96 | (43) |
|
|
96 | (1) |
|
The concept of a random variable |
|
|
96 | (1) |
|
|
97 | (1) |
|
Probability distributions |
|
|
98 | (2) |
|
Descriptive properties of distributions |
|
|
100 | (5) |
|
Relative line shape measures for distributions |
|
|
101 | (1) |
|
|
102 | (1) |
|
Other measures of central tendency and dispersion |
|
|
103 | (1) |
|
Median baseline subtraction |
|
|
104 | (1) |
|
Moment generating functions |
|
|
105 | (2) |
|
Some discrete probability distributions |
|
|
107 | (6) |
|
|
107 | (2) |
|
|
109 | (3) |
|
Negative binomial distribution |
|
|
112 | (1) |
|
Continuous probability distributions |
|
|
113 | (6) |
|
|
113 | (3) |
|
|
116 | (1) |
|
|
116 | (1) |
|
|
117 | (1) |
|
Negative exponential distribution |
|
|
118 | (1) |
|
|
119 | (1) |
|
Bayesian demonstration of the Central Limit Theorem |
|
|
120 | (4) |
|
Distribution of the sample mean |
|
|
124 | (1) |
|
|
125 | (1) |
|
Transformation of a random variable |
|
|
125 | (2) |
|
Random and pseudo-random numbers |
|
|
127 | (9) |
|
Pseudo-random number generators |
|
|
131 | (1) |
|
|
132 | (4) |
|
|
136 | (1) |
|
|
137 | (2) |
|
|
139 | (23) |
|
|
139 | (2) |
|
|
141 | (2) |
|
|
143 | (4) |
|
The Student's t distribution |
|
|
147 | (3) |
|
|
150 | (2) |
|
|
152 | (8) |
|
|
152 | (4) |
|
Confidence intervals for μ, unknown variance |
|
|
156 | (2) |
|
Confidence intervals: difference of two means |
|
|
158 | (1) |
|
Confidence intervals for σ2 |
|
|
159 | (1) |
|
Confidence intervals: ratio of two variances |
|
|
159 | (1) |
|
|
160 | (1) |
|
|
161 | (1) |
|
Frequentist hypothesis testing |
|
|
162 | (22) |
|
|
162 | (1) |
|
|
162 | (10) |
|
Hypothesis testing with the Χ2 statistic |
|
|
163 | (4) |
|
Hypothesis test on the difference of two means |
|
|
167 | (3) |
|
One-sided and two-sided hypothesis tests |
|
|
170 | (2) |
|
Are two distributions the same? |
|
|
172 | (5) |
|
Pearson Χ2 goodness-of-fit test |
|
|
173 | (4) |
|
Comparison of two-binned data sets |
|
|
177 | (1) |
|
Problem with frequentist hypothesis testing |
|
|
177 | (4) |
|
Bayesian resolution to optional stopping problem |
|
|
179 | (2) |
|
|
181 | (3) |
|
Maximum entropy probabilities |
|
|
184 | (28) |
|
|
184 | (1) |
|
The maximum entropy principle |
|
|
185 | (1) |
|
|
186 | (1) |
|
Alternative justification of MaxEnt |
|
|
187 | (3) |
|
|
190 | (1) |
|
|
190 | (1) |
|
Continuous probability distributions |
|
|
191 | (1) |
|
How to apply the MaxEnt principle |
|
|
191 | (1) |
|
Lagrange multipliers of variational calculus |
|
|
191 | (1) |
|
|
192 | (11) |
|
|
192 | (2) |
|
|
194 | (1) |
|
|
195 | (2) |
|
Normal and truncated Gaussian distributions |
|
|
197 | (5) |
|
Multivariate Gaussian distribution |
|
|
202 | (1) |
|
MaxEnt image reconstruction |
|
|
203 | (5) |
|
The kangaroo justification |
|
|
203 | (3) |
|
MaxEnt for uncertain constraints |
|
|
206 | (2) |
|
Pixon multiresolution image reconstruction |
|
|
208 | (3) |
|
|
211 | (1) |
|
Bayesian inference with Gaussian errors |
|
|
212 | (31) |
|
|
212 | (1) |
|
Bayesian estimate of a mean |
|
|
212 | (15) |
|
|
213 | (4) |
|
Mean: known noise, unequal σ |
|
|
217 | (1) |
|
|
218 | (6) |
|
|
224 | (3) |
|
|
227 | (1) |
|
Comparison of two independent samples |
|
|
228 | (12) |
|
|
230 | (3) |
|
How do the samples differ? |
|
|
233 | (1) |
|
|
233 | (3) |
|
|
236 | (1) |
|
Ratio of the standard deviations |
|
|
237 | (2) |
|
Effect of the prior ranges |
|
|
239 | (1) |
|
|
240 | (1) |
|
|
241 | (2) |
|
Linear model fitting (Gaussian errors) |
|
|
243 | (44) |
|
|
243 | (1) |
|
|
244 | (12) |
|
|
249 | (4) |
|
More powerful matrix formulation |
|
|
253 | (3) |
|
|
256 | (1) |
|
The posterior is a Gaussian |
|
|
257 | (7) |
|
|
260 | (4) |
|
|
264 | (9) |
|
Marginalization and the covariance matrix |
|
|
264 | (4) |
|
|
268 | (4) |
|
More on model parameter errors |
|
|
272 | (1) |
|
|
273 | (2) |
|
Model comparison with Gaussian posteriors |
|
|
275 | (4) |
|
Frequentist testing and errors |
|
|
279 | (4) |
|
Other model comparison methods |
|
|
281 | (2) |
|
|
283 | (1) |
|
|
284 | (3) |
|
|
287 | (25) |
|
|
287 | (1) |
|
Asymptotic normal approximation |
|
|
288 | (3) |
|
|
291 | (3) |
|
|
291 | (2) |
|
Marginal parameter posteriors |
|
|
293 | (1) |
|
Finding the most probable parameters |
|
|
294 | (4) |
|
|
296 | (1) |
|
|
297 | (1) |
|
|
298 | (4) |
|
Levenberg--Marquardt method |
|
|
300 | (1) |
|
|
301 | (1) |
|
|
302 | (5) |
|
|
304 | (2) |
|
Marginal and projected distributions |
|
|
306 | (1) |
|
Errors in both coordinates |
|
|
307 | (2) |
|
|
309 | (1) |
|
|
309 | (3) |
|
|
312 | (40) |
|
|
312 | (1) |
|
Metropolis--Hastings algorithm |
|
|
313 | (6) |
|
Why does Metropolis--Hastings work? |
|
|
319 | (2) |
|
|
321 | (1) |
|
|
321 | (1) |
|
|
322 | (4) |
|
|
326 | (4) |
|
Towards an automated MCMC |
|
|
330 | (1) |
|
Extrasolar planet example |
|
|
331 | (11) |
|
|
335 | (2) |
|
|
337 | (5) |
|
MCMC robust summary statistic |
|
|
342 | (4) |
|
|
346 | (3) |
|
|
349 | (3) |
|
Bayesian revolution in spectral analysis |
|
|
352 | (24) |
|
|
352 | (1) |
|
New insights on the periodogram |
|
|
352 | (6) |
|
|
356 | (2) |
|
Strong prior signal model |
|
|
358 | (2) |
|
No specific prior signal model |
|
|
360 | (5) |
|
|
362 | (1) |
|
|
363 | (2) |
|
Generalized Lomb--Scargle periodogram |
|
|
365 | (5) |
|
Relationship to Lomb--Scargle periodogram |
|
|
367 | (1) |
|
|
367 | (3) |
|
|
370 | (3) |
|
|
373 | (3) |
|
Bayesian inference with Poisson sampling |
|
|
376 | (13) |
|
|
376 | (1) |
|
|
377 | (2) |
|
|
378 | (1) |
|
Signal + known background |
|
|
379 | (1) |
|
Analysis of ON/OFF measurements |
|
|
380 | (6) |
|
Estimating the source rate |
|
|
381 | (3) |
|
Source detection question |
|
|
384 | (2) |
|
Time-varying Poisson rate |
|
|
386 | (2) |
|
|
388 | (1) |
|
Appendix A Singular value decomposition |
|
|
389 | (3) |
|
Appendix B Discrete Fourier Transforms |
|
|
392 | (42) |
|
|
392 | (1) |
|
B.2 Orthogonal and orthonormal functions |
|
|
392 | (2) |
|
B.3 Fourier series and integral transform |
|
|
394 | (4) |
|
|
395 | (1) |
|
|
396 | (2) |
|
B.4 Convolution and correlation |
|
|
398 | (5) |
|
B.4.1 Convolution theorem |
|
|
399 | (1) |
|
B.4.2 Correlation theorem |
|
|
400 | (1) |
|
B.4.3 Importance of convolution in science |
|
|
401 | (2) |
|
|
403 | (1) |
|
B.6 Nyquist sampling theorem |
|
|
404 | (3) |
|
|
406 | (1) |
|
B.7 Discrete Fourier Transform |
|
|
407 | (4) |
|
B.7.1 Graphical development |
|
|
407 | (2) |
|
B.7.2 Mathematical development of the DFT |
|
|
409 | (1) |
|
|
410 | (1) |
|
|
411 | (4) |
|
B.8.1 DFT as an approximate Fourier transform |
|
|
411 | (2) |
|
B.8.2 Inverse discrete Fourier transform |
|
|
413 | (2) |
|
B.9 The Fast Fourier Transform |
|
|
415 | (2) |
|
B.10 Discrete convolution and correlation |
|
|
417 | (5) |
|
B.10.1 Deconvolving a noisy signal |
|
|
418 | (2) |
|
B.10.2 Deconvolution with an optimal Weiner filter |
|
|
420 | (1) |
|
B.10.3 Treatment of end effects by zero padding |
|
|
421 | (1) |
|
B.11 Accurate amplitudes by zero padding |
|
|
422 | (2) |
|
B.12 Power-spectrum estimation |
|
|
424 | (4) |
|
B.12.1 Parseval's theorem and power spectral density |
|
|
424 | (1) |
|
B.12.2 Periodogram power-spectrum estimation |
|
|
425 | (1) |
|
B.12.3 Correlation spectrum estimation |
|
|
426 | (2) |
|
B.13 Discrete power spectral density estimation |
|
|
428 | (4) |
|
B.13.1 Discrete form of Parseval's theorem |
|
|
428 | (1) |
|
B.13.2 One-sided discrete power spectral density |
|
|
429 | (1) |
|
B.13.3 Variance of periodogram estimate |
|
|
429 | (2) |
|
B.13.4 Yule's stochastic spectrum estimation model |
|
|
431 | (1) |
|
B.13.5 Reduction of periodogram variance |
|
|
431 | (1) |
|
|
432 | (2) |
|
Appendix C Difference in two samples |
|
|
434 | (11) |
|
|
434 | (1) |
|
C.2 Probabilities of the four hypotheses |
|
|
434 | (5) |
|
C.2.1 Evaluation of p(C,S|D1, D2, I) |
|
|
434 | (2) |
|
C.2.2 Evaluation of p(C,S|D1, D2, I) |
|
|
436 | (2) |
|
C.2.3 Evaluation of p(C,S|D1, D2, I) |
|
|
438 | (1) |
|
C.2.4 Evaluation of p(C,S|D1, D2, I) |
|
|
439 | (1) |
|
C.3 The difference in the means |
|
|
439 | (3) |
|
C.3.1 The two-sample problem |
|
|
440 | (1) |
|
C.3.2 The Behrens--Fisher problem |
|
|
441 | (1) |
|
C.4 The ratio of the standard deviations |
|
|
442 | (3) |
|
C.4.1 Estimating the ratio, given the means are the same |
|
|
442 | (1) |
|
C.4.2 Estimating the ratio, given the means are different |
|
|
443 | (2) |
|
Appendix D Poisson ON/OFF details |
|
|
445 | (5) |
|
D.1 Derivation of p(s|Non, I) |
|
|
445 | (3) |
|
|
446 | (1) |
|
|
447 | (1) |
|
D.2 Derivation of the Bayes factor B{s+b,b} |
|
|
448 | (2) |
|
Appendix E Multivariate Gaussian from maximum entropy |
|
|
450 | (5) |
References |
|
455 | (6) |
Index |
|
461 | |