|
|
Preface |
|
xxvii | |
P.1 Emphasis on Foundations |
|
xxvii | |
P.2 Glimpse of History |
|
xxix | |
P.3 Organization of the Text |
|
xxxi | |
P.4 How to Use the Text |
|
xxxiv | |
P.5 Simulation Datasets |
|
xxxvii | |
P.6 Acknowledgments |
|
xl | |
Notation |
|
xlv | |
|
27 Mean-Square-Error Inference |
|
|
1053 | (39) |
|
27.1 Inference without Observations |
|
|
1054 | (3) |
|
27.2 Inference with Observations |
|
|
1057 | (3) |
|
27.3 Gaussian Random Variables |
|
|
1060 | (12) |
|
27.4 Bias-Variance Relation |
|
|
1072 | (10) |
|
27.5 Commentaries and Discussion |
|
|
1082 | (6) |
|
|
1085 | (3) |
|
27.A Circular Gaussian Distribution |
|
|
1088 | (4) |
|
|
1090 | (2) |
|
|
1092 | (29) |
|
28.1 Bayesian Formulation |
|
|
1092 | (2) |
|
28.2 Maximum A-Posteriori Inference |
|
|
1094 | (3) |
|
|
1097 | (9) |
|
28.4 Logistic Regression Inference |
|
|
1106 | (4) |
|
28.5 Discriminative and Generative Models |
|
|
1110 | (3) |
|
28.6 Commentaries and Discussion |
|
|
1113 | (8) |
|
|
1116 | (3) |
|
|
1119 | (2) |
|
|
1121 | (33) |
|
|
1121 | (7) |
|
29.2 Centering and Augmentation |
|
|
1128 | (3) |
|
|
1131 | (3) |
|
|
1134 | (2) |
|
|
1136 | (3) |
|
29.6 Minimum-Variance Unbiased Estimation |
|
|
1139 | (4) |
|
29.7 Commentaries and Discussion |
|
|
1143 | (8) |
|
|
1145 | (6) |
|
29.A Consistency of Normal Equations |
|
|
1151 | (3) |
|
|
1153 | (1) |
|
|
1154 | (57) |
|
30.1 Uncorrected Observations |
|
|
1154 | (3) |
|
|
1157 | (2) |
|
|
1159 | (12) |
|
30.4 Measurement - and Time-Update Forms |
|
|
1171 | (6) |
|
|
1177 | (4) |
|
|
1181 | (4) |
|
30.7 Ensemble Kalman Filter |
|
|
1185 | (6) |
|
|
1191 | (10) |
|
30.9 Commentaries and Discussion |
|
|
1201 | (10) |
|
|
1204 | (4) |
|
|
1208 | (3) |
|
|
1211 | (65) |
|
|
1211 | (3) |
|
31.2 Gaussian Distribution |
|
|
1214 | (9) |
|
31.3 Multinomial Distribution |
|
|
1223 | (3) |
|
31.4 Exponential Family of Distributions |
|
|
1226 | (3) |
|
31.5 Cramer Rao Lower Bound |
|
|
1229 | (8) |
|
|
1237 | (14) |
|
31.7 Commentaries and Discussion |
|
|
1251 | (14) |
|
|
1259 | (6) |
|
31.A Derivation of the Cramer-Rao Bound |
|
|
1265 | (1) |
|
31.B Derivation of the AIC Formulation |
|
|
1266 | (5) |
|
31.C Derivation of the BIC Formulation |
|
|
1271 | (5) |
|
|
1273 | (3) |
|
32 Expectation Maximization |
|
|
1276 | (43) |
|
|
1276 | (6) |
|
32.2 Derivation of the EM Algorithm |
|
|
1282 | (5) |
|
32.3 Gaussian Mixture Models |
|
|
1287 | (15) |
|
32.4 Bernoulli Mixture Models |
|
|
1302 | (6) |
|
32.5 Commentaries and Discussion |
|
|
1308 | (4) |
|
|
1310 | (2) |
|
32.A Exponential Mixture Models |
|
|
1312 | (7) |
|
|
1316 | (3) |
|
|
1319 | (33) |
|
33.1 Posterior Distributions |
|
|
1320 | (8) |
|
|
1328 | (5) |
|
33.3 Markov Chain Monte Carlo Method |
|
|
1333 | (13) |
|
33.4 Commentaries and Discussion |
|
|
1346 | (6) |
|
|
1348 | (1) |
|
|
1349 | (3) |
|
34 Expectation Propagation |
|
|
1352 | (28) |
|
34.1 Factored Representation |
|
|
1352 | (5) |
|
|
1357 | (14) |
|
|
1371 | (4) |
|
34.4 Assumed Density Filtering |
|
|
1375 | (3) |
|
34.5 Commentaries and Discussion |
|
|
1378 | (2) |
|
|
1378 | (1) |
|
|
1379 | (1) |
|
|
1380 | (25) |
|
|
1380 | (5) |
|
|
1385 | (8) |
|
35.3 Particle Filter Implementations |
|
|
1393 | (7) |
|
35.4 Commentaries and Discussion |
|
|
1400 | (5) |
|
|
1401 | (2) |
|
|
1403 | (2) |
|
|
1405 | (67) |
|
36.1 Evaluating Evidences |
|
|
1405 | (6) |
|
36.2 Evaluating Posterior Distributions |
|
|
1411 | (2) |
|
36.3 Mean-Field Approximation |
|
|
1413 | (27) |
|
36.4 Exponential Conjugate Models |
|
|
1440 | (14) |
|
|
1454 | (4) |
|
36.6 Stochastic Gradient Solution |
|
|
1458 | (3) |
|
|
1461 | (6) |
|
36.8 Commentaries and Discussion |
|
|
1467 | (5) |
|
|
1467 | (3) |
|
|
1470 | (2) |
|
37 Latent Dirichlet Allocation |
|
|
1472 | (45) |
|
|
1473 | (9) |
|
37.2 Coordinate-Ascent Solution |
|
|
1482 | (11) |
|
|
1493 | (7) |
|
37.4 Estimating Model Parameters |
|
|
1500 | (14) |
|
37.5 Commentaries and Discussion |
|
|
1514 | (3) |
|
|
1515 | (1) |
|
|
1515 | (2) |
|
|
1517 | (46) |
|
38.1 Gaussian Mixture Models |
|
|
1517 | (5) |
|
|
1522 | (16) |
|
38.3 Forward-Backward Recursions |
|
|
1538 | (9) |
|
38.4 Validation and Prediction Tasks |
|
|
1547 | (4) |
|
38.5 Commentaries and Discussion |
|
|
1551 | (12) |
|
|
1557 | (3) |
|
|
1560 | (3) |
|
39 Decoding Hidden Markov Models |
|
|
1563 | (46) |
|
|
1563 | (2) |
|
39.2 Decoding Transition Probabilities |
|
|
1565 | (4) |
|
39.3 Normalization and Scaling |
|
|
1569 | (5) |
|
|
1574 | (12) |
|
39.5 EM Algorithm for Dependent Observations |
|
|
1586 | (18) |
|
39.6 Commentaries and Discussion |
|
|
1604 | (5) |
|
|
1605 | (2) |
|
|
1607 | (2) |
|
40 Independent Component Analysis |
|
|
1609 | (34) |
|
|
1610 | (7) |
|
40.2 Maximum-Likelihood Formulation |
|
|
1617 | (5) |
|
40.3 Mutual Information Formulation |
|
|
1622 | (5) |
|
40.4 Maximum Kurtosis Formulation |
|
|
1627 | (7) |
|
|
1634 | (3) |
|
40.6 Commentaries and Discussion |
|
|
1637 | (6) |
|
|
1638 | (2) |
|
|
1640 | (3) |
|
|
1643 | (39) |
|
41.1 Curse of Dimensionality |
|
|
1644 | (3) |
|
41.2 Probabilistic Graphical Models |
|
|
1647 | (23) |
|
41.3 Active and Blocked Pathways 1G61 |
|
|
|
41.4 Conditional Independence Relations |
|
|
1670 | (7) |
|
41.5 Commentaries and Discussion |
|
|
1677 | (5) |
|
|
1679 | (1) |
|
|
1680 | (2) |
|
|
1682 | (58) |
|
42.1 Probabilistic Inference |
|
|
1682 | (3) |
|
42.2 Inference by Enumeration |
|
|
1685 | (6) |
|
42.3 Inference by Variable Elimination |
|
|
1691 | (7) |
|
|
1698 | (7) |
|
|
1705 | (6) |
|
42.6 Learning Graph Parameters |
|
|
1711 | (22) |
|
42.7 Commentaries and Discussion |
|
|
1733 | (7) |
|
|
1735 | (2) |
|
|
1737 | (3) |
|
|
1740 | (67) |
|
43.1 Cliques and Potentials |
|
|
1740 | (12) |
|
43.2 Representation Theorem |
|
|
1752 | (4) |
|
|
1756 | (5) |
|
43.4 Message-Passing Algorithms |
|
|
1761 | (32) |
|
43.5 Commentaries and Discussion |
|
|
1793 | (6) |
|
|
1796 | (3) |
|
43.A Proof of the Hammersley Clifford Theorem |
|
|
1799 | (4) |
|
43.B Equivalence of Markovian Properties |
|
|
1803 | (4) |
|
|
1804 | (3) |
|
44 Markov Decision Processes |
|
|
1807 | (46) |
|
|
1807 | (14) |
|
|
1821 | (4) |
|
|
1825 | (15) |
|
44.4 Linear Function Approximation |
|
|
1840 | (8) |
|
44.5 Commentaries and Discussion |
|
|
1848 | (5) |
|
|
1850 | (1) |
|
|
1851 | (2) |
|
45 Value and Policy Iterations |
|
|
1853 | (64) |
|
|
1853 | (13) |
|
|
1866 | (13) |
|
45.3 Partially Observable MDP |
|
|
1879 | (14) |
|
45.4 Commentaries and Discussion |
|
|
1893 | (10) |
|
|
1900 | (3) |
|
45.A Optimal Policy and State Action Values |
|
|
1903 | (2) |
|
45.B Convergence of Value Iteration |
|
|
1905 | (1) |
|
45.C Proof of e-Optimality |
|
|
1906 | (1) |
|
45.D Convergence of Policy Iteration |
|
|
1907 | (2) |
|
45.E Piecewise Linear Property |
|
|
1909 | (1) |
|
45.F Bellman Principle of Optimality |
|
|
1910 | (7) |
|
|
1914 | (3) |
|
46 Temporal Difference Learning |
|
|
1917 | (54) |
|
46.1 Model-Based Learning |
|
|
1918 | (2) |
|
46.2 Monte Carlo Policy Evaluation |
|
|
1920 | (8) |
|
|
1928 | (8) |
|
46.4 Look-Ahead TD Algorithm |
|
|
1936 | (4) |
|
|
1940 | (9) |
|
46.6 True Online TD(λ) Algorithm |
|
|
1949 | (3) |
|
|
1952 | (5) |
|
46.8 Commentaries and Discussion |
|
|
1957 | (2) |
|
|
1958 | (1) |
|
46.A Useful Convergence Result |
|
|
1959 | (1) |
|
46.B Convergence of TD(0) Algorithm |
|
|
1960 | (3) |
|
46.C Convergence of TD(λ) Algorithm |
|
|
1963 | (4) |
|
46.D Equivalence of Offline Implementations |
|
|
1967 | (4) |
|
|
1969 | (2) |
|
|
1971 | (37) |
|
|
1971 | (4) |
|
47.2 Look-Ahead SARSA Algorithm |
|
|
1975 | (2) |
|
|
1977 | (2) |
|
|
1979 | (1) |
|
47.5 Optimal Policy Extraction |
|
|
1980 | (2) |
|
47.6 Q-Learning Algorithm |
|
|
1982 | (3) |
|
47.7 Exploration versus Exploitation |
|
|
1985 | (8) |
|
47.8 Q-Learning with Replay Buffer |
|
|
1993 | (1) |
|
|
1994 | (5) |
|
47.10 Commentaries and Discussion 1996, Problems |
|
|
1999 | (2) |
|
47.A Convergence of SARSA(O) Algorithm |
|
|
2001 | (2) |
|
47.B Convergence of Q-Learning Algorithm |
|
|
2003 | (5) |
|
|
2005 | (3) |
|
48 Value Function Approximation |
|
|
2008 | (39) |
|
48.1 Stochastic Gradient TD-Learning |
|
|
2008 | (10) |
|
48.2 Least-Squares TD-Learning |
|
|
2018 | (1) |
|
48.3 Projected Bellman Learning |
|
|
2019 | (7) |
|
|
2026 | (6) |
|
|
2032 | (9) |
|
48.6 Commentaries and Discussion |
|
|
2041 | (6) |
|
|
2043 | (2) |
|
|
2045 | (2) |
|
49 Policy Gradient Methods |
|
|
2047 | (74) |
|
|
2047 | (1) |
|
49.2 Finite-Difference Method |
|
|
2048 | (2) |
|
|
2050 | (2) |
|
|
2052 | (5) |
|
49.5 Policy Gradient Theorem |
|
|
2057 | (2) |
|
49.6 Actor-Critic Algorithms |
|
|
2059 | (12) |
|
49.7 Natural Gradient Policy |
|
|
2071 | (3) |
|
49.8 Trust Region Policy Optimization |
|
|
2074 | (19) |
|
49.9 Deep Reinforcement Learning |
|
|
2093 | (5) |
|
|
2098 | (8) |
|
49.11 Commentaries and Discussion |
|
|
2106 | (7) |
|
|
2109 | (4) |
|
49.A Proof of Policy Gradient Theorem |
|
|
2113 | (4) |
|
49.B Proof of Consistency Theorem |
|
|
2117 | (4) |
|
|
2118 | (3) |
Author Index |
|
2121 | (24) |
Subject Index |
|
2145 | |