Foreword |
|
xv | |
|
Preface to second edition |
|
xvii | |
Preface to first edition |
|
xxi | |
About the author |
|
xxiii | |
List of symbols |
|
xxv | |
List of algorithms |
|
xxvii | |
I Basics |
|
1 | (160) |
|
|
3 | (26) |
|
1.1 The problem of missing data |
|
|
3 | (5) |
|
|
3 | (3) |
|
1.1.2 Changing perspective on missing data |
|
|
6 | (2) |
|
1.2 Concepts of MCAR, MAR and MNAR |
|
|
8 | (1) |
|
|
9 | (10) |
|
|
9 | (2) |
|
|
11 | (1) |
|
|
12 | (1) |
|
1.3.4 Regression imputation |
|
|
13 | (1) |
|
1.3.5 Stochastic regression imputation |
|
|
14 | (2) |
|
|
16 | (1) |
|
|
17 | (1) |
|
|
18 | (1) |
|
1.4 Multiple imputation in a nutshell |
|
|
19 | (4) |
|
|
19 | (1) |
|
1.4.2 Reasons to use multiple imputation |
|
|
20 | (1) |
|
1.4.3 Example of multiple imputation |
|
|
21 | (2) |
|
|
23 | (1) |
|
1.6 What the book does not cover |
|
|
23 | (3) |
|
|
24 | (1) |
|
1.6.2 Weighting procedures |
|
|
24 | (1) |
|
1.6.3 Likelihood-based approaches |
|
|
25 | (1) |
|
1.7 Structure of the book |
|
|
26 | (1) |
|
|
26 | (3) |
|
|
29 | (34) |
|
|
29 | (4) |
|
|
29 | (1) |
|
2.1.2 Multiple imputation |
|
|
30 | (2) |
|
2.1.3 The expanding literature on multiple imputation |
|
|
32 | (1) |
|
2.2 Concepts in incomplete data |
|
|
33 | (8) |
|
2.2.1 Incomplete-data perspective |
|
|
33 | (1) |
|
2.2.2 Causes of missing data |
|
|
33 | (2) |
|
|
35 | (1) |
|
2.2.4 MCAR, MAR, and MNAR again |
|
|
36 | (2) |
|
2.2.5 Ignorable and nonignorable |
|
|
38 | (1) |
|
2.2.6 Implications of ignorability |
|
|
39 | (2) |
|
2.3 Why and when multiple imputation works |
|
|
41 | (8) |
|
2.3.1 Goal of multiple imputation |
|
|
41 | (1) |
|
2.3.2 Three sources of variation |
|
|
41 | (3) |
|
|
44 | (2) |
|
2.3.4 Scope of the imputation model |
|
|
46 | (1) |
|
|
46 | (1) |
|
|
47 | (1) |
|
|
48 | (1) |
|
2.4 Statistical intervals and tests |
|
|
49 | (2) |
|
2.4.1 Scalar or multi-parameter inference? |
|
|
49 | (1) |
|
|
50 | (1) |
|
|
50 | (1) |
|
2.5 How to evaluate imputation methods |
|
|
51 | (4) |
|
2.5.1 Simulation designs and performance measures |
|
|
51 | (1) |
|
2.5.2 Evaluation criteria |
|
|
52 | (1) |
|
|
53 | (2) |
|
2.6 Imputation is not prediction |
|
|
55 | (2) |
|
2.7 When not to use multiple imputation |
|
|
57 | (1) |
|
2.8 How many imputations? |
|
|
58 | (3) |
|
|
61 | (2) |
|
3 Univariate missing data |
|
|
63 | (42) |
|
3.1 How to generate multiple imputations |
|
|
63 | (4) |
|
|
65 | (1) |
|
3.1.2 Predict + noise method |
|
|
65 | (1) |
|
3.1.3 Predict + noise + parameter uncertainty |
|
|
65 | (1) |
|
|
66 | (1) |
|
3.1.5 Drawing from the observed data |
|
|
66 | (1) |
|
|
66 | (1) |
|
3.2 Imputation under the normal linear normal |
|
|
67 | (7) |
|
|
67 | (1) |
|
|
67 | (2) |
|
|
69 | (1) |
|
3.2.4 Generating MAR missing data |
|
|
70 | (2) |
|
3.2.5 MAR missing data generation in multivariate data |
|
|
72 | (1) |
|
|
73 | (1) |
|
3.3 Imputation under non-normal distributions |
|
|
74 | (3) |
|
|
74 | (1) |
|
3.3.2 Imputation from the t-distribution |
|
|
75 | (2) |
|
3.4 Predictive mean matching |
|
|
77 | (7) |
|
|
77 | (2) |
|
3.4.2 Computational details |
|
|
79 | (2) |
|
|
81 | (1) |
|
|
82 | (2) |
|
|
84 | (1) |
|
3.5 Classification and regression trees |
|
|
84 | (3) |
|
|
84 | (3) |
|
|
87 | (4) |
|
3.6.1 Generalized linear model |
|
|
87 | (2) |
|
|
89 | (1) |
|
|
90 | (1) |
|
|
91 | (5) |
|
|
91 | (1) |
|
3.7.2 Semi-continuous data |
|
|
92 | (1) |
|
3.7.3 Censored, truncated and rounded data |
|
|
93 | (3) |
|
3.8 Nonignorable missing data |
|
|
96 | (6) |
|
|
96 | (1) |
|
|
97 | (1) |
|
3.8.3 Pattern-mixture model |
|
|
98 | (1) |
|
3.8.4 Converting selection and pattern-mixture models |
|
|
99 | (1) |
|
3.8.5 Sensitivity analysis |
|
|
100 | (1) |
|
3.8.6 Role of sensitivity analysis |
|
|
101 | (1) |
|
3.8.7 Recent developments |
|
|
102 | (1) |
|
|
102 | (3) |
|
4 Multivariate missing data |
|
|
105 | (34) |
|
|
105 | (6) |
|
|
105 | (2) |
|
|
107 | (2) |
|
|
109 | (2) |
|
4.2 Issues in multivariate imputation |
|
|
111 | (1) |
|
4.3 Monotone data imputation |
|
|
112 | (3) |
|
|
112 | (1) |
|
|
113 | (2) |
|
|
115 | (4) |
|
|
115 | (1) |
|
|
115 | (2) |
|
|
117 | (2) |
|
4.5 Fully conditional specification |
|
|
119 | (11) |
|
|
119 | (1) |
|
|
120 | (2) |
|
|
122 | (2) |
|
4.5.4 Congeniality or compatibility? |
|
|
124 | (1) |
|
4.5.5 Model-based and data-based imputation |
|
|
125 | (1) |
|
4.5.6 Number of iterations |
|
|
126 | (1) |
|
4.5.7 Example of slow convergence |
|
|
126 | (3) |
|
|
129 | (1) |
|
|
130 | (5) |
|
4.6.1 Relations between FCS and JM |
|
|
130 | (1) |
|
|
130 | (1) |
|
|
131 | (4) |
|
|
135 | (2) |
|
4.7.1 Skipping imputations and overimputation |
|
|
135 | (1) |
|
4.7.2 Blocks of variables, hybrid imputation |
|
|
135 | (1) |
|
4.7.3 Blocks of units, monotone blocks |
|
|
136 | (1) |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
137 | (2) |
|
5 Analysis of imputed data |
|
|
139 | (22) |
|
|
139 | (6) |
|
5.1.1 Recommended workflows |
|
|
140 | (2) |
|
5.1.2 Not recommended workflow: Averaging the data |
|
|
142 | (2) |
|
5.1.3 Not recommended workflow: Stack imputed data |
|
|
144 | (1) |
|
|
144 | (1) |
|
|
145 | (2) |
|
5.2.1 Scalar inference of normal quantities |
|
|
145 | (1) |
|
5.2.2 Scalar inference of non-normal quantities |
|
|
146 | (1) |
|
5.3 Multi-parameter inference |
|
|
147 | (6) |
|
5.3.1 D1 Multivariate Wald test |
|
|
147 | (2) |
|
5.3.2 D2 Combining test statistics |
|
|
149 | (1) |
|
5.3.3 D3 Likelihood ratio test |
|
|
150 | (2) |
|
|
152 | (1) |
|
5.4 Stepwise model selection |
|
|
153 | (4) |
|
5.4.1 Variable selection techniques |
|
|
153 | (1) |
|
|
154 | (1) |
|
|
155 | (2) |
|
|
157 | (1) |
|
|
158 | (1) |
|
|
158 | (3) |
II Advanced techniques |
|
161 | (96) |
|
|
163 | (34) |
|
6.1 Overview of modeling choices |
|
|
163 | (2) |
|
6.2 Ignorable or nonignorable? |
|
|
165 | (1) |
|
6.3 Model form and predictors |
|
|
166 | (4) |
|
|
166 | (1) |
|
|
167 | (3) |
|
|
170 | (14) |
|
6.4.1 Ratio of two variables |
|
|
170 | (5) |
|
|
175 | (1) |
|
6.4.3 Quadratic relations |
|
|
176 | (1) |
|
|
177 | (4) |
|
|
181 | (1) |
|
6.4.6 Conditional imputation |
|
|
182 | (2) |
|
|
184 | (5) |
|
|
184 | (3) |
|
|
187 | (2) |
|
|
189 | (5) |
|
6.6.1 Model fit versus distributional discrepancy |
|
|
190 | (1) |
|
|
190 | (4) |
|
|
194 | (1) |
|
|
195 | (2) |
|
7 Multilevel multiple imputation |
|
|
197 | (44) |
|
|
197 | (1) |
|
7.2 Notation for multilevel models |
|
|
197 | (3) |
|
7.3 Missing values in multilevel data |
|
|
200 | (4) |
|
7.3.1 Practical issues in multilevel imputation |
|
|
201 | (1) |
|
7.3.2 Ad-hoc solutions for multilevel data |
|
|
202 | (1) |
|
7.3.3 Likelihood solutions |
|
|
203 | (1) |
|
7.4 Multilevel imputation by joint modeling |
|
|
204 | (1) |
|
7.5 Multilevel imputation by fully conditional specification |
|
|
205 | (2) |
|
7.5.1 Add cluster means of predictors |
|
|
206 | (1) |
|
7.5.2 Model cluster heterogeneity |
|
|
207 | (1) |
|
|
207 | (7) |
|
|
208 | (1) |
|
|
209 | (1) |
|
|
209 | (5) |
|
|
214 | (4) |
|
|
214 | (1) |
|
|
215 | (3) |
|
7.8 Imputation of level-2 variable |
|
|
218 | (1) |
|
|
219 | (1) |
|
7.10 Guidelines and advice |
|
|
220 | (20) |
|
7.10.1 Intercept-only model, missing outcomes |
|
|
222 | (1) |
|
7.10.2 Random intercepts, missing level-1 predictor |
|
|
222 | (2) |
|
7.10.3 Random intercepts, contextual model |
|
|
224 | (2) |
|
7.10.4 Random intercepts, missing level-2 predictor |
|
|
226 | (2) |
|
7.10.5 Random intercepts, interactions |
|
|
228 | (4) |
|
7.10.6 Random slopes, missing outcomes and predictors |
|
|
232 | (2) |
|
7.10.7 Random slopes, interactions |
|
|
234 | (4) |
|
|
238 | (2) |
|
|
240 | (1) |
|
8 Individual causal effects |
|
|
241 | (16) |
|
8.1 Need for individual causal effects |
|
|
241 | (2) |
|
8.2 Problem of causal inference |
|
|
243 | (2) |
|
|
245 | (1) |
|
8.4 Generating imputations by FCS |
|
|
246 | (8) |
|
|
246 | (1) |
|
8.4.2 FCS with a prior for p |
|
|
247 | (6) |
|
|
253 | (1) |
|
|
254 | (3) |
III Case studies |
|
257 | (80) |
|
|
259 | (36) |
|
|
259 | (12) |
|
9.1.1 Scientific question |
|
|
260 | (1) |
|
|
260 | (1) |
|
|
261 | (2) |
|
|
263 | (2) |
|
9.1.5 Finding problems: loggedEvents |
|
|
265 | (2) |
|
9.1.6 Quick predictor selection: quickpred |
|
|
267 | (1) |
|
9.1.7 Generating the imputations |
|
|
268 | (2) |
|
9.1.8 A further improvement: Survival as predictor variable |
|
|
270 | (1) |
|
|
270 | (1) |
|
|
271 | (6) |
|
9.2.1 Causes and consequences of missing data |
|
|
272 | (2) |
|
|
274 | (1) |
|
9.2.3 Generating imputations under the δ-adjustment |
|
|
274 | (1) |
|
9.2.4 Complete-data model |
|
|
275 | (2) |
|
|
277 | (1) |
|
9.3 Correct prevalence estimates from self-reported data |
|
|
277 | (6) |
|
9.3.1 Description of the problem |
|
|
277 | (1) |
|
9.3.2 Don't count on predictions |
|
|
278 | (2) |
|
|
280 | (1) |
|
|
281 | (1) |
|
|
281 | (2) |
|
|
283 | (1) |
|
9.4 Enhancing comparability |
|
|
283 | (11) |
|
9.4.1 Description of the problem |
|
|
283 | (1) |
|
9.4.2 Full dependence: Simple equating |
|
|
284 | (2) |
|
9.4.3 Independence: Imputation without a bridge study |
|
|
286 | (2) |
|
9.4.4 Fully dependent or independent? |
|
|
288 | (1) |
|
9.4.5 Imputation using a bridge study |
|
|
289 | (3) |
|
|
292 | (1) |
|
|
293 | (1) |
|
|
294 | (1) |
|
|
295 | (16) |
|
10.1 Correcting for selective drop-out |
|
|
295 | (7) |
|
10.1.1 POPS study: 19 years follow-up |
|
|
295 | (1) |
|
10.1.2 Characterization of the drop-out |
|
|
296 | (1) |
|
|
296 | (3) |
|
10.1.4 A solution "that does not look good" |
|
|
299 | (2) |
|
|
301 | (1) |
|
|
302 | (1) |
|
10.2 Correcting for nonresponse |
|
|
302 | (7) |
|
10.2.1 Fifth Dutch Growth Study |
|
|
303 | (1) |
|
|
303 | (1) |
|
10.2.3 Comparison to known population totals |
|
|
304 | (1) |
|
10.2.4 Augmenting the sample |
|
|
304 | (2) |
|
|
306 | (1) |
|
10.2.6 Influence of nonresponse on final height |
|
|
307 | (1) |
|
|
308 | (1) |
|
|
309 | (2) |
|
|
311 | (26) |
|
11.1 Long and wide format |
|
|
311 | (2) |
|
11.2 SE Fireworks Disaster Study |
|
|
313 | (7) |
|
11.2.1 Intention to treat |
|
|
314 | (1) |
|
|
315 | (2) |
|
11.2.3 Inspecting imputations |
|
|
317 | (1) |
|
11.2.4 Complete-data model |
|
|
318 | (1) |
|
11.2.5 Results from the complete-data model |
|
|
319 | (1) |
|
11.3 Time raster imputation |
|
|
320 | (12) |
|
|
321 | (1) |
|
11.3.2 Scientific question: Critical periods |
|
|
322 | (2) |
|
11.3.3 Broken stick model |
|
|
324 | (2) |
|
11.3.4 Terneuzen Birth Cohort |
|
|
326 | (2) |
|
11.3.5 Shrinkage and the change score |
|
|
328 | (1) |
|
|
328 | (2) |
|
11.3.7 Complete-data model |
|
|
330 | (2) |
|
|
332 | (2) |
|
|
334 | (3) |
IV Extensions |
|
337 | (14) |
|
|
339 | (12) |
|
12.1 Some dangers, some do's and some don'ts |
|
|
339 | (3) |
|
|
339 | (1) |
|
|
340 | (1) |
|
|
341 | (1) |
|
|
342 | (3) |
|
12.2.1 Reporting guidelines |
|
|
343 | (1) |
|
|
344 | (1) |
|
|
345 | (2) |
|
12.3.1 Synthetic datasets for data protection |
|
|
345 | (1) |
|
12.3.2 Analysis of coarsened data |
|
|
345 | (1) |
|
12.3.3 File matching of multiple datasets |
|
|
346 | (1) |
|
12.3.4 Planned missing data for efficient designs |
|
|
346 | (1) |
|
12.3.5 Adjusting for verification bias |
|
|
347 | (1) |
|
|
347 | (2) |
|
|
347 | (1) |
|
12.4.2 Algorithms for blocks and batches |
|
|
347 | (1) |
|
|
348 | (1) |
|
12.4.4 Better trials with dynamic treatment regimes |
|
|
348 | (1) |
|
12.4.5 Distribution-free pooling rules |
|
|
348 | (1) |
|
12.4.6 Improved diagnostic techniques |
|
|
349 | (1) |
|
12.4.7 Building block in modular statistics |
|
|
349 | (1) |
|
|
349 | (2) |
References |
|
351 | (42) |
Author index |
|
393 | (12) |
Subject index |
|
405 | |