Preface |
|
xix | |
Acknowledgments |
|
xxi | |
About this book |
|
xxiii | |
About the author |
|
xxx | |
About the cover illustration |
|
xxxi | |
|
|
1 | (114) |
|
|
3 | (17) |
|
|
5 | (2) |
|
1.2 Obtaining and installing R |
|
|
7 | (1) |
|
|
7 | (8) |
|
|
8 | (2) |
|
|
10 | (2) |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
14 | (1) |
|
|
15 | (2) |
|
|
15 | (1) |
|
|
15 | (1) |
|
|
16 | (1) |
|
|
16 | (1) |
|
1.5 Using output as input: Reusing results |
|
|
17 | (1) |
|
1.6 Working with large datasets |
|
|
18 | (1) |
|
1.7 Working through an example |
|
|
18 | (2) |
|
|
20 | (26) |
|
2.1 Understanding datasets |
|
|
21 | (1) |
|
|
22 | (11) |
|
|
23 | (1) |
|
|
23 | (2) |
|
|
25 | (1) |
|
|
26 | (2) |
|
|
28 | (2) |
|
|
30 | (1) |
|
|
31 | (2) |
|
|
33 | (10) |
|
Entering data from the keyboard |
|
|
34 | (1) |
|
Importing data from a delimited text file |
|
|
35 | (4) |
|
Importing data from Excel |
|
|
39 | (1) |
|
|
39 | (1) |
|
Importing data from the web |
|
|
39 | (1) |
|
|
40 | (1) |
|
|
40 | (1) |
|
Importing data from Stata |
|
|
41 | (1) |
|
Accessing database management systems |
|
|
41 | (1) |
|
Importing data via Stat/Transfer |
|
|
42 | (1) |
|
|
43 | (1) |
|
|
43 | (1) |
|
|
44 | (1) |
|
2.5 Useful functions for working with data objects |
|
|
44 | (2) |
|
|
46 | (22) |
|
|
47 | (1) |
|
3.2 Creating new variables |
|
|
48 | (2) |
|
|
50 | (1) |
|
|
51 | (1) |
|
|
52 | (2) |
|
Recoding values to missing |
|
|
53 | (1) |
|
Excluding missing values from analyses |
|
|
53 | (1) |
|
|
54 | (2) |
|
Converting dates to character variables |
|
|
56 | (1) |
|
|
56 | (1) |
|
|
56 | (1) |
|
|
57 | (1) |
|
|
58 | (1) |
|
Adding columns to a data frame |
|
|
58 | (1) |
|
Adding rows to a data frame |
|
|
58 | (1) |
|
|
59 | (3) |
|
|
59 | (1) |
|
|
59 | (1) |
|
|
60 | (1) |
|
|
61 | (1) |
|
|
62 | (1) |
|
3.11 Using dplyr to manipulate data frames |
|
|
62 | (4) |
|
|
62 | (3) |
|
Using pipe operators to chain statements |
|
|
65 | (1) |
|
3.12 Using SQL statements to manipulate data frames |
|
|
66 | (2) |
|
4 Getting started with graphs |
|
|
68 | (20) |
|
4.1 Creating a graph with ggplot2 |
|
|
69 | (13) |
|
|
69 | (1) |
|
|
70 | (4) |
|
|
74 | (2) |
|
|
76 | (2) |
|
|
78 | (2) |
|
|
80 | (1) |
|
|
80 | (2) |
|
|
82 | (6) |
|
Placing the data and mapping options |
|
|
82 | (2) |
|
|
84 | (1) |
|
|
85 | (1) |
|
|
86 | (2) |
|
5 Advanced data management |
|
|
88 | (27) |
|
5.1 A data management challenge |
|
|
89 | (1) |
|
5.2 Numerical and character functions |
|
|
90 | (14) |
|
|
90 | (1) |
|
|
91 | (2) |
|
|
93 | (3) |
|
|
96 | (2) |
|
|
98 | (1) |
|
Applying functions to matrices and data frames |
|
|
99 | (1) |
|
A solution for the data management challenge |
|
|
100 | (4) |
|
|
104 | (2) |
|
|
104 | (1) |
|
|
105 | (1) |
|
5.4 User-written functions |
|
|
106 | (3) |
|
|
109 | (3) |
|
|
109 | (1) |
|
Converting from wide to long dataset formats |
|
|
109 | (3) |
|
|
112 | (3) |
|
|
115 | (62) |
|
|
117 | (30) |
|
|
118 | (10) |
|
|
118 | (1) |
|
Stacked, grouped, and filled bar charts |
|
|
119 | (2) |
|
|
121 | (2) |
|
|
123 | (5) |
|
|
128 | (2) |
|
|
130 | (3) |
|
|
133 | (2) |
|
|
135 | (3) |
|
|
138 | (5) |
|
Using parallel box plots to compare groups |
|
|
139 | (3) |
|
|
142 | (1) |
|
|
143 | (4) |
|
|
147 | (30) |
|
7.1 Descriptive statistics |
|
|
148 | (8) |
|
|
148 | (2) |
|
|
150 | (2) |
|
Descriptive statistics by group |
|
|
152 | (2) |
|
Summarizing data interactively with dplyr |
|
|
154 | (1) |
|
|
155 | (1) |
|
7.2 Frequency and contingency tables |
|
|
156 | (8) |
|
Generating frequency tables |
|
|
156 | (6) |
|
|
162 | (1) |
|
|
163 | (1) |
|
|
164 | (1) |
|
|
164 | (5) |
|
|
165 | (2) |
|
Testing correlations for significance |
|
|
167 | (2) |
|
|
169 | (1) |
|
|
169 | (2) |
|
|
169 | (1) |
|
|
170 | (1) |
|
When there are more than two groups |
|
|
171 | (1) |
|
7.5 Nonparametric tests of group differences |
|
|
171 | (4) |
|
|
171 | (2) |
|
Comparing more than two groups |
|
|
173 | (2) |
|
7.6 Visualizing group differences |
|
|
175 | (2) |
|
Part 3 Intermediate methods |
|
|
177 | (136) |
|
|
179 | (42) |
|
8.1 The many faces of regression |
|
|
180 | (3) |
|
Scenarios for using OLS regression |
|
|
181 | (1) |
|
|
182 | (1) |
|
|
183 | (11) |
|
Fitting regression models with lm() |
|
|
184 | (1) |
|
|
185 | (3) |
|
|
188 | (2) |
|
Multiple linear regression |
|
|
190 | (2) |
|
Multiple linear regression with interactions |
|
|
192 | (2) |
|
8.3 Regression diagnostics |
|
|
194 | (9) |
|
|
195 | (2) |
|
|
197 | (5) |
|
|
202 | (1) |
|
|
203 | (4) |
|
|
203 | (1) |
|
|
203 | (1) |
|
|
204 | (3) |
|
|
207 | (4) |
|
|
208 | (1) |
|
|
208 | (2) |
|
Adding or deleting variables |
|
|
210 | (1) |
|
Trying a different approach |
|
|
210 | (1) |
|
8.6 Selecting the "best" regression model |
|
|
211 | (4) |
|
|
211 | (1) |
|
|
212 | (3) |
|
8.7 Taking the analysis further |
|
|
215 | (6) |
|
|
215 | (2) |
|
|
217 | (4) |
|
|
221 | (28) |
|
9.1 A crash course on terminology |
|
|
222 | (2) |
|
|
224 | (2) |
|
|
224 | (1) |
|
The order of formula terms |
|
|
225 | (1) |
|
|
226 | (7) |
|
|
228 | (4) |
|
Assessing test assumptions |
|
|
232 | (1) |
|
|
233 | (4) |
|
Assessing test assumptions |
|
|
235 | (1) |
|
|
236 | (1) |
|
9.5 Two-way factorial ANOVA |
|
|
237 | (2) |
|
9.6 Repeated measures ANOVA |
|
|
239 | (3) |
|
9.7 Multivariate analysis of variance (MANOVA) |
|
|
242 | (4) |
|
Assessing test assumptions |
|
|
244 | (1) |
|
|
245 | (1) |
|
|
246 | (3) |
|
|
249 | (16) |
|
10.1 A quick review of hypothesis testing |
|
|
250 | (2) |
|
10.2 Implementing power analysis with the pwr package |
|
|
252 | (10) |
|
|
253 | (2) |
|
|
255 | (1) |
|
|
255 | (1) |
|
|
256 | (1) |
|
|
257 | (1) |
|
|
258 | (1) |
|
Choosing an appropriate effect size in novel situations |
|
|
259 | (3) |
|
10.3 Creating power analysis plots |
|
|
262 | (1) |
|
|
263 | (2) |
|
|
265 | (28) |
|
|
266 | (16) |
|
|
269 | (3) |
|
High-density scatter plots |
|
|
272 | (3) |
|
|
275 | (2) |
|
Spinning 3D scatter plots |
|
|
277 | (2) |
|
|
279 | (3) |
|
|
282 | (2) |
|
|
284 | (5) |
|
|
289 | (4) |
|
12 Resampling statistics and bootstrapping |
|
|
293 | (20) |
|
|
294 | (2) |
|
12.2 Permutation tests with the coin package |
|
|
296 | (4) |
|
Independent two-sample and k-sample tests |
|
|
297 | (1) |
|
Independence in contingency tables |
|
|
298 | (1) |
|
Independence between numeric variables |
|
|
299 | (1) |
|
Dependent two-sample and k-sample tests |
|
|
300 | (1) |
|
|
300 | (1) |
|
12.3 Permutation tests with the ImPerm package |
|
|
300 | (4) |
|
Simple and polynomial regression |
|
|
301 | (1) |
|
|
302 | (1) |
|
|
303 | (1) |
|
|
304 | (1) |
|
12.4 Additional comments on permutation tests |
|
|
304 | (1) |
|
|
305 | (1) |
|
12.6 Bootstrapping with the boot package |
|
|
306 | (7) |
|
Bootstrapping a single statistic |
|
|
307 | (2) |
|
Bootstrapping several statistics |
|
|
309 | (4) |
|
|
313 | (144) |
|
13 Generalized linear models |
|
|
315 | (18) |
|
13.1 Generalized linear models and the glm() function |
|
|
316 | (4) |
|
|
317 | (1) |
|
|
318 | (1) |
|
Model fit and regression diagnostics |
|
|
319 | (1) |
|
|
320 | (6) |
|
Interpreting the model parameters |
|
|
323 | (1) |
|
Assessing the impact of predictors on the probability of an outcome |
|
|
323 | (1) |
|
|
324 | (1) |
|
|
325 | (1) |
|
|
326 | (7) |
|
Interpreting the model parameters |
|
|
328 | (1) |
|
|
329 | (2) |
|
|
331 | (2) |
|
14 Principal components and factor analysis |
|
|
333 | (22) |
|
14.1 Principal components and factor analysis in R |
|
|
335 | (1) |
|
14.2 Principal components |
|
|
336 | (9) |
|
Selecting the number of components to extract |
|
|
337 | (1) |
|
Extracting principal components |
|
|
338 | (4) |
|
Rotating principal components |
|
|
342 | (1) |
|
Obtaining principal component scores |
|
|
343 | (2) |
|
14.3 Exploratory factor analysis |
|
|
345 | (7) |
|
Deciding how many common factors to extract |
|
|
346 | (1) |
|
Extracting common factors |
|
|
347 | (1) |
|
|
348 | (4) |
|
|
352 | (1) |
|
Other FFA-related packages |
|
|
352 | (1) |
|
14.4 Other latent variable models |
|
|
352 | (3) |
|
|
355 | (31) |
|
15.1 Creating a time-series object in R |
|
|
358 | (2) |
|
15.2 Smoothing and seasonal decomposition |
|
|
360 | (8) |
|
Smoothing with simple moving averages |
|
|
360 | (2) |
|
|
362 | (6) |
|
15.3 Exponential forecasting models |
|
|
368 | (8) |
|
Simple exponential smoothing |
|
|
369 | (3) |
|
Holt and Holt-Winters exponential smoothing |
|
|
372 | (2) |
|
The ets() function and automated forecasting |
|
|
374 | (2) |
|
15.4 ARIMA forecasting models |
|
|
376 | (8) |
|
|
376 | (2) |
|
|
378 | (5) |
|
Automated ARIMA forecasting |
|
|
383 | (1) |
|
|
384 | (2) |
|
|
386 | (23) |
|
16.1 Common steps in cluster analysis |
|
|
388 | (2) |
|
16.2 Calculating distances |
|
|
390 | (1) |
|
16.3 Hierarchical cluster analysis |
|
|
391 | (5) |
|
16.4 Partitioning-cluster analysis |
|
|
396 | (8) |
|
|
396 | (7) |
|
Partitioning around medoids |
|
|
403 | (1) |
|
16.5 Avoiding nonexistent clusters |
|
|
404 | (4) |
|
|
408 | (1) |
|
|
409 | (25) |
|
|
410 | (2) |
|
|
412 | (1) |
|
|
413 | (5) |
|
|
413 | (4) |
|
Conditional inference trees |
|
|
417 | (1) |
|
|
418 | (3) |
|
17.5 Support vector machines |
|
|
421 | (4) |
|
|
423 | (2) |
|
17.6 Choosing a best predictive solution |
|
|
425 | (3) |
|
17.7 Understanding black box predictions |
|
|
428 | (4) |
|
|
428 | (3) |
|
|
431 | (1) |
|
|
432 | (2) |
|
18 Advanced methods for missing data |
|
|
434 | (23) |
|
18.1 Steps in dealing with missing data |
|
|
435 | (2) |
|
18.2 Identifying missing values |
|
|
437 | (1) |
|
18.3 Exploring missing-values patterns |
|
|
438 | (6) |
|
Visualizing missing values |
|
|
439 | (3) |
|
Using correlations to explore missing values |
|
|
442 | (2) |
|
18.4 Understanding the sources and impact of missing data |
|
|
444 | (1) |
|
18.5 Rational approaches for dealing with incomplete data |
|
|
445 | (1) |
|
18.6 Deleting missing data |
|
|
446 | (2) |
|
Complete-case analysis (listwise deletion) |
|
|
446 | (2) |
|
Available case analysis (pairwise deletion) |
|
|
448 | (1) |
|
|
448 | (3) |
|
|
449 | (1) |
|
K-nearest neighbor imputation |
|
|
449 | (1) |
|
|
450 | (1) |
|
|
451 | (4) |
|
18.9 Other approaches to missing data |
|
|
455 | (2) |
|
Part 5 Expanding your skills |
|
|
457 | (111) |
|
|
459 | (32) |
|
|
460 | (10) |
|
|
460 | (6) |
|
|
466 | (4) |
|
|
470 | (8) |
|
|
471 | (1) |
|
|
472 | (3) |
|
|
475 | (2) |
|
Customizing the plot area |
|
|
477 | (1) |
|
|
478 | (7) |
|
|
485 | (2) |
|
19.5 Making graphs interactive |
|
|
487 | (4) |
|
|
491 | (34) |
|
20.1 A review of the language |
|
|
492 | (11) |
|
|
492 | (6) |
|
|
498 | (3) |
|
|
501 | (2) |
|
20.2 Working with environments |
|
|
503 | (2) |
|
20.3 Non-standard evaluation |
|
|
505 | (3) |
|
20.4 Object-oriented programming |
|
|
508 | (2) |
|
|
508 | (2) |
|
Limitations of the S3 model |
|
|
510 | (1) |
|
20.5 Writing efficient code |
|
|
510 | (4) |
|
|
510 | (1) |
|
|
511 | (1) |
|
|
512 | (1) |
|
|
512 | (2) |
|
|
514 | (9) |
|
|
514 | (1) |
|
|
515 | (3) |
|
Session options that support debugging |
|
|
518 | (3) |
|
UsingRStudio's visual debugger |
|
|
521 | (2) |
|
|
523 | (2) |
|
21 Creating dynamic reports |
|
|
525 | (18) |
|
21.1 A template approach to reports |
|
|
528 | (1) |
|
21.2 Creating a report with R and R Markdown |
|
|
529 | (5) |
|
21.3 Creating a report with R and LaTeX |
|
|
534 | (6) |
|
Creating a parameterized report |
|
|
536 | (4) |
|
21.4 Avoiding common R Markdown problems |
|
|
540 | (1) |
|
|
541 | (2) |
|
|
543 | (1) |
|
22.1 The edatools package |
|
|
544 | (2) |
|
|
546 | (1) |
|
Installing development tools |
|
|
546 | (1) |
|
Creating a package project |
|
|
547 | (1) |
|
Writing the package functions |
|
|
548 | (4) |
|
Adding function documentation |
|
|
552 | (2) |
|
Adding a general help file (optional) |
|
|
554 | (1) |
|
Adding sample data to the package (optional) |
|
|
555 | (1) |
|
Adding a vignette (optional) |
|
|
556 | (1) |
|
Editing the DESCRIPTION file |
|
|
556 | (2) |
|
Building and installing the package |
|
|
558 | (4) |
|
22.3 Sharing your package |
|
|
562 | (5) |
|
Distributing a source package file |
|
|
562 | (1) |
|
|
562 | (1) |
|
|
563 | (2) |
|
Creating a package website |
|
|
565 | (2) |
|
|
567 | (1) |
Afterword Into the rabbit hole |
|
568 | (3) |
Appendix A Graphical user interfaces |
|
571 | (3) |
Appendix B Customizing the startup environment |
|
574 | (3) |
Appendix C Exporting data from R |
|
577 | (2) |
Appendix D Matrix algebra in R |
|
579 | (2) |
Appendix E Packages used in this book |
|
581 | (6) |
Appendix F Working with large datasets |
|
587 | (5) |
Appendix G Updating an R installation |
|
592 | (3) |
References |
|
595 | (4) |
Index |
|
599 | |