Atjaunināt sīkdatņu piekrišanu

E-grāmata: R in Action, Third Edition: Data analysis and graphics with R and Tidyverse

  • Formāts: 656 pages
  • Izdošanas datums: 28-Jun-2022
  • Izdevniecība: Manning Publications
  • Valoda: eng
  • ISBN-13: 9781638357018
  • Formāts - EPUB+DRM
  • Cena: 49,74 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
  • Formāts: 656 pages
  • Izdošanas datums: 28-Jun-2022
  • Izdevniecība: Manning Publications
  • Valoda: eng
  • ISBN-13: 9781638357018

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Built specifically for statistical computing and graphics, the R language, along with its amazing collection of libraries and tools, is one of the most powerful tools you can use to tackle data analysis for business, research, and other data-intensive domains. This revised and expanded third edition of R in Action covers the new tidy verse approach to data analysis and R's state-of-the-art graphing capabilities with the ggplot2 package.

R in Action, Third Edition teaches you to use the R language, including the popular tidy verse packages, through hands-on examples relevant to scientific, technical, and business developers. Focusing on practical solutions to real-world data challenges, R expert RobKabacoff takes you on a crash course in statistics, from dealing with messy and incomplete data to creating stunning visualisations.

The R language is the most powerful platform you can choose for modern data analysis. Free and open source, R's community has created thousands of modules to tackle challenges from data-crunching to presentation. R's graphical capabilities are also state-of-the-art, with a comprehensive and powerful feature set available for data visualization. R runs on all major operating systems and is used by businesses, researchers, and organizations worldwide.

Recenzijas

Read it and masterthe invaluable art of solving data analysis problems efficiently: a must! AlainLompo

Excellent primer for starting R. Martin Perry

The book gives a amazing introduction to R and the applicable methods for machine learning and statistics. Nicole Koenigstein

Amusing writing style and great material ingeneral, great book for those who are beginning in Statistics programming. LuisFelipe Medeiro Alves

This is an awesome book on R. Tiklu Ganguly

The definitive guide to bring you from beginner to advanced with R. Jean-Franēois Morin

A clear and comprehensive guide to using R forreal work. I was able to get an R environment up and running with minimal difficulty! Jim Frohnhofer

Preface xix
Acknowledgments xxi
About this book xxiii
About the author xxx
About the cover illustration xxxi
Part 1 Getting started
1(114)
1 Introduction to R
3(17)
1.1 Why use R?
5(2)
1.2 Obtaining and installing R
7(1)
1.3 Working with R
7(8)
Getting started
8(2)
Using RStudio
10(2)
Getting help
12(1)
The workspace
13(1)
Projects
14(1)
1.4 Packages
15(2)
What are packages?
15(1)
Installing a package
15(1)
Loading a package
16(1)
Learning about a package
16(1)
1.5 Using output as input: Reusing results
17(1)
1.6 Working with large datasets
18(1)
1.7 Working through an example
18(2)
2 Creating a dataset
20(26)
2.1 Understanding datasets
21(1)
2.2 Data structures
22(11)
Vectors
23(1)
Matrices
23(2)
Arrays
25(1)
Data frames
26(2)
Factors
28(2)
Lists
30(1)
Tibbies
31(2)
2.3 Data input
33(10)
Entering data from the keyboard
34(1)
Importing data from a delimited text file
35(4)
Importing data from Excel
39(1)
Importing data from /SON
39(1)
Importing data from the web
39(1)
Importing data from SPSS
40(1)
Importing data from SAS
40(1)
Importing data from Stata
41(1)
Accessing database management systems
41(1)
Importing data via Stat/Transfer
42(1)
2.4 Annotating datasets
43(1)
Variable labels
43(1)
Value labels
44(1)
2.5 Useful functions for working with data objects
44(2)
3 Basic data management
46(22)
3.1 A working example
47(1)
3.2 Creating new variables
48(2)
3.3 Recoding variables
50(1)
3.4 Renaming variables
51(1)
3.5 Missing values
52(2)
Recoding values to missing
53(1)
Excluding missing values from analyses
53(1)
3.6 Date values
54(2)
Converting dates to character variables
56(1)
Going further
56(1)
3.7 Type conversions
56(1)
3.8 Sorting data
57(1)
3.9 Merging datasets
58(1)
Adding columns to a data frame
58(1)
Adding rows to a data frame
58(1)
3.10 Subsetting datasets
59(3)
Selecting variables
59(1)
Dropping variables
59(1)
Selecting observations
60(1)
The subsetQ function
61(1)
Random samples
62(1)
3.11 Using dplyr to manipulate data frames
62(4)
Basic dplyr functions
62(3)
Using pipe operators to chain statements
65(1)
3.12 Using SQL statements to manipulate data frames
66(2)
4 Getting started with graphs
68(20)
4.1 Creating a graph with ggplot2
69(13)
Ggplot
69(1)
Geoms
70(4)
Grouping
74(2)
Scales
76(2)
Facets
78(2)
Labels
80(1)
Themes
80(2)
4.2 Ggplot2 Details
82(6)
Placing the data and mapping options
82(2)
Graphs as objects
84(1)
Saving graphs
85(1)
Common mistakes
86(2)
5 Advanced data management
88(27)
5.1 A data management challenge
89(1)
5.2 Numerical and character functions
90(14)
Mathematical functions
90(1)
Statistical functions
91(2)
Probability functions
93(3)
Character functions
96(2)
Other useful functions
98(1)
Applying functions to matrices and data frames
99(1)
A solution for the data management challenge
100(4)
5.3 Control flow
104(2)
Repetition and looping
104(1)
Conditional execution
105(1)
5.4 User-written functions
106(3)
5.5 Reshaping data
109(3)
Transposing
109(1)
Converting from wide to long dataset formats
109(3)
5.6 Aggregating data
112(3)
Part 2 Basic Methods
115(62)
6 Basic graphs
117(30)
6.1 Bar charts
118(10)
Simple bar charts
118(1)
Stacked, grouped, and filled bar charts
119(2)
Mean bar charts
121(2)
Tweaking bar charts
123(5)
6.2 Pie charts
128(2)
6.3 Tree maps
130(3)
6.4 Histograms
133(2)
6.5 Kernel density plots
135(3)
6.6 Box plots
138(5)
Using parallel box plots to compare groups
139(3)
Violin plots
142(1)
6.7 Dot plots
143(4)
7 Basic statistics
147(30)
7.1 Descriptive statistics
148(8)
A menagerie of methods
148(2)
Even more methods
150(2)
Descriptive statistics by group
152(2)
Summarizing data interactively with dplyr
154(1)
Visualizing results
155(1)
7.2 Frequency and contingency tables
156(8)
Generating frequency tables
156(6)
Tests of independence
162(1)
Measures of association
163(1)
Visualizing results
164(1)
7.3 Correlations
164(5)
Types of correlations
165(2)
Testing correlations for significance
167(2)
Visualizing correlations
169(1)
7.4 T-tests
169(2)
Independent t-test
169(1)
Dependent t-test
170(1)
When there are more than two groups
171(1)
7.5 Nonparametric tests of group differences
171(4)
Comparing two groups
171(2)
Comparing more than two groups
173(2)
7.6 Visualizing group differences
175(2)
Part 3 Intermediate methods
177(136)
8 Regression
179(42)
8.1 The many faces of regression
180(3)
Scenarios for using OLS regression
181(1)
What you need to know
182(1)
8.2 OLS regression
183(11)
Fitting regression models with lm()
184(1)
Simple linear regression
185(3)
Polynomial regression
188(2)
Multiple linear regression
190(2)
Multiple linear regression with interactions
192(2)
8.3 Regression diagnostics
194(9)
A typical approach
195(2)
An enhanced approach
197(5)
Multicollinearity
202(1)
8.4 Unusual observations
203(4)
Outliers
203(1)
High-leverage points
203(1)
Influential observations
204(3)
8.5 Corrective measures
207(4)
Deleting observations
208(1)
Transforming variables
208(2)
Adding or deleting variables
210(1)
Trying a different approach
210(1)
8.6 Selecting the "best" regression model
211(4)
Comparing models
211(1)
Variable selection
212(3)
8.7 Taking the analysis further
215(6)
Cross-validation
215(2)
Relative importance
217(4)
9 Analysis of variance
221(28)
9.1 A crash course on terminology
222(2)
9.2 Fitting ANOVA models
224(2)
The aov() function
224(1)
The order of formula terms
225(1)
9.3 One-way ANOVA
226(7)
Multiple comparisons
228(4)
Assessing test assumptions
232(1)
9.4 One-way ANCOVA
233(4)
Assessing test assumptions
235(1)
Visualizing the results
236(1)
9.5 Two-way factorial ANOVA
237(2)
9.6 Repeated measures ANOVA
239(3)
9.7 Multivariate analysis of variance (MANOVA)
242(4)
Assessing test assumptions
244(1)
Robust MANOVA
245(1)
9.8 ANOVA as regression
246(3)
10 Power analysis
249(16)
10.1 A quick review of hypothesis testing
250(2)
10.2 Implementing power analysis with the pwr package
252(10)
T-tests
253(2)
ANOVA
255(1)
Correlations
255(1)
Linear models
256(1)
Tests of proportions
257(1)
Chi-square tests
258(1)
Choosing an appropriate effect size in novel situations
259(3)
10.3 Creating power analysis plots
262(1)
10.4 Other packages
263(2)
11 Intermediate graphs
265(28)
11.1 Scatter plots
266(16)
Scatter plot matrices
269(3)
High-density scatter plots
272(3)
3D scatter plots
275(2)
Spinning 3D scatter plots
277(2)
Bubble plots
279(3)
11.2 Line charts
282(2)
11.3 Corrgrams
284(5)
11.4 Mosaic plots
289(4)
12 Resampling statistics and bootstrapping
293(20)
12.1 Permutation tests
294(2)
12.2 Permutation tests with the coin package
296(4)
Independent two-sample and k-sample tests
297(1)
Independence in contingency tables
298(1)
Independence between numeric variables
299(1)
Dependent two-sample and k-sample tests
300(1)
Going further
300(1)
12.3 Permutation tests with the ImPerm package
300(4)
Simple and polynomial regression
301(1)
Multiple regression
302(1)
One-way ANOVA and ANCOVA
303(1)
Two-way ANOVA
304(1)
12.4 Additional comments on permutation tests
304(1)
12.5 Bootstrapping
305(1)
12.6 Bootstrapping with the boot package
306(7)
Bootstrapping a single statistic
307(2)
Bootstrapping several statistics
309(4)
Part 4 Advanced methods
313(144)
13 Generalized linear models
315(18)
13.1 Generalized linear models and the glm() function
316(4)
The glm() function
317(1)
Supporting functions
318(1)
Model fit and regression diagnostics
319(1)
13.2 Logistic regression
320(6)
Interpreting the model parameters
323(1)
Assessing the impact of predictors on the probability of an outcome
323(1)
Overdispersion
324(1)
Extensions
325(1)
13.3 Poisson regression
326(7)
Interpreting the model parameters
328(1)
Overdispersion
329(2)
Extensions
331(2)
14 Principal components and factor analysis
333(22)
14.1 Principal components and factor analysis in R
335(1)
14.2 Principal components
336(9)
Selecting the number of components to extract
337(1)
Extracting principal components
338(4)
Rotating principal components
342(1)
Obtaining principal component scores
343(2)
14.3 Exploratory factor analysis
345(7)
Deciding how many common factors to extract
346(1)
Extracting common factors
347(1)
Rotating factors
348(4)
Factor scores
352(1)
Other FFA-related packages
352(1)
14.4 Other latent variable models
352(3)
15 Time series
355(31)
15.1 Creating a time-series object in R
358(2)
15.2 Smoothing and seasonal decomposition
360(8)
Smoothing with simple moving averages
360(2)
Seasonal decomposition
362(6)
15.3 Exponential forecasting models
368(8)
Simple exponential smoothing
369(3)
Holt and Holt-Winters exponential smoothing
372(2)
The ets() function and automated forecasting
374(2)
15.4 ARIMA forecasting models
376(8)
Prerequisite concepts
376(2)
ARMA and ARIMA models
378(5)
Automated ARIMA forecasting
383(1)
15.5 Going further
384(2)
16 Cluster analysis
386(23)
16.1 Common steps in cluster analysis
388(2)
16.2 Calculating distances
390(1)
16.3 Hierarchical cluster analysis
391(5)
16.4 Partitioning-cluster analysis
396(8)
K-means clustering
396(7)
Partitioning around medoids
403(1)
16.5 Avoiding nonexistent clusters
404(4)
16.6 Going further
408(1)
17 Classification
409(25)
17.1 Preparing the data
410(2)
17.2 Logistic regression
412(1)
17.3 Decision trees
413(5)
Classical decision trees
413(4)
Conditional inference trees
417(1)
17.4 Random forests
418(3)
17.5 Support vector machines
421(4)
Tuning an SVM
423(2)
17.6 Choosing a best predictive solution
425(3)
17.7 Understanding black box predictions
428(4)
Break-down plots
428(3)
Plotting Shapley values
431(1)
17.8 Going further
432(2)
18 Advanced methods for missing data
434(23)
18.1 Steps in dealing with missing data
435(2)
18.2 Identifying missing values
437(1)
18.3 Exploring missing-values patterns
438(6)
Visualizing missing values
439(3)
Using correlations to explore missing values
442(2)
18.4 Understanding the sources and impact of missing data
444(1)
18.5 Rational approaches for dealing with incomplete data
445(1)
18.6 Deleting missing data
446(2)
Complete-case analysis (listwise deletion)
446(2)
Available case analysis (pairwise deletion)
448(1)
18.7 Single imputation
448(3)
Simple imputation
449(1)
K-nearest neighbor imputation
449(1)
Miss Forest
450(1)
18.8 Multiple imputation
451(4)
18.9 Other approaches to missing data
455(2)
Part 5 Expanding your skills
457(111)
19 Advanced graphs
459(32)
19.1 Modifying scales
460(10)
Customizing axes
460(6)
Customizing colors
466(4)
19.2 Modifying themes
470(8)
Prepackaged themes
471(1)
Customizing fonts
472(3)
Customizing legends
475(2)
Customizing the plot area
477(1)
19.3 Adding annotations
478(7)
19.4 Combining graphs
485(2)
19.5 Making graphs interactive
487(4)
20 Advanced programming
491(34)
20.1 A review of the language
492(11)
Data types
492(6)
Control structures
498(3)
Creating functions
501(2)
20.2 Working with environments
503(2)
20.3 Non-standard evaluation
505(3)
20.4 Object-oriented programming
508(2)
Generic functions
508(2)
Limitations of the S3 model
510(1)
20.5 Writing efficient code
510(4)
Efficient data input
510(1)
Vectorization
511(1)
Correctly sizing objects
512(1)
Parallelization
512(2)
20.6 Debugging
514(9)
Common sources of errors
514(1)
Debugging tools
515(3)
Session options that support debugging
518(3)
UsingRStudio's visual debugger
521(2)
20.7 Going further
523(2)
21 Creating dynamic reports
525(18)
21.1 A template approach to reports
528(1)
21.2 Creating a report with R and R Markdown
529(5)
21.3 Creating a report with R and LaTeX
534(6)
Creating a parameterized report
536(4)
21.4 Avoiding common R Markdown problems
540(1)
21.5 Going further
541(2)
22 Creating a package
543(1)
22.1 The edatools package
544(2)
22.2 Creating a package
546(1)
Installing development tools
546(1)
Creating a package project
547(1)
Writing the package functions
548(4)
Adding function documentation
552(2)
Adding a general help file (optional)
554(1)
Adding sample data to the package (optional)
555(1)
Adding a vignette (optional)
556(1)
Editing the DESCRIPTION file
556(2)
Building and installing the package
558(4)
22.3 Sharing your package
562(5)
Distributing a source package file
562(1)
Submitting to CRAN
562(1)
Hosting on GitHub
563(2)
Creating a package website
565(2)
22.4 Going further
567(1)
Afterword Into the rabbit hole 568(3)
Appendix A Graphical user interfaces 571(3)
Appendix B Customizing the startup environment 574(3)
Appendix C Exporting data from R 577(2)
Appendix D Matrix algebra in R 579(2)
Appendix E Packages used in this book 581(6)
Appendix F Working with large datasets 587(5)
Appendix G Updating an R installation 592(3)
References 595(4)
Index 599
R in Action, Third Edition teaches you to use the R language, including the popular tidy verse packages, through hands-on examples relevant to scientific, technical, and business developers. Focusing on practical solutions to real-world data challenges, R expert Rob Kabacoff takes you on a crash course in statistics, from dealing with messy and incomplete data to creating stunning visualisations. In this revised and expanded third edition, new coverage hasbeen added for R's state-of-the-art graphing capabilities with the ggplot2 package.