Atjaunināt sīkdatņu piekrišanu

Graphics of Large Datasets: Visualizing a Million 2006 ed. [Hardback]

3.40/5 (10 ratings by Goodreads)
  • Formāts: Hardback, 271 pages, height x width: 235x155 mm, weight: 600 g, XIII, 271 p., 1 Hardback
  • Sērija : Statistics and Computing
  • Izdošanas datums: 24-Jul-2006
  • Izdevniecība: Springer-Verlag New York Inc.
  • ISBN-10: 0387329064
  • ISBN-13: 9780387329062
Citas grāmatas par šo tēmu:
  • Hardback
  • Cena: 91,53 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Standarta cena: 107,69 €
  • Ietaupiet 15%
  • Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
  • Daudzums:
  • Ielikt grozā
  • Piegādes laiks - 4-6 nedēļas
  • Pievienot vēlmju sarakstam
  • Formāts: Hardback, 271 pages, height x width: 235x155 mm, weight: 600 g, XIII, 271 p., 1 Hardback
  • Sērija : Statistics and Computing
  • Izdošanas datums: 24-Jul-2006
  • Izdevniecība: Springer-Verlag New York Inc.
  • ISBN-10: 0387329064
  • ISBN-13: 9780387329062
Citas grāmatas par šo tēmu:
Graphics are great for exploring data, but how can they be used for looking at the large datasets that are commonplace to-day? This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases or large in numbers of variables or large in both. Data visualization is useful for data cleaning, exploring data, identifying trends and clusters, spotting local patterns, evaluating modeling output, and presenting results. It is essential for exploratory data analysis and data mining. Data analysts, statisticians, computer scientists-indeed anyone who has to explore a large dataset of their own-should benefit from reading this book.New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. There are considerable advantages in extending displays which are well-known and well-tried, both in understanding how best to make use of them in your work and in presenting results to others. It should also make the book readily accessible for readers who already have a little experience of drawing statistical graphics. All ideas are illustrated with displays from analyses of real datasets and the authors emphasize the importance of interpreting displays effectively. Graphics should be drawn to convey information and the book includes many insightful examples.From the reviews:"Anyone interested in modern techniques for visualizing data will be well rewarded by reading this book. There is a wealth of important plotting types and techniques." Paul Murrell for the Journal of Statistical Software, December 2006"This fascinating book looks at the question of visualizing large datasets from many different perspectives. Different authors are responsible for different chapters and this approach works well in giving the reader alternative viewpoints of the same problem. Interestingly the authors have cleverly chosen a definition of 'large dataset'. Essentially they focus on datasets with the order of a million cases. As the authors point out there are now many examples of much larger datasets but by limiting to ones that can be loaded in their entirety in standard statistical software they end up with a book that has great utility to the practitioner rather than just the theorist. Another very attractive feature of the book is the many colour plates, showing clearly what can now routinely be seen on the computer screen. The interactive nature of data analysis with large datasets is hard to reproduce in a book but the authors make an excellent attempt to do just this." P. Marriott for the Short Book Reviews of the ISI

This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases, or large in numbers of variables, or large in both. Data visualization is essential for exploratory data analysis and data mining. New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. There are considerable advantages in extending displays which are well-known and well-tried, both in understanding how best to make use of them in your work and in presenting results to others. All ideas are illustrated with displays from analyses of real datasets and the authors emphasize the importance of interpreting displays effectively. Graphics should be drawn to convey information and the book includes many insightful examples.

Recenzijas

From the reviews:



"Anyone interested in modern techniques for visualizing data will be well rewarded by reading this book. There is a wealth of important plotting types and techniques." Paul Murrell for the Journal of Statistical Software, December 2006



"This fascinating book looks at the question of visualizing large datasets from many different perspectives. Different authors are responsible for different chapters and this approach works well in giving the reader alternative viewpoints of the same problem. Interestingly the authors have cleverly chosen a definition of 'large dataset'. Essentially they focus on datasets with the order of a million cases. As the authors point out there are now many examples of much larger datasets but by limiting to ones that can be loaded in their entirety in standard statistical software they end up with a book that has great utility to the practitioner rather than just the theorist. Another very attractive feature of the book is the many colour plates, showing clearly what can now routinely be seen on the computer screen. The interactive nature of data analysis with large datasets is hard to reproduce in a book but the authors make an excellent attempt to do just this." P. Marriott for the Short Book Reviews of the ISI



"The intended readership of this book is anyone who is willing to explore a large dataset of their own as well as statisticians, computer scientists, and data analysts. This is a valuable book for all the researchers who need practical guidance to explore their large datasets by the help of a variety of visualization methods. We recommend this book for everyone interested in discovering structures hidden in the large datasets by a variety of state-of-the-art visualization techniques." (Tulay Koru-Sengul, Technometrics, Vol. 49 (3), August, 2007)



"The chief attractions of the book are its focus and clarity while dealing with a diverse range of topics andits simplicity of presentation....Thus, we can safely say that we recommend the book to anyone interested in the field of datat visualization. " (Rajesh Natarajan, Lev: Book Reviews, Interfaces 37(5), 2007)



"Data visualization is useful for data cleaning, exploring data, identifying trends and clusters, spotting local patterns, evaluating modelling output, and presenting results. It is also essential for exploratory data analysis and data mining. Given its subject matter, the book is well addressed to data analysts, statisticians, computer scientists, and anyone who has to explore a large dataset." (Christina Diakaki, Zentralblatt MATH, Vol. 1118 (20), 2007)



"I found the book informative, easy to read, and on target addressing an important and evolving need - the visualization of large datasets. The book contains many important points supported by examples, datasets, softare, and websites, which are accessible both intellectually and electronically." (Thomas Bradstreet, The American Statistician, Vol. 62 (2), 2008)



"The book has a Web site with files and code-settings for figures, links to software, and the most important datasets used. The style is very clear and makes pleasant reading. This book is full of inspiring ideas for all practicing statisticians, even for those who do not usually deal with large datasets." (Ricardo Maronna, Statistical Papers, Vol. 50, 2009)

1 Introduction
1
1.1 Introduction
1(3)
1.2 Data Visualization
4(3)
1.3 Research Literature
7(2)
1.4 How Large Is a Large Dataset?
9(13)
1.5 The Effects of Largeness
17(1)
1.5.1 Storage
18(1)
1.5.2 Quality
19(1)
1.5.3 Complexity
20(1)
1.5.4 Speed
20(1)
1.5.5 Analyses
21(1)
1.5.6 Displays
21(1)
1.5.7 Graphical Formats
22(1)
1.6 What Is in This Book
22(1)
1.7 Software
23(1)
1.8 What Is on the Website
24(2)
1.8.1 Files and Code for Figures
24(1)
1.8.2 Links to Software
24(1)
1.8.3 Datasets
25(1)
1.9 Contributing Authors
26(5)
Part I Basics
2 Statistical Graphics
31(24)
2.1 Introduction
31(1)
2.2 Plots for Categorical Data
31(1)
2.2.1 Barcharts and Spineplots for Univariate Categorical Data
32(1)
2.2.2 Mosaic Plots for Multi-dimensional Categorical Data
33(3)
2.3 Plots for Continuous Data
36(1)
2.3.1 Dotplots, Boxplots, and Histograms
36(3)
2.3.2 Scatterplots, Parallel Coordinates, and the Grand Tour
39(5)
2.4 Data on Mixed Scales
44(3)
2.5 Maps
47(2)
2.6 Contour Plots and Image Maps
49(1)
2.7 Time Series Plots
50(1)
2.8 Structure Plots
51(4)
3 Scaling Up Graphics
55(18)
3.1 Introduction
55(1)
3.2 Upscaling as a General Problem in Statistics
55(1)
3.3 Area Plots
56(1)
3.3.1 Histograms
57(1)
3.3.2 Barcharts
58(2)
3.3.3 Mosaic Plots
60(2)
3.4 Point Plots
62(1)
3.4.1 Boxplots
62(1)
3.4.2 Scatterplots
63(2)
3.4.3 Parallel Coordinates
65(2)
3.5 From Areas to Points and Back
67(2)
3.5.1 α-Blending and Tonal Highlighting
69(2)
3.6 Modifying Plots
71(1)
3.7 Summary
72(1)
4 Interacting with Graphics
73(32)
4.1 Introduction
73(1)
4.2 Interaction
74(1)
4.3 Interaction and Data Displays
75(1)
4.3.1 Querying
75(2)
4.3.2 Selection and Linking
77(1)
4.3.3 Selection Sequences
78(4)
4.3.4 Varying Plot Characteristics
82(2)
4.3.5 Interfaces and Interaction
84(2)
4.3.6 Degrees of Linking
86(1)
4.3.7 Warnings and Redmarking
87(1)
4.4 Interaction and Large Datasets
88(1)
4.4.1 Querying
88(1)
4.4.2 Selection, Linking, and Highlighting
89(3)
4.4.3 Varying Plot Characteristics for Large Datasets
92(6)
4.5 New Interactive Tasks
98(1)
4.5.1 Subsetting
98(1)
4.5.2 Aggregation and Recoding
99(1)
4.5.3 Transformations
99(1)
4.5.4 Weighting
99(2)
4.5.5 Managing Screen Layout
101(1)
4.6 Summary and Future Directions
101(4)
Part II Applications
5 Multivariate Categorical Data — Mosaic Plots
105(20)
5.1 Introduction
105(1)
5.2 Area-based Displays
105(2)
5.2.1 Weighted Displays and Weights in Datasets
107(1)
5.3 Displays and Techniques in One Dimension
107(3)
5.3.1 Sorting and Reordering
110(1)
5.3.2 Grouping, Averaging, and Zooming
111(2)
5.4 Mosaic Plots
113(1)
5.4.1 Combinatorics of Mosaic Plots
114(2)
5.4.2 Cases per Pixel and Pixels per Case
116(1)
5.4.3 Calibrating the Eye
116(3)
5.4.4 Gray-shading
119(3)
5.4.5 Rescaling Binsizes
122(1)
5.4.6 Rankings
123(1)
5.5 Summary
123(2)
6 Rotating Plots
125(18)
6.1 Introduction
125(1)
6.1.1 Type of Data
126(1)
6.1.2 Visual Methods for Continuous Variables
127(1)
6.1.3 Scaling Up Multiple Views for Larger Datasets
128(1)
6.2 Beginning to Work with a Million Cases
128(1)
6.2.1 What Happens in GGobi, a Real-time System?
128(1)
6.2.2 Reducing the Number of Cases
129(2)
6.2.3 Density Estimation
131(3)
6.2.4 Screen Real Estate Indexing
134(1)
6.3 Software System
135(2)
6.4 Application
137(1)
6.4.1 Data Description
137(1)
6.4.2 Viewing a Tour of the Data
137(1)
6.4.3 Scatterplot Matrix
138(2)
6.5 Current and Future Developments
140(1)
6.5.1 Improving the Methods
140(1)
6.5.2 Software
141(1)
6.5.3 How Might These Tools Be Used?
141(2)
7 Multivariate Continuous Data — Parallel Coordinates
143(14)
7.1 Introduction
143(1)
7.2 Interpolations and Inner Products
144(1)
7.3 Generalized Parallel Coordinate Geometry
145(4)
7.4 A New Family of Smooth Plots
149(1)
7.5 Examples
150(1)
7.5.1 Automobile Data
150(2)
7.5.2 Hyperspectral Data: Dealing with Massive Datasets
152(2)
7.6 Detecting Second—Order Structures
154(1)
7.7 Summary
155(2)
8 Networks
157(20)
8.1 Introduction
157(1)
8.2 Layout Algorithms
158(1)
8.2.1 Simple Tree Layout
159(2)
8.2.2 Force Layout Methods
161(1)
8.2.3 Individual Node Movement Algorithms
162(1)
8.3 Interactivity
162(2)
8.3.1 Speed Considerations
164(1)
8.3.2 Interaction and Layout
165(1)
8.4 NicheWorks
166(1)
8.5 Example: International Calling Fraud
167(5)
8.6 Languages for Description and Layouts
172(1)
8.6.1 Defining a Graph
172(1)
8.6.2 Graph Specification via VizML
173(1)
8.7 Summary
174(3)
9 Trees
177(26)
9.1 Introduction
177(1)
9.2 Growing Trees for Large Datasets
178(1)
9.2.1 Scalability of the CART Growing Algorithm
179(2)
9.2.2 Scalability of Pruning Methods
181(2)
9.2.3 Statistical Tests and Large Datasets
183(1)
9.2.4 Using Trees for Large Datasets in Practice
184(3)
9.3 Visualization of Large Trees
187(1)
9.3.1 Hierarchical Plots
187(5)
9.3.2 Sectioned Scatterplots
192(3)
9.3.3 Recursive Plots
195(3)
9.4 Forests for Large Datasets
198(4)
9.5 Summary
202(1)
10 Transactions
203(24)
10.1 Introduction and Background
203(2)
10.2 Mice and Elephant Plots and Random Sampling
205(5)
10.3 Biased Sampling
210(1)
10.3.1 Windowed Biased Sampling
211(2)
10.3.2 Box—Cox Biased Sampling
213(2)
10.4 Quantile Window Sampling
215(6)
10.5 Commonality of Flow Rates
221(6)
11 Graphics of a Large Dataset
227(24)
11.1 Introduction
227(1)
11.2 QuickStart Guide Data Visualization for Large Datasets
228(1)
11.3 Visualizing the InfoVis 2005 Contest Dataset
229(1)
11.3.1 Preliminaries
229(1)
11.3.2 Variables
230(1)
11.3.3 First Analyses
230(5)
11.3.4 Multivariate Displays
235(4)
11.3.5 Grouping and Selection
239(3)
11.3.6 Special Features
242(5)
11.3.7 Presenting Results
247(2)
11.3.8 Summary
249(2)
References 251(12)
Authors 263(4)
Index 267