Atjaunināt sīkdatņu piekrišanu

Probability and Statistics for Computer Science 1st ed. 2018 [Hardback]

  • Formāts: Hardback, 367 pages, height x width: 279x210 mm, weight: 1651 g, 84 Illustrations, color; 40 Illustrations, black and white; XXIV, 367 p. 124 illus., 84 illus. in color., 1 Hardback
  • Izdošanas datums: 20-Feb-2018
  • Izdevniecība: Springer International Publishing AG
  • ISBN-10: 3319644092
  • ISBN-13: 9783319644097
  • Hardback
  • Cena: 55,83 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Standarta cena: 65,69 €
  • Ietaupiet 15%
  • Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
  • Daudzums:
  • Ielikt grozā
  • Piegādes laiks - 4-6 nedēļas
  • Pievienot vēlmju sarakstam
  • Formāts: Hardback, 367 pages, height x width: 279x210 mm, weight: 1651 g, 84 Illustrations, color; 40 Illustrations, black and white; XXIV, 367 p. 124 illus., 84 illus. in color., 1 Hardback
  • Izdošanas datums: 20-Feb-2018
  • Izdevniecība: Springer International Publishing AG
  • ISBN-10: 3319644092
  • ISBN-13: 9783319644097
This textbook is aimed at computer science undergraduates late in sophomore or early in junior year, supplying a comprehensive background in qualitative and quantitative data analysis, probability, random variables, and statistical methods, including machine learning.

With careful treatment of topics that fill the curricular needs for the course, Probability and Statistics for Computer Science features:



   A treatment of random variables and expectations dealing primarily with the discrete case.



   A practical treatment of simulation, showing how many interesting probabilities and expectations can be extracted, with particular emphasis on Markov chains.

   A clear but crisp account of simple point inference strategies (maximum likelihood; Bayesian inference) in simple contexts. This is extended to cover some confidence intervals, samples and populations for random sampling with replacement, and the simplest hypothesis testing.

   Achapter dealing with classification, explaining why its useful; how to train SVM classifiers with stochastic gradient descent; and how to use implementations of more advanced methods such as random forests and nearest neighbors.

   A chapter dealing with regression, explaining how to set up, use and understand linear regression and nearest neighbors regression in practical problems.

   A chapter dealing with principal components analysis, developing intuition carefully, and including numerous practical examples. There is a brief description of multivariate scaling via principal coordinate analysis.





   A chapter dealing with clustering via agglomerative methods and k-means, showing how to build vector quantized features for complex signals.

Illustrated throughout, each main chapter includes many worked examples and other pedagogical elements such as

boxed Procedures, Definitions, Useful Facts, and Remember This (short tips). Problems and Programming Exercises are at the end of each chapter, with a summary of what the reader should know.  

Instructor resources include a full set of model solutions for all problems, and an Instructor's Manual with accompanying presentation slides.
Part I Describing Datasets
1 First Tools for Looking at Data
3(26)
1.1 Datasets
3(1)
1.2 What's Happening? Plotting Data
4(3)
1.2.1 Bar Charts
5(1)
1.2.2 Histograms
6(1)
1.2.3 How to Make Histograms
6(1)
1.2.4 Conditional Histograms
7(1)
1.3 Summarizing ID Data
7(9)
1.3.1 The Mean
7(2)
1.3.2 Standard Deviation
9(3)
1.3.3 Computing Mean and Standard Deviation Online
12(1)
1.3.4 Variance
12(1)
1.3.5 The Median
13(1)
1.3.6 Interquartile Range
14(1)
1.3.7 Using Summaries Sensibly
15(1)
1.4 Plots and Summaries
16(4)
1.4.1 Some Properties of Histograms
16(2)
1.4.2 Standard Coordinates and Normal Data
18(2)
1.4.3 Box Plots
20(1)
1.5 Whose is Bigger? Investigating Australian Pizzas
20(4)
1.6 You Should
24(5)
1.6.1 Remember These Definitions
24(1)
1.6.2 Remember These Terms
25(1)
1.6.3 Remember These Facts
25(1)
1.6.4 Be Able to
25(4)
2 Looking at Relationships
29(24)
2.1 Plotting 2D Data
29(7)
2.1.1 Categorical Data, Counts, and Charts
29(2)
2.1.2 Series
31(2)
2.1.3 Scatter Plots for Spatial Data
33(1)
2.1.4 Exposing Relationships with Scatter Plots
34(2)
2.2 Correlation
36(9)
2.2.1 The Correlation Coefficient
39(3)
2.2.2 Using Correlation to Predict
42(2)
2.2.3 Confusion Caused by Correlation
44(1)
2.3 Sterile Males in Wild Horse Herds
45(2)
2.4 You Should
47(6)
2.4.1 Remember These Definitions
47(1)
2.4.2 Remember These Terms
47(1)
2.4.3 Remember These Facts
47(1)
2.4.4 Use These Procedures
47(1)
2.4.5 Be Able to
47(6)
Part II Probability
3 Basic Ideas in Probability
53(34)
3.1 Experiments, Outcomes and Probability
53(2)
3.1.1 Outcomes and Probability
53(2)
3.2 Events
55(6)
3.2.1 Computing Event Probabilities by Counting Outcomes
56(2)
3.2.2 The Probability of Events
58(2)
3.2.3 Computing Probabilities by Reasoning About Sets
60(1)
3.3 Independence
61(5)
3.3.1 Example: Airline Overbooking
64(2)
3.4 Conditional Probability
66(9)
3.4.1 Evaluating Conditional Probabilities
67(3)
3.4.2 Detecting Rare Events Is Hard
70(1)
3.4.3 Conditional Probability and Various Forms of Independence
71(1)
3.4.4 Warning Example: The Prosecutor's Fallacy
72(1)
3.4.5 Warning Example: The Monty Hall Problem
73(2)
3.5 Extra Worked Examples
75(5)
3.5.1 Outcomes and Probability
75(1)
3.5.2 Events
76(1)
3.5.3 Independence
77(1)
3.5.4 Conditional Probability
78(2)
3.6 You Should
80(7)
3.6.1 Remember These Definitions
80(1)
3.6.2 Remember These Terms
80(1)
3.6.3 Remember and Use These Facts
80(1)
3.6.4 Remember These Points
80(1)
3.6.5 Be Able to
81(6)
4 Random Variables and Expectations
87(28)
4.1 Random Variables
87(6)
4.1.1 Joint and Conditional Probability for Random Variables
89(2)
4.1.2 Just a Little Continuous Probability
91(2)
4.2 Expectations and Expected Values
93(6)
4.2.1 Expected Values
93(2)
4.2.2 Mean, Variance and Covariance
95(3)
4.2.3 Expectations and Statistics
98(1)
4.3 The Weak Law of Large Numbers
99(4)
4.3.1 IID Samples
99(1)
4.3.2 Two Inequalities
100(1)
4.3.3 Proving the Inequalities
100(2)
4.3.4 The Weak Law of Large Numbers
102(1)
4.4 Using the Weak Law of Large Numbers
103(5)
4.4.1 Should You Accept a Bet?
103(1)
4.4.2 Odds, Expectations and Bookmaking: A Cultural Diversion
104(1)
4.4.3 Ending a Game Early
105(1)
4.4.4 Making a Decision with Decision Trees and Expectations
105(1)
4.4.5 Utility
106(2)
4.5 You Should
108(7)
4.5.1 Remember These Definitions
108(1)
4.5.2 Remember These Terms
108(1)
4.5.3 Use and Remember These Facts
109(1)
4.5.4 Remember These Points
109(1)
4.5.5 Be Able to
109(6)
5 Useful Probability Distributions
115(26)
5.1 Discrete Distributions
115(5)
5.1.1 The Discrete Uniform Distribution
115(1)
5.1.2 Bernoulli Random Variables
116(1)
5.1.3 The Geometric Distribution
116(1)
5.1.4 The Binomial Probability Distribution
116(2)
5.1.5 Multinomial Probabilities
118(1)
5.1.6 The Poisson Distribution
118(2)
5.2 Continuous Distributions
120(3)
5.2.1 The Continuous Uniform Distribution
120(1)
5.2.2 The Beta Distribution
120(1)
5.2.3 The Gamma Distribution
121(1)
5.2.4 The Exponential Distribution
122(1)
5.3 The Normal Distribution
123(3)
5.3.1 The Standard Normal Distribution
123(1)
5.3.2 The Normal Distribution
124(1)
5.3.3 Properties of the Normal Distribution
124(2)
5.4 Approximating Binomials with Large TV
126(4)
5.4.1 Large N
127(1)
5.4.2 Getting Normal
128(1)
5.4.3 Using a Normal Approximation to the Binomial Distribution
129(1)
5.5 You Should
130(11)
5.5.1 Remember These Definitions
130(1)
5.5.2 Remember These Terms
130(1)
5.5.3 Remember These Facts
131(1)
5.5.4 Remember These Points
131(10)
Part III Inference
6 Samples and Populations
141(18)
6.1 The Sample Mean
141(5)
6.1.1 The Sample Mean Is an Estimate of the Population Mean
141(1)
6.1.2 The Variance of the Sample Mean
142(2)
6.1.3 When The Urn Model Works
144(1)
6.1.4 Distributions Are Like Populations
145(1)
6.2 Confidence Intervals
146(8)
6.2.1 Constructing Confidence Intervals
146(1)
6.2.2 Estimating the Variance of the Sample Mean
146(2)
6.2.3 The Probability Distribution of the Sample Mean
148(1)
6.2.4 Confidence Intervals for Population Means
149(3)
6.2.5 Standard Error Estimates from Simulation
152(2)
6.3 You Should
154(5)
6.3.1 Remember These Definitions
154(1)
6.3.2 Remember These Terms
154(1)
6.3.3 Remember These Facts
154(1)
6.3.4 Use These Procedures
154(1)
6.3.5 Be Able to
154(5)
7 The Significance of Evidence
159(20)
7.1 Significance
160(5)
7.1.1 Evaluating Significance
160(1)
7.1.2 P-Values
161(4)
7.2 Comparing the Mean of Two Populations
165(4)
7.2.1 Assuming Known Population Standard Deviations
165(2)
7.2.2 Assuming Same, Unknown Population Standard Deviation
167(1)
7.2.3 Assuming Different, Unknown Population Standard Deviation
168(1)
7.3 Other Useful Tests of Significance
169(5)
7.3.1 F-Tests and Standard Deviations
169(2)
7.3.2 Χ2 Tests of Model Fit
171(3)
7.4 P-Value Hacking and Other Dangerous Behavior
174(1)
7.5 You Should
174(5)
7.5.1 Remember These Definitions
174(1)
7.5.2 Remember These Terms
175(1)
7.5.3 Remember These Facts
175(1)
7.5.4 Use These Procedures
175(1)
7.5.5 Be Able to
175(4)
8 Experiments
179(18)
8.1 A Simple Experiment: The Effect of a Treatment
179(7)
8.1.1 Randomized Balanced Experiments
180(1)
8.1.2 Decomposing Error in Predictions
180(1)
8.1.3 Estimating the Noise Variance
181(1)
8.1.4 The ANOVA Table
182(1)
8.1.5 Unbalanced Experiments
183(2)
8.1.6 Significant Differences
185(1)
8.2 Two Factor Experiments
186(8)
8.2.1 Decomposing the Error
188(1)
8.2.2 Interaction Between Effects
189(1)
8.2.3 The Effects of a Treatment
190(1)
8.2.4 Setting Up An ANOVA Table
191(3)
8.3 You Should
194(3)
8.3.1 Remember These Definitions
194(1)
8.3.2 Remember These Terms
194(1)
8.3.3 Remember These Facts
194(1)
8.3.4 Use These Procedures
194(1)
8.3.5 Be Able to
194(3)
9 Inferring Probability Models from Data
197(28)
9.1 Estimating Model Parameters with Maximum Likelihood
197(9)
9.1.1 The Maximum Likelihood Principle
198(1)
9.1.2 Binomial, Geometric and Multinomial Distributions
199(2)
9.1.3 Poisson and Normal Distributions
201(3)
9.1.4 Confidence Intervals for Model Parameters
204(2)
9.1.5 Cautions About Maximum Likelihood
206(1)
9.2 Incorporating Priors with Bayesian Inference
206(5)
9.2.1 Conjugacy
209(1)
9.2.2 MAP Inference
210(1)
9.2.3 Cautions About Bayesian Inference
211(1)
9.3 Bayesian Inference for Normal Distributions
211(4)
9.3.1 Example: Measuring Depth of a Borehole
212(1)
9.3.2 Normal Prior and Normal Likelihood Yield Normal Posterior
212(2)
9.3.3 Filtering
214(1)
9.4 You Should
215(10)
9.4.1 Remember These Definitions
215(1)
9.4.2 Remember These Terms
216(1)
9.4.3 Remember These Facts
216(1)
9.4.4 Use These Procedures
217(1)
9.4.5 Be Able to
217(8)
Part IV Tools
10 Extracting Important Relationships in High Dimensions
225(28)
10.1 Summaries and Simple Plots
225(6)
10.1.1 The Mean
226(1)
10.1.2 Stem Plots and Scatterplot Matrices
226(1)
10.1.3 Covariance
227(1)
10.1.4 The Covariance Matrix
228(3)
10.2 Using Mean and Covariance to Understand High Dimensional Data
231(5)
10.2.1 Mean and Covariance Under Affine Transformations
231(1)
10.2.2 Eigenvectors and Diagonalization
232(1)
10.2.3 Diagonalizing Covariance by Rotating Blobs
233(2)
10.2.4 Approximating Blobs
235(1)
10.2.5 Example: Transforming the Height-Weight Blob
235(1)
10.3 Principal Components Analysis
236(6)
10.3.1 The Low Dimensional Representation
236(2)
10.3.2 The Error Caused by Reducing Dimension
238(3)
10.3.3 Example: Representing Colors with Principal Components
241(1)
10.3.4 Example: Representing Faces with Principal Components
242(1)
10.4 Multi-Dimensional Scaling
242(5)
10.4.1 Choosing Low D Points Using High D Distances
243(2)
10.4.2 Factoring a Dot-Product Matrix
245(1)
10.4.3 Example: Mapping with Multidimensional Scaling
246(1)
10.5 Example: Understanding Height and Weight
247(3)
10.6 You Should
250(3)
10.6.1 Remember These Definitions
250(1)
10.6.2 Remember These Terms
250(1)
10.6.3 Remember These Facts
250(1)
10.6.4 Use These Procedures
250(1)
10.6.5 Be Able to
250(3)
11 Learning to Classify
253(28)
11.1 Classification: The Big Ideas
253(3)
11.1.1 The Error Rate, and Other Summaries of Performance
254(1)
11.1.2 More Detailed Evaluation
254(1)
11.1.3 Overfilling and Cross-Validation
255(1)
11.2 Classifying with Nearest Neighbors
256(1)
11.2.1 Practical Considerations for Nearest Neighbors
256(1)
11.3 Classifying with Naive Bayes
257(3)
11.3.1 Cross-Validation to Choose a Model
259(1)
11.4 The Support Vector Machine
260(8)
11.4.1 The Hinge Loss
261(1)
11.4.2 Regularization
262(1)
11.4.3 Finding a Classifier with Stochastic Gradient Descent
262(2)
11.4.4 Searching for A
264(2)
11.4.5 Example: Training an SVM with Stochastic Gradient Descent
266(2)
11.4.6 Multi-Class Classification with SVMs
268(1)
11.5 Classifying with Random Forests
268(6)
11.5.1 Building a Decision Tree: General Algorithm
270(1)
11.5.2 Building a Decision Tree: Choosing a Split
270(2)
11.5.3 Forests
272(2)
11.6 You Should
274(7)
11.6.1 Remember These Definitions
274(1)
11.6.2 Remember These Terms
274(1)
11.6.3 Remember These Facts
275(1)
11.6.4 Use These Procedures
275(1)
11.6.5 Be Able to
276(5)
12 Clustering: Models of High Dimensional Data
281(24)
12.1 The Curse of Dimension
281(2)
12.1.1 Minor Banes of Dimension
281(1)
12.1.2 The Curse: Data Isn't Where You Think It Is
282(1)
12.2 Clustering Data
283(4)
12.2.1 Agglomerative and Divisive Clustering
283(2)
12.2.2 Clustering and Distance
285(2)
12.3 The K-Means Algorithm and Variants
287(7)
12.3.1 How to Choose K
288(2)
12.3.2 Soft Assignment
290(1)
12.3.3 Efficient Clustering and Hierarchical K Means
291(1)
12.3.4 K-Mediods
292(1)
12.3.5 Example: Groceries in Portugal
292(1)
12.3.6 General Comments on K-Means
293(1)
12.4 Describing Repetition with Vector Quantization
294(6)
12.4.1 Vector Quantization
296(2)
12.4.2 Example: Activity from Accelerometer Data
298(2)
12.5 The Multivariate Normal Distribution
300(2)
12.5.1 Affine Transformations and Gaussians
301(1)
12.5.2 Plotting a 2D Gaussian: Covariance Ellipses
301(1)
12.6 You Should
302(3)
12.6.1 Remember These Definitions
302(1)
12.6.2 Remember These Terms
302(1)
12.6.3 Remember These Facts
303(1)
12.6.4 Use These Procedures
303(2)
13 Regression
305(26)
13.1 Regression to Make Predictions
305(1)
13.2 Regression to Spot Trends
306(2)
13.3 Linear Regression and Least Squares
308(5)
13.3.1 Linear Regression
308(1)
13.3.2 Choosing β
309(1)
13.3.3 Solving the Least Squares Problem
309(1)
13.3.4 Residuals
310(1)
13.3.5 R-Squared
310(3)
13.4 Producing Good Linear Regressions
313(8)
13.4.1 Transforming Variables
313(1)
13.4.2 Problem Data Points Have Significant Impact
314(3)
13.4.3 Functions of One Explanatory Variable
317(1)
13.4.4 Regularizing Linear Regressions
318(3)
13.5 Exploiting Your Neighbors for Regression
321(2)
13.5.1 Using Your Neighbors to Predict More than a Number
323(1)
13.6 You Should
323(8)
13.6.1 Remember These Definitions
323(1)
13.6.2 Remember These Terms
324(1)
13.6.3 Remember These Facts
324(1)
13.6.4 Remember These Procedures
324(7)
14 Markov Chains and Hidden Markov Models
331(24)
14.1 Markov Chains
331(7)
14.1.1 Transition Probability Matrices
333(2)
14.1.2 Stationary Distributions
335(1)
14.1.3 Example: Markov Chain Models of Text
336(2)
14.2 Estimating Properties of Markov Chains
338(4)
14.2.1 Simulation
338(1)
14.2.2 Simulation Results as Random Variables
339(2)
14.2.3 Simulating Markov Chains
341(1)
14.3 Example: Ranking the Web by Simulating a Markov Chain
342(2)
14.4 Hidden Markov Models and Dynamic Programming
344(5)
14.4.1 Hidden Markov Models
344(1)
14.4.2 Picturing Inference with a Trellis
344(2)
14.4.3 Dynamic Programming for HMM's: Formalities
346(2)
14.4.4 Example: Simple Communication Errors
348(1)
14.5 You Should
349(6)
14.5.1 Remember These Definitions
349(1)
14.5.2 Remember These Terms
349(1)
14.5.3 Remember These Facts
350(1)
14.5.4 Be Able to
350(5)
Part V Mathematical Bits and Pieces
15 Resources and Extras
355(8)
15.1 Useful Material About Matrices
355(3)
15.1.1 The Singular Value Decomposition
356(1)
15.1.2 Approximating A Symmetric Matrix
356(2)
15.2 Some Special Functions
358(1)
15.3 Splitting a Node in a Decision Tree
359(4)
15.3.1 Accounting for Information with Entropy
359(1)
15.3.2 Choosing a Split with Information Gain
360(3)
Index 363
David Alexander Forsyth is Fulton Watson Copp Chair in Computer Science at the University of Illinois at Urbana-Champaign, where he is a leading researcher in computer vision.  Professor Forsyth has regularly served as a program or general chair for the top conferences in computer vision, and has just finished a second term as Editor-in-Chief for IEEE Transactions on Pattern Analysis and Machine Intelligence.





A Fellow of the ACM (2014) and IEEE (2009), Forsyth has also been recognized with the IEEE Computer Societys Technical Achievement Award (2005), the Marr Prize, and a prize for best paper in cognitive computer vision (ECCV 2002).  Many of his former students are famous in their own right as academics or industry leaders.

He is the co-author with Jean Ponce of Computer Vision: A Modern Approach (2002; 2011), published in four languages, and a leading textbook on the topic.

Among a variety of odd hobbies, he is

a compulsive diver, certified up to normoxic trimix level.