Atjaunināt sīkdatņu piekrišanu

E-grāmata: High-Dimensional Statistics: A Non-Asymptotic Viewpoint

4.68/5 (50 ratings by Goodreads)
(University of California, Berkeley)
  • Formāts - EPUB+DRM
  • Cena: 77,31 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Recent years have witnessed an explosion in the volume and variety of data collected in all scientific disciplines and industrial settings. Such massive data sets present a number of challenges to researchers in statistics and machine learning. This book provides a self-contained introduction to the area of high-dimensional statistics, aimed at the first-year graduate level. It includes chapters that are focused on core methodology and theory - including tail bounds, concentration inequalities, uniform laws and empirical process, and random matrices - as well as chapters devoted to in-depth exploration of particular model classes - including sparse linear models, matrix models with rank constraints, graphical models, and various types of non-parametric models. With hundreds of worked examples and exercises, this text is intended both for courses and for self-study by graduate students and researchers in statistics, machine learning, and related fields who must understand, apply, and adapt modern statistical methods suited to large-scale data.

Recent years have seen an explosion in the volume and variety of data collected in scientific disciplines from astronomy to genetics and industrial settings ranging from Amazon to Uber. This graduate text equips readers in statistics, machine learning, and related fields to understand, apply, and adapt modern methods suited to large-scale data.

Recenzijas

'Non-asymptotic, high-dimensional theory is critical for modern statistics and machine learning. This book is unique in providing a crystal clear, complete and unified treatment of the area. With topics ranging from concentration of measure to graphical models, the author weaves together probability theory and its applications to statistics. Ideal for graduate students and researchers. This will surely be the standard reference on the topic for many years.' Larry Wasserman, Carnegie Mellon University, Pennsylvania 'Martin J. Wainwright brings his large box of analytical power tools to bear on the problems of the day - the analysis of models for wide data. A broad knowledge of this new area combines with his powerful analytical skills to deliver this impressive and intimidating work - bound to be an essential reference for all the brave souls that try their hand.' Trevor Hastie, Stanford University, California 'This book provides an excellent treatment of perhaps the fastest growing area within high-dimensional theoretical statistics - non-asymptotic theory that seeks to provide probabilistic bounds on estimators as a function of sample size and dimension. It offers the most thorough, clear, and engaging coverage of this area to date, and is thus poised to become the definitive reference and textbook on this topic.' Genevera Allen, William Marsh Rice University, Texas 'Statistical theory and practice have undergone a renaissance in the past two decades, with intensive study of high-dimensional data analysis. No researcher has deepened our understanding of high-dimensional statistics more than Martin Wainwright. This book brings the signature clarity and incisiveness of his published research into book form. It will be a fantastic resource for both beginning students and seasoned researchers, as the field continues to make exciting breakthroughs.' John Lafferty, Yale University, Connecticut 'This is an outstanding book on high-dimensional statistics, written by a creative and celebrated researcher in the field. It gives comprehensive treatments on many important topics in statistical machine learning and, furthermore, is self-contained, from introductory materials to most updated results on various research frontiers. This book is a must-read for those who wish to learn and to develop modern statistical machine theory, methods and algorithms.' Jianqing Fan, Princeton University, New Jersey 'This book provides an in-depth mathematical treatment and methodological intuition of high-dimensional statistics. The main technical tools from probability theory are carefully developed and the construction and analysis of statistical methods and algorithms for high-dimensional problems is presented in an outstandingly clear way. Martin J. Wainwright has written a truly exceptional, inspiring and beautiful masterpiece!' Peter Bühlmann, Eidgenössische Technische Hochschule Zürich 'This new book by Martin J. Wainwright covers modern topics in high-dimensional statistical inference, and focuses primarily on explicit non-asymptotic results related to sparsity and non-parametric estimation. This is a must-read for all graduate students in mathematical statistics and theoretical machine learning, both for the breadth of recent advances it covers and the depth of results which are presented. The exposition is outstandingly clear, starting from the first introductory chapters on the necessary probabilistic tools. Then, the book covers state-of-the-art advances in high-dimensional statistics, with always a clever choice of results which have the perfect mix of significance and mathematical depth.' Francis Bach, INRIA Paris 'Wainwright's book on those parts of probability theory and mathematical statistics critical to understanding of the new phenomena encountered in high dimensions is marked by the clarity of its presentation and the depth to which it travels. In every chapter he starts with intuitive examples and simulations which are systematically developed either into powerful mathematical tools or complete answers to fundamental questions of inference. It is not easy, but elegant and rewarding whether read systematically or dipped into as a reference.' Peter Bickel, University of California, Berkeley ' this is a very valuable book, covering a variety of important topics, self-contained and nicely written.' Fabio Mainardi, MAA Reviews 'This is an excellent book. It provides a lucid, accessible and in-depth treatment of nonasymptotic high-dimensional statistical theory, which is critical as the underpinning of modern statistics and machine learning. It succeeds brilliantly in providing a self-contained overview of high-dimensional statistics, suitable for use in formal courses or for self-study by graduate-level students or researchers. The treatment is outstandingly clear and engaging, and the production is first-rate. It will quickly become essential reading and the key reference text in the field.' G. Alastair Young, International Statistical Review 'Martin Wainwright takes great care to polish every sentence of each part of the book. He introduces state-of-the-art theory in every chapter, as should probably be expected from an acknowledged specialist of the field. But it is certainly an enormous amount of work to organize all these results in a complete, coherent, rigorous yet easy-to-follow theory. I am simply amazed by the quality of the writing. The explanations on the motivations (Chapter 1) and on the core of the theory are extremely pedagogical. The proofs of the main results are rigorous and complete, but most of them are also built in a way that makes them seem easier to the reader than they actually are. This is the kind of magic only a few authors are capable of.' Pierre Alquier, MatSciNet '... provides a masterful exposition of various mathematical tools that are becoming increasingly common in the analysis of contemporary statistical problems. In addition to providing a rigorous and comprehensive overview of these tools, the author delves into the details of many illustrative examples to provide a convincing case for the general usefulness of the methods that are introduced.' Po-Ling Lo, Bulletin of the American Mathematical Society 'An excellent statistical masterpiece is in the hands of the reader, which is a must read book for all graduate students in both mathematical statistics and mathematical machine learning.' Rózsa Horvįth-Bokor, ZB Math Reviews

Papildus informācija

A coherent introductory text from a groundbreaking researcher, focusing on clarity and motivation to build intuition and understanding.
List of illustrations
xv
1 Introduction
1(20)
1.1 Classical versus high-dimensional theory
1(1)
1.2 What can go wrong in high dimensions?
2(7)
1.2.1 Linear discriminant analysis
2(3)
1.2.2 Covariance estimation
5(2)
1.2.3 Nonparametric regression
7(2)
1.3 What can help us in high dimensions?
9(5)
1.3.1 Sparsity in vectors
10(1)
1.3.2 Structure in covariance matrices
11(1)
1.3.3 Structured forms of regression
12(2)
1.4 What is the non-asymptotic viewpoint?
14(1)
1.5 Overview of the book
15(4)
1.5.1
Chapter structure and synopses
15(2)
1.5.2 Recommended background
17(1)
1.5.3 Teaching possibilities and a flow diagram
17(2)
1.6 Bibliographic details and background
19(2)
2 Basic tail and concentration bounds
21(37)
2.1 Classical bounds
21(11)
2.1.1 From Markov to Chernoff
21(1)
2.1.2 Sub-Gaussian variables and Hoeffding bounds
22(3)
2.1.3 Sub-exponential variables and Bernstein bounds
25(6)
2.1.4 Some one-sided results
31(1)
2.2 Martingale-based methods
32(8)
2.2.1 Background
33(2)
2.2.2 Concentration bounds for martingale difference sequences
35(5)
2.3 Lipschitz functions of Gaussian variables
40(5)
2.4 Appendix A: Equivalent versions of sub-Gaussian variables
45(3)
2.5 Appendix B: Equivalent versions of sub-exponential variables
48(1)
2.6 Bibliographic details and background
49(1)
2.7 Exercises
50(8)
3 Concentration of measure
58(40)
3.1 Concentration by en tropic techniques
58(9)
3.1.1 Entropy and its properties
58(2)
3.1.2 Herbst argument and its extensions
60(2)
3.1.3 Separately convex functions and the entropic method
62(2)
3.1.4 Tensorization and separately convex functions
64(3)
3.2 A geometric perspective on concentration
67(9)
3.2.1 Concentration functions
67(3)
3.2.2 Connection to Lipschitz functions
70(2)
3.2.3 From geometry to concentration
72(4)
3.3 Wasserstein distances and information inequalities
76(11)
3.3.1 Wasserstein distances
76(2)
3.3.2 Transportation cost and concentration inequalities
78(2)
3.3.3 Tensorization for transportation cost
80(2)
3.3.4 Transportation cost inequalities for Markov chains
82(2)
3.3.5 Asymmetric coupling cost
84(3)
3.4 Tail bounds for empirical processes
87(4)
3.4.1 A functional Hoeffding inequality
87(2)
3.4.2 A functional Bernstein inequality
89(2)
3.5 Bibliographic details and background
91(1)
3.6 Exercises
92(6)
4 Uniform laws of large numbers
98(23)
4.1 Motivation
98(6)
4.1.1 Uniform convergence of cumulative distribution functions
98(3)
4.1.2 Uniform laws for more general function classes
101(3)
4.2 A uniform law via Rademacher complexity
104(5)
4.2.1 Necessary conditions with Rademacher complexity
107(2)
4.3 Upper bounds on the Rademacher complexity
109(8)
4.3.1 Classes with polynomial discrimination
109(2)
4.3.2 Vapnik--Chervonenkis dimension
111(4)
4.3.3 Controlling the VC dimension
115(2)
4.4 Bibliographic details and background
117(1)
4.5 Exercises
117(4)
5 Metric entropy and its uses
121(38)
5.1 Covering and packing
121(11)
5.2 Gaussian and Rademacher complexity
132(2)
5.3 Metric entropy and sub-Gaussian processes
134(9)
5.3.1 Upper bound by one-step discretization
135(2)
5.3.2 Some examples of discretization bounds
137(2)
5.3.3 Chaining and Dudley's entropy integral
139(4)
5.4 Some Gaussian comparison inequalities
143(5)
5.4.1 A general comparison result
143(2)
5.4.2 Slepian and Sudakov--Fernique inequalities
145(1)
5.4.3 Gaussian contraction inequality
146(2)
5.5 Sudakov's lower bound
148(2)
5.6 Chaining and Orlicz processes
150(3)
5.7 Bibliographic details and background
153(1)
5.8 Exercises
154(5)
6 Random matrices and covariance estimation
159(35)
6.1 Some preliminaries
159(2)
6.1.1 Notation and basic facts
159(1)
6.1.2 Set-up of covariance estimation
160(1)
6.2 Wishart matrices and their behavior
161(4)
6.3 Covariance matrices from sub-Gaussian ensembles
165(3)
6.4 Bounds for general matrices
168(12)
6.4.1 Background on matrix analysis
168(1)
6.4.2 Tail conditions for matrices
169(3)
6.4.3 Matrix Chernoff approach and independent decompositions
172(2)
6.4.4 Upper tail bounds for random matrices
174(5)
6.4.5 Consequences for covariance matrices
179(1)
6.5 Bounds for structured covariance matrices
180(5)
6.5.1 Unknown sparsity and thresholding
180(3)
6.5.2 Approximate sparsity
183(2)
6.6 Appendix: Proof of Theorem 6.1
185(3)
6.7 Bibliographic details and background
188(1)
6.8 Exercises
189(5)
7 Sparse linear models in high dimensions
194(42)
7.1 Problem formulation and applications
194(5)
7.1.1 Different sparsity models
194(2)
7.1.2 Applications of sparse linear models
196(3)
7.2 Recovery in the noiseless setting
199(7)
7.2.1 1-based relaxation
200(1)
7.2.2 Exact recovery and restricted nullspace
200(2)
7.2.3 Sufficient conditions for restricted nullspace
202(4)
7.3 Estimation in noisy settings
206(10)
7.3.1 Restricted eigenvalue condition
207(2)
7.3.2 Bounds on 2-error for hard sparse models
209(4)
7.3.3 Restricted nullspace and eigenvalues for random designs
213(3)
7.4 Bounds on prediction error
216(2)
7.5 Variable or subset selection
218(6)
7.5.1 Variable selection consistency for the Lasso
219(3)
7.5.2 Proof of Theorem 7.21
222(2)
7.6 Appendix: Proof of Theorem 7.16
224(3)
7.7 Bibliographic details and background
227(2)
7.8 Exercises
229(7)
8 Principal component analysis in high dimensions
236(23)
8.1 Principal components and dimension reduction
236(6)
8.1.1 Interpretations and uses of PCA
237(4)
8.1.2 Perturbations of eigenvalues and eigenspaces
241(1)
8.2 Bounds for generic eigenvectors
242(6)
8.2.1 A general deterministic result
242(3)
8.2.2 Consequences for a spiked ensemble
245(3)
8.3 Sparse principal component analysis
248(7)
8.3.1 A general deterministic result
249(3)
8.3.2 Consequences for the spiked model with sparsity
252(3)
8.4 Bibliographic details and background
255(1)
8.5 Exercises
256(3)
9 Decomposability and restricted strong convexity
259(53)
9.1 A general regularized M-estimator
259(10)
9.2 Decomposable regularizes and their utility
269(7)
9.2.1 Definition and some examples
269(3)
9.2.2 A key consequence of decomposability
272(4)
9.3 Restricted curvature conditions
276(3)
9.3.1 Restricted strong convexity
277(2)
9.4 Some general theorems
279(7)
9.4.1 Guarantees under restricted strong convexity
280(4)
9.4.2 Bounds under Φ-curvature
284(2)
9.5 Bounds for sparse vector regression
286(4)
9.5.1 Generalized linear models with sparsity
286(1)
9.5.2 Bounds under restricted strong convexity
287(1)
9.5.3 Bounds under ∞-curvature conditions
288(2)
9.6 Bounds for group-structured sparsity
290(3)
9.7 Bounds for overlapping decomposition-based norms
293(4)
9.8 Techniques for proving restricted strong convexity
297(9)
9.8.1 Lipschitz cost functions and Rademacher complexity
298(4)
9.8.2 A one-sided bound via truncation
302(4)
9.9 Appendix: Star-shaped property
306(1)
9.10 Bibliographic details and background
306(1)
9.11 Exercises
307(5)
10 Matrix estimation with rank constraints
312(35)
10.1 Matrix regression and applications
312(5)
10.2 Analysis of nuclear norm regularization
317(4)
10.2.1 Decomposability and subspaces
317(2)
10.2.2 Restricted strong convexity and error bounds
319(1)
10.2.3 Bounds under operator norm curvature
320(1)
10.3 Matrix compressed sensing
321(5)
10.4 Bounds for phase retrieval
326(3)
10.5 Multivariate regression with low-rank constraints
329(1)
10.6 Matrix completion
330(7)
10.7 Additive matrix decompositions
337(4)
10.8 Bibliographic details and background
341(2)
10.9 Exercises
343(4)
11 Graphical models for high-dimensional data
347(36)
11.1 Some basics
347(5)
11.1.1 Factorization
347(3)
11.1.2 Conditional independence
350(1)
11.1.3 Hammersley--Clifford equivalence
351(1)
11.1.4 Estimation of graphical models
352(1)
11.2 Estimation of Gaussian graphical models
352(13)
11.2.1 Graphical Lasso: 1-regularized maximum likelihood
353(6)
11.2.2 Neighborhood-based methods
359(6)
11.3 Graphical models in exponential form
365(3)
11.3.1 A general form of neighborhood regression
366(1)
11.3.2 Graph selection for Ising models
367(1)
11.4 Graphs with corrupted or hidden variables
368(8)
11.4.1 Gaussian graph estimation with corrupted data
368(5)
11.4.2 Gaussian graph selection with hidden variables
373(3)
11.5 Bibliographic details and background
376(2)
11.6 Exercises
378(5)
12 Reproducing kernel Hilbert spaces
383(33)
12.1 Basics of Hilbert spaces
383(2)
12.2 Reproducing kernel Hilbert spaces
385(9)
12.2.1 Positive semidefinite kernel functions
386(1)
12.2.2 Feature maps in 2(N)
387(1)
12.2.3 Constructing an RKHS from a kernel
388(2)
12.2.4 A more abstract viewpoint and further examples
390(4)
12.3 Mercer's theorem and its consequences
394(6)
12.4 Operations on reproducing kernel Hilbert spaces
400(5)
12.4.1 Sums of reproducing kernels
400(3)
12.4.2 Tensor products
403(2)
12.5 Interpolation and fitting
405(4)
12.5.1 Function interpolation
405(2)
12.5.2 Fitting via kernel ridge regression
407(2)
12.6 Distances between probability measures
409(2)
12.7 Bibliographic details and background
411(1)
12.8 Exercises
412(4)
13 Nonparametric least squares
416(37)
13.1 Problem set-up
416(4)
13.1.1 Different measures of quality
416(1)
13.1.2 Estimation via constrained least squares
417(1)
13.1.3 Some examples
418(2)
13.2 Bounding the prediction error
420(12)
13.2.1 Bounds via metric entropy
425(2)
13.2.2 Bounds for high-dimensional parametric problems
427(2)
13.2.3 Bounds for nonparametric problems
429(1)
13.2.4 Proof of Theorem 13.5
430(2)
13.3 Oracle inequalities
432(7)
13.3.1 Some examples of oracle inequalities
434(3)
13.3.2 Proof of Theorem 13.13
437(2)
13.4 Regularized estimators
439(9)
13.4.1 Oracle inequalities for regularized estimators
439(1)
13.4.2 Consequences for kernel ridge regression
439(4)
13.4.3 Proof of Corollary 13.18
443(1)
13.4.4 Proof of Theorem 13.17
444(4)
13.5 Bibliographic details and background
448(1)
13.6 Exercises
449(4)
14 Localization and uniform laws
453(32)
14.1 Population and empirical L2-norms
453(9)
14.1.1 A uniform law with localization
454(4)
14.1.2 Specialization to kernel classes
458(2)
14.1.3 Proof of Theorem 14.1
460(2)
14.2 A one-sided uniform law
462(7)
14.2.1 Consequences for nonparametric least squares
466(2)
14.2.2 Proof of Theorem 14.12
468(1)
14.3 A uniform law for Lipschitz cost functions
469(6)
14.3.1 General prediction problems
469(3)
14.3.2 Uniform law for Lipschitz cost functions
472(3)
14.4 Some consequences for nonparametric density estimation
475(5)
14.4.1 Density estimation via the nonparametric maximum likelihood estimate
475(2)
14.4.2 Density estimation via projections
477(3)
14.5 Appendix: Population and empirical Rademacher complexities
480(1)
14.6 Bibliographic details and background
481(1)
14.7 Exercises
482(3)
15 Minimax lower bounds
485(39)
15.1 Basic framework
485(6)
15.1.1 Minimax risks
486(1)
15.1.2 From estimation to testing
487(2)
15.1.3 Some divergence measures
489(2)
15.2 Binary testing and Le Cam's method
491(9)
15.2.1 Bayes error and total variation distance
491(6)
15.2.2 Le Cam's convex hull method
497(3)
15.3 Fano's method
500(15)
15.3.1 Kullback-Leibler divergence and mutual information
501(1)
15.3.2 Fano lower bound on minimax risk
501(2)
15.3.3 Bounds based on local packings
503(3)
15.3.4 Local packings with Gaussian entropy bounds
506(6)
15.3.5 Yang-Barron version of Fano's method
512(3)
15.4 Appendix: Basic background in information theory
515(3)
15.5 Bibliographic details and background
518(1)
15.6 Exercises
519(5)
References 524(16)
Subject index 540(8)
Author index 548
Martin J. Wainwright is a Chancellor's Professor at the University of California, Berkeley, with a joint appointment between the Department of Statistics and the Department of Electrical Engineering and Computer Sciences. His research lies at the nexus of statistics, machine learning, optimization, and information theory, and he has published widely in all of these disciplines. He has written two other books, one on graphical models together with Michael I. Jordan, and one on sparse learning together with Trevor Hastie and Robert Tibshirani. Among other awards, he has received the COPSS Presdients' Award, has been a Medallion Lecturer and Blackwell Lecturer for the Institute of Mathematical Statistics, and has received Best Paper Awards from the Neural Information Processing Systems (NIPS), the International Conference on Machine Learning (ICML), and the Uncertainty in Artificial Intelligence (UAI) conferences, as well as from the Institute of Electrical and Electronics Engineers (IEEE) Information Theory Society.