Klientu atbalsts: 27018494

Grāmatu iegāde | Jauns profils | Ienākt

Essential Math for AI: Next-Level Mathematics for Efficient and Successful AI Systems [Mīkstie vāki]

3.81/5 (32 ratings by Goodreads)

Hala Nelson

Formāts: Paperback / softback, 602 pages, height x width: 233x178 mm
Izdošanas datums: 17-Jan-2023
Izdevniecība: O'Reilly Media
ISBN-10: 1098107632
ISBN-13: 9781098107635

Citas grāmatas par šo tēmu:

Machine learning
Mathematical & statistical software - (Noliktavā: 1 punkts)
Natural language & machine translation
Neural networks & fuzzy systems
Probability & statistics - (Noliktavā: 2 punkts)
Complex analysis, complex variables

Mīkstie vāki
Cena: 73,03 €*
* ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
Standarta cena: 85,92 €
Ietaupiet 15%
Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
Daudzums:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Ielikt grozā
Piegādes laiks - 4-6 nedēļas
Pievienot vēlmju sarakstam

Formāts: Paperback / softback, 602 pages, height x width: 233x178 mm
Izdošanas datums: 17-Jan-2023
Izdevniecība: O'Reilly Media
ISBN-10: 1098107632
ISBN-13: 9781098107635

Citas grāmatas par šo tēmu:

Machine learning
Mathematical & statistical software - (Noliktavā: 1 punkts)
Natural language & machine translation
Neural networks & fuzzy systems
Probability & statistics - (Noliktavā: 2 punkts)
Complex analysis, complex variables

Permanent link: https://www.kriso.lv/db/9781098107635.html

Keywords:

Artificial intelligence - Mathematics

Companies are scrambling to integrate AI into their systems and operations. But to build truly successful solutions, you need a firm grasp of the underlying mathematics. This accessible guide walks you through the math necessary to thrive in the AI field such as focusing on real-world applications rather than dense academic theory.

Engineers, data scientists, and students alike will examine mathematical topics critical for AI--including regression, neural networks, optimization, backpropagation, convolution, Markov chains, and more--through popular applications such as computer vision, natural language processing, and automated systems. And supplementary Jupyter notebooks shed light on examples with Python code and visualizations. Whether you're just beginning your career or have years of experience, this book gives you the foundation necessary to dive deeper in the field.

Understand the underlying mathematics powering AI systems, including generative adversarial networks, random graphs, large random matrices, mathematical logic, optimal control, and more
Learn how to adapt mathematical methods to different applications from completely different fields
Gain the mathematical fluency to interpret and explain how AI systems arrive at their decisions

Preface

xvii

1 Why Learn the Mathematics of Al?

(12)

What Is AI?

(1)

Why Is AI So Popular Now?

(1)

What Is AI Able to Do?

(3)

An AI Agent's Specific Tasks

(2)

What Are AI's Limitations?

(2)

What Happens When AI Systems Fail?

(1)

Where Is AI Headed?

(2)

Who Are the Current Main Contributors to the AI Field?

(1)

What Math Is Typically Involved in AI?

(1)

Summary and Looking Ahead

(2)

2 Data, Data, Data

(38)

Data for AI

(2)

Real Data Versus Simulated Data

(1)

Mathematical Models: Linear Versus Nonlinear

(2)

An Example of Real Data

(3)

An Example of Simulated Data

(4)

Mathematical Models: Simulations and AI

(2)

Where Do We Get Our Data From?

(2)

The Vocabulary of Data Distributions, Probability, and Statistics

(6)

Random Variables

(1)

Probability Distributions

(1)

Marginal Probabilities

(1)

The Uniform and the Normal Distributions

(1)

Conditional Probabilities and Bayes' Theorem

(1)

Conditional Probabilities and Joint Distributions

(1)

Prior Distribution, Posterior Distribution, and Likelihood Function

(1)

Mixtures of Distributions

(1)

Sums and Products of Random Variables

(1)

Using Graphs to Represent Joint Probability Distributions

(1)

Expectation, Mean, Variance, and Uncertainty

(1)

Covariance and Correlation

(1)

Markov Process

(1)

Normalizing, Scaling, and/or Standardizing a Random Variable or Data Set

(1)

Common Examples

(1)

Continuous Distributions Versus Discrete Distributions (Density Versus Mass)

(2)

The Power of the Joint Probability Density Function

(1)

Distribution of Data: The Uniform Distribution

(2)

Distribution of Data: The Bell-Shaped Normal (Gaussian) Distribution

(3)

Distribution of Data: Other Important and Commonly Used Distributions

(4)

The Various Uses of the Word "Distribution"

(1)

A/B Testing

(1)

Summary and Looking Ahead

(3)

3 Fitting Functions to Data

(62)

Traditional and Very Useful Machine Learning Models

(2)

Numerical Solutions Versus Analytical Solutions

(1)

Regression: Predict a Numerical Value

(29)

Training Function

(2)

Loss Function

(11)

Optimization

(14)

Logistic Regression: Classify into Two Classes

(3)

Training Function

(1)

Loss Function

(2)

Optimization

(1)

Softmax Regression: Classify into Multiple Classes

(5)

Training Function

(2)

Loss Function

(1)

Optimization

(1)

Incorporating These Models into the Last Layer of a Neural Network

(1)

Other Popular Machine Learning Techniques and Ensembles of Techniques

(15)

Support Vector Machines

(4)

Decision Trees

(9)

Random Forests

107

(1)

K-means Clustering

108

(1)

Performance Measures for Classification Models

108

(2)

Summary and Looking Ahead

110

(3)

4 Optimization for Neural Networks

113

(48)

The Brain Cortex and Artificial Neural Networks

113

(2)

Training Function: Fully Connected, or Dense, Feed Forward Neural Networks

115

(16)

A Neural Network Is a Computational Graph Representation of the Training Function

117

(1)

Linearly Combine, Add Bias, Then Activate

117

(5)

Common Activation Functions

122

(3)

Universal Function Approximation

125

(6)

Approximation Theory for Deep Learning

131

(1)

Loss Functions

131

(2)

Optimization

133

(12)

Mathematics and the Mysterious Success of Neural Networks

134

(1)

Gradient Descent ω→i+1 = ω→i- ηΔL(ω→i)

135

(2)

Explaining the Role of the Learning Rate Hyperparameter η

137

(3)

Convex Versus Nonconvex Landscapes

140

(3)

Stochastic Gradient Descent

143

(1)

Initializing the Weights ω→0 for the Optimization Process

144

(1)

Regularization Techniques

145

(7)

Dropout

145

(1)

Early Stopping

146

(1)

Batch Normalization of Each Layer

146

(1)

Control the Size of the Weights by Penalizing Their Norm

147

(3)

Penalizing the l2 Norm Versus Penalizing the l2 Norm

150

(1)

Explaining the Role of the Regularization Hyperparameter α

151

(1)

Hyperparameter Examples That Appear in Machine Learning

152

(1)

Chain Rule and Backpropagation: Calculating ΔL(ω→i)

153

(4)

Backpropagation Is Not Too Different from How Our Brain Learns

154

(1)

Why Is It Better to Backpropagate?

155

(1)

Backpropagation in Detail

155

(2)

Assessing the Significance of the Input Data Features

157

(1)

Summary and Looking Ahead

158

(3)

5 Convolutional Neural Networks and Computer Vision

161

(26)

Convolution and Cross-Correlation

163

(5)

Translation Invariance and Translation Equivariance

167

(1)

Convolution in Usual Space Is a Product in Frequency Space

167

(1)

Convolution from a Systems Design Perspective

168

(3)

Convolution and Impulse Response for Linear and Translation Invariant Systems

169

(2)

Convolution and One-Dimensional Discrete Signals

171

(1)

Convolution and Two-Dimensional Discrete Signals

172

(7)

Filtering Images

174

(4)

Feature Maps

178

(1)

Linear Algebra Notation

179

(4)

The One-Dimensional Case: Multiplication by a Toeplitz Matrix

182

(1)

The Two-Dimensional Case: Multiplication by a Doubly Block Circulant Matrix

182

(1)

Pooling

183

(1)

A Convolutional Neural Network for Image Classification

184

(2)

Summary and Looking Ahead

186

(1)

6 Singular Value Decomposition: Image Processing, Natural Language Processing, and Social Media

187

(32)

Matrix Factorization

188

(3)

Diagonal Matrices

191

(2)

Matrices as Linear Transformations Acting on Space

193

(7)

Action of A on the Right Singular Vectors

194

(1)

Action of A on the Standard Unit Vectors and the Unit Square Determined by Them

195

(1)

Action of A on the Unit Circle

196

(1)

Breaking Down the Circle-to-Ellipse Transformation According to the Singular Value Decomposition

197

(1)

Rotation and Reflection Matrices

198

(1)

Action of A on a General Vector →x

199

(1)

Three Ways to Multiply Matrices

200

(1)

The Big Picture

201

(3)

The Condition Number and Computational Stability

203

(1)

The Ingredients of the Singular Value Decomposition

204

(1)

Singular Value Decomposition Versus the Eigenvalue Decomposition

204

(2)

Computation of the Singular Value Decomposition

206

(2)

Computing an Eigenvector Numerically

207

(1)

The Pseudoinverse

208

(1)

Applying the Singular Value Decomposition to Images

209

(3)

Principal Component Analysis and Dimension Reduction

212

(2)

Principal Component Analysis and Clustering

214

(1)

A Social Media Application

214

(1)

Latent Semantic Analysis

215

(1)

Randomized Singular Value Decomposition

216

(1)

Summary and Looking Ahead

216

(3)

7 Natural Language and Finance AI: Vectorization and Time Series

219

(44)

Natural Language AI

222

(1)

Preparing Natural Language Data for Machine Processing

223

(3)

Statistical Models and the log Function

226

(1)

Zipf's Law for Term Counts

226

(1)

Various Vector Representations for Natural Language Documents

227

(14)

Term Frequency Vector Representation of a Document or Bag of Words

227

(1)

Term Frequency-Inverse Document Frequency Vector Representation of a Document

228

(1)

Topic Vector Representation of a Document Determined by Latent Semantic Analysis

228

(4)

Topic Vector Representation of a Document Determined by Latent Dirichlet Allocation

232

(1)

Topic Vector Representation of a Document Determined by Latent Discriminant Analysis

233

(1)

Meaning Vector Representations of Words and of Documents Determined by Neural Network Embeddings

234

(7)

Cosine Similarity

241

(2)

Natural Language Processing Applications

243

(4)

Sentiment Analysis

243

(1)

Spam Filter

244

(1)

Search and Information Retrieval

244

(2)

Machine Translation

246

(1)

Image Captioning

247

(1)

Chatbots

247

(1)

Other Applications

247

(1)

Transformers and Attention Models

247

(8)

The Transformer Architecture

248

(3)

The Attention Mechanism

251

(4)

Transformers Are Far from Perfect

255

(1)

Convolutional Neural Networks for Time Series Data

255

(2)

Recurrent Neural Networks for Time Series Data

257

(4)

How Do Recurrent Neural Networks Work?

258

(2)

Gated Recurrent Units and Long Short-Term Memory Units

260

(1)

An Example of Natural Language Data

261

(1)

Finance AI

261

(1)

Summary and Looking Ahead

262

(1)

8 Probabilistic Generative Models

263

(34)

What Are Generative Models Useful For?

264

(1)

The Typical Mathematics of Generative Models

265

(3)

Shifting Our Brain from Deterministic Thinking to Probabilistic Thinking

268

(2)

Maximum Likelihood Estimation

270

(2)

Explicit and Implicit Density Models

272

(1)

Explicit Density-Tractable: Fully Visible Belief Networks

273

(3)

Example: Generating Images via PixelCNN and Machine Audio via WaveNet

273

(3)

Explicit Density-Tractable: Change of Variables Nonlinear Independent Component Analysis

276

(1)

Explicit Density-Intractable: Variational Autoencoders Approximation via Variational Methods

277

(2)

Explicit Density-Intractable: Boltzman Machine Approximation via Markov Chain

279

(1)

Implicit Density-Markov Chain: Generative Stochastic Network

279

(1)

Implicit Density-Direct: Generative Adversarial Networks

280

(1)

How Do Generative Adversarial Networks Work?

281

(2)

Example: Machine Learning and Generative Networks for High Energy Physics

283

(2)

Other Generative Models

285

(4)

Naive Bayes Classification Model

286

(2)

Gaussian Mixture Model

288

(1)

The Evolution of Generative Models

289

(4)

Hopfield Nets

290

(1)

Boltzmann Machine

291

(1)

Restricted Boltzmann Machine (Explicit Density and Intractable)

291

(1)

The Original Autoencoder

292

(1)

Probabilistic Language Modeling

293

(2)

Summary and Looking Ahead

295

(2)

9 Graph Models

297

(50)

Graphs: Nodes, Edges, and Features for Each

299

(3)

Example: PageRank Algorithm

302

(5)

Inverting Matrices Using Graphs

307

(1)

Cayley Graphs of Groups: Pure Algebra and Parallel Computing

308

(1)

Message Passing Within a Graph

309

(1)

The Limitless Applications of Graphs

310

(14)

Brain Networks

311

(1)

Spread of Disease

312

(1)

Spread of Information

312

(1)

Detecting and Tracking Fake News Propagation

312

(2)

Web-Scale Recommendation Systems

314

(1)

Fighting Cancer

314

(2)

Biochemical Graphs

316

(1)

Molecular Graph Generation for Drug and Protein Structure Discovery

316

(1)

Citation Networks

316

(1)

Social Media Networks and Social Influence Prediction

316

(1)

Sociological Structures

317

(1)

Bayesian Networks

317

(1)

Traffic Forecasting

317

(1)

Logistics and Operations Research

318

(1)

Language Models

318

(2)

Graph Structure of the Web

320

(1)

Automatically Analyzing Computer Programs

321

(1)

Data Structures in Computer Science

321

(1)

Load Balancing in Distributed Networks

322

(1)

Artificial Neural Networks

323

(1)

Random Walks on Graphs

324

(2)

Node Representation Learning

326

(1)

Tasks for Graph Neural Networks

327

(3)

Node Classification

327

(1)

Graph Classification

328

(1)

Clustering and Community Detection

329

(1)

Graph Generation

329

(1)

Influence Maximization

329

(1)

Link Prediction

330

(1)

Dynamic Graph Models

330

(1)

Bayesian Networks

331

(7)

A Bayesian Network Represents a Compactified Conditional Probability Table

333

(1)

Making Predictions Using a Bayesian Network

334

(1)

Bayesian Networks Are Belief Networks, Not Causal Networks

334

(1)

Keep This in Mind About Bayesian Networks

335

(1)

Chains, Forks, and Colliders

336

(1)

Given a Data Set, How Do We Set Up a Bayesian Network for the Involved Variables?

337

(1)

Graph Diagrams for Probabilistic Causal Modeling

338

(2)

A Brief History of Graph Theory

340

(1)

Main Considerations in Graph Theory

341

(3)

Spanning Trees and Shortest Spanning Trees

341

(1)

Cut Sets and Cut Vertices

342

(1)

Planarity

342

(1)

Graphs as Vector Spaces

343

(1)

Realizability

343

(1)

Coloring and Matching

344

(1)

Enumeration

344

(1)

Algorithms and Computational Aspects of Graphs

344

(1)

Summary and Looking Ahead

345

(2)

10 Operations Research

347

(64)

No Free Lunch

349

(1)

Complexity Analysis and 0() Notation

350

(3)

Optimization: The Heart of Operations Research

353

(3)

Thinking About Optimization

356

(9)

Optimization: Finite Dimensions, Unconstrained

357

(1)

Optimization: Finite Dimensions, Constrained Lagrange Multipliers

357

(3)

Optimization: Infinite Dimensions, Calculus of Variations

360

(5)

Optimization on Networks

365

(5)

Traveling Salesman Problem

365

(1)

Minimum Spanning Tree

366

(1)

Shortest Path

367

(1)

Max-Flow Min-Cut

368

(1)

Max-Flow Min-Cost

369

(1)

The Critical Path Method for Project Design

369

(1)

The n-Queens Problem

370

(1)

Linear Optimization

371

(31)

The General Form and the Standard Form

372

(1)

Visualizing a Linear Optimization Problem in Two Dimensions

373

(1)

Convex to Linear

374

(3)

The Geometry of Linear Optimization

377

(2)

The Simplex Method

379

(7)

Transportation and Assignment Problems

386

(1)

Duality, Lagrange Relaxation, Shadow Prices, Max-Min, Min-Max, and All That

386

(15)

Sensitivity

401

(1)

Game Theory and Multiagents

402

(2)

Queuing

404

(1)

Inventory

405

(1)

Machine Learning for Operations Research

405

(1)

Hamilton-Jacobi-Bellman Equation

406

(1)

Operations Research for AI

407

(1)

Summary and Looking Ahead

407

(4)

11 Probability

411

(40)

Where Did Probability Appear in This Book?

412

(3)

What More Do We Need to Know That Is Essential for AI?

415

(1)

Causal Modeling and the Do Calculus

415

(5)

An Alternative: The Do Calculus

417

(3)

Paradoxes and Diagram Interpretations

420

(4)

Monty Hall Problem

420

(2)

Berkson's Paradox

422

(1)

Simpson's Paradox

422

(2)

Large Random Matrices

424

(8)

Examples of Random Vectors and Random Matrices

424

(3)

Main Considerations in Random Matrix Theory

427

(2)

Random Matrix Ensembles

429

(1)

Eigenvalue Density of the Sum of Two Large Random Matrices

430

(1)

Essential Math for Large Random Matrices

430

(2)

Stochastic Processes

432

(6)

Bernoulli Process

433

(1)

Poisson Process

433

(1)

Random Walk

434

(1)

Wiener Process or Brownian Motion

435

(1)

Martingale

435

(1)

Levy Process

436

(1)

Branching Process

436

(1)

Markov Chain

436

(1)

Ito's Lemma

437

(1)

Markov Decision Processes and Reinforcement Learning

438

(3)

Examples of Reinforcement Learning

438

(1)

Reinforcement Learning as a Markov Decision Process

439

(2)

Reinforcement Learning in the Context of Optimal Control and Nonlinear Dynamics

441

(1)

Python Library for Reinforcement Learning

441

(1)

Theoretical and Rigorous Grounds

441

(7)

Which Events Have a Probability?

442

(1)

Can We Talk About a Wider Range of Random Variables?

443

(1)

A Probability Triple (Sample Space, Sigma Algebra, Probability Measure)

443

(1)

Where Is the Difficulty?

444

(1)

Random Variable, Expectation, and Integration

445

(1)

Distribution of a Random Variable and the Change of Variable Theorem

446

(1)

Next Steps in Rigorous Probability Theory

447

(1)

The Universality Theorem for Neural Networks

448

(1)

Summary and Looking Ahead

448

(3)

12 Mathematical Logic

451

(14)

Various Logic Frameworks

452

(1)

Propositional Logic

452

(5)

From Few Axioms to a Whole Theory

455

(1)

Codifying Logic Within an Agent

456

(1)

How Do Deterministic and Probabilistic Machine Learning Fit In?

456

(1)

First-Order Logic

457

(3)

Relationships Between For All and There Exist

458

(2)

Probabilistic Logic

460

(1)

Fuzzy Logic

460

(1)

Temporal Logic

461

(1)

Comparison with Human Natural Language

462

(1)

Machines and Complex Mathematical Reasoning

462

(1)

Summary and Looking Ahead

463

(2)

13 Artificial Intelligence and Partial Differential Equations

465

(66)

What Is a Partial Differential Equation?

466

(1)

Modeling with Differential Equations

467

(5)

Models at Different Scales

468

(1)

The Parameters of a PDE

468

(1)

Changing One Thing in a PDE Can Be a Big Deal

469

(2)

Can AI Step In?

471

(1)

Numerical Solutions Are Very Valuable

472

(21)

Continuous Functions Versus Discrete Functions

472

(2)

PDE Themes from My Ph.D. Thesis

474

(3)

Discretization and the Curse of Dimensionality

477

(1)

Finite Differences

478

(6)

Finite Elements

484

(5)

Variational or Energy Methods

489

(1)

Monte Carlo Methods

490

(3)

Some Statistical Mechanics: The Wonderful Master Equation

493

(2)

Solutions as Expectations of Underlying Random Processes

495

(1)

Transforming the PDE

495

(4)

Fourier Transform

495

(3)

Laplace Transform

498

(1)

Solution Operators

499

(10)

Example Using the Heat Equation

499

(2)

Example Using the Poisson Equation

501

(2)

Fixed Point Iteration

503

(6)

AI for PDEs

509

(13)

Deep Learning to Learn Physical Parameter Values

509

(1)

Deep Learning to Learn Meshes

510

(2)

Deep Learning to Approximate Solution Operators of PDEs

512

(7)

Numerical Solutions of High-Dimensional Differential Equations

519

(1)

Simulating Natural Phenomena Directly from Data

520

(2)

Hamilton-Jacobi-Bellman PDE for Dynamic Programming

522

(6)

PDEs for AI?

528

(1)

Other Considerations in Partial Differential Equations

528

(2)

Summary and Looking Ahead

530

(1)

14 Artificial Intelligence, Ethics, Mathematics, Law, and Policy

531

(18)

Good AI

533

(1)

Policy Matters

534

(2)

What Could Go Wrong?

536

(3)

From Math to Weapons

536

(1)

Chemical Warfare Agents

537

(1)

AI and Politics

538

(1)

Unintended Outcomes of Generative Models

539

(1)

How to Fix It?

539

(5)

Addressing Underrepresentation in Training Data

539

(1)

Addressing Bias in Word Vectors

540

(1)

Addressing Privacy

540

(1)

Addressing Fairness

541

(1)

Injecting Morality into AI

542

(1)

Democratization and Accessibility of AI to Nonexperts

543

(1)

Prioritizing High Quality Data

543

(1)

Distinguishing Bias from Discrimination

544

(1)

The Hype

545

(1)

Final Thoughts

546

(3)

Index

549

Hala Nelson is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to James Madison University, she was a postdoctoral assistant professor at the University of Michigan, Ann Arbor. She specializes in mathematical modeling and consults for emergency and infrastructure services in the public sector. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well or is trying to show off. Other facts: Hala Nelson grew up in Lebanon during its brutal civil war. She lost her hair at a very young age in a missile explosion. This event, and many that followed, shaped her interests in human behavior, the nature of intelligence, and AI. Her dad taught her math, at home and in French, until she graduated high school. Her favorite quote from her dad about math is, "It is the one clean science".

Essential Math for AI: Next-Level Mathematics for Efficient and Successful AI Systems [Mīkstie vāki]

Konts un iestatījumi

Meklēšana

Meklēt datubāzē

Refine By

Tēmas Grāmatas angļu valodā

Izvēlieties iepirkumu grozu