Klientu atbalsts: 27018494

Grāmatu iegāde | Jauns profils | Ienākt

E-grāmata: Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning

3.80/5 (5 ratings by Goodreads)

Abhijit Gosavi

Formāts: PDF+DRM
Sērija : Operations Research/Computer Science Interfaces Series 55
Izdošanas datums: 30-Oct-2014
Izdevniecība: Springer-Verlag New York Inc.
Valoda: eng
ISBN-13: 9781489974914

Citas grāmatas par šo tēmu:

Formāts - PDF+DRM
Cena: 118,37 €*
* ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
Ielikt grozā
Pievienot vēlmju sarakstam
Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.

Formāts: PDF+DRM
Sērija : Operations Research/Computer Science Interfaces Series 55
Izdošanas datums: 30-Oct-2014
Izdevniecība: Springer-Verlag New York Inc.
Valoda: eng
ISBN-13: 9781489974914

Citas grāmatas par šo tēmu:

DRM restrictions

Kopēšana (kopēt/ievietot):

nav atļauts
Drukāšana:

nav atļauts
Lietošana:

Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

Nepieciešamā programmatūra
Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning introduce the evolving area of static and dynamic simulation-based optimization. Covered in detail are model-free optimization techniques especially designed for those discrete-event, stochastic systems which can be simulated but whose analytical models are difficult to find in closed mathematical forms.

Key features of this revised and improved Second Edition include:

· Extensive coverage, via step-by-step recipes, of powerful new algorithms for static simulation optimization, including simultaneous perturbation, backtracking adaptive search and nested partitions, in addition to traditional methods, such as response surfaces, Nelder-Mead search and meta-heuristics (simulated annealing, tabu search, and genetic algorithms)

· Detailed coverage of the Bellman equation framework for Markov Decision Processes (MDPs), along with dynamic programming(value and policy iteration) for discounted, average, and total reward performance metrics

· An in-depth consideration of dynamic simulation optimization via temporal differences and Reinforcement Learning: Q-Learning, SARSA, and R-SMART algorithms, and policy search, via API, Q-P-Learning, actor-critics, and learning automata

· A special examination of neural-network-based function approximation for Reinforcement Learning, semi-Markov decision processes (SMDPs), finite-horizon problems, two time scales, case studies for industrial tasks, computer codes (placed online) and convergence proofs, via Banach fixed point theory and Ordinary Differential Equations

Themed around three areas in separate sets of chapters Static Simulation Optimization, Reinforcement Learning and Convergence Analysis this book is written for researchers and students in the fields of engineering (industrial, systems,electrical and computer), operations research, computer science and applied mathematics.

Preface

vii

Acknowledgments

List of Figures

xxi

List of Tables

xxvi

1 Background

(12)

1 Motivation

(4)

1.1 Main Branches

(1)

1.2 Difficulties with Classical Optimization

(1)

1.3 Recent Advances in Simulation Optimization

(2)

2 Goals and Limitations

(1)

2.1 Goals

(1)

2.2 Limitations

(1)

3 Notation

(4)

3.1 Some Basic Conventions

(1)

3.2 Vector Notation

(1)

3.3 Notation for Matrices

(1)

3.4 Notation for n-tuples

(1)

3.5 Notation for Sets

(1)

3.6 Notation for Sequences

(1)

3.7 Notation for Transformations

(1)

3.8 Max, Min, and Arg Max

(1)

4 Organization

(3)

2 Simulation Basics

(16)

1
Chapter Overview

(1)

2 Introduction

(1)

3 Models

(2)

4 Simulation Modeling

(11)

4.1 Random Number Generation

(4)

4.2 Event Generation

(5)

4.3 Independence of Samples Collected

(2)

5 Concluding Remarks

(2)

3 Simulation-Based Optimization: An Overview

(8)

1
Chapter Overview

(1)

2 Parametric Optimization

(3)

3 Control Optimization

(3)

4 Concluding Remarks

(2)

4 Parametric Optimization: Response Surfaces And Neural Networks

(34)

1
Chapter Overview

(1)

2 RSM: An Overview

(2)

3 RSM: Details

(8)

3.1 Sampling

(1)

3.2 Function Fitting

(6)

3.3 How Good Is the Guessed Metamodel?

(1)

3.4 Optimization with a Metamodel

(1)

4 Neuro-Response Surface Methods

(22)

4.1 Linear Neural Networks

(5)

4.2 Non-linear Neural Networks

(16)

5 Concluding Remarks

(2)

5 Parametric Optimization: Stochastic Gradients And Adaptive Search

(52)

1
Chapter Overview

(1)

2 Continuous Optimization

(13)

2.1 Steepest Descent

(11)

2.2 Non-derivative Methods

(2)

3 Discrete Optimization

(35)

3.1 Ranking and Selection

(3)

3.2 Meta-heuristics

(9)

3.3 Stochastic Adaptive Search

(22)

4 Concluding Remarks

120

(3)

6 Control Optimization With Stochastic Dynamic Programming

123

(74)

1
Chapter Overview

123

(1)

2 Stochastic Processes

123

(2)

3 Markov, Semi-Markov, and Decision Processes

125

(25)

3.1 Markov Chains

129

(6)

3.2 Semi-Markov Processes

135

(2)

3.3 Markov Decision Problems

137

(13)

4 Average Reward MDPs and DP

150

(9)

4.1 Bellman Policy Equation

151

(1)

4.2 Policy Iteration

152

(2)

4.3 Value Iteration and Its Variants

154

(5)

5 Discounted Reward MDPs and DP

159

(8)

5.1 Discounted Reward

160

(1)

5.2 Discounted Reward MDPs

161

(1)

5.3 Bellman Policy Equation

162

(1)

5.4 Policy Iteration

163

(1)

5.5 Value Iteration

164

(2)

5.6 Gauss-Siedel Value Iteration

166

(1)

6 Bellman Equation Revisited

167

(2)

7 Semi-Markov Decision Problems

169

(15)

7.1 Natural and Decision-Making Processes

171

(2)

7.2 Average Reward SMDPs

173

(7)

7.3 Discounted Reward SMDPs

180

(4)

8 Modified Policy Iteration

184

(3)

8.1 Steps for Discounted Reward MDPs

185

(1)

8.2 Steps for Average Reward MDPs

186

(1)

9 The MDP and Mathematical Programming

187

(2)

10 Finite Horizon MDPs

189

(4)

11 Conclusions

193

(4)

7 Control Optimization With Reinforcement Learning

197

(72)

1
Chapter Overview

197

(1)

2 The Twin Curses of DP

198

(5)

2.1 Breaking the Curses

199

(2)

2.2 MLE and Small MDPs

201

(2)

3 Reinforcement Learning: Fundamentals

203

(8)

3.1 Q-Factors

204

(2)

3.2 Q-Factor Value Iteration

206

(1)

3.3 Robbins-Monro Algorithm

207

(1)

3.4 Robbins-Monro and Q-Factors

208

(1)

3.5 Asynchronous Updating and Step Sizes

209

(2)

4 MDPs

211

(23)

4.1 Discounted Reward

211

(20)

4.2 Average Reward

231

(2)

4.3 R-SMART and Other Algorithms

233

(1)

5 SMDPs

234

(10)

5.1 Discounted Reward

234

(1)

5.2 Average Reward

235

(9)

6 Model-Building Algorithms

244

(4)

6.1 RTDP

245

(1)

6.2 Model-Building Q-Learning

246

(1)

6.3 Indirect Model-Building

247

(1)

7 Finite Horizon Problems

248

(1)

8 Function Approximation

249

(16)

8.1 State Aggregation

250

(5)

8.2 Function Fitting

255

(10)

9 Conclusions

265

(4)

8 Control Optimization With Stochastic Search

269

(12)

1
Chapter Overview

269

(1)

2 The MCAT Framework

270

(6)

2.1 Step-by-Step Details of an MCAT Algorithm

272

(2)

2.2 An Illustrative 3-State Example

274

(2)

2.3 Multiple Actions

276

(1)

3 Actor Critics

276

(4)

3.1 Discounted Reward MDPs

277

(1)

3.2 Average Reward MDPs

278

(1)

3.3 Average Reward SMDPs

279

(1)

4 Concluding Remarks

280

(1)

9 Convergence: Background Material

281

(38)

1
Chapter Overview

281

(1)

2 Vectors and Vector Spaces

282

(2)

3 Norms

284

(1)

3.1 Properties of Norms

284

(1)

4 Normed Vector Spaces

285

(1)

5 Functions and Mappings

285

(2)

5.1 Domain and Range

285

(2)

5.2 The Notation for Transformations

287

(1)

6 Mathematical Induction

287

(3)

7 Sequences

290

(11)

7.1 Convergent Sequences

292

(1)

7.2 Increasing and Decreasing Sequences

293

(1)

7.3 Boundedness

293

(6)

7.4 Limit Theorems and Squeeze Theorem

299

(2)

8 Sequences in Rn

301

(1)

9 Cauchy Sequences in Rn

301

(2)

10 Contraction Mappings in Rn

303

(6)

11 Stochastic Approximation

309

(10)

11.1 Convergence with Probability 1

309

(1)

11.2 Ordinary Differential Equations

310

(3)

11.3 Stochastic Approximation and ODEs

313

(6)

10 Convergence Analysis Of Parametric Optimization Methods

319

(32)

1
Chapter Overview

319

(1)

2 Preliminaries

319

(5)

2.1 Continuous Functions

320

(1)

2.2 Partial Derivatives

320

(1)

2.3 A Continuously Differentiable Function

320

(1)

2.4 Stationary Points and Local and Global Optima

321

(1)

2.5 Taylor's Theorem

322

(2)

3 Steepest Descent

324

(3)

4 Finite Differences Perturbation Estimates

327

(1)

5 Simultaneous Perturbation

328

(8)

5.1 Stochastic Gradient

329

(5)

5.2 ODE Approach

334

(2)

5.3 Spall's Conditions

336

(1)

6 Stochastic Adaptive Search

336

(14)

6.1 Pure Random Search

338

(2)

6.2 Learning Automata Search Technique

340

(1)

6.3 Backtracking Adaptive Search

341

(2)

6.4 Simulated Annealing

343

(6)

6.5 Modified Stochastic Ruler

349

(1)

7 Concluding Remarks

350

(1)

11 Convergence Analysis Of Control Optimization Methods

351

(100)

1
Chapter Overview

351

(1)

2 Dynamic Programming: Background

351

(7)

2.1 Special Notation

353

(1)

2.2 Monotonicity of T, Tˆμ, L, and Lˆμ

354

(1)

2.3 Key Results for Average and Discounted MDPs

355

(3)

3 Discounted Reward DP: MDPs

358

(13)

3.1 Bellman Equation for Discounted Reward

358

(9)

3.2 Policy Iteration

367

(2)

3.3 Value Iteration

369

(2)

4 Average Reward DP: MDPs

371

(18)

4.1 Bellman Equation for Average Reward

371

(5)

4.2 Policy Iteration

376

(4)

4.3 Value Iteration

380

(9)

5 DP: SMDPs

389

(1)

6 Asynchronous Stochastic Approximation

390

(10)

6.1 Asynchronous Convergence

390

(6)

6.2 Two-Time-Scale Convergence

396

(4)

7 Reinforcement Learning: Convergence Background

400

(4)

7.1 Discounted Reward MDPs

401

(1)

7.2 Average Reward MDPs

402

(2)

8 Reinforcement Learning for MDPs: Convergence

404

(20)

8.1 Q-Learning: Discounted Reward MDPs

404

(11)

8.2 Relative Q-Learning: Average Reward MDPs

415

(2)

8.3 CAP-I: Discounted Reward MDPs

417

(5)

8.4 Q-P-Learning: Discounted Reward MDPs

422

(2)

9 Reinforcement Learning for SMDPs: Convergence

424

(23)

9.1 Q-Learning: Discounted Reward SMDPs

424

(1)

9.2 Average Reward SMDPs

425

(22)

10 Reinforcement Learning for Finite Horizon: Convergence

447

(2)

11 Conclusions

449

(2)

12 Case Studies

451

(22)

1
Chapter Overview

451

(1)

2 Airline Revenue Management

451

(8)

3 Preventive Maintenance

459

(4)

4 Production Line Buffer Optimization

463

(5)

5 Other Case Studies

468

(2)

6 Conclusions

470

(3)

Appendix

473

(4)

Bibliography

477

(27)

Index

504

Abhijit Gosavi is a leading international authority on reinforcement learning, stochastic dynamic programming and simulation-based optimization. The first edition of his Springer book Simulation-Based Optimization that appeared in 2003 was the first text to have appeared on that topic. He is regularly an invited speaker at major national and international conferences on operations research, reinforcement learning, adaptive/approximate dynamic programming, and systems engineering.

He has published more than fifty journal and conference articles many of which have appeared in leading scholarly journals such as Management Science, Automatica, INFORMS Journal on Computing, Machine Learning, Journal of Retailing, Systems and Control Letters and the European Journal of Operational Research. He has also authored numerous book chapters on simulation-based optimization and operations research. His research has been funded by the National Science Foundation, Department of Defense, Missouri Department of Transportation, University of Missouri Research Board and industry. He has consulted extensively for the U.S. Department of Veterans Affairs and the mass media as a statistical/simulation analyst. He has received teaching awards from the Institute of Industrial Engineers.

He currently serves as an Associate Professor of Engineering Management and Systems Engineering at Missouri University of Science and Technology in Rolla, MO. He holds a masters degree in Mechanical Engineering from the Indian Institute of Technology and a Ph.D. in Industrial Engineering from the University of South Florida. He is a member of INFORMS, IIE and ASEE.

Biežāk uzdotie jautājumi par e-grāmatām

Permanent link: https://www.kriso.lv/db/97814899749142e.html

Keywords:

E-grāmata: Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning

DRM restrictions

Kopēšana (kopēt/ievietot):

Drukāšana:

Lietošana:

Konts un iestatījumi

Meklēšana

Meklēt datubāzē

Refine By

Tēmas Ebook Subjects

Izvēlieties iepirkumu grozu